Détail du package

@clickup/pg-snowflake

clickup122kMIT2.15.3

Generate Snowflake ID Standard compatible ids with optional cluster number

PostgreSQL, Snowflake, primary key, id generation

readme

pg-snowflake: generate Snowflake ID Standard compatible ids with optional cluster number

Exposes snowflake_gen() function that generates Snowflake ID Standard compatible ids (with the number of bits customizable, see below):

# Snowflake ID compatible (each letter is 1 bit):
0tttttttttttttttttttttttttttttttttt | CCCCC | iiiiiiiiii | ssssssssssssss
^                                     ^       ^            ^
35                                    5       10           14 bits

Here,

  • "t" denotes the bits of timestamp, which is the current Unix time in milliseconds minus the value of CONST_SNOWFLAKE_TS_START psql variable. Only CONST_SNOWFLAKE_TS_BITS highest bits of the timestamp are used.
  • "C" denotes "cluster number". If CONST_SNOWFLAKE_CLUSTER_NO_BITS is 0, then this field is not present, and the entire leftmost range is used for the timestamp.
  • "i" is "island no" aka "Snowflake Machine ID", must be unique across all master nodes within the same cluster.
  • "s" is "sequence number bits".

You can customize the length of the bit fields in your pg-snowflake.config.sql file, see below.

Also, you can pass custom arguments to snowflake_gen() function itself.

The library is typically installed in schema public on every PostgreSQL node.

Installation

First, copy pg-snowflake.config.sql.example to pg-snowflake.config.sql in the root of your app and optionally update CONST_* according to your needs.

\set CONST_SNOWFLAKE_TS_START 1475796139634
\set CONST_SNOWFLAKE_TS_BITS 35
\set CONST_SNOWFLAKE_CLUSTER_NO_BITS 5
\set CONST_SNOWFLAKE_ISLAND_NO_BITS 10
\set CONST_SNOWFLAKE_SEQ_BITS 14

Then run in psql console:

-- The library is typically installed in schema "public".
SET search_path TO public;

-- Optional: if you don't want to pass PG_SNOWFLAKE_CLUSTER_NO environment
-- variable, create this configuration function manually.
--CREATE OR REPLACE FUNCTION snowflake_cluster_no() RETURNS integer LANGUAGE sql
--  SET search_path FROM CURRENT AS 'SELECT 1';

-- Optional: if you don't want to pass PG_SNOWFLAKE_ISLAND_NO environment
-- variable, create this configuration function manually.
--CREATE OR REPLACE FUNCTION snowflake_island_no() RETURNS integer LANGUAGE sql
--  SET search_path FROM CURRENT AS 'SELECT 1';

-- Install the library to the current schema.
\ir .../pg-snowflake-up.sql

The pg-snowflake-up.sql installation script will search for pg-snowflake.config.sql file in the parent folders (it will fail if the file is not found).

By default, cluster number is read from PG_SNOWFLAKE_CLUSTER_NO shell env variable. This way, you can assign it to e.g. 1 for development, 2 for staging, 3 for production environments (or 3, 4, 5, ... for different clusters in different AWS regions). If for some reason you don't want this behavior, just define snowflake_cluster_no() function manually.

The same applies to PG_SNOWFLAKE_ISLAND_NO and snowflake_island_no() functions.

snowflake_gen([seq_name regclass][,ts_start bigint][,ts_bits][,cluster_no_bits][,island_no_bits][,seq_bits])

Generates a Snowflake ID Standard compatible ids.

  • seq_name: by default, the function uses snowflake_seq sequence to avoid within-millisecond (or within-second) collisions, but you can also pass your own sequence name instead.
  • ts_start: optional epoch start timestamp in milliseconds. Defaults to CONST_SNOWFLAKE_TS_START (see pg-snowflake.config.sql.example).
  • ts_bits: optional number of bits for the timestamp part. Defaults to CONST_SNOWFLAKE_TS_BITS.
  • cluster_no_bits: optional number of bits for the cluster number part. Defaults to CONST_SNOWFLAKE_CLUSTER_NO_BITS.
  • island_no_bits: optional number of bits for the island number part. Defaults to CONST_SNOWFLAKE_ISLAND_NO_BITS.
  • seq_bits: optional number of bits for the sequence part. Defaults to CONST_SNOWFLAKE_SEQ_BITS.

The default format is t35-C5-i10-s14. To customize, use PostgreSQL named arguments syntax, e.g.:

SELECT snowflake_gen(ts_bits:=30, cluster_no_bits:=10, island_no_bits:=10);

The benefit of this function is that you can order by id to get the approximate timestamp-based order, even across microshards/nodes. It also uses bits in the bigint id efficiently. The downside is that the id is not easily parsable by just the human eye, since it's bits-based (not decimal positions based).

Example of id generated:

  • 4705186322862407685