Elixir
Fast and efficient generation of cryptographically strong probably unique identifiers (puid, aka random string) of specified entropy from various character sets.
Install / Use
/learn @puid/ElixirREADME
Puid
Simple, fast, flexible and efficient generation of probably unique identifiers (puid, aka random strings) of intuitively specified entropy using pre-defined or custom characters.
iex> defmodule(RandId, do: use(Puid, chars: :alpha, total: 1.0e5, risk: 1.0e12))
iex> RandId.generate()
"YAwrpLRqXGlny"
Table of Contents
Overview
Puid provides a means to create modules for generating random IDs. Specifically, Puid allows full control over all three key characteristics of generating random strings: entropy source, ID characters and ID randomness.
A general overview provides information relevant to the use of Puid for random IDs.
Usage
Puid is used to create individual modules for random ID generation. Creating a random ID generator module is a simple as:
iex> defmodule(SessionId, do: use(Puid))
iex> SessionId.generate()
"8nGA2UaIfaawX-Og61go5A"
The code above use default parameters, so Puid creates a module suitable for generating session IDs (ID entropy for the default module is 132 bits). Options allow easy and complete control of all three of the important facets of ID generation.
Entropy Source
Puid uses :crypto.strong_rand_bytes/1 as the default entropy source. The rand_bytes option can be used to specify any function of the form (non_neg_integer) -> binary as the source:
iex > defmodule(PrngId, do: use(Puid, rand_bytes: &:rand.bytes/1))
iex> PrngId.generate()
"bIkrSeU6Yr8_1WHGvO0H3M"
Characters
By default, Puid use the RFC 4648 file system & URL safe characters. The chars option can by used to specify any of the 31 pre-defined character sets or custom characters, including Unicode:
iex> defmodule(HexId, do: use(Puid, chars: :hex))
iex> HexId.generate()
"13fb81e35cb89e5daa5649802ad4bbbd"
iex> defmodule(Base58Id, do: use(Puid, chars: :base58))
iex> Base58Id.generate()
"vRxen9A4vejoX4U66iaHna"
iex> defmodule(DingoskyId, do: use(Puid, chars: "dingosky"))
iex> DingoskyId.generate()
"yiidgidnygkgydkodggysonydodndsnkgksgonisnko"
iex> defmodule(DingoskyUnicodeId, do: use(Puid, chars: "dîñgø$kyDÎÑGØßK¥", total: 2.5e6, risk: 1.0e15))
iex> DingoskyUnicodeId.generate()
"øßK$ggKñø$dyGîñdyØøØÎîk"
Captured Entropy
The default Puid module generates IDs that have 132-bit entropy. Puid provides a simple, intuitive way to specify ID randomness by declaring a total number of possible IDs with a specified risk of a repeat in that many IDs:
To generate up to 10 million random IDs with 1 in a trillion chance of repeat:
iex> defmodule(MyPuid, do: use(Puid, total: 10.0e6, risk: 1.0e15))
iex> MyPuid.generate()
"T0bFZadxBYVKs5lA"
The bits option can be used to directly specify an amount of ID randomness:
iex> defmodule(Token, do: use(Puid, bits: 256, chars: :hex_upper))
iex> Token.generate()
"6E908C2A1AA7BF101E7041338D43B87266AFA73734F423B6C3C3A17599F40F2A"
General Note
The mathematical approximations used by Puid always favor conservative estimation:
- overestimate the bits needed for a specified total and risk
- overestimate the risk of generating a total number of puids
- underestimate the total number of puids that can be generated at a specified risk
Installation
Add puid to mix.exs dependencies:
def deps,
do: [
{:puid, "~> 2.1"}
]
Update dependencies
mix deps.get
Module API
Puid modules have the following functions:
- generate/0: Generate a random puid
- total/1: total puids which can be generated at a specified
risk - risk/1: risk of generating
totalpuids - encode/1: Encode
bytesinto a puid - decode/1: Decode a
puidinto bytes - info/0: Module information
The total/1, risk/1 functions provide approximations to the risk of a repeat in some total number of generated puids. The mathematical approximations used purposely overestimate risk and underestimate total.
The encode/1, decode/1 functions convert String.t() puids to and from bitstring bits to facilitate binary data storage, e.g. as an Ecto type.
The info/0 function returns a Puid.Info structure consisting of:
- source characters
- name of pre-defined
Puid.Charsor:custom - entropy bits per character
- total entropy bits
- may be larger than the specified
bitssince it is a multiple of the entropy bits per character
- may be larger than the specified
- entropy representation efficiency
- ratio of puid entropy to bits required for puid string representation
- entropy transform efficiency
- ratio of puid entropy bits to avg entropy source bits required for ID generation
- entropy source function
- puid string length
Example
iex> defmodule(SafeId, do: use(Puid))
iex> SafeId.generate()
"CSWEPL3AiethdYFlCbSaVC"
iex> SafeId.total(1_000_000)
104350568690606000
iex> SafeId.risk(1.0e12)
9007199254740992
iex> SafeId.decode("CSWEPL3AiethdYFlCbSaVC")
<<9, 37, 132, 60, 189, 192, 137, 235, 97, 117, 129, 101, 9, 180, 154, 84, 32>>
iex> SafeId.encode(<<9, 37, 132, 60, 189, 192, 137, 235, 97, 117, 129, 101, 9, 180, 154, 84, 2::size(4)>>)
"CSWEPL3AiethdYFlCbSaVC"
iex> SafeId.info()
%Puid.Info{
characters: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_",
char_set: :safe64,
entropy_bits: 132.0,
entropy_bits_per_char: 6.0,
ere: 0.75,
ete: 1.0,
length: 22,
rand_bytes: &:crypto.strong_rand_bytes/1
}
Characters
Puid Predefined Charsets
| Name | Count | ERE | ETE | Characters | |------|--------|-----|-----|------------| | :alpha | 52 | 5.7 | 0.84 | ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz | | :alpha_lower | 26 | 4.7 | 0.81 | abcdefghijklmnopqrstuvwxyz | | :alpha_upper | 26 | 4.7 | 0.81 | ABCDEFGHIJKLMNOPQRSTUVWXYZ | | :alphanum | 62 | 5.95 | 0.97 | ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 | | :alphanum_lower | 36 | 5.17 | 0.65 | abcdefghijklmnopqrstuvwxyz0123456789 | | :alphanum_upper | 36 | 5.17 | 0.65 | ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 | | :base16 | 16 | 4.0 | 1.0 | 0123456789ABCDEF | | :base32 | 32 | 5.0 | 1.0 | ABCDEFGHIJKLMNOPQRSTUVWXYZ234567 | | :base32_hex | 32 | 5.0 | 1.0 | 0123456789abcdefghijklmnopqrstuv | | :base32_hex_upper | 32 | 5.0 | 1.0 | 0123456789ABCDEFGHIJKLMNOPQRSTUV | | :base36 | 36 | 5.17 | 0.65 | 0123456789abcdefghijklmnopqrstuvwxyz | | :base36_upper | 36 | 5.17 | 0.65 | 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ | | :base45 | 45 | 5.49 | 0.78 | 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ $%*+-./: | | :base58 | 58 | 5.86 | 0.91 | 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz | | :base62 | 62 | 5.95 | 0.97 | ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 | | :base85 | 85 | 6.41 | 0.77 | !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu | | :bech32 | 32 | 5.0 | 1.0 | 023456789acdefghjklmnpqrstuvwxyz | | :boolean | 2 | 1.0 | 1.0 | TF | | :crockford32 | 32 | 5.0 | 1.0 | 0123456789ABCDEFGHJKMNPQRSTVWXYZ | | :decimal | 10 | 3.32 | 0.62 | 0123456789 | | :dna | 4 | 2.0 | 1.0 | ACGT | | :geohash | 32 | 5.0 | 1.0 | 0123456789bcdefghjkmnpqrstuvwxyz | | :hex | 16 | 4.0 | 1.0 | 0123456789abcdef | | :hex_upper | 16 | 4.0 | 1.0 | 0123456789ABCDEF | | :safe_ascii | 90 | 6.49 | 0.8 | !#$%&()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_abcdefghijklmnopqrstuvwxyz{|}~ | | :safe32 | 32 | 5.0 | 1.0 | 2346789bdfghjmnpqrtBDFGHJLMNPQRT | | :safe64 | 64 | 6.0 | 1.0 | ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_ | | :symbol | 28 | 4.81 | 0.89 | !#$%&()*+,-./:;<=>?@[]^_{|}~ | | :url_safe | 66 | 6.04 | 0.63 | ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~ | | :word_safe32 | 32 | 5.0 | 1.0 | 23456789CFGHJMPQRVWXcfghjmpqrvwx | | :z_base32 | 32 | 5.0 | 1.0 | ybndrfg8ejkmcpqxot1uwisza345h769 |
Note: The Metrics section explains ERE and ETE.
Description of non-obvious character sets
| Name | Description | | :---------------- | :--------------------------------------------------------- | | :base16 | https://datatracker.ietf.org/doc/html/rfc4648#section-8 | | :base32 | https://datatracker.ietf.org/doc/html/rfc4648#section-6 | | :base32_hex | Lowercase of :base32_hex_upper | | :base32_hex_upper | https://datatracker.ietf.org/doc/html/rfc4648#section-7 | | :base36 | Used by many URL shorteners | | :base58 | Bitcoin base58 alphabet (excludes 0, O, I, l) | | :base85 | Used in Adobe PostScript and PDF | | :bech32 | Bitcoin SegWit address encoding | | :dna | DNA nucleotide bases (Adenine, Cytosine, Guanine, Thymine) | | :ascii85 | Same as :safe_ascii | | :ascii90 | Same as :base85 | | :crockford32 | https://www.crockford.com/base32.html | | :geohash | Used for encoding geographic coordinates | | :safe_ascii | Printable ascii that does not require escape in String | | :safe32 | Alpha and nu
