Generate UID like YouTube.
The library generates a unique identifier consisting of 64 characters and a length of 10 characters (you can change the length of the identifier). This gives us 6410 = 260 = 1 152 921 504 606 846 976 combinations.
To represent this number, imagine that in order to get all possible values of identifiers with a length of 10 characters and generating an ID every microsecond, it takes 36 559 years.
UUID works on the same principle, but its main drawback is that it's too long. It is not convenient to use it as a public identifier, for example in the URL. In order to get the same number of combinations as the UUID, we need 2128 = 6421 lines 21 characters long, that is, almost 2 times shorter than the UUID (37 characters). And if we take an identifier of the same length as the UUID, then we get 6437 = 2222 against 2128 for the UUID.
The most important advantage of this approach is that you ourselves control the number of combinations by changing the length of the string and the character set. This will optimize the length of the identifier for your business requirements.
The probability of collision of identifiers can be calculated by the formula:
p(n) ≈ 1 - exp(N * (ln(N - 1) - ln(N - n)) + n * (ln(N - n) - ln(N) - 1) - (ln(N - 1) - ln(N) - 1))
Where
- N - number of possible options;
- n - number of generated keys.
Take an identifier with a length of 11 characters, like YouTube, which will give us N = 6411 = 266 and we will get:
- p(225) ≈ 7.62 * 10-6
- p(230) ≈ 0.0077
- p(236) ≈ 0.9999
That is, by generating 236 = 68 719 476 736 identifiers you are almost guaranteed to get a collision.
For calculations with large numbers, i recommend this online calculator.
Pretty simple with Composer, run:
composer require gpslab/base64uid
use GpsLab\Component\Base64UID\Base64UID;
$uid = Base64UID::generate(); // iKtwBpOH2E
With length 6 chars (646 = 68 719 476 736 combinations).
$uid = Base64UID::generate(6); // nWzfgA
The floating-length identifier will give more unique identifiers (648 + 649 + 6410 = 1 171 217 378 093 039 616 combinations).
$uid = Base64UID::generate(random_int(8, 10));
You can customize charset.
$charset = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ+/';
$uid = Base64UID::generate(11, $charset);
$charset = '0123456789abcdef';
$uid = Base64UID::generate(11, $charset);
Generate random characters of a finite UID from a charset.
$generator = new RandomCharGenerator();
$uid = $generator->generate(); // iKtwBpOH2E
Limit the length of the UID and the charset.
$charset = '0123456789abcdef';
$generator = new RandomCharGenerator(6, $charset);
$uid = $generator->generate(); // fa6c7d
Generate random bytes and encode it in Base64.
$generator = new RandomBytesGenerator();
$uid = $generator->generate(); // YCfGKBxd9k4
$generator = new RandomBytesGenerator(5);
$uid = $generator->generate(); // Mm7dpkM
Generate bitmap with random bits and encode it in Base64. The bitmap length is 64 bits and it require 64-bit mode of processor architecture.
$binary_generator = new RandomBinaryGenerator(32);
$encoder = new HexToBase64BitmapEncoder();
$generator = new EncodeBitmapGenerator($binary_generator, $encoder);
$uid = $generator->generate(); // 7MWx2BuWJUw
Generate bitmap with current time in microseconds and encode it in Base64. The bitmap length is 64 bits and it require 64-bit mode of processor architecture.
$binary_generator = new TimeBinaryGenerator();
$encoder = new HexToBase64BitmapEncoder();
$generator = new EncodeBitmapGenerator($binary_generator, $encoder);
$uid = $generator->generate(); // koLfRhzAoI0
$uid = $generator->generate(); // zALfRhzAovg
$uid = $generator->generate(); // 18LfRhzAoQw
Generated bitmap has a structure:
{first bit}{random prefix}{current time}{random suffix}
- first bit - bitmap limiter for fixed size of bitmap;
- prefix - random bits used in prefix of bitmap. The length of the generated bits can be configured from
$prefix_length
; - time - bits of current time in microseconds.
- suffix - random bits used in suffix of bitmap. The length is calculated from
64 - 1 - $prefix_length - $time_length
.
Responsibly select the number of bits allocated to store the current time. The $time_length
defines the limit of the
stored date:
Bits limit | Maximum available bitmap | Unix Timestamp | Date |
---|---|---|---|
40-bits | 1111111111111111111111111111111111111111 |
1099511627775 |
2004-11-03 19:53:48 (UTC) |
41-bits | 11111111111111111111111111111111111111111 |
2199023255551 |
2039-09-07 15:47:36 (UTC) |
42-bits | 111111111111111111111111111111111111111111 |
4398046511103 |
2109-05-15 07:35:11 (UTC) |
43-bits | 1111111111111111111111111111111111111111111 |
8796093022207 |
2248-09-26 15:10:22 (UTC) |
44-bits | 11111111111111111111111111111111111111111111 |
17592186044415 |
2527-06-23 06:20:44 (UTC) |
45-bits | 111111111111111111111111111111111111111111111 |
35184372088831 |
3084-12-12 12:41:29 (UTC) |
To reduce the size of the saved time, you can use a $time_offset
that allows you to move the starting point of time:
Offset microseconds | Offset date | Maximum available date for 41-bits |
---|---|---|
0 | 1970-01-01 00:00:00 (UTC) | 2039-09-07 15:47:36 (UTC) |
1577836800000 | 2020-01-01 00:00:00 (UTC) | 2089-09-06 15:47:36 (UTC) |
It is similar to the previous generator TimeBinaryGenerator
, but the position with bits of the current time is
floating. That is, the length of the prefix and suffix is randomly generated each time. Simultaneously generated
identifiers have less similarity, but the likelihood of collision increases.
$binary_generator = new FloatingTimeGenerator();
$encoder = new HexToBase64BitmapEncoder();
$generator = new EncodeBitmapGenerator($binary_generator, $encoder);
$uid = $generator->generate(); // 5mqhb6MPH7g
$uid = $generator->generate(); // kFvow8joJys
$uid = $generator->generate(); // 8QRC30YeP3E
Snowflake-id use time in microseconds and generator id. This allows you to customize the generator to your environment and reduce the likelihood of a collision, but the identifiers are very similar to each other and the identifier reveals the scheme of your internal infrastructure. Snowflake-id used in Twitter, Instagram, etc.
$generator_id = 0; // value 0-1023
$binary_generator = new SnowflakeGenerator($generator_id);
$encoder = new HexToBase64BitmapEncoder();
$generator = new EncodeBitmapGenerator($binary_generator, $encoder);
$uid = $generator->generate(); // gBFKQeuAAAA
$uid = $generator->generate(); // gBFKQeuAAAE
$uid = $generator->generate(); // gBFKQevAAAA
How to usage in your domain.
For example create a ArticleId
ValueObject:
class ArticleId
{
private $id;
public function __construct(string $id)
{
$this->id = $id;
}
public function id()
{
return $this->id;
}
}
Repository interface for Article:
interface ArticleRepository
{
public function nextId();
// more methods ...
}
Concrete repository for Article:
use GpsLab\Component\Base64UID\Base64UID;
class ConcreteArticleRepository implements ArticleRepository
{
public function nextId()
{
return new ArticleId(Base64UID::generate());
}
// more methods ...
}
Now we can create a new entity with ArticleId
:
$article = new Article(
$repository->nextId(),
// more article parameters ...
);
This bundle is under the MIT license. See the complete license in the file: LICENSE