All about UUIDs
In which community members share their favorite UUID-related blogs and videos. And other topics of interest for SaaS Developers.
No matter what you are building, you are going to need some unique identifiers. Look at any SaaS, and they are everywhere: User ID, tenant ID, blog post ID, transaction ID, client ID… everything in the world has to have a unique identifier.
Where do these identifiers come from?
Either from sequence generators (for example, database identity columns) or various types of UUID, random 4-bytes that are virtually guaranteed to be unique. Each method has its pros and cons, which can get somewhat nuanced.
Since changing identifiers for existing data is challenging, picking the method of ID generation is one of the earliest “one-way doors” when developing new services.
What SaaS Developers should know about UUID
Not surprisingly, this topic pops up in the SaaS Developer Slack regularly and leads to lively conversations. Here are some of the community’s favorite resources on the topic of UUIDs.
Andrew Atkinson recommended:
Haki Benita’s “Unconventional ways to index UUIDs in PostgreSQL”
Lucas Stephens recommended:
Segment’s blog post on the history of UUIDs. Segment also has an OSS library for generating sortable UUIDs (although today, you should just use UUIDv7, which is standard and implemented in most languages).
The blog post explains the importance of the library - sorting and order can be desirable traits in IDs for application and optimization reasons. Historically, this was a challenge with UUIDs. Thankfully, there are now many variations of sortable UUIDs.
Lalit Pagaria recommended:
Instagram’s blog on generating unique identifiers in a sharded Postgres system. It has one of the clearest comparisons of the pros/cons of various solutions. Instagram’s post included a link to a blog about Twitter’s Snowflake - a mostly monotonous ID generation service.
Specifically for Postgres users:
An overview of the UUID standard and its versions and how to generate different types of UUID in Postgres.
A performance concern about UUIDs that applies to MySQL but not Postgres. And you may also want to read a related discussion that popped up when I posted this on Twitter the other day.
My highlights from the Twitter discussion:
Many, many, many folks mentioned that UUIDv7 and ULID are both better choices than the common v4, since they are semi-monotonous and sortable.
A lot of folks shared a great blog by Vlad Mihalcea, where he clearly explains some of the issues with UUIDs and recommends TSIDs instead. TSIDs are not only sortable, they are also twice more compact, so they are more efficient in memory.
A repeated misconception seemed to be that UUIDs are basically very large strings. This is fake news. On almost every system, UUIDs are 16 bytes, not 36 characters.
Little-known fact is that modern CPUs and modern compilers can compare UUIDs in a single operation, so sorting and indexing is as efficient as using numbers. For the extra nerdy, you may want to check the conversation where Jeremy Schneider and I exchange assembly code that proves this.
Stephen Garland clarified that UUIDs still have performance implications when inserting and shared links to his benchmark script, data generator, and results.
Richard Banffy explained one under-rated benefit of UUID:
Since a UUID will probably be needed anyway (because one doesn’t simply expose a sequential PK), it makes sense to not have a sequential PK in the first place.
While not as critical as “thou shalt never save plain-text passwords to a table” it could result in involuntarily disclosing data such as insertion order of inclusion or gaps with hidden records. A number of things can be inferred from sequential ids.
Postgres Performance for Rails
Andrew Atkinson took a Rails web application that was struggling with load and optimized it to handle over 9000 HTTP requests per second with an average latency of 35ms end to end. Handling a much higher load on a smaller RDS instance with lower latencies.
He then shared his expertise by writing a book: "High-Performance Postgres with Rails." You can watch my video interview with him below or read the blog post he wrote following our conversation.
Andrew often hangs out and answers questions on the SaaS Developer Slack, ping him if you want to discuss Postgres, Rails or maybe even get a discount on his book.