Hacking SaaS #20 - Database Digest
Preview of PostgreSQL 16 and more proxies, models, struggles, and scalability solutions!
Lots to cool milestones this time! We have 20 editions of Hacking SaaS, over 2500 subscribers, and… Postgres 16 is almost here. Lets party!
Let’s start with the breaking news - The first preview release of Postgres 16 is now available. Highlights include:
Quite a few performance improvements
Notably - COPY from files should be 300% faster
Logical replication from standby
Client-side load balancing
Client-side load balancing is especially interesting since Postgres currently has a lively ecosystem of proxies. There’s a Hacker News discussion of the new release in general and whether the new client load-balancing will replace the proxies or not.
And speaking of Postgres proxies, I found an interesting new one, PgCat. It is written in Rust, and in addition to load balancing and connection pooling, it also handles failover scenarios. In general, I like it when tool names include cats.
Over in the MySQL world:
The company formerly known as Facebook published a blog describing their Raft-based distributed MySQL.
We enhanced MySQL and made it a true distributed system. Realizing that control plane operations like promotions and membership changes were the trigger of most issues, we wanted the control plane and data plane operations to be part of the same replicated log. For this, we used the well-understood consensus protocol Raft. This also meant that the source of truth of membership and leadership moved inside the server (mysqld). This was the single biggest contribution of bringing in Raft because it enabled provable correctness (safety property) across promotions and membership changes into the MySQL server.
On Data Modeling:
Hacker News is discussing when to use indexes. Shockingly, it sounds like the answer is “It depends.” Still a good discussion with a lot of interesting viewpoints and anecdotes.
Knowlo’s data modeling blog post describes how they tried to create a reasonable data model for their MVP without spending too much time on it. They started with their two MVP use cases and went from there. I found the data modeling process with DynamoDB and the help from AI quite interesting.
Knowlo’s build-in-public process is quite interesting in general. I ended up reading more posts and cheering for their young SaaS.
Ease of use vs. flexibility
When you build a platform, you need to balance two opposing requirements:
Making things easy for beginners or developers with simple requirements. Abstracting away something they shouldn’t think about.
Allowing advanced users to implement complex or uncommon requirements while benefiting from the rest of the platform.
It was interesting to read how these two play out in Remix, as an enthusiastic developer struggled as their requirements evolved.
In the community Slack: How to authenticate in gRPC?
Good discussion in the community Slack on the best way to authenticate in the gRPC protocol.
I was wondering if anyone could critique how we are thinking of providing authentication for a grpc-based SaaS (maybe it should be called PaaS) product. Basically, we expose an API that does useful stuff over gRPC, and we have an admin console which you can access using an SSO provider of your choice. Here's my thinking:
We'll go with mTLS because that's the only non-google-specific out-of-the-box grpc authentication mechanism: https://grpc.io/docs/guides/auth/
The client libraries require CA Cert, private key, and public key for authentication. We will distribute all three securely through the dashboard (which users login to using their SSO + MFA)
Cert rotation is pretty simple, just provide a new private key and public key which can be downloaded from the admin console.
When we want to revoke credentials (eg. if the customer has a leak), what we do is create a new CA cert + private key + private cert combo, and configure the server to accept requests signed by the new CA rather than the old one. This would be a very rare occasion.
What do you think? Join our Slack to respond, or let us know in the comments.
Scalability and Multi-tenancy
Shayon wrote a great blog post on his approach to scalability issues and how his team at Loom applied it to their database as the company hyper-scaled.
He also joined our YouTube to discuss database scalability, performance, and multi-tenancy: