Hacking SaaS #15 - Databases from the Future
In which we discuss current problems and future trends in database technology. Plus topics of interest on the SaaS Developer Slack.
When you start really thinking about a topic, suddenly you see it everywhere. This is true when looking for mushrooms in the forest - and also true when looking for ways databases can be more supportive of developers. Suddenly all you see are blogs by people with database problems and blogs about future data trends. So, let’s look at those trends, problems - as well as a few other topics that came up in our community Slack.
Hacking SaaS, the SaaS Developer Slack, and the SaaS Developer podcast are all community projects by Nile, the database for modern SaaS applications.
Tenants and Sharding
It will not surprise anyone who is hanging out around our community Slack will know that one of the most common problems when designing a data layer for SaaS applications is finding an effective model for tenant isolation, and then evolving this model as usage increases and the data layer has to scale.
Ram, founder, and CEO of Nile, solved this issue multiple times, talked to 100+ companies that have the same issue, and also found many related blogs on the topic - and he mapped out the journey for us (btw, he’s still collecting stories! Ping us on the community Slack if you have interesting stories to share).
You’ll find many good blogs in the show notes - from Atlassian, Cloudflare, Airtable, Notion, Slack, Sentry, Cloudflare and more.
If you want to go even deeper into the art and science of sharing resources between tenants, Marc Brooker has an amazing blog in general and his writeup about False Sharing vs Perfect Placement does a great job bringing statistics to the tenant placement problem.
Future of Databases
At the end of January, I gave a keynote at Data Days Texas where I ranted about all the features I’ve had to build again and again - and why can’t databases just do more for application developers. Word spread, people asked to learn more and I ended up blogging the rant - Things Databases Don’t Do - But Should!
While we shouldn't expect a single DB to be a perfect fit for any situation, we can and should expect databases to do a lot more for us.
Technology keeps improving because users always ask for faster and better products. And when it comes to databases, we are the users. If we don't ask for more - we won't get it. In an age where Github Co-Pilot can write our boilerplate code and ChatGPT can write our CV, we can't just accept that soft delete, version control, tenant isolation or reasonable data APIs are impossible.
Meanwhile, my friend Eric Sammer, founder and CEO of Decodable, wrote his own thoughts on the future of data. According to Eric - The Future of Data is Real Time:
Life is real-time by default. Continuous processing is more natural than discretized chunks of work. The impacts of the batch approach are self-evident: reduced throughput and therefore increased latency driving apps and services to use stale data. Correctness also suffers with records being lost or delivered multiple times. Impact assessment is complex, determining who is impacted by a failure and estimating time to recover. Due to the limitations of batch, no amount of money thrown at the problem can make it go away.
To use a common analogy, the horse can’t go any faster and it’s time to buy that car.
Lee Robinson from Vercel wrote about Databases for Serverless and the Edge, and discussed a point that I also mentioned but didn’t dive as deeply into - a big problem with traditional databases is that their connection handling and protocols did not yet evolve to meet modern application demands:
A new programming model is required for workloads to be compatible with serverless compute and modern runtimes.
These solutions must be:
Connectionless: Developers don't want to think about manual connection management. Traditional database protocols are stateful, whereas HTTP is mostly stateless, making it easier to use with scale to zero compute. Exposed through an SDK or HTTP API, “connectionless” solutions provide an abstraction over connection pooling.
Web native: Browser data-fetching APIs (e.g. Web
fetch
) and protocols are eating the world. New databases use HTTP APIs or WebSockets, rather than opening direct connections to the database. This makes them compatible with all forms of compute (including the lighter runtime used in edge compute).
And finally, while everyone is talking about separating compute and storage, our friends at Rockset went a step further to separate compute from compute.
In the Community Slack
We discussed Google’s new Service Weaver:
And Felix GV, the creator of Venice, quipped that we now have “Modulith” in addition to our existing “Minilith” and “Macroservice”. I wanted to learn more - and the next day Felix provided us with the definitive guide to the microservices architecture zoo.
Shomik Ghnosh, a VC by day and fan of developer tooling in his spare time, wrote about the rise of Platform Engineering:
And last but not least, Bytewax joined the SaaS Developer podcast to discuss how they created an amazing developer experience in their product - with next-level developer guides and a very creative community.