Hacking SaaS #34 - A little data, a few patterns
Some AI, some vectors, some file formats, some design patterns and a new social network.
My mental space is still 80% occupied by AI. It feels inescapable - in every dinner, meeting or conference, AI seems to dominate the conversation. In this edition, I share some exciting news about language models and pgvector. But! If you need a break from AI, just scroll down a bit to find links to great blogs about integration patterns, file formats and Bluesky - the new social network.
AI News: Same results at 25% of the cost
If you are not familiar with quantization, you should be. AI models are all about large vectors and matrices. Typically, the numbers in these matrices are 32bit floating points. Quantization is the process of replacing these with 16 or 8 bit floating point. If the model that uses 16bit or 8bit performs (in AI terms) as well as the 32bit version, you just saved 50% or 75% of GPU, memory, network and storage requirements. This is a huge deal, but obviously depends on whether the quantized models deliver.
Now there’s a paper shows that shows:
We present a comprehensive empirical study of quantized accuracy, evaluating popular quantization formats (FP8, INT8, INT4) across academic benchmarks and real-world tasks, on the entire Llama-3.1 model family…. Our investigation, encompassing over 500,000 individual evaluations, yields several key findings: (1) FP8 weight and activation quantization (W8A8-FP) is lossless across all model scales,
FP8 quantization is lossless! Essentially meaning that you can replace all your Llamas with much smaller Llamas that use 25% of the resources and yield the same results.
If thats not huge news, I don’t know what is.
OSS Language Models
It is common to refer to language models as either proprietary (ChatGPT or Claud for example) or OSS (Meta’s Llama for example). But what makes an OSS Model actually “open source”? Models are neural networks - there is source code involved, but also weights and training data. The OSI (open source initiative) has been discussing this exact topic and has some definitions for us, but it looks like the debate is very far from over.
You can find a good writeup by Joe Brockmeyer on LWN and another, more opinionated, take by Stephen OGrady at Redmonk. Both are great reads and link to many interesting resources on the topic.
pgvector 0.8.0 is out
Postgres’ vector extension, pgvector, released version 0.8.0 with much awaited improvements and optimizations. I blogged about the new features (including a quick demo). If you are curious about pgvector in general, I also recorded an introduction video and blogged about some of the misconceptions around pgvector.
Integration Patterns
Colt McNealy, well known in the SaaS community as the founder of LittleHorse, is working on a multi-part blog series about data integration patterns.
So far he covered Saga, Transactional Outbox and Queueing patterns. Highly recommended if you are new-ish to the world of architecture patterns, and also if you are an experienced architect looking for a friendly resource to refer to.
Data Formats
I just recently learned about the Vortex file format. This blog post explains the format and compares it to Arrow. I learned quite a bit from it - both about Arrow and about Vortex. And if you are interested in learning more about file format internals, this is a short but interesting blog about parquet pruning as implemented in DataFusion.
Bluesky!
Last but not least, I’m becoming more active in the new social network - BlueSky. It now has a lively tech community and many interesting conversations going on at all times. Bluesky is built around an open protocol, which means everyone can hack their own ecosystem for this social network. The best way to get started is by following a starter pack or two. For example Cloud Native pack, Infrastructure Engineers pack or Distributed Systems pack. And of course if you joined, give me a ping and say “hello” - I love making new friends or finding new ways to connect with old friends.
Thank you for the blue sky packs