How I Learned to Stop Worrying and Love the Cache
Why our 97% cost reduction came from forgetting everything we knew about modern architecture
Shipping semantic search in Esy was a journey through the world of embeddings, vector databases, and the subtle art of measuring similarity between ideas. Semantic search is about understanding meani
How do language models learn to use tools? In this article, I analyze the surprising ways GPT-4 adapts to new tasks with minimal instruction. Emergent behaviors in LLMs often arise from scale and dat
My first experiments with Runway Gen-2 for real-time video generation were a mix of excitement and chaos. Here's what I learned. Video generation is still in its infancy—expect weird artifacts and lo
When building Esy, I faced a choice: chase the latest tech trends, or pick tools that just work. I chose the latter, and here's why. "Boring" technology is often the most reliable and cost-effective.
Fine-tuning Llama 2 on my own writing was both exciting and a little unsettling. Here's what happened. Personalized models can mimic your style—sometimes too well. I gathered three years of blog pos
Chain-of-thought (CoT) prompting has become a go-to technique for improving LLM reasoning. But how well does it really work? CoT can boost accuracy, but not all models benefit equally. I tested CoT
My first experiments with MusicGen were a wild ride. Some tracks were beautiful, others... not so much. AI-generated music is unpredictable—embrace the weirdness! I used MusicGen to create ambient s