Creating Impossible Architecture with AI
Featured experiment

Creating Impossible Architecture with AI

Exploring the boundaries between physical constraints and digital imagination

What happens when you ask AI to design buildings that defy physics? I spent a week generating architectural concepts that could never exist in reality, revealing fascinating biases in how models under

Zev UhuruMarch 28, 202512 min
500153
Shipping Semantic Search
experiment

Shipping Semantic Search

Shipping semantic search in Esy was a journey through the world of embeddings, vector databases, and the subtle art of measuring similarity between ideas. Semantic search is about understanding meani

March 26, 20258 min
50015
Emergent Tool Use in GPT-4
research

Emergent Tool Use in GPT-4

How do language models learn to use tools? In this article, I analyze the surprising ways GPT-4 adapts to new tasks with minimal instruction. Emergent behaviors in LLMs often arise from scale and dat

March 24, 202518 min
50015
Real-time Video Generation Tests
experiment

Real-time Video Generation Tests

My first experiments with Runway Gen-2 for real-time video generation were a mix of excitement and chaos. Here's what I learned. Video generation is still in its infancy—expect weird artifacts and lo

March 22, 20256 min
50015

Latest Posts

How I Learned to Stop Worrying and Love the Cache
experimentApril 15, 202518 min

How I Learned to Stop Worrying and Love the Cache

Last Tuesday at 3:47 AM, I discovered our RAG system was making the same embedding call 47,000 times per hour. The same query. The same vector. The same response. Over and over, like a extremely expen

50015
Why I Chose Boring Technology for Esy
buildMarch 20, 202510 min

Why I Chose Boring Technology for Esy

When building Esy, I faced a choice: chase the latest tech trends, or pick tools that just work. I chose the latter, and here's why. "Boring" technology is often the most reliable and cost-effective.

50015
Training a LoRA on My Writing Style
experimentMarch 18, 202515 min

Training a LoRA on My Writing Style

Fine-tuning Llama 2 on my own writing was both exciting and a little unsettling. Here's what happened. Personalized models can mimic your style—sometimes too well. I gathered three years of blog pos

50015
Benchmarking Chain-of-Thought Prompting
researchMarch 16, 202522 min

Benchmarking Chain-of-Thought Prompting

Chain-of-thought (CoT) prompting has become a go-to technique for improving LLM reasoning. But how well does it really work? CoT can boost accuracy, but not all models benefit equally. I tested CoT

50015