Blog
Writing & thinking
On AI product development, agentic systems, and the messy reality of shipping ML in production.
We Asked 8 LLMs to Run Our Family's Life. Three Tried to Book a Vacation.
We tested 8 LLMs against Honeydew's production family assistant — GPT-4o-mini, Gemini Flash, DeepSeek V3, Claude Haiku, GPT-4.1, Claude Sonnet, GPT-5.4, Claude Opus. 2,800 API calls, five weird findings, and one practitioner tip that'll save you a debugging session.
Do LLMs Actually Cite Your Startup? An Empirical Analysis of LLM Discoverability Infrastructure
We built an LLM discoverability stack and measured what happened. 13 verified LLM sessions, 3 platforms citing us, and hard data on what works. Full methodology and results.
How We Get Cited by ChatGPT, Claude & Perplexity Without Ads
A deep dive into LLM discoverability and how we built a citation strategy for Honeydew.
Building a Multimodal AI Family Assistant
The architecture behind Honeydew's voice, text, and photo input pipeline.
What I Learned Building an AI Agent That Manages a Family's Life
Lessons from building a 27-tool agentic system that real families depend on daily.
More articles on the Honeydew blog. Personal essays coming here soon.