Booking.com began experimenting with conversational recommendations and tool driven automation before most enterprises had a name for agentic systems, and that early start is now shaping a more measured strategy. In a VentureBeat podcast, AI product development lead Pranav Pathak described a layered, modular stack that pairs small travel specific models for fast, low cost inference with larger LLMs for reasoning, plus in house evaluations when precision matters. With this hybrid approach and selective collaboration with OpenAI, the company reports that accuracy has doubled across key retrieval, ranking, and customer interaction tasks, while it continues to test the right balance between highly specialized agents and a smaller set of more general ones.
Pathak said the goal is to move recommendations from guesswork to context driven personalization without crossing into behavior that feels invasive, especially when long term memory is involved. Booking.com has expanded from early intent and topic detection models into an orchestrated system that classifies queries, triggers retrieval augmented generation, and calls APIs or specialized models, delivering a reported 2X lift in topic detection and a 1.5 to 1.7X increase in human agent bandwidth as more issues shift to self service. The company is also using free text filtering to translate what customers ask for into tailored search filters, while keeping build versus buy decisions and monitoring choices flexible to avoid irreversible architectural commitments, advice Pathak distilled for other teams as well: start simple, prove product market fit, and only customize deeply when standard APIs no longer meet the need.

