Top Voices in Search Tech: Uri Goren

Uri Goren

The "Top Voices in Search-Tech" initiative is a carefully curated showcase of the most impactful and influential search-tech professionals from around the world that you can connect with and learn from.


About Uri

Uri Goren is a veteran search-and-recommendation specialist with more than 15 years of machine-learning experience. Earlier in his career he served as a research engineer at Microsoft, Yahoo! Labs, and AT&T, where he helped design and scale large-scale ranking, personalization, and knowledge-graph systems.

Today Uri is the Founder, CEO, and Chief Data Scientist of ArgmaxML, an AI-matching company that leverages cutting-edge language models, vector search, and real-time decisioning to transform search and discovery for clients in finance, e-commerce, healthcare, and tech.

Before launching ArgmaxML, Uri co-founded a legal-technology start-up that automated contract drafting through generative AI suggestions and anomaly detection, dramatically shortening legal review cycles for global law firms and in-house counsel.

Uri co-hosts ExplAInable, a weekly podcast in Hebrew on Machine learning and AI.

Where to find Uri on the web:

Let’s start from the beginning — how did you get involved in the search tech industry?

Back in 2014 I joined Yahoo Research to help launch what became the Yahoo Native Ads platform. Overnight I went from tuning classifiers to wrestling with a fully distributed IR stack—billions of documents, sub-100 ms latency targets, and a brand-new serving engine that later spun out as Vespa.ai. Debugging ranking features and click-through predictions for every query across Yahoo News, Finance, and Sports was a real-time crash-course in indexing, feature-store design, and online A/B culture. I was hooked the moment I saw how a single relevance tweak could move revenue (and user happiness) the very next hour. A few years later I co-founded BestPractix, a legal-tech startup that processed millions of long, highly structured contracts. We used the same hybrid retrieval and anomaly-detection ideas—only now the goal was to surface the three clauses a lawyer actually cares about, rather than the best ad. The domain was different, but the playbook from Yahoo (streaming feature pipelines, rapid offline → online iteration, tight latency budgets) proved invaluable. Those two experiences convinced me that robust search-and-matching tech belongs everywhere, not just in web search or ads. That insight is what led me to start ArgmaxML, where we now apply those lessons to everything from Hebrew insurance FAQs to real-time bidding and recommendation engines for publishers and advertisers .

Tell us about your current role and what you’re working on these days.

My job is an endless sprint on two fronts: outrunning the breakneck pace of AI innovation and cementing the playful, research-first culture that made Argmax special even as we expand into a distributed, multinational team. Every week I toggle between prototyping new RAG and vector-search pipelines to stay ahead of the next model release, and designing rituals—paper clubs, “one-experiment-per-sprint” rules, and a Relevance Bootcamp—that turn curiosity into a system so our growing roster of engineers in multiple time zones can keep shipping bold ideas instead of settling into big-company inertia.

Could you describe a ‘favorite failure’—a setback that ultimately led to an important lesson or breakthrough in your work?

Early in my career I helped build Microsoft’s next-generation 3-D sensing camera, certain Kinect-style hardware would dominate homes, and that conviction pushed me to co-found BestPratix to commercialize the tech. We poured ourselves into perfecting depth imaging but ignored whether anyone truly needed it, and when deep-learning-based vision took off, our solution became irrelevant almost overnight. That misstep taught me to anchor every project in a validated customer problem—not a cool technology—and to design products flexible enough to endure sudden shifts in the innovation landscape.

What are some of the biggest misconceptions about search that you often encounter?

Many people assume that if prompt-engineering tricks and a few rules can get ChatGPT-powered search to about 70 % accuracy, a bit more tweaking will painlessly lift it into the 90 % range; in truth, those last gains demand entirely different tools—solid retrieval infrastructure, relevance ranking, feedback loops, and rigorous evaluation—because clever prompts alone can’t substitute for a purpose-built search stack.

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

Over the next two‑three years I expect search relevance and data insights to converge around “retrieval‑augmented generation”: high‑quality search pipelines feed LLMs fresher, long‑tail evidence, while those same models learn semantic taxonomies and intent signals that loop back to boost ranking and personalization. Vector databases, lightweight domain‑specific models, and continuous user‑feedback tuning will make search feel more conversational and analytics more exploratory, so the traditional gap between relevance engineers and machine‑learning engineers will dissolve into a single stack where the same embeddings power autocomplete, dashboards, and on‑the‑fly narrative answers.

Can you share an example of a particularly challenging production issue you’ve encountered in your work with search technologies, and the process you used to resolve it?

When an e-commerce client pivoted from electronics to home décor, our production ranker—trained on years of electronics click logs—started surfacing totally irrelevant results, tanking conversion by 18 %. Because historical behavior offered no guidance for the new catalog, we built a “zero-history” test harness: (1) generated thousands of synthetic décor queries and relevance judgments from the merchandising taxonomy; (2) replayed them through the stack to expose blind spots; (3) injected an exploration layer that randomly promoted long-tail items while collecting fresh user interactions; and (4) shipped a daily offline-to-online feedback loop that retrained the ranker only on post-pivot signals. Within two weeks we had enough real clicks to replace the synthetic data, relevance scores rebounded past pre-pivot levels, and the exploration layer could be dialed back—proving that simulation plus rigorous measurement can rescue search when past data suddenly goes stale.

Are there any open-source tools or projects—beyond Elasticsearch and OpenSearch—that have significantly influenced your work?

Yes — Superlinked, an open-source Python framework for “vector computers,” has become a cornerstone for us because it lets you fuse heterogeneous signals—image pixels, text descriptions, categorical tags, even numeric features—into a single, composite embedding space; that unified vector makes it far simpler to reason about multimodal relevance, run similarity search, and experiment with weighting or pruning each modality without rewriting the whole pipeline

Is there a log error/alert that terrifies/annoys you in particular?

Nothing makes my heart sink faster than the “search-server-down” alert—relevance metrics can drift and we’ll triage them, but when the core search node drops offline the entire discovery experience collapses instantly.

What is a golden tip for optimizing search performance that you’ve picked up in your years of experience?

Start every optimization by pinning down the real business KPI—conversion rate, average order value, time‑to‑answer, whatever actually matters—because tuning for convenient offline metrics like accuracy or NDCG is meaningless if they don’t move the needle the business cares about.

What is the most unexpected or unconventional way you’ve seen search technologies applied?

The most unconventional use I’ve seen is “fractal” or hierarchical search inside massive codebases: instead of querying the whole monolith, a system recursively slices the repository into ever-smaller semantic chunks—modules, classes, then functions—and runs relevance ranking at each level before zooming in on the exact snippet the assistant needs. This multi-resolution crawl lets AI tools reason about millions of lines of code without blowing context limits, and the same idea transfers to other domains where knowledge naturally nests, from legal treaties down to clauses or from medical literature down to sentences.

If you're building something from scratch - what does your ideal search tech stack look like?

These days I'd probably start a search project with OpenSearch + Redis for search, and python + postgres for the backend

Google’s era of near-total search dominance appears to be fading, highlighting the need for a more diverse ecosystem so no single company can steer or suppress access to information.

Pulse - Elasticsearch Operations Done Right

The world's top search experts love Pulse

Learn More

Know a search-tech guru that we should feature?

Let us know

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.