Atita Arora - Search Tech Top Voices

The "Top Voices in Search-Tech" initiative is a carefully curated showcase of the most impactful and influential search-tech professionals from around the world that you can connect with and learn from.

About Atita

Atita Arora is a seasoned and esteemed professional in information retrieval systems and has decoded complex business challenges, pioneering innovative information retrieval solutions in her professional journey spanning nearly two decades.

She is currently Director of Artificial Intelligence and Information Retrieval at Voyager Search. She has a robust background from her impactful contributions as a committer to various information retrieval projects and is an active presenter and participant at many international conferences.

She has a keen interest in making revolutionary tech innovations accessible and implementable to solve real-world problems.

She specializes in vector and hybrid search, seeking to uncover insights that can enhance their practical applications and effectiveness.

Atita also champions diversity in tech as co-leader of Women in Search.

Where to find Atita on the web:

Let’s start from the beginning — how did you get involved in the search tech industry?

Late 2007, I convinced my soon-to-be employer, Amsoft Systems, to let me start early on a stipend for my final college project.

What I thought would be simple coding work turned into a Facebook application loaded with every buzzword of that time: VOIP, maps, caching, ETL pipelines, and this thing called "search." Three of us total—basically the most glorified team frantically Googling everything.

Thankfully, I had the most humble and knowledgeable mentor, Kishore, who gave me my first mission: make Lucene work with Solr 1.2. I spent weeks in trial-and-error purgatory, but once that first query worked, I was hooked. We did a lot of cool things to boost performance of our local deployment setup—from custom cache policies, cache warming strategies, a homegrown Solr cluster (pre-SolrCloud days), and monitoring tools with GraphViz that made our setup look like some beautiful, complex organism.

As weird as it may sound, I never thought about contributing any of those things to Solr open source. I was just having way too much fun building.

That chaotic, self-directed learning experience turned out to be perfect training for the roles to come—embracing uncertainty, learning from sparse to no documentation, and knowing when to experiment boldly versus when to think strategically.

Plus, that early lesson about not contributing back taught me the value of open collaboration and community knowledge sharing whenever possible.

Tell us about your current role and what you’re working on these days.

My current role is Director of Information Retrieval & AI Solutions at Voyager Search, and it sits at the intersection of AI/ML technology, product management, and user experience design—essentially simplifying complex data in the geospatial domain and making it more accessible and valuable through AI-powered tools and interfaces.

I’m currently leading initiatives focused on how users discover and extract value from structured and unstructured data, along with AI-driven search functionality and conversational AI interfaces that help users comprehend and interact in a more geospatially aware context.

Voyager already has a valuable pool of data connectors that can process and enrich data from 2,500 different sources. The idea now is to take multimodal and multilingual capabilities to the next level in the geospatial world.

Could you describe a ‘favorite failure’—a setback that ultimately led to an important lesson or breakthrough in your work?

While I've had many failures that shaped my career, one stands out for how fundamentally it changed my approach to both learning and leadership.

In 2014–15, I was approached by an online education company in India to develop and deliver an Apache Solr course. Despite having solid technical knowledge, I'd never taught before—or even taken an online technical course myself.

To this, my brain went: "Teaching? How hard can it be? I know Solr!" What I conveniently ignored was that I'd never actually taught anything—or, frankly, even taken an online course myself. But hey, confidence was never my problem.

I spent weeks crafting what I thought were brilliant materials, even watched a few of their existing courses to "get the vibe." I walked into my first live session feeling like a rockstar, ready to enlighten the masses about the wonders of search technology.

Then reality hit like a freight train. My "students" weren't eager beginners hanging on my every word—they were grizzled veterans with 10+ years of experience asking questions like, "Can you explain the intricate details of how Lucene's segment merging affects real-time indexing performance under high-concurrency scenarios?" I'm sitting there thinking, "Segment what now?" while frantically googling under my desk.

It was like showing up to teach a cooking class and realizing your audience is all Michelin-starred chefs asking about molecular gastronomy techniques while you're still figuring out how to boil water without burning it. That spectacular crash taught me something that's been invaluable in all my roles thereafter: there's a massive difference between knowing something and being able to guide others through it.

Now, when I'm leading AI initiatives or mentoring my team, I don't just focus on the technical aspects—I think about how to translate complex concepts, relate them to something familiar, anticipate the curveball questions, and create psychological safety for people to admit when they don't know something (because trust me, we're all still learning in this field).

Plus, that experience of being humbled by my audience? Pure gold for leading in AI, where the technology is evolving so fast that yesterday's expert can be tomorrow's student. It keeps me curious, collaborative, and just a little bit paranoid about over-preparing—which, in AI leadership, is exactly where you want to be.

What are some of the biggest misconceptions about search that you often encounter?

Glad you said misconceptions (plural), because I’ve discovered so many.

Biggest one? That search is just about finding things.

Guess what—it’s not. Or let’s say, it’s not only that. People want to build search engines that read their users' minds, and those minds are filled with this nebulous concept of "relevance" that shifts dramatically from domain to domain. I’ve worked across everything from e-commerce to medical to news platforms, and the reality is way more nuanced than just matching text.

Take e-commerce, where I’ve spent most of my time. Sure, text matching matters—but what really drives conversions? Newness, popularity, discounts, add-to-cart rates, inventory status, compatibility with what’s already in your cart, even demographic signals. Compare that to medical search, where ingredients, salt substitutes, and allergy information can literally be life-or-death factors. Or news, where freshness trumps almost everything. In second-hand marketplaces, suddenly distance from the buyer becomes critical. Each domain doesn’t just have different ranking factors—it has a completely different definition of success.

The second big misconception—especially in this AI era—is that more data automatically equals better search. I see teams throwing everything at an LLM thinking, "It’ll figure it out!" But it’s never about quantity—it’s about quality and structure. You need homogeneous data states, exploration-friendly catalogs, and honestly, you need to understand whether your data even supports the kind of search experience you’re trying to build.

What’s fascinating is that these same misconceptions are now plaguing AI implementations. People expect ChatGPT-style interfaces to magically solve enterprise search problems—without considering that the underlying challenges (understanding intent, domain-specific relevance, data quality) haven’t changed. The interface got shinier, but the fundamental problems are still there.

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

I’m looking forward—and I’m already seeing—AI fundamentally transform how we approach relevance and insights in ways that would have seemed like science fiction just a few years ago.

Take multimodal and multilingual capabilities—I’m seeing search engines that can retrieve matching products based on a picture from the user, who may be speaking in broken English mixed with their native language, and yet still receive relevant results (or even a combination of results). Having spent years building complex ranking algorithms that tried to guess user intent from limited text, this feels revolutionary. :)

In my e-commerce days, we’d spend a lot of time manually auditing and profiling catalog data to understand why search results were poor. Now I’m seeing systems that can automatically surface data quality issues, identify gaps in product information, and even suggest what missing attributes are hurting discoverability.

Personalization and recommendation systems have already come a long way—and I look forward to seeing them improve even more in understanding not just what you searched for, but when, why, and what you might need next, based on patterns across many similar users.

There have been many, as I have been learning constantly and getting better—just like AI ;)

I remember a couple that were challenging enough to make us rethink the entire application’s architecture, especially back in the days of real-time data processing—like applying stock and pricing changes in an e-commerce use case. That’s what introduced me to event-driven, high-throughput, low-latency architectures.

There was another one that was quite tacky, actually—involving legal urgency at MyToys: ensuring products of two competing brands (LEGO and Playmobil) did not appear in the same result set. That one gave me a lot of sleepless nights.

Are there any open-source tools or projects—beyond Elasticsearch and OpenSearch—that have significantly influenced your work?

As you may have guessed—I’ve worked a lot on Solr. I was also involved in using and contributing to Apache OpenNLP and Tika.

During my time at OpenSource Connections, I contributed the first Vector Search demo for Solr in Chorus and added visualization capabilities in Quepid.

Apart from that, Vespa, Weaviate, and Qdrant have definitely influenced and inspired me to a great extent.

Is there a log error/alert that terrifies/annoys you in particular?

Classic OOMs / Model inference timeout / Internal error / connection timeouts - all weird and daunting to debug and fix.

What is a golden tip for optimizing search performance that you’ve picked up in your years of experience?

I would say: understand the nuances that help you see the bigger picture—and focus on those. For example, optimize for business outcomes, not just technical needs. As we say: measure the full funnel—latency, quality, clicks, and conversions.

That said, user intent and behavior is also a critical aspect of search performance—do not ignore it! And you can’t achieve either of the above unless you adopt an evaluation mindset—treat your search like a product, not a project.

What is the most unexpected or unconventional way you’ve seen search technologies applied?

I have not only seen but rather helped develop it as well - a search to detect money laundering by treating transaction patterns like search queries against known suspicious behavior patterns.

If you're building something from scratch - what does your ideal search tech stack look like?

This is probably the trickiest question of all—and I can play my consultant card here and say: it depends :)

I say this because, especially in today’s landscape, there are so many tech options available that it’s hard to make a choice unless I know more about the kind of search I’m building, along with its functional and non-functional requirements. As I always suggest: you should never start with a technology or tool—the driving factor should always be the problem you’re solving.

I would consider:

The complexity of the search experience that’s needed
Scalability and performance needs
Team skills and the existing tech stack it needs to integrate with
And, importantly, what future evolution looks like

The goal is picking technologies that solve today's problem excellently while positioning us well for tomorrow’s requirements.

The best search stack is the one that grows with your business.

Can you suggest a lesser-known book, blog, or resource that would be valuable to others in the community?

You know, I've been diving into some books that sit at this fascinating intersection of technology and human behavior, and they've completely changed how I approach building search systems.

Take Invisible Women — it's a wake-up call for anyone in product or AI. The book reveals how most datasets default to male as "normal" and treat women as edge cases. Imagine launching a voice search feature that works perfectly for your male users but fails for women because your training data was skewed. That’s not just a technical bug — that’s bad business that alienates half your potential users. What struck me most is how this isn’t intentional bias; it’s unconscious exclusion through lack of consideration.

Then there's Thinking, Fast and Slow — its insights into cognitive biases are pure gold for search design and understanding how users think and interact with different search experiences.

And Mismatch ties it all together beautifully with this framework: recognize, include, learn. When you design search that works for someone with limited typing ability, you often end up creating better search for everyone. Solve for one, extend to many.

Here’s what excites me: as AI starts handling more of the heavy lifting in dev, we can focus on these human-centered aspects that we've been overlooking — and actually improve the core product strategy.

Everyone's building AI-powered search to avoid admitting their fundamentals are broken.

Most search failures aren’t because systems can’t get what users want—or because users can't express what they want. They're failing because the underlying data is garbage, taxonomies are either broken or nonexistent, and the criteria governing search ranking have never been tied to actual business goals. But fixing data quality and information architecture? That’s boring, unglamorous work that doesn’t get you promoted.

So instead, what do teams do? They obsess over GenAI and shiny new tech without conducting a single hypothesis, experiment, or evaluation to assess if it actually fits their problem. It’s become the ultimate technical debt avoidance strategy.

Here’s the kicker: adding a conversational layer on top of fundamentally flawed search just makes users more frustrated—because now they think the system should be smart enough to read their minds. You’ve literally weaponized their expectations against them.

I’ve seen companies spend millions on vector databases and LLMs while their product catalog still has descriptions like “blue thing, size medium, kinda nice.” You can’t ChatGPT your way out of bad data management. My suggestion? Go easy on the hype and spend a few months getting your basics right. Clean your data, align your ranking with business goals, invest in search evaluation and metrics.

Remember: AI is a multiplier, not a foundation. And multiplying garbage just gives you more garbage.

Thank you so much for this opportunity to contrubute as one of the "Top Voices in Search Tech". What I love about this initiative is that it’s purely merit-based — focused on credentials, expertise, and the value people bring, period.

It’s refreshing to be in a space where I can focus entirely on my passion for search technology and the problems I’m excited to solve.

This is exactly what our industry needs more of, and it gives me hope that we’re moving toward evaluating everyone purely on what they bring to the table.

Thank you for creating that environment — it means a lot.

Top Voices in Search Tech: Atita Arora

About Atita

Where to find Atita on the web:

Let’s start from the beginning — how did you get involved in the search tech industry?

Tell us about your current role and what you’re working on these days.

Could you describe a ‘favorite failure’—a setback that ultimately led to an important lesson or breakthrough in your work?

What are some of the biggest misconceptions about search that you often encounter?

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

Are there any open-source tools or projects—beyond Elasticsearch and OpenSearch—that have significantly influenced your work?

Is there a log error/alert that terrifies/annoys you in particular?

What is a golden tip for optimizing search performance that you’ve picked up in your years of experience?

What is the most unexpected or unconventional way you’ve seen search technologies applied?

If you're building something from scratch - what does your ideal search tech stack look like?

Can you suggest a lesser-known book, blog, or resource that would be valuable to others in the community?

Know a search-tech guru that we should feature?

Top Voices in Search Tech: Atita Arora

About Atita

Where to find Atita on the web:

Let’s start from the beginning — how did you get involved in the search tech industry?

Tell us about your current role and what you’re working on these days.

Could you describe a ‘favorite failure’—a setback that ultimately led to an important lesson or breakthrough in your work?

What are some of the biggest misconceptions about search that you often encounter?

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

Can you share an example of a particularly challenging production issue you’ve encountered in your work with search technologies, and the process you used to resolve it?

Are there any open-source tools or projects—beyond Elasticsearch and OpenSearch—that have significantly influenced your work?

Is there a log error/alert that terrifies/annoys you in particular?

What is a golden tip for optimizing search performance that you’ve picked up in your years of experience?

What is the most unexpected or unconventional way you’ve seen search technologies applied?

If you're building something from scratch - what does your ideal search tech stack look like?

Can you suggest a lesser-known book, blog, or resource that would be valuable to others in the community?

Give us a spicy take/controversial opinion on something related to Search

Anything else you want to share? Feel free to tell us about a product or project you’re working on or anything else that you think the search community will find valuable

Know a search-tech guru that we should feature?