Top Voices in Search Tech: Samuel Herman

Samuel Herman

The "Top Voices in Search-Tech" initiative is a carefully curated showcase of the most impactful and influential search-tech professionals from around the world that you can connect with and learn from.


About Samuel

Samuel is a distinguished engineer and researcher with 10+ years of expertise in database technologies, vector search, distributed systems, and large-scale data processing.

He has a deep technical focus on search indexing systems, time series databases, distributed architectures, and machine learning infrastructure at petabyte scale.

Today, Samuel is a technical lead for OpenSearch at DataStax (now an IBM company) and a member of the OpenSearch Technical Steering Committee. Prior to DataStax, Samuel was the architect and lead engineer of OCI Observability and later OCI Search Services at Oracle Cloud Infrastructure.

Samuel also led large-scale observability and storage projects at AWS EBS, and prior to that, at various startups.

Where to find Samuel on the web:

Let’s start from the beginning — how did you get involved in the search tech industry?

My path into search actually started with a security problem that existing databases couldn’t solve. Back in 2012, I was tasked with building cross-device algorithms to scan everything on our network—from operating systems to applications—protecting sensitive data from potential attacks.

The breakthrough moment came when I realized this wasn’t just an algorithmic challenge. The real bottleneck was that traditional database systems simply couldn’t handle the scale and real-time requirements we needed. That’s when I went down the rabbit hole of building custom storage engines, indexing strategies, and query planners from scratch.

When I joined Oracle Cloud Infrastructure in 2017, I encountered logging data at unprecedented scale—hundreds of millions of events per second, petabytes of data daily. Building a proprietary database to handle all of OCI’s logging needs taught me that the line between “database” and “search” gets pretty blurry when you're operating at cloud scale. Whether you’re doing SIEM analysis, security ML, or traditional search, it really comes down to answering questions as accurately and quickly as possible.

The real turning point was when I was tapped to architect something entirely new for OCI—not just another massive observability database, but a true search platform. We chose OpenSearch as our foundation because the open ecosystem allowed us to push performance boundaries in ways proprietary systems couldn’t match. That became OCI Search Services, powered by OpenSearch.

What fascinated me was how the performance optimization techniques I’d developed for security scanning and logging directly applied to search relevance problems. Now at DataStax with jVector and search on OpenSearch, I'm bringing that same systems-level thinking to vector search, where the indexing trade-offs remind me of those early database optimization challenges—just with a lot more dimensions.

Tell us about your current role and what you’re working on these days.

At DataStax, now part of IBM, I’m architecting our next-generation search capabilities for enterprise AI applications. While we have Astra DB—our serverless Cassandra offering—we’re strategically adding OpenSearch to create a unified platform that gives customers the best of both worlds: Cassandra’s operational simplicity for transactional workloads and OpenSearch’s advanced search and analytics capabilities.

My primary focus is pushing the boundaries of vector search quality through jVector, our core vector search library, while simultaneously evolving OpenSearch’s storage engine and KNN capabilities to support features that simply don’t exist elsewhere.

The challenge that keeps me up at night is the recall-performance trade-off problem. Every customer has different vector datasets, different latency requirements, and different cost constraints. A recommendation system serving millions of users has completely different needs than a RAG application doing document retrieval. We’re building adaptive systems that can automatically optimize these trade-offs based on actual usage patterns rather than forcing customers to become vector search experts themselves.

Could you describe a ‘favorite failure’—a setback that ultimately led to an important lesson or breakthrough in your work?

There’s a pattern in my career where I start by thinking “I can build a better X” and end up realizing “Wait, do we even need X in the first place?”

The most recent example happened when I was obsessing over our vector bulk loading performance. I spent days figuring out offline index construction. We were getting decent improvements, but I kept hitting walls whenever I thought of any real-world use cases.

What about incremental updates? What happens during changes? Deletes? It added too many edge cases and too much operational complexity.

The breakthrough came during a particularly frustrating debugging session at 2 a.m. I was debugging plumbing issues for an offline-generated index when I had this moment of clarity: I was optimizing the wrong thing entirely. Instead of making throughput faster by introducing bulk loading, what if we made regular inserts so efficient that bulk loading became unnecessary?

That complete mental shift led to what became our in-place graph modification approach—where every single insert gets most of the benefits we used to reserve for bulk operations. Instead of having two completely different code paths with different performance characteristics, we have one path that’s fast for everything.

The lesson I took from this—as well as from similar cases—stuck with me: sometimes the biggest breakthroughs come not from making something better, but from questioning whether you need it at all. It’s changed how I approach every optimization problem since.

What are some of the biggest misconceptions about search that you often encounter?

The biggest misconception I encounter is this artificial wall people put up between “search” and “databases”—as if they’re completely different worlds. I’ve sat in countless meetings where engineers say “we need a search solution” when what they really need is better indexing, or “we need an eventually consistent key-value store” when their search (OpenSearch) solution already possesses that capability.

Here’s the thing: whether you’re doing vector similarity search, traditional text search, or complex analytics queries, you’re fundamentally solving the same problems. You have a question, you need to scan data efficiently, you need to bring the right data to compute, and you need to optimize how you ask that question. The core challenges around storage, filtering, retrieval, and query planning haven’t changed—we just have trendier logos on the solutions.

Another big one is the belief that vector search indices can somehow be developed in a vacuum. Vector indices are tightly coupled to data distribution, which is itself tightly coupled to the model generating the vectors. The reality is that search quality depends far more on the model generating your vectors than on index optimization.

I see teams spending months fine-tuning recall parameters while using outdated embedding models, or debugging index performance when their real problem is that their vectors don’t actually capture the semantic relationships they need.

The misconceptions matter because they lead to over-engineered solutions. Teams will build separate systems for “search” and “analytics” when one well-designed platform could handle both. Or they’ll neglect the basics—like whether they’re generating the right vectors—before jumping into advanced index tuning. Success comes from focusing on the core problem without getting distracted by industry labels and artificial boundaries.

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

I think vector search will continue to be a dominant player. In particular, new techniques like multi-vector search will have an increasingly dominant effect over time.

Can you share an example of a particularly challenging production issue you’ve encountered in your work with search technologies and the process you used to resolve it?

The most confusing issue I had to deal with was a machine running out of resources (usually memory), but for reasons that weren’t immediately evident, like file handles closing too slowly or too many connections being opened and closed. The number of host metrics can be overwhelming, even if you're only looking at network or file system metrics. The correlations aren’t always straightforward and can take time to detect.

Is there a log error/alert that terrifies or annoys you in particular?

Any log is unstructured, and any log message that omits the thread or request ID. :)

If the trend of dense vector search dominance continues to grow, it will shift the focus of the retriever to the properties of the model, as the index technology will become more of a commodity over time.

Pulse - Elasticsearch Operations Done Right

The world's top search experts love Pulse

Learn More

Know a search-tech guru that we should feature?

Let us know

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.