Charlie Hull - Search Tech Top Voices

The "Top Voices in Search-Tech" initiative is a carefully curated showcase of the most impactful and influential search-tech professionals from around the world that you can connect with and learn from.

About Charlie

Charlie is an expert consultant with over 25 years experience helping clients across the world build better and more accurate search systems. He founded and ran Flax, the UK's leading open source search consultants, in 2001. Flax worked with clients including the UK Government, Reed Specialist Recruitment and The Financial Times. In 2019 Flax merged with OpenSource Connections and Charlie became a Managing Consultant there, working with companies across the world from multi-billion-dollar e-commerce giants to startups. In addition, Charlie ran the popular Haystack conference for five years and is the co-author of the book Searching the Enterprise for NOW Publications, and is a founding member of The Search Network.

In Charlie’s own words: “Search isn't working? Let me help! As The Search Juggler I can audit and develop your search strategy, offer expert consulting in search & AI, help you run an amazing search-themed event and assist you as you bring your new service or product to the search sector. If I can't help, my wide and deep network allows me to find the people you need. I'm honest, neutral and pragmatic and able to work with all levels of your business from executive to technical.”

Where to find Charlie on the web:

Let’s start from the beginning - how did you get involved in the search tech industry?

Early in my career as a developer, I got a job at a company called Muscat, one of the very early players in the search space, co-founded by Martin Porter, who wrote the famous Porter Stemmer algorithm. They had recently been funded by a venture group, and their ambition was to build a web search engine. Google wasn’t really around at that time, and people were more familiar with AltaVista and similar services. I believe I was the second person hired, and the company grew to about 30 people. We built a web search engine of over half a billion pages in about 18 months, which was a great introduction to the world of search and something I found fascinating.

Unfortunately, the venture money ran out, and it didn’t become the next Google. Around 2001, I left with one of the other top-level people and together we started our own consulting company. The interesting thing was, the web search engine had been based on open-source software, so we were able to continue using it. We set ourselves up as consultants, helping people build search engines on that open-source platform. Over the next 17 years, we worked with many different companies worldwide, helping them implement and improve search. We were never a huge company, but we were very specialized and started to build a reputation. We moved into different technologies, mainly open source, though we also dealt with other software.

We worked for governments, recruitment companies, e-commerce companies, financial institutions - you name it - a really wide range of clients. That was my start: simply being employed at a search-focused company, enjoying the work, and enjoying the whole field of search. That’s ultimately what led me to where I am today.

Tell us about your current role and what you’re working on these days.

I was with a company called OpenSource Connections for six years. I left in November when they closed their UK office. Since that time, I've been developing an independent identity as a search consultant, which I've named “The Search Juggler”. I chose that name because in my other life, I’m also a juggler and stilt walker, and I’ve taught and performed circus skills for many years. I think juggling is a good metaphor for search - there are so many different objectives and pressures coming from business and technology, and it’s a constant juggling act to keep all those things in the air and reach a good outcome.

What I’m doing now is multifaceted. One part is straightforward search consulting - helping people get better results from their search systems, and taking advantage of AI where it makes sense. It’s not always necessary, but it’s another useful tool in the box. I also help companies that are breaking into the search market or want a bigger footprint, since I have a wide knowledge of the search world and its various players. I help them communicate what they do, find new markets, and figure out ways to grow. I’m also involved with search events. I ran the Haystack Search Conference for five years, both in the US and the EU, and it became a leading event in our space. I’m an experienced conference speaker, presenter, and host, so I can help companies make their search events shine.

Overall, it’s a mix of technical search consulting and the wider business aspects - promotion, marketing, and sales - within the search world.

What are some of the biggest misconceptions about search that you often encounter?

One common misconception I often see in organisations is the idea that simply buying or implementing a new search engine will fix all their problems. I’ve encountered this multiple times: people hate the old search engine, so they decide a new one will solve everything, believing all the marketing and technical claims. But often the real issues are structural - process-related. It’s about knowing what your data is, what the quality of that data is, and matching user intention (which can be tricky) to the language of your content. It’s also about measurement. Many people don’t know how to measure good or bad search; they just respond to complaints instead of intentionally working to improve search quality.

I’ve seen companies start implementing a new search engine, thinking it will take three months, only for it to drag on for 18 months. Then, when it’s done, their boss asks why the search still isn’t better and why people still hate it. The answer is they missed the point: making search better isn’t just a technology issue - though technology is part of it - it’s about the entire process.

Search covers so many different areas, which is why it’s such a fascinating field. It involves language, psychology, human-computer interaction, data, and scale - and now, of course, AI as well. Because of that, you really have to consider all of these angles if you want to make search better.

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

A couple of years ago, people started saying, “We don’t need normal search engines anymore - AI systems will answer all our questions.” That made some in the search industry a bit nervous. And while that concern is not accurate, we’ve come to realize AI does several things for us in search. It helps solve some of the oldest problems, like extracting data from certain file formats. For example, you can use a vision model to extract features from a PDF and make it searchable. It also helps address the fundamental issue of language mismatch - how I describe something in my query might not match how it’s described in the content. Because language models are so good at language, they help bridge that gap.

AI also opens up new ways of doing search, like multimodal search. I can give you an image, and you return text; or I can type text and get an image back, or search within video. It can also help summarize search results, so instead of just seeing a list of links, I can ask a chatbot or use a voice interface and get the answer I’m looking for.

The caveat is that a lot of what’s labeled “AI” isn’t really new - we’ve had similar techniques for years. There’s also an assumption that AI is automatically better than traditional methods, which isn’t always true. Often, you need a hybrid approach. We should see AI as a new set of tools in our toolbox, not an entirely different toolbox. That’s something people struggle with because they’re being told to “do AI,” executives are excited about it, and the world is excited about it. But building these systems is harder than you think, and it often comes down to fundamental questions about how we construct retrieval systems.

So that’s my view. We need to figure out which specific uses of AI truly help our users, rather than throwing everything out and starting over. Also, don’t get swept up in every new model that appears. Just because a new model is announced today doesn’t mean you should drop everything. Focus on the real problems your users face.

Are there any open-source tools or projects - beyond Elasticsearch and OpenSearch - that have significantly influenced your work?

What we’ve seen over the last couple of years is an absolute explosion in the number of available open-source search engines and search technologies. We used to have the Lucene stack - OpenSearch, Solr, Elasticsearch - as the main options. OpenSearch was a slightly later player, but it’s become very influential now with Amazon’s support. We also have Weaviate and many other engines emerging from the vector/AI side, then maybe adding more traditional features. There are lots of well-funded new players out there.

One of the most interesting ones, I think, is Vespa, which comes out of Trondheim in Norway. It was initially built within Yahoo and spun out a year or two ago. I think they’ll be a major player in the next couple of years. They’re aggressively expanding, redeveloping their business, and they have a very solid technical foundation. Their challenge is they’re not as well-known, and it’s a big box of tricks that might be a bit intimidating at first.

So there are many new options now. One question I’m often asked is, “Which search engine should I choose?” But that’s somewhat nonsensical because it depends so much on context. I often give the consultant’s standard answer: “It depends.” Making that choice is hard, and sometimes the right answer is, “Don’t change what you have - stick with what you know.” But I think it’s become a lot tougher for companies these days: which technology should they use, which direction should they go, and what options will they have going forward? There are always pros and cons.

Is there a log error/alert that terrifies/annoys you in particular?

I’ve been lucky enough not to be woken up in the middle of the night by errors, because I don’t do on-call support - thank goodness. In the old days, the big problem was indexing failures: you’d run a massive, multi-day process to ingest all your data, then one strange document would make the whole thing crash. Technology has improved, so that doesn’t happen as often anymore. Still, one of the worst situations for search is ending up with an incomplete index - when it doesn’t contain what you think it does. Long processes failing in the middle of the night for mysterious reasons is something that keeps search engineers awake.

On the querying side, there’s always a query that can break the search engine - a huge, complicated Boolean expression or something that touches every corner of your index. Wildcards are notorious for this. These queries really stress your infrastructure, and you never know what a user is going to try. It’s entirely possible for someone to send a query that ties up the engine for a long time, and that’s definitely cause for worry.

What is the most unexpected or unconventional way you’ve seen search technologies applied?

When I started working in search, it was all about textual search - Google-style queries where you type something and get back a list of links. That’s what everything revolved around. One of the great shifts happened in the Elasticsearch world, where people realized that the nature of search indexes could be used for analytics. All of a sudden, Elasticsearch moved into the analytics space - territory that companies like Splunk had occupied - and it became hugely lucrative and a big part of their model. I find it fascinating that search indexes could be repurposed like that. Because we have counts of everything in an index, we can use those counts in different ways, produce visualizations, and discover insights we didn’t see before.

That development took search technology into new areas and also fed back into the technology itself, prompting us to scale differently and build bigger systems in ways we hadn’t done before - especially when dealing with billions of log lines. That’s been one of the most interesting things in recent years. But now it’s all about AI and thinking about new ways to deliver information using AI techniques, which is making the search world rethink its approach.

I think almost every traditional search engine has now added the word “AI” somewhere in its marketing. Everyone’s acknowledging that this is the next big trend, no matter how cynical we might feel about whether it’s truly AI. That’s the current shift: figuring out how to use these techniques to better serve users.

We can see some of the big players - Google and OpenAI, for instance - making a lot of noise about how they’re going to help with enterprise search or enterprise processes. They’re trying to break into the enterprise market. Google has done this before with the Google Search Appliance (literally a yellow box with “Google” written on it), which attempted to address enterprise search. Now, I think we’ll see these companies claiming, “We’re going to bring AI to the enterprise. We’ll solve all your business problems. You’ll be able to have conversations with your PDFs,” which, frankly, isn’t the most exciting concept, but there we go.

I believe they’ll find this very difficult. Enterprise search - something I’ve written a book about and dealt with for many years - is a thorny, complex problem. It’s nothing like consumer-facing search or e-commerce search. Enterprise data is scattered, locked in silos, and stored in strange old file formats. The queries are often brand-new or highly specialized. In general, enterprise data doesn’t want to be found - metadata is terrible, and titles are inconsistent: “version 3.1,” “No, this is the actual final version,” and so on. Breaking into this market is going to be tough.

I see a lot of startups saying, “We have AI; we can search all your enterprise data!” But when you look closer, they might only index Google Docs and a couple of other APIs. Enterprise data is much bigger and more complicated than that. Ownership is unclear, security is a huge challenge, and I just read an article about a law firm restricting AI tools because employees were uploading legal documents to ChatGPT. That introduces major confidentiality and business risks.

So I suspect these companies promoting AI-driven enterprise search will discover it’s harder than they think. I’ll be interested to see whether they can truly solve it.

Can you suggest a lesser-known book, blog, or resource that would be valuable to others in the community?

There’s a book called Designing the Search Experience, written by Tyler Tate and Tony Russell-Rose, which is a great resource for anyone thinking about different user interfaces for search. It’s not very well-known, but I often recommend it and find that people haven’t heard of it.

I recommend that anyone involved in search check out the Relevance & Matching Tech Slack community. We created it at OpenSource Connections about five years ago, after a conference. It started with around 50 to 100 attendees chatting about search topics, and it’s now grown to about 5,000 people worldwide. It’s a fantastic resource and community that covers all aspects of search.

Top Voices in Search Tech: Charlie Hull

About Charlie

Where to find Charlie on the web:

Let’s start from the beginning - how did you get involved in the search tech industry?

Tell us about your current role and what you’re working on these days.

What are some of the biggest misconceptions about search that you often encounter?

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

Are there any open-source tools or projects - beyond Elasticsearch and OpenSearch - that have significantly influenced your work?

Is there a log error/alert that terrifies/annoys you in particular?

What is the most unexpected or unconventional way you’ve seen search technologies applied?

Can you suggest a lesser-known book, blog, or resource that would be valuable to others in the community?

Know a search-tech guru that we should feature?

Top Voices in Search Tech: Charlie Hull

About Charlie

Where to find Charlie on the web:

Let’s start from the beginning - how did you get involved in the search tech industry?

Tell us about your current role and what you’re working on these days.

What are some of the biggest misconceptions about search that you often encounter?

How do you envision AI and machine learning impacting search relevance and data insights over the next 2-3 years?

Are there any open-source tools or projects - beyond Elasticsearch and OpenSearch - that have significantly influenced your work?

Is there a log error/alert that terrifies/annoys you in particular?

What is the most unexpected or unconventional way you’ve seen search technologies applied?

Give us a spicy take/controversial opinion on something related to Search

Can you suggest a lesser-known book, blog, or resource that would be valuable to others in the community?

Anything else you want to share? Feel free to tell us about a product or project you’re working on or anything else that you think the search community will find valuable

Know a search-tech guru that we should feature?