Despite quite a bit of snow and a big job change, I was equally psyched to get selected to return to State of the Net to give a second lightning talk on the future of search. Full video is above and my original remarks are below.

Fixing Search with Index Access

Internet search is a fundamental utility for how we navigate the web. Search is such basic functionality that we often take it for granted, even as we use it day after day to answer questions and seek advice. But, we may be on the verge of a revolution in what internet users expect from search.

Search Is Evolving

This is where I hype everyone up about how AI will change search. Google’s CEO recently suggested that Google will “change profoundly” this year. OpenAI has launched SearchGPT; Perplexity proposes to be an “answer engine.” At DuckDuckGo, we also have a suite of private AI tools. 

But this isn’t a talk about AI; this is a talk about the value of web indexes and how access to Google’s index could be an important unlock for improving search competition and access to information.

An essential ingredient for AI-based search tools is real-time information that comes from an index. Search is powered by a robust index of digital content that is crawled, sorted, and able to be returned to users in response to a query.

A sudden shift to large language models or generative AI does not change this. Instead, the rise of LLMs and genAI has made web indexes even more valuable. Without access to an index, an AI’s “knowledge” is limited to what an LLM is trained on.

“Biden Is the President”?

While web indexes have long powered search, their role in AI is less appreciated. Indexes are used to “ground” the LLM and ensure they provide up to date info. For example, without a search index, an LLM may not be able to answer timely questions like who’s the president of the United States? 

Techniques like “retrieval augmented generation” or RAG are terms to pay attention to. RAG is important because it combines the reasoning capabilities of an LLM with search’s ability to fetch information, ensuring that LLMs can pull in new information after any training cut off date. Instead of retraining an LLM on all of this indexed data, it can just harness a web index itself. 

RAG is less expensive, and importantly for users, can also reduce the incidence of hallucinations and simultaneously increase cited reference sources, addressing some of the “buyer beware” messaging around LLMs. There is a huge opportunity here for innovation in search, but also a big risk when it comes to lock-in and market competition.

LLM Competition v. Index Monopoly

The two most reliable and extensive web indexes that are available to license are from Google and Microsoft. There are few other options. This stands in sharp contrast to the amount of LLMs available. Lots of differentiation in LLMs, but much less so in terms of accessible web indexes they can leverage. 

That means a company like Google can continue leveraging their monopoly in search to carry over into AI. Many critics of ongoing antitrust investigations into Google point to AI as evidence of robust competition, but that may miss the forest for the trees. Take for example, “Grounding with Google Search,” a Gemini product which already demonstrates how a web index can be an important way to improve accuracy and recency of responses in a separate AI service. 

The most fierce competition in search will be over who can combine AI and a web index into the best user experience that leverages the best of both. That’s the future of search.

Google’s sheer size gives its index and search results a leg up, but there is ample evidence that Google’s search results are getting worse. What explains this discrepancy? It is not that Google’s index is getting worse, but rather that the company’s own decisions to maximize profits and attention may have impacted what is shown and elevated in search results. 

A Remedy Proposal

This has implications for how everyone experiences search, and it suggests that other entities — DuckDuckGo, AI companies, and startups — have an opportunity to create differentiated and unique search experiences that are tailored to different internet users. We need to decouple Google’s index from Google’s search results. I am not the only person to say this. 

No less than the Department of Justice has proposed making Google’s Search Index available to competitors in order “to remove barriers to entry, pry open the monopolized markets to competition, and deprive Google of the fruits” of its monopoly. (Unfortunately, getting into the minutia of licensing a web index and discussing RAG is not nearly as headline grabbing as “break-up Google.”)

This remedy also doesn’t punish Google so much as it prioritizes unlocking competition. If any search provider or AI start up could gain access to Google’s search index at marginal cost, it would allow different services to prioritize privacy, design, and UX/UI customization with the same high-quality information. 

Wrapping Up

Imagine a search engine that allows you to tweak its ranking algorithms to prioritize different factors:

Perhaps you want results that emphasize privacy, or that highlight diverse and underrepresented voices, or that surface the most up-to-date information on a breaking news story? 

Rather than being confined to filter bubbles created by opaque algorithms, users could customize their search experience to align with their own values and priorities.

Or picture a search tools that give people unprecedented transparency into how results are ranked and ordered. Rather than accepting a search engine’s decisions at face value, anyone could dive into the underlying logic and see exactly why certain results were surfaced. This level of algorithmic accountability could foster greater trust and empower users to make more informed decisions about the information they consume.

We are on the cusp of a potential revolution in online search. It’s not just the emergence of AI tools but access to robust web indexes that will shape the future of how we find information online. By decoupling Google’s index from its search results, we can unlock a universe of innovative search experiences that empower users to search on their own terms.

0 Comments

Comments are closed.