The challenge of using ChatGPT for search engines

Following the launch of ChatGPT in November, large corporations have been captivated by the idea of building AI into their systems. Microsoft and Google have led this charge in announcing their moves to build models like GPT-3 and Bard into their search engines, enabling the delivery of contextual and humanised responses to queries.

More corporations will likely follow this lead, implementing large language models (LLMs) into search engines. These models, which include GPT-3 among their ranks, have captivated much of the tech world through their ability to mimic and interpret human patterns of speech and writing.

Trained on vast amounts of internet-sourced data and tracking billions of parameters, LLMs can pick up on subtle matters of human speech. LLMs are thus capable of interpreting colloquialisms, subtexts, and nuanced questions, and producing output that’s equally rich-seeming to us.

However, while LLMs can understand and produce natural language, it doesn’t follow that they naturally produce value for many organisations at present. Rather, the very size and scale of LLMs often make them often very unsuitable for any work in fields where factual accuracy is the primary concern. That includes many online search queries.

LLMs don’t perform well in fields that require specialist knowledge. In fact, they don’t do well at all in any revenue-generating role where factual reliability is required. Why is this the case? And what does this mean for the use of AI in business?

LLMs: a problem of scale

The factual weakness of LLMs comes down to the very quality that allows them to interpret and imitate the speech of laypeople: the scope of data they’re trained on.

To train LLMs, teams at organisations like OpenAI or Google scrape countless millions of examples of text from across the open internet. This mix covers just about every type of content available on the public web and across nearly every conceivable topic. From this, LLMs can develop an idea of the language used by the average person.

But the problem manifests once you move into a niche field. Even before introducing area-specific jargon, we find that different specialities often use precise and distinct definitions for words compared to everyday use. This doesn’t just include definitions, either: one discipline may have very different relationships between concepts and terms compared to the same ones in another discipline.

As a result, LLMs that are posed requests in a specialist field often end up falsely analogising from an unrelated area, use the wrong definitions when sourcing information, or outright misunderstand the question being asked of it.

Pivoting to smarter models for search

The big problem for LLMs in search is that many aren’t designed for any requests for specialist or niche knowledge. So what room is there for AI in the world of search?

The answer lies in smart language models. These are models trained on curated, high-quality datasets, on top of scientific content, that are targeted from the outset to be focused on a particular business context or field of expertise. Furthermore, there is a strong focus on the factual accuracy of both the results generated and citing the sources used to arrive at these results.

These smart language models provide a strong contrast to the current issues with the LLM’s being used for search, where there is no guarantee of factual correctness and no citing of sources. On top of this, the internet operates with links, ranks and ads, that further complicates that analysis that LLMs and search are dependent on.

LLMs can learn a lot of lessons from smart language models. By prioritising explainability and accuracy, the impressive ability of LLMs can be turned to disrupting traditional search engines.

Source: Information Age