Commonly used AI-generated conversational services like ChatGPT and Google’s Bard have the incredible ability to mimic human language by assembling words based on a technique called “word embedding” that organizes words and sentences based on their semantic proximity. This technique produces compelling and accurate responses when the subject matter is largely known by the LLM. But when concepts or information are missing in the corpus used by the LLM, GPTs fill the gap with fabricated, approximate answers that can be plain wrong in some cases. We call those answers “hallucinations.”
There are other drawbacks to these models. ChatGPT, for example, isn’t a dependable or exhaustive source. The system is trained on data up to 2021, which means that companies who rely on real-time information, like the media, will be working with outdated information. The system is trained on a range of internet text data that can include biased data and misinformation. Filters aren’t robust enough yet to identify inappropriate content.
This is very much a challenge for media organizations that deal with facts, real time data, news, and implies that every single word provided by a GPT must be verified by a human.”
So in this context the question is: How can media companies leverage generative AI technology to accelerate content identification, production and distribution, while maintaining competitive advantage against search engines?
At a high level, news platforms produce a range of content that can be mapped against a spectrum that goes from “pure fact” (weather, stocks, sport results, for example) to “pure stories” (political op-ed, interviews, critiques, etc.) and includes anything in between. Search engines have long won the battle of distributing pure facts directly to your mobile so audiences often don’t even need to visit their news websites for that information.
Publishers remain for now the true owners of interesting stories, authorship, passionate debate, and opinions. But what the new generation of LLM/GPT-based search engines and conversational bots seem to be doing is climbing up the spectrum of information from pure fact towards “human-sounding” stories, due to their ability to mimic human language.
Thus, a new competitive landscape is appearing where search engines are no longer limited to deliver weather and stock prices.
In the light of this new competitive landscape, media organizations cannot wait for the tectonic shift to happen. They must begin to train themself today with generative AI, even if it’s imperfect, unreliable, and untrustworthy, and build muscles with simple AI-powered workflows for newsrooms, train their teams to use it, get smart on how to train their own LLMs and automate content creation and distribution in lower-risk categories, while keeping search engines at bay.