The opening keynote at Black Hat 2023, unsurprisingly, was on the topic of AI. Specifically, the presentation discussed the implications of AI’s large language models (LLMs) on the cybersecurity industry and the wider ecosystem. The presenter kicked off the conversation by talking about the timeline of investments made by Google and Microsoft, the number is large, as you would expect; Microsoft alone has invested $13 billion, so far.
In fact, that number caused me to digress from the topic of the presentation and I started wondering why a company would spend $13Bn, and – more importantly – why they are rushing this to market when many of the experts, governments, and industry commentators are suggesting caution and slowing the adoption of AI by society.
AI causing de-monetization of content
There are many uses of AI; all could be part of the reason. The presenter gave an example, that does give a very understandable view, of AI being used in video conferencing, analyzing the video, audio, and materials shared, and then being able to summarize the meeting in more detail than just transcribing. Is this why you invest $13Bn, or is this a play to own the future search market? Will the word Google stop being a verb?
While society may or may not have an issue with the ethics of AI, I am curious whether the adoption of AI in search causes an issue that de-monetizes the internet for many content providers.
Traditional search engines, such as Bing and Google, index content and use algorithms to determine and deliver what they believe to be the most relevant results in the search engine results page (SERP), and in the process deliver a few sponsored ads at the top. If you are a content producer and have a website nowadays, then your monetization model likely includes advertising, or the content is protected only for subscribers or through a paywall. In either case, you are likely to be reliant, at least in part, on the traffic generated via search, the clicking of a link in the SERP, and browsing your website content directly.
Related: Top 5 search engines for internet-connected devices and services
What happens when a large language model (LLM) is responsible for delivering the answer to a search query that circumvents the need for the SERP? The model has at its disposal all the content that is accessible to the search engine, creating the data to train the LLM so it can generate a human-like answer to the query. Thus, we end up with a single answer to the query that may have been formulated using many different content sites, with no attribution to what content was used to form the answer, and no option for the content creator to monetize the creation and hosting of the content.
Did this just become less about a technology race and more about how to grab market share for search and to monetize? Microsoft is among of the biggest search engine providers; with most of the market share still belonging to Google. Any impact in a market valued at $225 billion per year is significant, which may explain the investment into AI LLMs. Replacing the familiar list of search results with a single answer means the person creating the query never leaves this new ‘SERP’ page, retaining all the traffic for the search engine provider to monetize directly through ads and such like.
An already pressing issue
We have already seen some similar implications: for example, news content is sometimes displayed directly in the SERP or on social media pages; while the attribution to the content source is displayed, the person generating the query does not need to visit the news site and thus no advertising or paywall traffic is generated. The Canadian Government has pioneered legislation, Bill C-18, to protect news content creators; it forces a negotiation between the platform using the content and the creator to compensate them, to monetize the content they created.
Broaden this issue to all content and remove the attribution. This could cause many content providers, such as sites that have unique niche content, to stop providing good quality information due to a lack of finances. Fast forward ten years and if the LLM is basing its answers on the content available at the time, and the content providers have stopped providing updated relevant content, then the result is going to become less reliable than it is today.
Cybersecurity on the chopping block?
Why is this important to cybersecurity? A lack of funding may cause website owners to cease updating software or paying to secure their sites, there could be a lack of trust created when query results generate the wrong information, and cybercriminals may start publishing their own content to game the LLM, the reasons are many. Importantly in this transition, the plight of the content creator needs to be considered so that the internet remains a source of income, and thus a source of factual and accurate information.
Before you go: Untrained staff and low budgets leave 96% of businesses feeling "unprepared" for cyberattack