In the recent article in The Atlantic, they make an interesting point – the web will likely be transformed by the junk that can be created by LLVMs and ChatGPT. This could be an increasing problem in the future, and I have seen some cases where current search engines give links to the “word salad” pages noted in the article.
The problem extends to local news, which have been used by malicious actors – and is now exceedingly easy to create according to a Poynter Institute article. As most people prefer local news that is relevant to their area, this could open up a number of areas to fake news. For example, a Michigan university newspaper got taken up at its old domain by someone that is reportedly misrepresenting the previous university newspaper.
One way to tell a non-AI site
One feature you may have noticed recently on Google is the “…” menu next to a result. On this “about the source” page it will list on the bottom something like:
Site first indexed by Google
More than 10 years ago
Now, if a site were well known and not created with recent Chatgpt/Bard or other AI tools, it should show as being made years ago. Second, unless a tool were automated and kept up on a server for years, older sites will not be lower content, generally. Spammy websites generally are abandoned after a number of years whenever a domain or software glitch breaks it, if there are no real people and developers behind it to maintain it. Of course there is the case mentioned above with the university whose old newspaper domain lapsed and was taken over. Looking up an online WHOIS history could hint at the change in domain registrar, in that case.