January 8, 2024
Narrativa’s Automated Content and Google’s Search Algorithm
By Sofía Sánchez González
We’ve read a lot of news lately about how content generated by artificial intelligence might be penalized by Google’s algorithm. This year, the California company released a guide warning that only “useful content” would benefit from its algorithm. They even went further and claimed that content generated by artificial intelligence violates the Google Webmaster Guidelines.
This leads us to ask ourselves several questions. What is “useful content?” How can Google know that an article has been generated by artificial intelligence? What consequences does it have for the automated content produced by Narrativa?
Luckily, we have the answers to these questions.
How does Google identify automated content?
Text generated by language models, such as GPT-3 (a language model developed by OpenAI that learns from existing text and provides various ways of composing sentences), are statistically predictable. That is to say, when writing several articles, a model will always use the most statistically probable words when given a text prompt.
The researchers Bender, Gebru, McMillan-Major and Shmitchell sum it up like this: Text generated by a language model is not grounded in communicative intent because the training data never included sharing thoughts with a listener, nor does the machine have the ability to do that. A language model is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot. Even in the case of tweaking the generation parameters to make the model sound more “natural,” the final text will remain predictable.
However, human-generated text creates unexpected words (or in other words, less predictable words) because humans are more creative. And creativity is not one of the strengths of regular language models. Although it’s true that the creative capacity of the models is often put to the test, the results are not fully satisfactory. GPT-like models tend to go off track and generate text that is not consistent with reality.
Based on that assumption, Google can detect AI-generated content in two ways:
-
Repetitive patterns
-
Statistical patterns
What has changed in 2024?
Now the question for Google is not “Should we ban AI-generated content?” Instead, it is “What should we do with AI-generated content?”
And as they indicate, about 10 years ago, there was concern about the rise of mass-generated content by individuals. It wouldn’t have seemed reasonable to anyone that, as a result, all content generated by individuals would be banned. Instead, it made more sense for the company to improve its systems to reward quality content, and that’s precisely what happened.
The new guidelines are as follows: users aiming for a good position on Google Search should strive to create original, high-quality content targeted at people, a task that we at Narrativa are already undertaking.
- The proper use of AI or automation does not violate Google’s guidelines.
- Automation has been used in the editorial world for a long time to create useful content.
- The use of AI doesn’t provide any special benefit to the content. It’s simply content.
Why doesn’t this impact Narrativa’s content?
Firstly, because the content generated by our natural language technology is not statistically predictable. Our artificial intelligence system extracts human-written content that is then clustered, classified and ordered to create a connected graph of small segments. These segments are used to generate natural-sounding sentences by transposing the data present in the structured datasets that Narrativa licenses. Artificial intelligence adds context to the graph using classification algorithms which ensure the right contextualization of the sentence generated by the system. This approach has allowed Narrativa’s automated content to perform very well with regard to SEO for the last eight years, and continues to do so.
Secondly, the distribution of words in the automated content that Narrativa generates is less probabilistic and more human. Our team of automation editors ensure the quality of our content every step of development. Their job is to corroborate the correctness (in terms of what is considered “human”) so that all outputs read like they were written by a human.
In short, Narrativa’s automated content is useful content for users and therefore, also for Google. Our narratives have a meaningful purpose to inform readers about particular subjects based on validated and truthful datasets. In addition, our approach takes full advantage of artificial intelligence language models without neglecting the human element. The combination of both guarantees the generation of human-like outputs at scale.
About Narrativa
Narrativa is an internationally recognized content services company that uses its proprietary artificial intelligence and machine learning platforms to build and deploy digital content solutions for enterprises. Its technology suite, consisting of data extraction, data analysis, natural language processing (NLP) and natural language generation (NLG) tools, all seamlessly work together to power a lineup of smart content creation, automated business intelligence reporting and process optimization products for a variety of industries.
Contact us to learn more about our solutions!
Share