Let me catch you up. A few weeks ago, Search Engine Journal released an article titled “Google’s New Algorithm Creates Original Articles From Your Content“. In it is a summary of a new algorithm that Google has researched that extracts a summary of your content, and then rewrites it to create something “original”. Scary news for wikipedia.
Why should I care?
Google has for years told webmasters to focus on their users by creating unique, valuable content that solves searcher’s problems. If Google is able to programmatically extract the most useful pieces of your content, rewrite it in their own words and display to users, then they can drain traffic from your website while avoiding copyright infringement. Because it’s “original” they own it, and don’t need to send any traffic to your website. If you own a website or enjoy a website’s content for free, this could be bad for you, and if you’re an SEO consultant, you probably don’t need to be told why this is scary.
The most likely applications for this initially are featured snippets and voice search responses. If you are ranking for featured snippets, you are probably getting a strong, healthy flow of free traffic that could be cut off when this algorithm is introduced. Voice search results are also based on featured snippets so you would lose that as well.
Why would Google do this?
It’s about retaining searchers on google.com instead of sending them off to another website. To understand this, you have to understand Google’s revenue streams and just how massive a fractional increase in their revenue would be. Let’s talk about how Google makes money.
Google sells ads. They have various ad formats but the main ones that most people recognize are the green text ads at the top and the shopping ads. Whenever someone clicks on an ad, Google gets paid, and they have a couple tactics for growth.
1. They can increase the cost of those ads.
2. They can increase the number of people clicking on those ads.
Adwords operates as an auction. Google can manipulate the price people pay and drive up bids by changing the format of their ads. Not long ago I wrote about Local Service Ads and how Google is taking a bite out of that category. They created another ad format, and reduced the number of ads displaying from paid search ads at the top of the page from four to two. Demand remained the same, so everyone’s bids (which are automated by Google’s platform) increased. Cost per lead skyrocketed for advertisers.
Or the other option for them is to increase the number of people clicking on ads. Let’s talk about how that relates to featured snippets. Right now the click through rate for featured snippets is very high, somewhere around 30% on average, though it varies from snippet to snippet. If Google were to eliminate the click through for that feature, they would retain 30% of their traffic on their website, giving users another chance to click on an ad instead. Google’s total ad click through rate is somewhere between 2% and 3% and they are a multi billion dollar company. Keeping users on google.com instead of sending them to another website is a known strategy that Google uses, and the search community has been tracking it for several years now. To give you an idea of how much impact a chance like this could have, an increase of one tenth of one percent on ad click through rate would increase their revenue by several billion dollars each year.
This graph shows that over time Google is solving user’s searches without the need for a click to a website, and that correspondingly the click through rate on ads has increased. Credit to Rand Fishkin while he was at Moz.
Why do they need a new algorithm? Why wouldn’t they just stop linking to websites now?
Two words: copyright infringement. If they were to display the text from your website without linking to it (citing the source) they would be making money off of other people’s content at scale, while demonetizing the content creators. To put it mildly, there would be a lot of upset companies with big lawyers getting real litigious. This algorithm potentially removes that risk by keeping within the letter of the law, if not its spirit.
So featured snippets and voice search, is that all?
Nope. This technology is dangerous for its wide applicability. The research is titled “Generating Wikipedia by Summarizing Long Sequences”. Wikipedia is one of Google’s dominant search results due to the high quality and quantity of peer reviewed information. For wikipedia this is scary – Google would essentially scrape the content from their site, rewrite it, and axe their traffic. But they’re not the only ones. Most search specialists who focus on driving traffic to websites have at least a passing familiarity with the concept of searcher intent. I like to use this one from STAT because they’re awesome:
Any website that does a good job of informing users in the research phase (top of funnel) could have their content scraped by Google and repurposed to improve Google’s bottom line by increasing ad CTR, which we displayed in that graph above.
With this algorithm, Google could do things like extract summaries from the top 10 search results, paraphrase it into its own version of the content on its own webpage that loads incredibly fast and give themselves prime placement. Maybe include a few ads on that webpage as well. Think about it this way: over 50% of all searches are done on mobile devices, which uses a vertical screen with limited real estate. So, if there are 4 text ads, a shopping ads carousel, a “wiki” generated by Google, and a webpage generated by Google, then you have about 7 results that are Google owned web properties driving people towards ads. You would need to scroll down 4 “screens” to get to the organic (free) results. In other words, Google isn’t able to just “generate wikipedia”. Google could rewrite the entire internet and the prize would be billions.
How close are we to Google implementing this?
We’re a lot closer than the article from Search Engine Journal indicates. They’re a great publication, but after reading through the research (which is real exciting) I think they missed a couple key points. First, the SEJ article mentions that about 30% of the facts generated by this algorithm are fake. That’s not entirely accurate. The research indicates that they have improved from 30% to 6% by encoding facts. That was a big improvement, but to top it off, this is a machine learning algorithm. As they train the algo, it will improve, and with Google’s massive dataset and experience in this area, they may one day activate it and it will take over instantly. Google improved its translation service using AI literally overnight in this manner, so there is some precedent for it.
I can’t give a deadline, and Google may decide not to implement this because there is another layer of risk for management to consider.
If they suddenly flip the switch on this and shut off traffic from featured snippets to websites they could find that copyright law is rewritten to restrict Google. Even if they tried to introduce this slowly so it wasn’t a huge shock all at once, they’re still going to face scrutiny from regulators.
There is also the issue of scalability. Because Google is so massive and operates at such a large scale, this type of an implementation would demonetize many content creators who rely on traffic from Google. The incentive to creating high quality content could diminish dramatically across the entire internet because it becomes much less profitable if Google is able to copy your content at scale and claim it as their own. This could cause the internet to stagnate, with content being updated less frequently and with less effort put into it. I can’t imagine this scenario would be profitable for Google in the long run either.
And in the era of fake news, 6% is too high. Watchdogs out there would have headlines for months if Google started using this algorithm and accidentally declared Obama the reptile king in voice search or released news declaring Donald Trump is being impeached through a misinterpretation.
Is there anything this doesn’t affect?
If you’re operating in the local search space you’re probably safe for now. Local search is at the transactional level of searcher intent. If a searcher is searching “running shoes near me”, they don’t need or want a summary of website content from Google, they want the map pack so they can go and buy or browse running shoes at the store. This algorithm doesn’t quite fit there.
In fact, many searches at the transactional level are probably safe. Think of searches like “buy camp stove” or “best painters near me”. Those are probably not going to be impacted by this algorithm. But if you operate a recipe or news site and you rely on traffic from Google, buckle up.
What do we do to keep traffic coming to our website?
Content takes many forms and this algorithm only appears to apply to text currently. Images, videos, and tools are still great ways to generate traffic that Google would have a hard time copying programmatically for both legal and technological reasons. Those are good areas to invest in.
Storytelling and brand awareness are crucial. You can’t summarize a brand or a story, and you can’t paraphrase it either. Someone searching for “sherwin williams paint colors” does not want a paraphrased chunk of text from Google. They want the brand site with the current paint colors. And storytelling gives you a competitive edge that a robot can’t match.
Let’s take two examples from the DIY space: Ana White and wikiHow. WikiHow gives step by step instructions for many basic to advanced DIY projects, and includes their own artwork to accompany the instructions. Their text could probably be paraphrased, but their images are so important for the searcher that they would be unlikely to lose all traffic. They might lose a chunk to the projects that don’t require an image as much, like “how to replace a fluorescent light bulb”.
Ana White would be virtually unaffected. She’s an Alaskan DIY blogger/interior designer that has built a community based around her projects. She maintains an active social media presence, earns traffic from Google, has a strong brand, and maintains online forums where people share their own projects based on work that was inspired by the projects she posts. She engages regularly with her fans, and each of her projects is accompanied by original, unique content that is written with scraps of her life and family inserted into the project.
It’s no longer a manual (like wikiHow), it’s a story that connects with her readers. If you actually research her website you’ll see her traffic is fairly diverse, but her top searches all start with her name in various spellings. She is a master of storytelling + brand, whether she really knows it or not.
The 15 minute risk assessment
Let me be clear – there is no guarantee that this algorithm will be used at all or used the way that we think it will, however I think it is very likely that it will based on statements from upper management to the press about Google’s mission, and some changes at their corporate headquarters. The best move right now for companies who know that they get a lot of traffic from search is to do a risk assessment.
Go to Google Analytics and follow this path:
- Behavior > Site Content > Landing Pages
- Change Segment from “All Users” to “Organic Traffic” and hit Apply.
- Set your date range and click “export” at the top of the page (excel or csv)
*note that if you want to see more than 10 rows of data you need to go to the bottom and click “show rows” and select the amount you want to export
This gives you a list of all the pages that receive traffic from search engines to your website. Try to determine how much of that traffic is disrupt-able. If you have a blog for example, that tends to rank better for those informational/research type searches, and that traffic is probably at higher risk.
Identify where the majority of my traffic lands. In this case I’m using a client who has 70% of their traffic from the top ten pages. Of those pages, 1 is an FAQ, 1 is the homepage and 8 are blogs. I’m going to assess the risk by comparing pages and conversions.
Select the data set, go to the Insert tab in excel, and click pivot table. Set your rows to landing page and goal completions to values. You should be able to see how many people converted after landing on each page from search. The vast majority (99%) of the conversions from the top 10 pages came from the homepage. This makes sense because we have a different conversion path for the other pages.
From here it’s easiest to do a gut check. You can go into Google Search Console and set your filters to look at queries for a specific page, download those queries, and check to see if those queries trigger a featured snippet that you rank for. You can get really organized with a spreadsheet and track all this, marking whether it’s an informational query or a featured snippet and get super technical, but you should be able to just look at the page and say yeah, Google could truncate and summarize this, and the searcher’s needs would be satisfied.
So in the above I can see this page is ranking for a bunch of “how to…” queries, the page itself is short and descriptive, and by adding CTR as a filter I can even guess which ones are generating featured snippets by their very high percentages, some over 30%. But when I visit the page, I’m able to just glance through it and the result is the same. I would mark this in a spreadsheet as “high risk” for losing traffic.
For this client, Google can summarize 9 of the top 10 pages, diverting a large chunk of traffic. The homepage is where all the conversions are from however, and that is mostly brand traffic and variations of keywords of the different services they offer. This traffic is likely to be unaffected by the new algorithm.
The risk assessment for this client is moderate. They won’t lose all of their traffic and their converting traffic will mostly remain intact, but they will lose opportunities higher up the funnel in the worst case scenario.
This assessment can be done within 15 minutes for free as long as you have Google Analytics and Google Search Console setup. If all you have is Google Analytics, you could do this with just that, but it would be a little more of a gut check than with Search Console.
Hope this helps, feel free to reach out on either social media or contact me through the site. Thanks for reading!