Google Addresses Website Ranking Drop Caused by Googlebot Crawl Overload

A recent case brought to Google’s attention involved a website that experienced an overwhelming surge in requests from Googlebot , leading to a significant drop in search visibility. The site was hit with millions of crawl requests targeting non-existent URLs, with one specific URL receiving over 2.4 million hits in a single month — a volume comparable to a DDoS attack.

Background: Removal of NoIndex Pages and Use of 410 Status Code

The affected publisher had previously removed around 11 million pages that were never meant to be indexed. These pages were intentionally taken down and replaced with a 410 Gone HTTP status code, which signals to crawlers that the content is permanently gone and unlikely to return. This differs from a 404 Not Found , which simply indicates a missing page without specifying permanence.

Despite these measures, Googlebot continued crawling the removed URLs aggressively for months, prompting concerns about crawl budget waste and potential negative impacts on SEO performance.

User Concerns Shared With Google’s John Mueller

In a follow-up discussion, the site owner shared that within just 30 days, they received 5.4 million Googlebot requests for nonexistent URLs, including 2.4 million hits on one specific URL with query parameters:
https://example.net/software/virtual-dj/?feature=...

They also noted a visible drop in search rankings during this time and suspected a link between the excessive crawling and their reduced visibility.

The issue originated when the URLs were briefly exposed via a Next.js-generated JSON payload , even though they weren’t linked anywhere on the site. The team corrected this by changing how features are handled (using a different query parameter ?mf) and added that new parameter to robots.txt to block crawling.

Their goal was to stop the flood of crawl requests affecting server logs and possibly harming site performance or SEO.

Google’s Official Response From John Mueller

John Mueller of Google confirmed that it’s standard behavior for Googlebot to revisit URLs that once existed, even if they now return 404 or 410 responses. Google does this to check whether pages have been restored, especially since removals are sometimes accidental.

He reassured the publisher that having many such URLs isn’t inherently problematic and that disallowing them in robots.txt is a valid approach to reduce crawl load.

However, he issued a caution about implementing this fix:

  • Check if any frontend JavaScript or JSON files still reference the ?feature= URLs , as blocking them could break rendering.
  • Use Chrome DevTools to simulate what happens when those URLs are blocked.
  • Monitor Google Search Console for Soft 404 errors , which might indicate unintended consequences like broken page elements or failed indexing.

Mueller specifically warned that blocking a resource used by client-side rendered pages (like JavaScript-powered content) could prevent important pages from rendering correctly or being indexed at all.

Final Advice: Dig Deeper Before Concluding

While the aggressive crawling seems like the obvious cause of the ranking drop, Mueller advised the site owner not to jump to conclusions. He encouraged further investigation into other possible technical issues or changes that might better explain the decline in visibility. It’s a classic reminder: just because something seems to be the problem doesn’t mean it actually is .

This situation highlights the importance of careful technical SEO audits, especially after large-scale site changes or exposure of unintended URLs.

Leave a Reply

Your email address will not be published. Required fields are marked *