🔴 Issue — noindex not visible to Wave 1

This page has no <meta name="robots"> tag in its raw HTML. The noindex directive is injected by JavaScript after the page loads. Check view-source — you will not find a noindex tag. Open browser DevTools → Elements → <head> and you will see it injected after JS runs. To verify, open Command Prompt (Windows) or Terminal (Mac) and run: curl -L https://sallymills.com/indexing/noindex-javascript/ | grep noindex — the response contains no noindex directive. (Windows: replace | grep noindex with | findstr noindex.)

What this demonstrates

This page intentionally omits a <meta name="robots" content="noindex"> tag from its raw HTML. Instead, the noindex directive is injected into the <head> by a DOMContentLoaded JavaScript handler. The intent is to noindex this page — but the directive isn't present when Googlebot fetches the raw HTML on Wave 1.

This is the same pattern as the JavaScript SEO: noindex Loaded via JavaScript demo — the same failure mode shown from an indexing-controls perspective rather than a rendering perspective.

Why it matters

A developer or marketer wants to noindex a page. They add a noindex tag via Google Tag Manager, a JavaScript CMS hook, or a client-side component. In the browser it looks correct — inspect the <head> in DevTools and the tag is there. A quick browser check gives the impression the directive is working.

But Googlebot's Wave 1 crawl sees no noindex directive. The raw HTML has none. Wave 1 may queue this page for indexing before Wave 2 runs the JavaScript and sees the noindex. Once a page is in the index, it takes additional crawl cycles to remove it — and the window for accidental indexation can be hours to days.

This is a common mistake with Google Tag Manager — using a GTM tag to inject noindex onto pages. GTM fires after the page loads, meaning the directive is always added client-side. It is not a reliable method for noindexing a page.

The code

The JavaScript that injects the noindex tag — and why it doesn't appear in view-source.

<!-- Raw HTML <head>: NO noindex tag here --> <head> <meta charset="UTF-8"> <title>Indexing: noindex via JavaScript</title> <link rel="canonical" href="https://sallymills.com/indexing/noindex-javascript/"> <!-- <meta name="robots"> is NOT here — it is added by JavaScript --> </head> <!-- JavaScript injects noindex after DOM load --> <script> document.addEventListener('DOMContentLoaded', function() { var meta = document.createElement('meta'); meta.name = 'robots'; meta.content = 'noindex'; document.head.appendChild(meta); }); </script> <!-- Correct implementation: put it in raw HTML instead --> <meta name="robots" content="noindex">

What Google does

  1. Googlebot crawls this page and receives the raw HTML response.
  2. Wave 1: scans the <head> for indexing directives. No <meta name="robots"> tag is found.
  3. Wave 1: no noindex directive detected. This page may be queued for indexing.
  4. Wave 2: JavaScript executes, the noindex meta tag is injected into <head> by the script.
  5. Wave 2: Google now sees the noindex and may remove the page from the index — but only after it may have already been indexed from Wave 1.
  6. The timing gap between Wave 1 and Wave 2 creates a window where the page can appear in search results despite the developer's intent to noindex it.

How to detect it

  • view-source Ctrl+U (Windows) / Cmd+U (Mac) → search for noindex in the page source. You will not find a <meta name="robots"> tag. Then open DevTools → Elements → inspect <head> — you will see the tag injected by JavaScript after the page loads. The discrepancy between view-source and DevTools is the diagnostic signal.
  • curl Open Command Prompt (Windows) or Terminal (Mac) and run: curl -L https://sallymills.com/indexing/noindex-javascript/ | grep -i 'noindex' → Returns nothing — the raw HTTP response contains no noindex directive. Compare this to the meta robots demo where the same command returns the noindex tag. (Windows: replace | grep -i 'noindex' with | findstr noindex.)
  • Google Search Console This page may appear as indexed despite the developer's intent. It will not appear under "Excluded by 'noindex' tag" based on Wave 1 alone. If Wave 2 processes it and finds the JS noindex, GSC may later show it excluded — but this is inconsistent and unreliable.
  • Screaming Frog JavaScript rendering OFF → Meta Robots column shows blank or "index" — no noindex directive visible. JavaScript rendering ON → Meta Robots column shows "noindex". The difference between these two modes on the same URL confirms a JavaScript-injected directive.

How to fix it

Place the noindex directive in the raw HTML <head> — not in JavaScript. Remove any GTM tags, JavaScript handlers, or client-side CMS hooks that are adding <meta name="robots" content="noindex">. Add the tag directly to the HTML template or CMS configuration so it is present in the first HTTP response.

If you are using GTM to manage noindex across many pages, move that logic to the server side — either in your CMS template layer or your web server configuration. GTM is appropriate for analytics tags and conversion tracking, not for SEO directives that must be present before JavaScript executes.

See also: correct meta robots noindex and X-Robots-Tag via HTTP header — two reliable methods that work on Wave 1.