This page has no <meta name="robots"> tag in its raw HTML <head>. The noindex directive is injected by JavaScript after the page loads. Check view-source — no noindex tag. Open browser DevTools → Elements → <head> and you'll see it appear after JavaScript runs. To verify, open Command Prompt (Windows) or Terminal (Mac) and run: curl -L https://sallymills.com/seo-javascript-issues/noindex-loaded-in-javascript/ | grep noindex — the raw response contains no noindex directive. (Windows: replace | grep noindex with | findstr noindex.)
What this demonstrates
This page is otherwise a normal article. Its title, meta description, canonical, H1, and body content are all in raw HTML. The one thing that is not: the <meta name="robots" content="noindex"> tag. That directive is injected into the <head> by a JavaScript DOMContentLoaded handler.
The intent is to noindex this page. But the directive arrives after the page loads — which means Googlebot's Wave 1 crawl processes the raw HTML without seeing any noindex. By the time Wave 2 executes the JavaScript, the page may already have been queued for indexing.
See also: Indexing: noindex via JavaScript — the same failure mode covered from an indexing-directive perspective rather than a JavaScript rendering perspective.
Why it matters
This is one of the most consequential Wave 1 / Wave 2 gaps. It's not just a content latency issue — it's a case where a page you intend to keep out of the index may appear there.
The pattern is extremely common in Google Tag Manager setups. A developer adds a noindex tag via a GTM custom HTML tag, or a marketer uses a GTM variable to conditionally inject noindex onto certain page types. In the browser it looks correct — DevTools confirms the noindex is present. But GTM fires after the page loads, making every GTM-injected noindex a client-side directive that Wave 1 will miss.
Once a page appears in the index, removing it takes additional crawl cycles and time. The window between Wave 1 indexing and Wave 2 noindex detection can be hours to days. For pages that are genuinely sensitive — staging content accidentally exposed, duplicate content you're trying to suppress — this window creates real risk.
The code
The JavaScript that injects the noindex tag — and the correct placement for comparison.
<!-- Raw HTML <head>: NO noindex tag -->
<head>
<title>JavaScript SEO: noindex Loaded via JavaScript</title>
<meta name="description" content="...">
<link rel="canonical" href="https://sallymills.com/...">
<!-- No <meta name="robots"> here -->
</head>
<!-- JavaScript injects noindex after page loads -->
<script>
document.addEventListener('DOMContentLoaded', function() {
var meta = document.createElement('meta');
meta.name = 'robots';
meta.content = 'noindex';
document.head.appendChild(meta);
});
</script>
<!-- Correct: put it in raw HTML in the <head> -->
<meta name="robots" content="noindex">
What Google does
- Googlebot crawls this page and receives the full HTML response.
- Wave 1: the
<head>is scanned. No<meta name="robots">tag is found. No noindex directive detected. - Wave 1: this page has a title, meta description, canonical, H1, and readable content. Googlebot may queue it for indexing.
- JavaScript execution is deferred to Wave 2.
- Wave 2: the
DOMContentLoadedhandler runs. The noindex meta tag is appended to<head>. - Wave 2: Google now sees the noindex. The page may be removed from the index — but only after it may have already appeared in search results following Wave 1.
How to detect it
-
view-source
Ctrl+U(Windows) /Cmd+U(Mac) → search fornoindexin the<head>. Not found. Then open browser DevTools (F12) → Elements → inspect the<head>— the noindex meta tag appears after JavaScript has run. The discrepancy between view-source and DevTools is the tell. -
curl
Open Command Prompt (Windows) or Terminal (Mac) and run:
curl -L https://sallymills.com/seo-javascript-issues/noindex-loaded-in-javascript/ | grep -i 'noindex'→ Returns nothing — the raw HTTP response contains no noindex directive. This is what Googlebot sees on Wave 1. (Windows: replace| grep -i 'noindex'with| findstr noindex.) - Google Search Console This page may appear as indexed in GSC despite the developer's intent to noindex it. It will not reliably appear under "Excluded by 'noindex' tag" based on Wave 1 processing alone.
- Screaming Frog JavaScript rendering OFF → Meta Robots column is blank or "index" — no noindex visible. JavaScript rendering ON → Meta Robots column shows "noindex". The mismatch between these two modes on the same URL confirms a JavaScript-injected directive. This is the most systematic way to detect this pattern across an entire site.
How to fix it
Place the noindex tag in the raw HTML <head>. Remove any JavaScript handlers, GTM tags, or client-side CMS hooks that are adding the noindex meta tag. The directive must be present in the initial HTML response from the server — not added after the page loads.
If you're using Google Tag Manager to manage noindex, this is the wrong tool for the job. Move noindex logic to your CMS template layer, your web server configuration (via X-Robots-Tag), or your build process. GTM is for analytics and conversion tracking — not for SEO directives that must reach the crawler before JavaScript runs.