This page carries <meta name="robots" content="noindex"> in its <head> in the raw HTML. It is genuinely excluded from Google's index. Check view-source on this page to see the live directive.
What this demonstrates
This page implements the standard HTML noindex directive: <meta name="robots" content="noindex"> in the <head>. The directive is present in raw HTML, visible in view-source, and readable by Googlebot on its first fetch — no JavaScript execution required.
This is the most common and reliable method for preventing a page from appearing in search results. It is correctly implemented here — the page is genuinely excluded from the index.
Why it matters
The meta robots noindex is the go-to tool for page-level indexation control. It works for any HTML page, it's visible in view-source making it easy to audit, and Googlebot processes it on Wave 1 — no dependency on JavaScript rendering. When you need a page not to appear in search results, this is the right directive to reach for.
The key requirement: it must be in the raw HTML. A noindex directive added by JavaScript (via GTM or a JS CMS hook) won't be seen by Wave 1 — see the noindex via JavaScript demo for that failure mode.
Note that noindex does not prevent crawling. Google may still visit this page periodically. It also doesn't affect what Google can discover from this page — links found here are still followed and queued for crawling. noindex controls only whether the page appears in search results.
The code
The noindex directive on this page — in the raw HTML <head>.
<!-- Standard noindex — in <head>, in raw HTML -->
<head>
<meta name="robots" content="noindex">
<link rel="canonical" href="https://sallymills.com/indexing/meta-robots-noindex/">
<!-- rest of head -->
</head>
# Combined directives — noindex AND nofollow:
<meta name="robots" content="noindex, nofollow">
# Googlebot-specific (only affects Googlebot, not other crawlers):
<meta name="googlebot" content="noindex">
What Google does
- Googlebot crawls this page and reads the raw HTML response.
- In the
<head>, it finds<meta name="robots" content="noindex">. - The noindex directive is processed on Wave 1 — no JavaScript execution required.
- Google removes this page from (or prevents its entry into) the index.
- The page may still be crawled again periodically. Links from this page are still followed.
- In Google Search Console, this page may appear under "Excluded by noindex" in the Indexing report.
How to detect it
-
view-source
Ctrl+U(Windows) /Cmd+U(Mac) → search forrobots→ you'll see<meta name="robots" content="noindex">in the<head>. This is the most direct check. -
curl
Open Command Prompt (Windows) or Terminal (Mac) and run:
curl -L https://sallymills.com/indexing/meta-robots-noindex/ | grep -i 'robots'→ Returns the meta robots tag from the raw HTML. (Windows: replace| grep -i 'robots'with| findstr robots.) - Google Search Console This page appears under Coverage → "Excluded by 'noindex' tag" (or "Page with redirect" if GSC is following the canonical). It will not appear as indexed.
- Screaming Frog Crawl this URL → Directives tab (or Meta Robots column) → shows "noindex". Response Codes tab will show status 200 — the page exists, it's just noindexed. Screaming Frog's "Follow Noindex" mode can be toggled to control whether it continues crawling noindexed pages.
How to fix it
This page is intentionally noindexed — there's nothing to fix here. This is a correctly implemented noindex. To make a page indexable, remove the <meta name="robots" content="noindex"> tag from the HTML, redeploy, and allow Google time to recrawl and update its index.
For the inverse — pages that should be noindexed but aren't — see the noindex via JavaScript demo (directive too late in the rendering cycle) and the X-Robots-Tag demo (noindex in the HTTP header, invisible to view-source).