pagesizechecker.com

fundamentals

Google's 2MB HTML limit, explained

What Google's 2MB Googlebot indexing limit actually means, what counts toward it, and how to know if your pages are at risk of being partially indexed.

3 min read
Silhouetted figure walking forward, body dissolving into glowing particles at a hard vertical cutoff line of cyan-white light against black void

In late 2024, Google updated its public documentation on to clarify file-size limits. Three numbers come up: 2MB for supported web-search HTML files, 64MB for PDFs, and 15MB as a broader default crawler cap.

For SEO on regular web pages, the 2MB rendered-HTML limit is the only one most teams need to worry about. This post is about what that limit actually means, and what it doesn't.

What "2MB" actually measures

The cap applies to the rendered HTML of a page, which is the DOM after JavaScript executes. Not the raw HTML your server sends. That distinction matters more than anything else in this conversation.

If you have a JS-heavy app that serializes large state into the HTML (Next.js' __NEXT_DATA__, Nuxt's __NUXT__, Apollo's cached query results), that serialized data counts. Even if it's invisible to the user, Googlebot reads it.

Compressed transfer size doesn't matter for this limit. Whether your HTML is gzipped, brotli'd, or sent raw over the wire, Google measures the uncompressed bytes after rendering.

What counts toward the 2MB

CountsDoesn't count
All HTML markup (tags, attributes, text)External CSS files
Inlined CSS (<style> blocks)External JavaScript files (their output counts once executed)
Inlined JavaScript (<script> contents)Images (external <img src>)
Inlined SVGExternal SVG referenced via <use>
data: URI images (base64)Server response headers
Schema.org JSON-LDResources blocked by robots.txt
Comments and whitespaceContent past the 2MB cutoff (Google just stops reading)
Hidden DOM (display:none elements)

External resources don't count toward the limit, but they affect the page in other ways. If JavaScript runs and inserts content into the DOM, that inserted content does count, because it's part of the rendered HTML.

What happens at the cutoff

Google doesn't error or block the page. It just stops reading. Anything past byte 2,097,152 is invisible to indexing (the content can't appear in search results), schema parsing (if your JSON-LD is below the cutoff, structured data won't trigger rich results), internal link discovery (links Google would otherwise follow), and any canonical or hreflang tags placed late in <head> (rare but possible if your <head> is huge).

The order of content in your HTML matters because of this. Critical SEO elements should be early. Heavy footer markup is the safest place to be over budget.

Are you affected?

Most static and content-driven sites are nowhere near 2MB rendered HTML. The pages that consistently exceed it tend to be:

  • Single Page Applications with heavy client-side state hydration
  • E-commerce product listing pages with hundreds of products inlined
  • News sites with infinite-scroll content rendered server-side on first paint
  • Sites using older frameworks without partial hydration or streaming SSR
  • Pages with embedded large datasets (charts, tables, configuration UIs)

If any of those describe you, before assuming you're fine.

Different from page speed (Core Web Vitals)

The 2MB limit is an indexing signal, not a ranking one, at least not directly. Google has been clear that page weight isn't a Core Web Vital, doesn't directly factor into search rankings, and isn't the same as performance.

There's an obvious second-order effect, though: pages with 2MB+ rendered HTML are almost always slow, and slow pages do affect rankings via Core Web Vitals (LCP, INP, CLS). Fixing the page-weight problem usually fixes the performance problem along with it.

What if I have content past the cutoff?

You've got two options. Move the important stuff up: restructure the page so the SEO-critical content sits early, and let comments, related articles, and footers fall past 2MB if anything has to. Or cut the bloat directly. Our covers the specific levers.

For 99% of sites that fail this check, the fix is shrinking the rendered HTML, not restructuring it. The bloat is almost always serialized state or inlined assets, not your actual content.

If you're not sure where your site stands, test it. renders your URL in a real browser and reports the uncompressed rendered HTML byte count, which is the same number Google sees.

Check your page size now

Test any URL against Google's 2MB Googlebot HTML limit in seconds.

Run a check