Front-End Performance 2021: Delivery Optimizations

About The Author

Vitaly Friedman loves beautiful content and doesn’t like to give in easily. When he is not writing, he’s most probably running front-end & UX … More about Vitaly ↬

Email Newsletter

Weekly tips on front-end & UX.
Trusted by 176.000 folks.

Quick summary ↬ Let’s make 2021… fast! An annual front-end performance checklist with everything you need to know to create fast experiences on the web today, from metrics to tooling and front-end techniques. Updated since 2016.

Table Of Contents

  1. , once async scripts arrive, they are executed immediately — as soon as the script is ready. If that happens very fast, for example when the script is in cache aleady, it can actually block HTML parser. With defer, browser doesn’t execute scripts until HTML is parsed. So, unless you need JavaScript to execute before start render, it’s better to use defer. Also, multiple async files will execute in a non-deterministic order.

    It's worth noting that there are a few misconceptions about async and defer. Most importantly, async doesn’t mean that the code will run whenever the script is ready; it means that it will run whenever the scripts is ready and all preceding sync work is done. In Harry Roberts' words, "If you put an async script after sync scripts, your async script is only as fast as your slowest sync script."

    Also, it's not recommended to use both async and defer. Modern browsers support both, but whenever both attributes are used, async will always win.

    If you'd like to dive into more details, Milica Mihajlija has written a very detailed guide on Building the DOM faster, going into the details of speculative parsing, async and defer.

  2. Lazy load expensive components with IntersectionObserver and priority hints.
    In general, it’s recommended to lazy-load all expensive components, such as heavy JavaScript, videos, iframes, widgets, and potentially images. Native lazy-loading is already available for images and iframes with the loading attribute (only Chromium). Under the hood, this attribute defers the loading of the resource until it reaches a calculated distance from the viewport.
    <!-- Lazy loading for images, iframes, scripts.
    Probably for images outside of the viewport. -->
    <img loading="lazy" ... />
    <iframe loading="lazy" ... />
    <!-- Prompt an early download of an asset.
    For critical images, e.g. hero images. -->
    <img loading="eager" ... />
    <iframe loading="eager" ... />

    That threshold depends on a few things, from the type of image resource being fetched to effective connection type. But experiments conducted using Chrome on Android suggest that on 4G, 97.5% of below-the-fold images that are lazy-loaded were fully loaded within 10ms of becoming visible, so it should be safe.

    We can also use importance attribute (high or low) on a <script>, <img>, or <link> element (Blink only). In fact, it’s a great way to deprioritize images in carousels, as well as re-prioritize scripts. However, sometimes we might need a bit more granular control.

    When the browser assigns "High" priority to an image,
    but we don’t actually want that.
    <img src="less-important-image.svg" fetchpriority="low" ... />
    We want to initiate an early fetch for a resource,
    but also deprioritize it.
    <link rel="preload" fetchpriority="low" href="/script.js" as="script" />

    The most performant way to do slightly more sophisticated lazy loading is by to lazy-load anything that you actually want people to see quickly, e.g. product page images, hero images or a script required for the main navigation to become interactive.

An example showing an old treshold of 3000px with 160KB downloads (left) while the new threshold has an amount of 1250px with only 90KB downloads (right) showing an improvement of img loading lazy data savings
On fast connections (e.g 4G), Chrome’s distance-from-viewport thresholds was recently reduced from 3000px to 1250px, and on slower connections (e.g 3G), the threshold changed from 4000px to 2500px. (Large preview)
An illustration with text around a mobile phone with the Twitter UI showwn, explaining tooling improvements from lazy-load translation strings
By lazy-loading translation strings, Mobile Twitter managed to achieve 80% faster JavaScript execution from the new internationalization pipeline. (Image credit: Addy Osmani) (Large preview)
  1. Load images progressively.
    You could even take lazy loading to the next level by adding progressive image loading to your pages. Similarly to Facebook, Pinterest, Medium and Wolt, you could load low quality or even blurry images first, and then as the page continues to load, replace them with the full quality versions by using the BlurHash technique or LQIP (Low Quality Image Placeholders) technique.

    Opinions differ if these techniques improve user experience or not, but it definitely improves time to First Contentful Paint. We can even automate it by using SQIP that creates a low quality version of an image as an SVG placeholder, or Gradient Image Placeholders with CSS linear gradients.

    These placeholders could be embedded within HTML as they naturally compress well with text compression methods. In his article, Dean Hume has described how this technique can be implemented using Intersection Observer.

    Fallback? If the browser doesn’t support intersection observer, we can still lazy load a polyfill or load the images immediately. And there is even a library for it.

    Want to go fancier? You could trace your images and use primitive shapes and edges to create a lightweight SVG placeholder, load it first, and then transition from the placeholder vector image to the (loaded) bitmap image.

  2. Three different versions showing the SVG lazy loading technique by José M. Pérez, a version similar to Cubism art on the left, a pixelated blurred version in the middle, and a proper picture of José himself on the right
    SVG lazy loading technique by José M. Pérez. (Large preview)
  3. Do you defer rendering with content-visibility?
    For complex layout with plentiful of content blocks, images and videos, decoding data and rendering pixels might be a quite expensive operation — especially on low-end devices. With content-visibility: auto, we can prompt the browser to skip the layout of the children while the container is outside of the viewport.

    For example, you might skip the rendering of the footer and late sections on the initial load:

    footer {
      content-visibility: auto;
      contain-intrinsic-size: 1000px;
      /* 1000px is an estimated height for sections that are not rendered yet. */

    Note that .

    And if you need to get a bit more granular, with CSS Containment, you can manually skip layout, style and paint work for descendants of a DOM node if you need only size, alignment or computed styles on other elements — or the element is currently off-canvas.

The rendering performance on initial load is 2,288 ms for baseline (left) and 13,464 ms for chunks with content-visibility:auto (right)
In the demo, applying content-visibility: auto to chunked content areas gives a 7× rendering performance boost on initial load. (Large preview)
  1. Do you defer decoding with decoding="async"?
    Sometimes content appears offscreen, yet we want to ensure that it’s available when customers need it — ideally, not blocking anything in the critical path, but decoding and rendering asynchronously. We can use decoding="async" to give the browser a permission to decode the image off the main thread, avoiding user impact of the CPU-time used to decode the image (via Malte Ubl):

    <img decoding="async" … />

    Alternatively, for offscreen images, we can display a placeholder first, and when the image is within the viewport, using IntersectionObserver, trigger a network call for the image to be downloaded in background. Also, we can or download the image if the .

  2. Do you generate and serve critical CSS?
    To ensure that browsers start rendering your page as quickly as possible, it’s become a common practice to collect all of the CSS required to start rendering the first visible portion of the page (known as "critical CSS" or "above-the-fold CSS") and include it inline in the <head> of the page, thus reducing roundtrips. Due to the limited size of packages exchanged during the slow start phase, your budget for critical CSS is around 14KB.

    If you go beyond that, the browser will need additional roundtrips to fetch more styles. CriticalCSS and Critical enable you to output critical CSS for every template you're using. In our experience though, no automatic system was ever better than manual collection of critical CSS for every template, and indeed that's the approach we've moved back to recently.

    You can then inline critical CSS and lazy-load the rest with critters Webpack plugin. If possible, consider using the conditional inlining approach used by the Filament Group, or convert inline code to static assets on the fly.

    If you currently load your full CSS asynchronously with libraries such as loadCSS, it’s not really necessary. With media="print", you can trick browser into fetching the CSS asynchronously but applying to the screen environment once it loads. (thanks, Scott!)

    <!-- Via Scott Jehl. -->
    <!-- Load CSS asynchronously, with low priority -->
    <link rel="stylesheet"
      onload="'all'" />

    When collecting all the critical CSS for each template, it’s common to explore the "above-the-fold" area alone. However, for complex layouts, it might be a good idea to include the groundwork of the layout as well to avoid massive recalculation and repainting costs, hurting your Core Web Vitals score as a result.

    What if a user gets a URL that’s linking directly to the middle of the page but the CSS hasn’t been downloaded yet? In that case, it has become common to .

    One important advantage of streaming the entire HTML response is that HTML rendered during the initial navigation request can take full advantage of the browser’s streaming HTML parser. Chunks of HTML that are inserted into a document after the page has loaded (as is common with content populated via JavaScript) can’t take advantage of this optimization.

    Browser support? with partial support in Chrome, Firefox, Safari and Edge supporting the API and Service Workers being . And if you feel adventurous again, you can check an experimental implementation of streaming requests, which allows you to start sending the request while still generating the body. Available in Chrome 85.

An image summarizing the save-data usage on Android Chrome and the average img hits or sessions discovered by Cloudinary research in November 2019 and April 2020
18% of global Android Chrome users have Lite Mode enabled (aka Save-Data), according to . For example, with React, we could write a component that renders differently for different connection types. As Max suggested, a <Media /> component in a news article might output:

  • Offline: a placeholder with alt text,
  • 2G / save-data mode: a low-resolution image,
  • 3G on non-Retina screen: a mid-resolution image,
  • 3G on Retina screens: high-res Retina image,
  • 4G: an HD video.

Dean Hume provides a .

  • React Adaptive Loading Hooks & Utilities provides code snippets for React,
  • Netanel Basel explores Connection-Aware Components in Angular,
  • Theodore Vorilas shares how Serving Adaptive Components Using the Network Information API in Vue works.
  • Umar Hansa show how to selectively download/execute expensive JavaScript.
  • Consider making your components device memory-aware.
    Network connection gives us only one perspective at the context of the user though. Going further, you could also dynamically adjust resources based on available device memory, with the Device Memory API. navigator.deviceMemory returns how much RAM the device has in gigabytes, rounded down to the nearest power of two. The API also features a Client Hints Header, Device-Memory, that reports the same value.

    Bonus: Umar Hansa shows how to defer expensive scripts with dynamic imports to change the experience based on device memory, network connectivity and hardware concurrency.

  • A break-down showing how different resources are prioritized in Blink as of Chrome 46 and beyond
    A break-down showing how different resources are prioritized in Blink as of Chrome 46 and beyond. (Image credit: (which performs a DNS lookup in the background), (which asks the browser to start the connection handshake (DNS, TCP, TLS) in the background), (which asks the browser to request a resource) and in modern browsers, with support , ranging from a huge memory footprint and bandwidth usage to multiple registered analytics hits and ad impressions.

    Unsurprinsingly, it was deprecated, but the Chrome team has brought it back as NoState Prefetch mechanism. In fact, Chrome treats the prerender hint as a NoState Prefetch instead, so we can still use it today. As Katie Hempenius explains in that article, "like prerendering, NoState Prefetch fetches resources in advance; but unlike prerendering, it does not execute JavaScript or render any part of the page in advance."

    NoState Prefetch only uses ~45MiB of memory and subresources that are fetched will be fetched with an IDLE Net Priority. Since Chrome 69, NoState Prefetch adds the header Purpose: Prefetch to all requests in order to make them distinguishable from normal browsing.

    Also, watch out for prerendering alternatives and portals, a new effort toward privacy-conscious prerendering, which will provide the inset preview of the content for seamless navigations.

    Using resource hints is probably the easiest way to boost performance, and it works well indeed. When to use what? As Addy Osmani has explained, it’s reasonable to preload resources that we know are very likely to be used on the current page and for future navigations across multiple navigation boundaries, e.g. Webpack bundles needed for pages the user hasn’t visited yet.

    Addy’s article on "Loading Priorities in Chrome" shows how exactly Chrome interprets resource hints, so once you’ve decided which assets are critical for rendering, you can assign high priority to them. To see how your requests are prioritized, you can enable a "priority" column in the Chrome DevTools network request table (as well as Safari).

    Most of the time these days, we’ll be using at least preconnect and dns-prefetch, and we’ll be cautious with using prefetch, preload and prerender. Note that even with preconnect and dns-prefetch, the browser has a limit on the number of hosts it will look up/connect to in parallel, so it’s a safe bet to order them based on priority (thanks Philip Tellis!).

    Since fonts usually are important assets on a page, sometimes it’s a good idea to request the browser to download critical fonts with preload. However, double check if it actually helps performance as there is a puzzle of priorities when preloading fonts: as preload is seen as high importance, it can leapfrog even more critical resources like critical CSS. (thanks, Barry!)

    <!-- Loading two rendering-critical fonts, but not all their weights. -->
    crossorigin="anonymous" is required due to CORS.
    Without it, preloaded fonts will be ignored.
    <link rel="preload" as="font"
          media="only screen and (min-width: 48rem)" />
    <link rel="preload" as="font"
          media="only screen and (min-width: 48rem)" />

    Since <link rel="preload"> accepts a media attribute, you could choose to selectively download resources based on @media query rules, as shown above.

    Furthermore, we can use imagesrcset and imagesizes attributes to preload late-discovered hero images faster, or any images that are loaded via JavaScript, e.g. movie posters:

    <!-- Addy Osmani. -->
    <link rel="preload" as="image"
            poster_400px.jpg 400w,
            poster_800px.jpg 800w,
            poster_1600px.jpg 1600w"

    We can also preload the JSON as fetch, so it’s discovered before JavaScript gets to request it:

    <!-- Addy Osmani. -->
    <link rel="preload" as="fetch" href="" crossorigin>

    We could also load JavaScript dynamically, effectively for lazy execution of the script.

    /* Adding a preload hint to the head */
    var preload = document.createElement("link");
    link.href = "myscript.js";
    link.rel = "preload"; = "script";
    /* Injecting a script when we want it to execute */
    var script = document.createElement("script");
    script.src = "myscript.js";

    A few closer to the initial request, but preloaded assets land in the memory cache which is tied to the page making the request. preload plays well with the HTTP cache: a network request is never sent if the item is already there in the HTTP cache.

    Hence, it’s useful for late-discovered resources, hero images loaded via background-image, inlining critical CSS (or JavaScript) and pre-loading the rest of the CSS (or JavaScript).

    An example using the cover of the Greyhound movie starring Tom Hanks to show that preloaded images load earlier as there is no need to wait on JavaScript to discover
    Preload important images early; no need to wait on JavaScript to discover them. (Image credit: “ though).

    , ). Plus, Priority Hints will help us indicate loading priorities for scripts.

    Beware: if you’re using preload, as must be defined or nothing loads, plus preloaded fonts without the crossorigin attribute will double fetch. If you’re using prefetch, beware of the Age header issues in Firefox.

    A graph showing first contentful paint (by server worker status) with count from 0 to 150 across a given period of time (in ms)
    With a service worker, we can request just the bare minimum of data, and then transform that data into a full HTML document to improve FCP. (via and the fallback is the network anyway. Does it help boost performance? Oh yes, it does. And it’s getting better, e.g. with Background Fetch allowing background uploads/downloads via a service worker as well.

    There are a number of use cases for a service worker. For example, you could implement "Save for offline" feature, handle broken images, introduce messaging between tabs or provide different caching strategies based on request types. In general, a common reliable strategy is to store the app shell in the service worker’s cache along with a few critical pages, such as offline page, frontpage and anything else that might be important in your case.

    There are a few gotchas to keep in mind though. With a service worker in place, we need to beware range requests in Safari (if you are using Workbox for a service worker it has a range request module). If you ever stumbled upon DOMException: Quota exceeded. error in the browser console, then look into Gerardo’s article When 7KB equals 7MB.

    As Gerardo writes, “If you are building a progressive web app and are experiencing bloated cache storage when your service worker caches static assets served from CDNs, make sure the proper CORS response header exists for cross-origin resources, you do not cache opaque responses with your service worker unintentionally, you opt-in cross-origin image assets into CORS mode by adding the crossorigin attribute to the <img> tag.”

    There is plenty of great resources to get started with service workers:

    • Service Worker Mindset, which helps you understand how service workers work behind the scenes and things to understand when building one.
    • Chris Ferdinandi provides a great series of articles on service workers, explaining how to create offline applications and covering a variety of scenarios, from saving recently viewed pages offline to setting an expiration date for items in a service worker cache.

    • Service Worker Pitfalls and Best Practices, with a few tips about the scope, delaying registering a service worker and service worker caching.
    • Great series by Ire Aderinokun on "Offline First" with Service Worker, with a strategy on precaching the app shell.
    • Service Worker: An Introduction with practical tips on how to use service worker for rich offline experiences, periodic background syncs and push notifications.
    • It's always worth referring to good ol' Jake Archibald’s Offline Cookbook with a number of recipes on how to bake your own service worker.
    • Workbox is a set of service worker libraries built specifically for building progressive web apps.
  • Are you running servers workers on the CDN/Edge, e.g. for A/B testing?
    At this point, we are quite used to running service workers on the client, but with CDNs implementing them on the server, we could use them to tweak performance on the edge as well.

    For example, in A/B tests, when HTML needs to vary its content for different users, we could use Service Workers on the CDN servers to handle the logic. We could also stream HTML rewriting to speed up sites that use Google Fonts.

  • A graph showing Timeseries of service worker installations on desktop and mobile with percent of pages across time between January 2016 and July 2020
    Timeseries of service worker installation. Only 0.87% of all desktop pages register a service worker, according to to inform the browser of which elements and properties will change.

    Whenever you are experiencing, debug unnecessary repaints in DevTools:

    Are you using a Masonry layout? Keep in mind that might be able to build a Masonry layout with CSS grid alone, very soon.

    If you want to dive deeper into the topic, Nolan Lawson has shared tricks to accurately measure layout performance in his article, and Jason Miller suggested alternative techniques, too. We also have a lil' article by Sergey Chikuyonok on how to get GPU animation right.

    High performance animations including Position, Scale, Rotation and Opacity
    Browsers can animate transform and opacity cheaply. .

  • Have you optimized for perceived performance?
    While the sequence of how components appear on the page, and the strategy of how we serve assets to the browser matter, we shouldn’t underestimate the role of under "Experience" in the Performance Panel.

    Bonus: if you want to reduce reflows and repaints, check Charis Theodoulou’s guide to Minimising DOM Reflow/Layout Thrashing and Paul Irish’s list of What forces layout / reflow as well as, a reference table on CSS properties that trigger layout, paint and compositing.

  • Table Of Contents

    1. Getting Ready: Planning And Metrics
    2. Setting Realistic Goals
    3. Defining The Environment
    4. Assets Optimizations
    5. Build Optimizations
    6. Delivery Optimizations
    7. Networking, HTTP/2, HTTP/3
    8. Testing And Monitoring
    9. Quick Wins
    10. Everything on one page
    11. Download The Checklist (PDF, Apple Pages, MS Word)
    12. Subscribe to our email newsletter to not miss the next guides.
    slots empire bonus codes
 Editorial (il)