Steve Kinney

Performance at Scale

Any client-side framework looks reasonable when the session lasts three minutes. A user lands on a marketing page, scrolls, maybe clicks a CTA, and leaves. The performance conversation for that kind of surface is mostly about first impressions—how fast did the hero image paint, how quickly did the page become interactive, did anything shift while the user was trying to click.

Enterprise applications are a different animal. Your users keep the tab open for hours. They interact with data grids containing thousands of rows. They trigger complex state transitions across deeply nested component trees. They open dashboards in the morning and don’t close them until they leave for the day. The things that matter for a landing page still matter here—but they’re joined by a whole category of concerns that consumer-facing optimization guides barely mention: memory hygiene, garbage collection pressure, rendering cost at density, and the slow accumulation of cruft that turns a snappy application into something that needs a page refresh by 2 p.m.

This lecture covers the strategies that keep enterprise frontends fast over the long haul. For the deep dive on setting and enforcing specific metric thresholds, see the performance budgets lecture. For production telemetry and alerting, see the observability lecture.

Why Enterprise Performance Is Different

The performance profile of an internal tool, a trading platform, or a management dashboard differs from a consumer landing page in ways that aren’t just quantitative.

Session duration is the big one. A consumer session might last two or three minutes. An enterprise session can run eight hours or longer. Gmail’s engineering team documented this directly—they found that even modest memory growth, invisible in a short session, compounded into real pain when users kept tabs open all day. Minor leaks that a consumer site never notices become browser crashes in an enterprise context.

Data density is the second. Enterprise UIs regularly display grids with hundreds of columns and tens of thousands of rows. The performance bottleneck isn’t network bandwidth—it’s the number of DOM nodes the browser has to lay out and paint, and the cost of recalculating styles when something changes in a sea of cells.

Interaction frequency is the third. A user on a content site might click three times. An enterprise user might perform hundreds of clicks, keypresses, and filter operations in a single session. That distinction is exactly why Google replaced First Input Delay with Interaction to Next Paint as a Core Web Vital—FID only measured the first interaction, while INP tracks the latency of all interactions throughout the session. For enterprise apps, that shift was overdue.

Core Web Vitals in an Enterprise Context

The Core Web Vitals give you a triangulated view of loading, responsiveness, and visual stability. The current thresholds, measured at the 75th percentile of page visits:

MetricGoodNeeds improvementPoor
Largest Contentful Paint (LCP)< 2.5s2.5s–4.0s> 4.0s
Interaction to Next Paint (INP)< 200ms200ms–500ms> 500ms
Cumulative Layout Shift (CLS)< 0.10.1–0.25> 0.25

LCP measures when the largest visible element finishes rendering. In an enterprise dashboard, that’s usually the primary data grid or chart. If the main content area takes three seconds to paint, the user is staring at a skeleton screen wondering why they didn’t go into farming.

INP replaced FID in 2024 and measures responsiveness across the entire session—not just the first click. For enterprise apps with complex filtering, sorting, and navigation, INP is the metric that tells you whether the app stays responsive, not just whether it starts responsive.

CLS quantifies visual instability. In applications that load multiple asynchronous data streams simultaneously—a dashboard with six independently-fetching widgets, say—layout shift is a constant threat. Every widget that pushes content around as it resolves contributes to the score.

These metrics are a starting point. For enterprise-specific measurement strategies and how to wire them into CI, the performance budgets lecture covers field budgets, lab budgets, and asset budgets in detail. For production telemetry pipelines, see the observability lecture.

Real User Monitoring Versus Synthetic Testing

You need both. Real User Monitoring (RUM) captures how the application behaves on the actual hardware your users have—which in many enterprises means lower-powered corporate laptops, not the M4 MacBook Pro you’re developing on. Synthetic monitoring provides a controlled, repeatable baseline for catching regressions in CI.

The key nuance is that they diverge for real reasons. Lab tests can’t measure INP because there’s no real user interacting. CLS in the lab looks artificially low because lab runs typically don’t scroll or interact. LCP can differ because of caching, redirects, and personalization that only exist in production. web.dev’s guidance is explicit: use synthetic tests for CI gates and developer workflows, use RUM for understanding what users actually experience. The performance budgets lecture walks through the tooling for both.

Bundle Optimization

Large-scale applications accumulate JavaScript the way old houses accumulate paint layers. Each feature, each library, each “just one more script” adds to the total payload that the browser must download, parse, and execute before the application becomes interactive.

Code Splitting Strategies

Code splitting divides a JavaScript bundle into smaller chunks loaded only when needed. In enterprise environments, this typically operates at three granularities.

Route-based splitting is the foundation. The JavaScript for /settings doesn’t load when the user is on /dashboard. Every modern bundler supports this through dynamic import() at the router level, and it’s the single highest-leverage optimization most apps can make.

Component-based splitting goes further. A heavy charting library, a complex modal, or an entire below-the-fold section can be lazily loaded using React.lazy and Suspense. The code doesn’t arrive until the component actually mounts.

const AnalyticsChart = React.lazy(() => import('./AnalyticsChart'));

function Dashboard() {
  return (
    <Suspense fallback={<ChartSkeleton />}>
      <AnalyticsChart />
    </Suspense>
  );
}

Interaction-based splitting is the most aggressive form. Code loads only after a specific user action—clicking an “Export to CSV” button, opening an advanced settings panel, hovering over a tooltip that needs a rich rendering library. The Import on Interaction pattern from Patterns.dev is the canonical reference here.

StrategyWhen code loadsGood for
Route-basedOn navigationSeparate views, admin sections
Component-basedOn mountHeavy tabs, modals, charts
Interaction-basedOn user actionExport utilities, advanced dialogs

Tree Shaking Pitfalls

Tree shaking removes unreachable code by analyzing the static structure of ES module imports. Bundlers like webpack and Vite are efficient at this, but enterprise projects frequently disable the optimization without realizing it.

The primary culprit is side effects—code that does something when imported, like modifying the global window object or injecting styles. If a module isn’t pure, the bundler can’t safely remove it even when none of its exports are used. webpack’s tree-shaking guide is explicit: mark internal packages with "sideEffects": false in package.json so the bundler knows it’s safe to eliminate unused exports.

The second culprit is barrel files—those index.ts files that re-export everything from a package. When you write import { Button } from '@acme/ui' and that barrel re-exports 200 components, the bundler may not be able to prune the other 199 without explicit side-effect declarations. In large monorepos, barrel files are one of the most common reasons tree shaking quietly stops working.

Bundle Analysis and Shared Chunks

You can’t optimize what you can’t see. Tools like webpack-bundle-analyzer and source-map-explorer give you a visual map of what’s actually in the bundle. In my experience, the first time a team runs one of these tools on a mature enterprise application, someone gasps. The discovery is always the same: a library nobody remembers adding, a duplicate copy of something that should be shared, or a dev dependency that accidentally leaked into production.

In multi-application or microfrontend architectures, shared chunk optimization becomes critical. A common “vendor” chunk can hold libraries like React, your design system, or shared utilities. The browser downloads it once, caches it, and reuses it across applications. The Module Federation lecture covers shared dependency negotiation in depth—the singleton, requiredVersion, and share-scope machinery that prevents five copies of React from loading simultaneously.

Rendering Performance

When your UI displays thousands of data points and handles hundreds of interactions per session, rendering performance stops being about “fast first paint” and becomes about sustained responsiveness under load.

Virtualization

The naïve approach to rendering a 50,000-row data grid is to create 50,000 DOM nodes. The browser then spends its time laying out, painting, and reflowing elements the user can’t even see. Virtualization (also called windowing) fixes this by rendering only the rows currently visible in the viewport, plus a small buffer above and below for smooth scrolling.

Libraries like react-window and TanStack Virtual provide the infrastructure. For complex enterprise grids with frozen columns, dynamic row heights, and horizontal scrolling across hundreds of columns, teams often end up with custom implementations tuned to their specific data shape. Shopify’s engineering team documented this approach—they rewrote major screens using lazy scroll views and saw load time reductions up to 50% on data-heavy admin pages.

The architectural rule is simple: if you’re rendering a list of more than a few hundred items, you should be virtualizing it. The DOM was not designed to handle tens of thousands of nodes efficiently, and no amount of memoization will fix a layout engine that’s recalculating geometry for rows the user will never scroll to.

Re-render Prevention and Scheduling

In a complex component tree, a single state change at the root can trigger a cascade of re-renders all the way down. The standard prevention tools—React.memo, useMemo, useCallback—exist to short-circuit unnecessary work by maintaining referential equality of props. But, they’re not free. The comparison itself has a cost, and wrapping every component in React.memo without profiling first is premature optimization wearing a trench coat.

The more interesting lever is scheduling. Not all work needs to happen immediately. requestIdleCallback lets the browser perform background tasks—telemetry logging, data prefetching, non-critical state updates—when it isn’t busy handling user input or animations. For heavy computation like data parsing or complex filtering, offloading to Web Workers is the right move. The Comlink lecture covers how to make that boundary ergonomic.

The practical threshold is 50 milliseconds. Any synchronous computation that takes longer than that is a long task that degrades INP. If you’re parsing a CSV, running a search index, evaluating spreadsheet formulas, or generating a PDF on the client, that work belongs off the main thread.

CSS Architecture and Runtime Cost

Your styling strategy has runtime implications that most teams underestimate. Runtime CSS-in-JS libraries like styled-components and Emotion execute JavaScript to generate and inject styles into the DOM on every render. That’s fine for moderate interaction frequencies, but in enterprise apps with high-frequency re-renders—a live data grid, a trading interface, anything that updates dozens of times per second—the cumulative cost becomes visible.

Static CSS extraction and compile-time CSS-in-JS solutions like Vanilla Extract and Linaria generate standard CSS files at build time. The browser’s CSS parsing engine is heavily optimized and hardware-accelerated in ways that JavaScript-driven style injection is not. For enterprise applications where rendering throughput matters, compile-time solutions tend to be the better default.

This isn’t a universal rule. If your app doesn’t have high-frequency rendering, the difference is negligible. But, if you’re debugging jank in a dense data display and you’re running a runtime CSS-in-JS library, profiling the style injection cost is worth doing before you start blaming React.

Network Optimization

In a globally distributed enterprise, your users are not all on gigabit fiber sitting next to the data center. Network optimization is about reducing the cost of the requests that have to happen and eliminating the ones that don’t.

Resource Hints

Resource hints give the browser advance notice about what it’s going to need. preconnect and dns-prefetch establish early connections to critical domains—your API gateway, your font provider, your CDN—so the DNS lookup and TLS handshake are already done when the first real request fires. preload fetches high-priority resources for the current page. prefetch fetches resources expected for the next navigation.

For module-heavy applications, modulepreload is worth knowing about. It tells the browser to fetch and parse ES modules in parallel, avoiding the serial dependency chain that happens when the browser discovers each import only after evaluating the previous module.

<link rel="preconnect" href="https://api.example.com" />
<link rel="modulepreload" href="/chunks/dashboard.js" />
<link rel="prefetch" href="/chunks/settings.js" />

The ordering matters. Preconnect the domains you’ll definitely hit. Preload what the current page needs. Prefetch what the next page probably needs. Get the priority wrong and you’ll burn bandwidth on speculative fetches while the critical resources wait in line.

Caching Strategies

Service workers give you a programmable network proxy between the application and the server. The three caching strategies that matter most for enterprise apps:

  • Stale-while-revalidate for application shells and static assets. The user gets an instant cache hit while the service worker fetches a fresh copy in the background. The next visit gets the updated version. This is the sweet spot for most UI assets.
  • Cache-first for immutable, versioned assets. If the filename contains a content hash, the file never changes. Serve it from cache forever.
  • Network-first for critical business data. Accuracy matters more than speed. Fall back to cached data only when the network is genuinely unavailable.

HTTP/2, HTTP/3, and CDN

HTTP/2’s multiplexing lets the browser send multiple requests over a single connection, which means the old practice of domain-sharding assets across CDN subdomains is no longer necessary and actually counterproductive—each extra domain adds a DNS lookup and TLS handshake. HTTP/3 improves on this further with QUIC’s zero-round-trip connection resumption.

Shopify found that consolidating static content under a single domain—instead of using a separate cdn.shopify.com subdomain—improved TTFB by 35% globally because it eliminated the extra DNS and TLS overhead and let HTTP/3 prioritization do its job.

For image-heavy dashboards and admin screens, automatic format negotiation (serving AVIF or WebP based on browser support) reduces the visual payload without requiring per-image decisions from developers.

Long-Session Memory Hygiene

This is where enterprise performance diverges most sharply from consumer optimization. A minor memory leak that’s invisible in a three-minute session becomes a browser crash after eight hours. Gmail’s engineering team reported that fixing unbounded caches, orphaned callbacks, and leaked event listeners reduced memory usage at the 99th percentile by 80%.

Common Leak Patterns

Detached DOM nodes are elements removed from the visible page but still referenced in a JavaScript variable, array, or map. The garbage collector can’t reclaim them because the code still holds a reference, even though nothing renders them.

Closure-based leaks happen when an inner function captures a large variable from its outer scope. If that closure outlives the component that created it—attached to a global event handler, stored in a module-level cache, referenced by a timer that was never cleared—the entire captured scope stays alive in memory.

Event listener accumulation is the most insidious pattern. Listeners attached to window or document that are never removed effectively keep the entire component scope alive for the lifetime of the page. In a single-page application where components mount and unmount as the user navigates, forgetting to clean up a window.addEventListener('resize', ...) in an effect cleanup function means every visit to that route adds another listener that never goes away.

The diagnostic tool is Chrome DevTools’ Memory tab. Take heap snapshots before and after a suspected leak, compare them, and look for objects whose retained size—the memory that would be freed if the object were collected—keeps growing. If you navigate to a route, navigate away, force a garbage collection, and the heap doesn’t return to baseline, something is holding on to references it shouldn’t be.

Garbage Collection Pressure

Even without leaks, you can degrade performance by allocating and discarding objects too rapidly. V8’s garbage collector uses a generational strategy—young-generation collections are fast but frequent, old-generation collections are thorough but pause the main thread. Every allocation moves you closer to a GC pause. If your application creates and discards thousands of short-lived objects per second—common with high-frequency state updates, immutable data patterns, or aggressive memoization—the GC runs more often and the pauses become noticeable as jank.

The practical mitigation is to keep large, non-reactive datasets out of the reactive state tree. Historical logs, complex configuration objects, reference data that doesn’t change—these can live in IndexedDB instead of a JavaScript state store. The reactive system only needs to hold what the UI actually renders. Everything else is overhead that makes the garbage collector work harder than it should.

Microfrontend-Specific Performance

Microfrontends introduce performance challenges that don’t exist in a monolithic architecture. The Module Federation lecture covers the composition model; this section covers the performance implications.

Bundle Duplication

Without coordination, each microfrontend independently bundles common dependencies. If three remotes each ship their own copy of React, the user downloads React three times. Module Federation’s shared configuration with singleton: true is the standard mitigation—only the first remote to load provides the library, and the rest reuse it. The shared dependencies section of the Module Federation lecture covers the mechanics.

The organizational discipline matters as much as the configuration. Group platform dependency upgrades so shared runtime libraries move atomically across remotes. Use CI checks (tools like syncpack or centralized version presets) to catch version drift before it ships. Publish dependency-alignment dashboards so teams can see which remotes are lagging behind and prioritize accordingly. Performance regressions in federated architectures are frequently dependency-governance failures, not rendering algorithm failures.

Network Waterfalls

If the shell loads, discovers it needs Remote A, loads that, which then discovers it needs an async chunk, loads that—you’ve built a serial waterfall where each request has to complete before the next one starts. Mitigations include preloading frequently-needed remote scripts with <link rel="preload"> in the shell, using Module Federation 2.0’s manifest for coordinated preloading, and ensuring HTTP/2 multiplexing is active so at least the parallel requests share a connection.

American Express reportedly imposed a strict budget of roughly 250 KB per microfrontend, enforced through externals and dynamic imports. Setting per-remote performance budgets is one of the more effective guardrails against any single team’s remote becoming a payload liability for the whole platform.

Streaming SSR and Partial Hydration

Advanced microfrontend architectures can stream HTML as fragments render—similar to Zalando’s Tailor approach—so users see content incrementally rather than waiting for every remote to finish. Partial hydration ensures only the truly interactive parts of a microfrontend run JavaScript on the client, reducing the initial execution cost. The server components and streaming lecture covers the React-specific streaming model, and the islands architecture lecture covers the broader pattern of selective hydration.

Off-Main-Thread Architecture

Any synchronous computation that takes more than 50 milliseconds should be a candidate for a Web Worker. Enterprise applications are full of candidates: CSV parsing, client-side search indexing, data transformation pipelines, spreadsheet formula evaluation, PDF generation.

A Worker runs in a separate thread with its own event loop. The main thread stays responsive because the heavy work never touches it.

// workers/csv-parser.worker.ts
self.onmessage = (event: MessageEvent<{ csv: string }>) => {
  const rows = parseCSV(event.data.csv);
  const summary = computeAggregations(rows);
  self.postMessage({ rows, summary });
};

// components/DataImport.tsx
const worker = new Worker(new URL('../workers/csv-parser.worker.ts', import.meta.url));
worker.postMessage({ csv: rawFileContent });
worker.onmessage = (event) => {
  setTableData(event.data.rows);
};

SharedWorker extends this for scenarios where multiple tabs or microfrontends need the same background process—a WebSocket connection manager, a shared search index, a cross-tab state synchronization layer. In federated architectures, a SharedWorker can maintain a single WebSocket connection to the backend instead of each remote opening its own.

The Comlink lecture covers how to turn the raw postMessage ceremony into an ergonomic RPC-style API, including transfer() for zero-copy handoff of large buffers and proxy() for passing callbacks across the boundary.

Case Studies

Netflix

Netflix optimized their logged-out landing page by removing client-side React entirely and transitioning to vanilla JavaScript. React still ran on the server for developer experience and SSR, but stripping the client-side framework cut the bundle by 200 KB and improved time-to-interactive by 50%. For subsequent steps in the sign-up flow, they used XHR prefetching to download the React application in the background while the user was still reading the landing page. The lesson isn’t “don’t use React.” It’s that the landing page and the application are different performance contexts, and treating them identically is a missed opportunity.

Walmart

Walmart Global Tech treats performance as a north-star metric and focuses on the 75th percentile rather than medians or averages, because averages hide pain. Their governance model includes pull-request-level performance checks in CI, canary monitoring for INP in production (since INP requires real user interactions and can’t be measured synthetically), and real-time alerting. Thousands of developers contribute to the platform, and the performance governance is what keeps the aggregate from drifting.

Shopify

Shopify’s engineering team rewrote major admin screens around lazy scroll views and intelligent layout, achieving dramatically faster initial renders for data-heavy pages. They also consolidated static assets under a single domain instead of a separate CDN subdomain, leveraging HTTP/3 prioritization and eliminating the latency of extra DNS lookups and TLS handshakes. The result was a 35% improvement in TTFB globally. Perceived performance—how fast the app feels—was their explicit optimization target, not just how fast synthetic benchmarks said it was.

The Organizational Part

Technical optimizations erode without organizational support. Performance budgets belong in CI. Memory regression tests belong in the release process. Performance data belongs on business dashboards, not just engineering dashboards.

Farfetch built a “Site Speed Business Case Calculator” that lets product managers estimate the revenue impact of performance changes—turning speed from a technical concern into a business priority. Walmart runs weekly operational-excellence reviews where performance is visible to leadership. The pattern is the same everywhere it works: performance has to be somebody’s job, and the metrics have to be legible to people who don’t read flame charts.

The performance budgets lecture covers the mechanics—field budgets, lab budgets, asset budgets, CI enforcement with Lighthouse CI, and how to structure thresholds that are strict enough to prevent drift without being so brittle that teams start reaching for --no-verify.

The Short Version

Enterprise frontend performance is a long game. The first paint matters, but so does the 10,000th interaction. Memory management matters as much as bundle size. Organizational governance matters as much as technical optimization.

Measure in the field and the lab. Split bundles at route, component, and interaction boundaries. Virtualize large datasets. Move heavy computation off the main thread. Keep non-reactive data out of the reactive state tree. Coordinate shared dependencies across federated remotes. Set budgets, enforce them in CI, and make the numbers visible to people who can prioritize the work.

The applications that stay fast are the ones where performance is treated as an ongoing constraint—built into the architecture, enforced in the pipeline, and visible in the culture—rather than a quarterly initiative that produces a slide deck and then quietly evaporates.


Slides

Slide: Performance in Enterprise Context

Enterprise performance ≠ consumer performance.

  • Consumer apps optimize for first visit (LCP, FCP).
  • Enterprise apps optimize for long sessions — users stay for hours.

Enterprise-specific concerns:

  • Memory leaks — detached DOM nodes, closure-based leaks, event listener accumulation.
  • Re-render storms — dense data grids updating every few seconds.
  • Bundle duplication — multiple remotes shipping the same library.
  • Network waterfalls — remote A loads → discovers it needs remote B → loads remote B.

Slide: Code Splitting Strategies

Three levels of splitting for microfrontends.

LevelHowWhen to load
Route-basedEach route is a lazy-loaded chunkOn navigation
Component-basedHeavy components split outOn visibility / interaction
Interaction-basedLoad code on user actionOn click, hover, scroll
  • Barrel file trap: index.ts that re-exports everything defeats tree-shaking. Import directly from the module.
  • Shared chunks: Configure bundler to extract common dependencies into shared chunks across remotes.
  • Use modulepreload for critical federated remotes the user will likely need.

Last modified on .