MarsalaMarsala
Back to articles
GuideDec 20, 2024·7 min read

Next.js Performance Audits Without the Drama

The checklist I follow to keep Core Web Vitals green across complex Next.js properties.

Practical performance guide: budgets, tooling and runbooks to keep Next.js fast.

By Marina Álvarez·
#Web#Performance

Next.js Performance Audits Without the Drama

Give me clear budgets over last-minute hacks any day.

Context

I maintain a portfolio of web properties, including high-traffic landing pages, extensive documentation hubs, and dynamic product surfaces that the marketing team edits on a weekly basis. For a long time, performance was an afterthought. It used to degrade quietly, commit by commit, until a VP of Marketing noticed our main landing page's Largest Contentful Paint (LCP) was over 4 seconds during a critical campaign. That was the wake-up call.

The problem was a classic case of "death by a thousand cuts." A new tracking script here, a high-resolution hero image there, and a few kilobytes of extra CSS from a new component—it all added up. We were flying blind, with no visibility into how our day-to-day changes were impacting the user experience.

To solve this, I designed and implemented a repeatable and automated performance audit program. This program is built on a foundation of performance budgets, continuous integration checks, bundle analysis, real-user monitoring (RUM), and a rotating "performance guard" duty. Now, performance regressions are caught and addressed before they ever make it to production.

Stack I leaned on

  • Lighthouse CI + GitHub Actions: This is the first line of defense. We run Lighthouse on every pull request using GitHub Actions. It provides immediate feedback on performance, accessibility, and SEO, right in the PR comments. We chose this because it's free, open-source, and highly configurable.
  • WebPageTest + Calibre: While Lighthouse gives us lab data, WebPageTest and Calibre give us synthetic and real-user monitoring (RUM) data. WebPageTest allows us to run tests from different locations and on different devices, while Calibre provides ongoing monitoring and alerts. This combination gives us a complete picture of our performance in the wild.
  • Next.js analyzer + Bundle Buddy: These tools are essential for JavaScript diagnostics. The Next.js analyzer gives us a visual representation of our bundle sizes, and Bundle Buddy helps us identify duplicate dependencies and opportunities for code splitting.
  • Cloudinary/ImageKit: Media optimization is a huge part of web performance. We use Cloudinary and ImageKit to automatically optimize and serve responsive images and videos in modern formats like AVIF and WebP.
  • Grafana dashboard fed by CrUX + PostHog RUM: We use Grafana to visualize our performance data. We pull in data from the Chrome User Experience Report (CrUX) and our own Real User Monitoring (RUM) data from PostHog. This allows us to see how our performance is trending over time and how it varies by country, device, and connection speed.
  • n8n bots: Automation is key to making this program sustainable. We use n8n to create bots that post weekly performance digests to Slack and automatically create tickets in Linear when metrics drift. This keeps the team informed and accountable without adding manual work.

Budgets & Config

performance.budgets.json (excerpt):

{
  "lcp": { "mobile": 2600, "desktop": 2200 },
  "cls": 0.1,
  "inp": 200,
  "js_kb": 180,
  "css_kb": 90,
  "api_latency_ms": 400
}

Budgets live in Git; only platform + product leads approve changes. GitHub workflow compares Lighthouse results + bundle sizes to the file and fails fast.

Route-specific overrides live in performance.routes.json, letting us set tighter budgets for /pricing than /blog. We review overrides quarterly to avoid “budget creep.”

GitHub Action Snippet

name: performance
on:
  pull_request:
    types: [opened, synchronize]
jobs:
  lighthouse:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm ci
      - run: npm run build && npm run start &
      - run: npx @lhci/cli autorun --config=./lighthouserc.cjs

If LHCI fails, a bot comments with before/after metrics and blocking resources.

Workflow

  1. Instrument RUM: PostHog + Calibre capture real-user metrics per route; Grafana compares lab vs. field.
  2. CI enforcement: preview URLs get audited via Lighthouse (desktop + mobile). Failures block merge and comment JSON + HTML artifacts into PR.
  3. Bundle reviews: next build --analyze runs nightly. Bundle Buddy highlights shared chunks; we split or lazy-load accordingly.
  4. Media strategy: Next.js Image, Cloudinary transformations, AVIF/WebP, streaming video via Mux/HLS keep bytes low.
  5. API audits: log SSR/data fetch times; budgets fail if p95 > 400ms. Use caching + stale-while-revalidate.
  6. Synthetic monitoring: WebPageTest + Calibre hit top routes hourly from three regions; alerts mirror budgets.
  7. Incident response: PagerDuty alerts guard on breach; we roll back offending PR or toggle feature flag using LaunchDarkly/PostHog.
  8. Weekly performance review: share digest (#perf-watch) summarizing regressions, fixes, and backlog items.

Common Performance Pitfalls in Next.js

  • Large, unoptimized images: This is the most common issue. Use next/image to automatically resize, optimize, and serve images in modern formats. Also, make sure to provide width and height attributes to prevent layout shifts.
  • Blocking the main thread with large JavaScript bundles: Use the Next.js dynamic import feature to code-split your application and only load the JavaScript that's needed for the current page.
  • Slow API calls in getServerSideProps: If your API calls are slow, your page will be slow. Use stale-while-revalidate caching strategies and make sure your APIs are fast.
  • Not using next/font: This can lead to font-related layout shifts. next/font will automatically optimize your fonts and remove external network requests for improved privacy and performance.
  • Shipping too much third-party JavaScript: Be mindful of the third-party scripts you add to your application. Use next/script with the lazyOnload strategy to defer the loading of non-critical scripts.

Audit Checklist

  • [ ] Primary routes Lighthouse ≥ budget (desktop + mobile).
  • [ ] JS delta vs. main <= 20KB.
  • [ ] Fonts served via next/font with subsets.
  • [ ] Images use &lt;Image&gt; with defined sizes + priority flags.
  • [ ] Third-party scripts behind next/script with strategy="lazyOnload".
  • [ ] API calls cached or SSR/ISR tuned; no blocking >400ms.
  • [ ] RUM dashboard updated with new baseline screenshot.

PR templates require linking Lighthouse report + analyzer screenshot.

Automation Highlights

  • GitHub comment bot posts sparkline of LCP/CLS for the PR route.
  • Calibre webhooks open Linear tickets when metrics drift for 2 consecutive runs.
  • n8n digest posts Monday summary (best/worst routes, JS shipped, outstanding tickets).
  • Chromatic + Percy run visual diffs; flagged changes automatically join the performance review.

Performance Guard Rotation

  • Guard IC (weekly rotation) receives alerts, triages regressions, coordinates fixes.
  • Comms lead posts Slack updates + status notes.
  • Analyst compares RUM vs. lab data, updates dashboards.
  • Playbook lives in Notion; guard roster sits in PagerDuty. If guard sees three breaches in a week, we pause new marketing experiments until budgets recover.

We also run quarterly “perf drills” where new engineers walk through diagnosing a staged regression so they learn the tooling without pressure.

Field vs. Lab Strategy

  • Lab results catch regressions pre-merge.
  • Field data (Calibre RUM, CrUX) catches CDN, geo, or device-specific issues.
  • Dashboard shows both; we annotate marketing campaigns so we know if a hero video spiked CLS, for example.

Instrumenting APIs & Edge

  • Each Route Handler logs timing metrics to Grafana.
  • Edge Middleware tracks total execution time + cache hit ratio.
  • Budgets include SSR/ISR metrics; slow API = slow LCP.

Cost Snapshot

  • Calibre Pro: $180/mo for multi-route monitoring.
  • WebPageTest API credits: ~$40/mo.
  • Chromatic: $60/mo.
  • Total performance tooling: <$300/mo—worth it compared to lost conversions when LCP spikes.

Case Study: Pricing Revamp

  • Marketing added autoplay hero video + interactive calculator.
  • CI flagged JS budget violation (+120KB) and LCP 3.5s mobile.
  • We converted video to streaming HLS, lazy loaded chat widget, split calculator into React Server Components with Suspense.
  • Launch delayed 24h but shipped with mobile LCP 2.1s and CLS 0.04.

Bonus Case: Localization Rollout

When we added five localized hero images, CLS spiked due to missing dimensions. Visual regression suite + Lighthouse caught it. Fix was simply adding sizes + width/height plus reserving space with CSS aspect-ratio. Having automation saved hours of detective work.

Metrics & Telemetry

  • Improved LCP: The average Largest Contentful Paint (LCP) is now under 2.2 seconds on desktop and 2.6 seconds on mobile.
  • Reduced JavaScript footprint: We've reduced the amount of JavaScript shipped per page by 35% year-over-year.
  • Fewer performance incidents: We've had near-zero performance incidents since implementing the audit program.
  • Full Lighthouse adoption: 100% of pull requests now include Lighthouse evidence.
  • Faster regression detection: The median time to detect a regression is now 10 minutes, down from hours.
  • Fewer budget breaches: We've reduced the number of budget breaches per quarter from 11 to 2 or fewer.

Lessons Learned

  • Audits must live inside the sprint, not as a side quest.
  • Budgets only work if every squad knows them by heart—print them in the repo.
  • Field data ends debates; show CrUX before arguing.
  • Automate nagging; humans forget, bots don’t.
  • Performance debt behaves like interest; pay it weekly or compound pain later.

Performance Backlog Buckets

  1. Quick wins (under 2 hours): image format fixes, lazy load toggles.
  2. Structural (1–3 days): bundle splits, caching strategies.
  3. Strategic (1–2 sprints): redesigning hero layout, migrating to streaming.

We keep backlog prioritized in Linear with owner + expected gain (ms saved). Guards pull from it during slow weeks.

Cost of Ignoring Performance

Before this program we lost ~12% conversion during a seasonal campaign because LCP hit 4s. That single incident cost more than a year of performance tooling. Use numbers like that to justify the investment.

Implementation Timeline

  • Week 1: define budgets, wire Lighthouse CI, publish playbook.
  • Week 2: add RUM instrumentation + Grafana dashboards, configure Calibre/WebPageTest.
  • Week 3: analyze bundles, refactor hot paths, document guard rotation.
  • Week 4: run first performance retro, tune alerts, socialize weekly digests.

Keeping it iterative avoided a monolithic “perf project” that never ships.

What I'm building next

I’m sharing a GitHub Actions workflow with configurable budgets, WebPageTest hooks, and Slack digests. Want it? let me know.

Next on my list: adopt Next.js Flight Recorder + profiling via react-devtools-profile in CI so we catch slow renders even before Lighthouse notices.

FAQ

  • Do marketing experiments still move fast? Yes—if an experiment needs extra JS, they budget for it and commit to cleanup. Budgets are negotiation tools, not handcuffs.
  • What about pages outside Next.js? We wrap legacy pages with Calibre monitoring so they follow the same guardrails until migrated.

Marsala OS

Ready to turn this insight into a live system?

We build brand, web, CRM, AI, and automation modules that plug into your stack.

Talk to our team