Feature Flags with PostHog Without Breaking Prod

Flags stopped being random strings—they’re governed like code now.

Context

In a rapidly evolving product environment, the ability to safely roll out new features, conduct A/B tests, and quickly respond to incidents is paramount. We desperately needed capabilities like cohort-based rollouts, robust experimentation frameworks, and reliable kill switches for new functionalities. However, solutions like LaunchDarkly, while powerful, were simply overkill and too expensive for our lean, agile team.

We discovered that PostHog's feature flag capabilities, combined with a disciplined approach to governance, could provide 90% of the functionality we needed at a fraction of the cost. The key insight was to treat feature flags not as temporary toggles, but as first-class citizens—like code. This meant establishing clear naming conventions, defining explicit contracts for each flag, implementing rigorous auditing processes, integrating them seamlessly with our middleware, and building comprehensive dashboards to track their status and impact.

This playbook outlines our end-to-end governance strategy for feature flags. It empowers our product, growth, and RevOps teams to launch new features and experiments safely, with confidence, and without fear of breaking production.

Stack I leaned on

PostHog feature flags (boolean, multivariate, condition-based): PostHog is the core of our feature flagging system. It allows us to create various types of flags—simple boolean toggles, multivariate flags for A/B/n testing, and condition-based flags that target specific user cohorts. Its flexibility is crucial for our diverse experimentation needs.
PostHog cohorts built from real events and properties: One of PostHog's strengths is its integrated analytics. We leverage this by building dynamic cohorts directly from real user events and properties. This ensures our flag targeting is precise and always up-to-date, reflecting actual user behavior.
Next.js middleware + server components to evaluate flags at the edge: For our Next.js applications, we integrate PostHog flags directly into the middleware and server components. This allows us to evaluate flags at the edge, ensuring that users see the correct experience from the very first render, minimizing layout shifts and improving perceived performance.
Supabase edge functions for non-Next surfaces and backend jobs: For services outside our Next.js frontend, such as backend jobs or other microservices, we use Supabase edge functions. These functions can fetch flag statuses via the PostHog API, ensuring consistent flag evaluation across our entire stack.
Linear + Slack workflows for approval, change logs, and cleanup tasks: To maintain governance, we've integrated our flag management with Linear for approvals and task tracking. Slack workflows provide real-time notifications for flag changes and facilitate discussions, while also serving as a channel for automated change logs and cleanup reminders.
Metabase dashboards summarizing flag status and experiment outcomes: Transparency is key. We use Metabase to create custom dashboards that provide a comprehensive overview of all active flags, their statuses, and the real-time outcomes of our experiments. This allows product managers, growth leads, and engineers to monitor impact and make data-driven decisions.

Pain Points Before Governance

Random flag names (flag123, test_newcta) cluttered the dashboard.
Permanent flags living for months because nobody cleaned them.
Who flipped this?—zero audit trail or change log.
Inconsistent rollouts: some flags gated components, others only hid buttons.

We fixed this by building a governance model similar to what we use for experiments.

Flag Taxonomy & Naming

Format: surface_segment_feature_goal (e.g., web_signup_copytest_ctr).
Types:
- launch: kill switch or phased rollout for new features.
- experiment: A/B/n tests tied to hypotheses.
- ops: temporary toggles for pricing, messaging, partners.
Metadata: owner Slack handle, creation date, expiry date, related Linear ticket, and rollback plan.

PostHog’s tags + descriptions store metadata; we also sync to Supabase for reporting.

Playbook

Intake & Approval
- Create a Linear ticket using the “Feature Flag” template.
- Specify purpose, cohorts, success metric, and risk level.
- Slack workflow notifies product + growth leads for approval.
Create Flag in PostHog
- Name using taxonomy.
- Add description, owner, expiry date (default 30 days).
- Configure filters/cohorts (e.g., country = "US" AND plan != Enterprise).
- Link to metrics dashboards (PostHog insight) so owners can monitor impact quickly.
Integrate with Next.js
- Use @posthog/nextjs and middleware to fetch flags server-side.
- Expose flags via React context/hook for client components.
- For edge cases (SSR + client hydration), rely on posthog.isFeatureEnabled with caching.
Backend & Jobs
- Supabase edge functions fetch flags via PostHog API for scheduled jobs (e.g., B2B nurture send).
- Use fallback defaults in code to avoid failing open when PostHog is unreachable.
Experimentation Workflow
- If flag type = experiment, auto-create PostHog experiment entry referencing the flag.
- Cohorts auto-update from live events (no manual CSV).
- PostHog calculates significance; once concluded, Slack bot posts summary + recommended action.
Change Logging
- n8n listens to PostHog flag update webhooks; logs changes (who toggled, when) into Supabase and posts to #flag-log channel.
Cleanup
- Daily script checks expired flags; if past expiry, open Linear cleanup ticket and ping owner.
- Run “flag debt” review during weekly growth standup to ensure we remove or merge code paths.

Key Principles for Feature Flag Governance

Treat flags as code: Apply software development best practices to feature flags, including version control, code reviews, and automated testing.
Clear ownership: Every feature flag must have a clear owner (and a backup) who is responsible for its lifecycle.
Defined lifecycle: Flags should have a defined lifecycle, from creation to deprecation, with clear expiry dates for temporary flags.
Automated monitoring: Implement automated monitoring and alerting for flag changes and their impact on key metrics.
Transparency and communication: Ensure all stakeholders are aware of active flags, their purpose, and any changes.
Rollback readiness: Always have a clear and tested rollback plan for every feature flag.

Data Contracts for Flags

Each flag entry includes:

Owner (primary + backup)
Purpose (launch, experiment, ops) with Linear ticket link
Affected surfaces (web, API, lifecycle email)
Rollback path (command or code to revert)
Metrics (PostHog insight ID + success threshold)
Expiry (default 30 days, extend via PR)

Contracts live in a JSON file synced with PostHog via API, so dashboards and automation stay consistent.

Middleware Pattern (Next.js)

// middleware.ts
import { NextResponse } from "next/server";
import { createClient } from "@posthog/nextjs";

export async function middleware(req: Request) {
  const posthog = createClient({ apiKey: process.env.POSTHOG_KEY });
  const sessionId = req.cookies.get("ph_id")?.value;

  const flags = await posthog.getFeatureFlags({
    distinctId: sessionId ?? "anonymous",
    groups: { organization: req.headers.get("x-org-id") },
  });

  if (!flags["web_signup_copytest_ctr"]) {
    // Redirect variant B users to control page
    return NextResponse.rewrite(new URL("/signup/control", req.url));
  }
  return NextResponse.next();
}

Middleware ensures routing decisions happen before render; no flashes between variants.

For client-only widgets, we wrap components with a useFlag hook that hydrates from server-provided defaults to avoid mismatched UI.

Backend Pattern (Supabase)

import { createClient } from "@supabase/supabase-js";
import fetch from "node-fetch";

export async function shouldSendNurture(contactId: string) {
  const { data } = await supabase
    .from("contacts")
    .select("ph_distinct_id")
    .eq("id", contactId)
    .single();

  const res = await fetch("https://app.posthog.com/decide/", {
    method: "POST",
    headers: { "Content-Type": "application/json", Authorization: `Bearer ${process.env.POSTHOG_KEY}` },
    body: JSON.stringify({ distinct_id: data.ph_distinct_id }),
  });
  const { featureFlags } = await res.json();
  return featureFlags.includes("ops_nurture_pause") === false;
}

Server jobs call PostHog’s Decide API so non-Next services respect the same toggles.

Metrics & telemetry

Orphaned flags: 0 (daily cleanup + expiry).
Time to launch an experiment: 1 day (from ticket to live).
Incidents caused by misconfigured flags: 0 in 6 months.
Flags with documented rollback steps: 100%.
Avg. lifetime of temporary flags: 19 days (vs. 90 before).
Flag change audit coverage: 100% logged in Supabase + Slack.
Percentage of flags with automated cleanup tasks: 100%.
Weekly "flag debt" backlog items: <3.

Monitoring Dashboard

Metabase dashboard displays:

Active flags by type with owners and expiry.
Experiments nearing significance (plus effect sizes).
Flags per repo/service (to detect code hotspots).
Cleanup backlog with due dates.

We review this in the weekly release meeting.

Incident Response

If a flag causes issues (e.g., variant crash):

Owner hits the “Kill switch” button in PostHog (preconfigured action).
n8n logs the change, reverts to control variant, and posts to #flag-log with context.
Linear incident ticket auto-creates with checklist: collect stack traces, disable code path, communicate to stakeholders.
After resolution, we update the flag contract to document the issue and prevent reactivation without a new review.

Because PostHog provides audit logs, compliance can see exactly when toggles changed.

Cost Snapshot

PostHog Scale plan: ~$200/mo in our tenant (covers product analytics + flags).
n8n automation: $15/mo on Fly.io.
Supabase metadata store: $25/mo.

Total incremental cost to run disciplined flags: <$250/mo.

What I'm building next

I'm writing reusable middleware snippets (Next.js, Supabase functions, Remix) plus a Supabase+Slack bot that enforces expiry. Want them? let me know and I’ll share.

Lessons Learned

Every flag needs an owner and a death date.
Document activation/deactivation steps to avoid late-night nerves.
Cohorts should use real behavioral data; static lists rot fast.
Logging toggles builds trust—product knows who changed what.

What I'm building next

I'm writing reusable middleware snippets (Next.js, Supabase functions, Remix) plus a Supabase+Slack bot that enforces expiry. Want them? let me know and I’ll share.

Want me to help you replicate this module? Drop me a note and we’ll build it together.

Feature Flags with PostHog Without Breaking Prod

Feature Flags with PostHog Without Breaking Prod

Context

Stack I leaned on

Pain Points Before Governance

Flag Taxonomy & Naming

Playbook

Key Principles for Feature Flag Governance

Data Contracts for Flags

Middleware Pattern (Next.js)

Backend Pattern (Supabase)

Metrics & telemetry

Monitoring Dashboard

Incident Response

Cost Snapshot

What I'm building next

Lessons Learned

What I'm building next

Segment-to-Warehouse Governance

Launch Readiness Runbook for Calm Releases

You might also like

Modular Experimentation Engine

Partner Portal in 3 Weeks with Next.js + Supabase

Growth Sprints in 30 Days to Launch Marsala OS

Ready to turn this insight into a live system?