Setting up End-to-End Testing with cm-test-gate

In the era of “Vibe Coding,” where AI can generate hundreds of lines of code in seconds, the bottleneck is no longer creation—it is verification. You’ve likely experienced the “AI Honeymoon”: you describe a feature, the agent builds it perfectly, and for five minutes, you feel like a god. Then, you try to deploy. Suddenly, the login button is unresponsive, the CSS is bleeding across components, and a subtle logic error in the backend has turned your “Vibe” into a “Nightmare.”

This is where cm-test-gate becomes your most valuable asset. It is not just another testing library; it is a Quality Sentinel designed to bridge the gap between AI-driven speed and production-grade reliability. In this article, we will explore how to set up cm-test-gate to ensure that your “Vibe” remains unbroken from local development to the final deployment.

The Vibe Coding Quality Gap

The fundamental problem with AI-assisted development is the Verification Debt. When you write code manually, you unit test as you go (hopefully). When an AI writes a feature, it often ignores the edge cases that aren’t explicitly mentioned in the prompt. If you don’t have an automated “Gate,” you are forced to manually click through your app after every turn. This kills the momentum of Vibe Coding.

cm-test-gate solves this by enforcing a Verification-Before-Completion protocol. It demands evidence before it allows a task to be marked as “done.” It transforms the vague feeling that “it looks right” into a hard, green checkmark.

Core Concepts: How cm-test-gate Works

At its heart, cm-test-gate is a skill within the Cody Master ecosystem that orchestrates four critical layers of protection:

Stack Detection: It automatically identifies your project type (Astro, Next.js, FastAPI, etc.) and wires up the appropriate test runners.
The Four Core Test Files: It scaffolds a standardized test/ directory with specific focus areas:
- frontend-safety.test.ts: Checks for broken links, console errors, and accessibility.
- business-logic.test.ts: Verifies core workflows (e.g., “Can a user sign up?”).
- api-routes.test.ts: Ensures your endpoints return the correct status codes and payloads.
- security-scan.test.ts: Scans for exposed secrets and common injection vulnerabilities.
Secret Hygiene: It validates that no .env files or API keys are accidentally leaked into the codebase during the rapid-fire generation process.
The Deploy Gate: It integrates with your CI/CD (like GitHub Actions or Cloudflare Pages) to prevent “Dark Deploys”—pushes that happen without a successful test pass.

Step-by-Step: From Zero to Gate

Let’s walk through a practical setup for a modern web application.

Phase 1: Activation and Scaffolding

The first step is to tell the Cody Master agent to prepare the workspace. You do this by activating the skill.

Command: activate_skill cm-test-gate

Once activated, you issue a directive to bootstrap the testing infrastructure: “Set up a test gate for this project. Ensure we have Playwright for the frontend and Vitest for the business logic.”

The agent will then:

Detect your package.json.
Install playwright and @playwright/test.
Create a playwright.config.ts if one doesn’t exist.
Scaffold the test/ directory.

Phase 2: Writing the High-Signal “Vibe” Test

In Vibe Coding, we don’t want 1,000 unit tests. We want 10 high-signal End-to-End (E2E) tests that cover the “Happy Path” and the “Critical Failure Path.”

Let’s say you’re building a dashboard. Your test/frontend-safety.test.ts should look like this:

import { test, expect } from '@playwright/test';

test.describe('Critical UI Vibe Check', () => {
  test('should load the homepage without console errors', async ({ page }) => {
    const logs: string[] = [];
    page.on('console', msg => {
      if (msg.type() === 'error') logs.push(msg.text());
    });

    await page.goto('/');
    
    // The "Vibe" check: Ensure no major JS crashes occurred
    expect(logs).toHaveLength(0);
    
    // The "Branding" check: Ensure the logo is visible
    const logo = page.locator('nav img#logo');
    await expect(logo).toBeVisible();
  });

  test('main navigation works', async ({ page }) => {
    await page.goto('/');
    await page.click('text=Features');
    await expect(page).toHaveURL(/.*features/);
    await expect(page.locator('h1')).toContainText('Our Features');
  });
});

Phase 3: Wiring the Script Gates

For cm-test-gate to be effective, it must be part of your package.json lifecycle. The agent will typically add or update your scripts:

{
  "scripts": {
    "test:gate": "npm run test:unit && npm run test:e2e",
    "test:unit": "vitest run",
    "test:e2e": "playwright test",
    "prepush": "npm run test:gate"
  }
}

By naming the command test:gate, you create a single point of failure that the agent can call before suggesting a git push.

Integrating with the “Plan-Act-Validate” Cycle

The true power of cm-test-gate is realized when you combine it with the standard development lifecycle. Here is how exactly it solves the “Hallucination Problem”:

Research: You ask the agent to add a “Dark Mode” toggle.
Strategy: The agent plans the CSS changes.
Execution (Act): The agent modifies tailwind.config.js and the Header.tsx.
Validation (The Gate):
- Instead of saying “I’ve changed the code,” the agent runs npm run test:gate.
- The Playwright test opens a headless browser, clicks the toggle, and checks if the html element has the .dark class.
- If the test fails (e.g., the AI forgot to export the useTheme hook), the agent must fix the error before it can report success to you.

This “Closed Loop” ensures that you only see finished work that actually functions.

Best Practices & Tips for Intermediate Users

1. Use “Ephemeral” Test Data

In intermediate setups, avoid testing against your production database. Use a setup/ script within your test gate to seed a local SQLite or a temporary Neon branch. cm-test-gate excels at managing these transient environments.

2. The “Hallucination Audit”

If you suspect the AI is giving you “fake” passes, ask cm-test-gate to perform a Hallucination Audit: “Run the test gate, but also take a screenshot of the failure states and save them to /tmp/vibe-audit. I want to see exactly what the browser sees.” This forces the agent to provide visual evidence of its claims.

3. Frontend Safety is Non-Negotiable

Always include a test that checks for 404s on your asset files. There is nothing worse than a beautiful UI with broken icons because the AI used the wrong path for an SVG.

4. Secret Scanning

In your security-scan.test.ts, use a simple grep script to ensure no strings matching the pattern of a Stripe key (sk_test_...) or AWS key exist in your dist/ or src/ folders. cm-test-gate can automate this as a “pre-deployment” check.

Troubleshooting Common Issues

The “Flaky Vibe”: Sometimes E2E tests fail because the dev server didn’t start fast enough.

Fix: Use Playwright’s webServer config option to automatically manage the startup and shutdown of your app during tests.

The “Infinite Loop”: If the AI keeps trying to fix a test but keeps failing, the “Vibe” is fundamentally broken.

Fix: Stop the agent. Use read_file on the test-results/ folder to see the error log yourself. Often, the AI is trying to click an element that it renamed in the last turn but didn’t update in the test.

Conclusion: Mastering the Gate

In Vibe Coding, speed is your engine, but cm-test-gate is your brakes. Without brakes, you will eventually crash into a production outage. By setting up a disciplined test gate, you transform your AI agent from a “Code Generator” into a “Software Engineer.”

You no longer have to fear the “Big Refactor” or the “Quick Fix.” You can prompt with confidence, knowing that if the AI’s “Vibe” is off, the Gate will catch it before your users do.

Your next step: Open your terminal, run activate_skill cm-test-gate, and ask your agent to “Secure the Vibe.” It’s time to move fast without breaking things.