How to Implement Specification-Driven Development with Gherkin

1. What is Gherkin and Why Use It for SDD?

Gherkin is a structured, plain-English specification language that describes software behavior in a format both humans and machines can understand. Originally created for Behavior-Driven Development (BDD) tools like Cucumber, Gherkin has become the lingua franca of software specifications — and the foundation of the Don Cheli SDD framework.

The core of Gherkin is the Given / When / Then syntax, which maps directly to the three concerns of any software behavior:

Given — The preconditions. What state is the system in before the action? (context)
When — The trigger. What action or event occurs? (stimulus)
Then — The observable outcome. What should the system do in response? (assertion)

                gherkin
                Feature: User Registration
  As a new visitor
  I want to create an account
  So that I can access the platform

  Scenario: Successful registration with valid data
    Given no account exists with email "user@example.com"
    When I submit registration with name "Jane Doe", email "user@example.com", and password "S3cur3P@ss!"
    Then the system creates a new user account
    And sends a verification email to "user@example.com"
    And returns HTTP 201 with the user's public profile
            

The power of Gherkin for SDD goes far beyond documentation. When you write a Gherkin specification, you are making explicit, testable contracts about system behavior. Every scenario is an executable test. Every Given maps to a test fixture. Every Then maps to an assertion. This means your specification is your test suite — written before a single line of production code exists.

"Gherkin bridges the gap between business intent and engineering implementation. When a product manager writes 'users should be able to log in,' a Gherkin scenario turns that into an unambiguous, verifiable contract."

In the context of SDD, Gherkin provides three critical advantages:

Unambiguity — Natural language is notoriously imprecise. "The system should respond quickly" means nothing testable. "Then the API responds within 200ms" means everything. Gherkin forces precision.
Executability — Gherkin scenarios can be directly mapped to test code using step definitions. Your spec becomes your test harness.
AI-readability — Large language models like Claude parse Gherkin with exceptional accuracy. When you feed a Gherkin spec to an AI coding agent, it generates far more accurate code than when given a loose English description.

2. The Gherkin-First Workflow

The Don Cheli SDD framework centers the entire development process around a single command: /dc:specify. This command initiates the Gherkin-first workflow, which means no planning, no architecture, and no code until you have a validated Gherkin specification.

Starting a Specification Session

When you run /dc:specify, the framework guides you through a structured interview process. It asks targeted questions about:

The business goal of the feature
The primary actors (users, systems, external services)
The happy path scenarios
The edge cases and failure modes
The non-functional requirements (performance, security, availability)

From your answers, the framework generates a complete .feature file following Gherkin syntax. But it doesn't stop there: it also auto-generates a DBML schema by analyzing the data entities referenced in your scenarios.

                bash
                # Run the specify command
/dc:specify

# The framework will ask:
# > What feature are you building?
# > Who are the primary users?
# > What is the main success scenario?
# > What can go wrong?
# > What are the performance requirements?

# Output: /specs/feature-name/spec.feature + /specs/feature-name/schema.dbml
            

Auto-Generated DBML Schema

One of the most powerful aspects of the Gherkin-first workflow is automatic schema generation. When your Gherkin spec references entities like "a registered user," "an active subscription," or "a product with SKU," the framework identifies these domain objects and generates a corresponding DBML (Database Markup Language) schema:

                dbml
                // Auto-generated from spec.feature by /dc:specify
// Do not edit manually — regenerate with /dc:specify --regen-schema

Table users {
  id uuid [pk, default: `gen_random_uuid()`]
  email varchar(255) [unique, not null]
  name varchar(100) [not null]
  password_hash varchar(255) [not null]
  email_verified_at timestamp
  created_at timestamp [default: `now()`]
  updated_at timestamp [default: `now()`]

  indexes {
    email [unique]
  }
}

Table verification_tokens {
  id uuid [pk, default: `gen_random_uuid()`]
  user_id uuid [ref: > users.id]
  token varchar(255) [unique, not null]
  expires_at timestamp [not null]
  used_at timestamp
}
            

This tight coupling between the specification and the data model eliminates an entire class of bugs: those caused by a mismatch between what the spec says and what the database actually stores.

3. Writing Effective Gherkin Specifications

Not all Gherkin is equal. Poorly written Gherkin is verbose, brittle, and difficult to maintain. Well-written Gherkin is concise, declarative, and serves as living documentation. Here are the core principles the Don Cheli framework enforces.

Principle 1: Describe Behavior, Not Implementation

Bad Gherkin describes how the system works internally. Good Gherkin describes what the system does from the user's perspective.

                gherkin
                # BAD: Implementation-coupled (fragile, hard to maintain)
Scenario: Login
  Given I open the MySQL connection to the users table
  When I run SELECT * WHERE email = "user@example.com"
  Then I get a row back and set the JWT using HS256

# GOOD: Behavior-focused (resilient, meaningful)
Scenario: Successful login with valid credentials
  Given a registered user with email "user@example.com" and password "MyP@ssw0rd"
  When they submit a POST request to /auth/login with their credentials
  Then they receive HTTP 200
  And the response body contains a "token" field
  And the token is a valid JWT expiring in 24 hours
            

Principle 2: One Scenario, One Behavior

Each Gherkin scenario should test exactly one behavior. If a scenario has more than 5-6 steps, it is almost always testing multiple behaviors bundled together. The framework's Auto-QA system flags scenarios that exceed the complexity threshold.

Principle 3: Use Background for Shared Context

When multiple scenarios share the same preconditions, use a Background block instead of repeating Given steps:

                gherkin
                Feature: Product Catalog API

  Background:
    Given the API is running at "https://api.example.com/v1"
    And the database contains 50 active products
    And I am authenticated as an admin user

  Scenario: List products with default pagination
    When I GET /products
    Then the response status is 200
    And the response contains 20 products
    And the response includes a "nextCursor" field

  Scenario: Filter products by category
    When I GET /products?category=electronics
    Then the response status is 200
    And all returned products have category "electronics"

  Scenario: Search products by name
    When I GET /products?search=laptop
    Then the response status is 200
    And all returned product names contain "laptop" (case-insensitive)
            

Principle 4: Cover Authentication and Authorization Explicitly

Security scenarios are non-negotiable in the Don Cheli framework. Every protected endpoint must have explicit Gherkin scenarios for both authenticated and unauthenticated access:

                gherkin
                Feature: API Authentication Guards

  Scenario: Reject request without Authorization header
    Given no Authorization header is set
    When I GET /api/protected-resource
    Then the response status is 401
    And the response body contains error code "MISSING_TOKEN"

  Scenario: Reject request with expired JWT
    Given an Authorization header with an expired JWT token
    When I GET /api/protected-resource
    Then the response status is 401
    And the response body contains error code "TOKEN_EXPIRED"

  Scenario: Reject request with insufficient permissions
    Given I am authenticated as a "viewer" role user
    When I DELETE /api/resource/123
    Then the response status is 403
    And the response body contains error code "INSUFFICIENT_PERMISSIONS"
            

Principle 5: Specify Data Validation Boundaries

Use Scenario Outline with an Examples table to test data validation across multiple inputs efficiently:

                gherkin
                Scenario Outline: Reject registration with invalid email formats
  Given no existing account with email "<email>"
  When I POST /auth/register with email "<email>"
  Then the response status is 422
  And the response contains validation error for field "email"

  Examples:
    | email              |
    | notanemail         |
    | @missing-local.com |
    | missing@.com       |
    | two@@at.com        |
    | spaces in@email.com|
            

4. From Specification to Implementation: The Pipeline

Once your Gherkin specification is written and validated, it flows through the Don Cheli 6-phase SDD pipeline. Each phase consumes the output of the previous one, creating a tight dependency chain that prevents shortcuts.

Phase	Command	Input	Output
1. Specify	`/dc:specify`	Business requirements	`spec.feature` + `schema.dbml`
2. Clarify	`/dc:clarify`	`spec.feature`	Resolved ambiguities, updated spec
3. Plan	`/dc:tech-plan`	Validated spec	Architecture blueprint, ADRs
4. Breakdown	`/dc:breakdown`	Blueprint	TDD task list with parallelism markers
5. Implement	`/dc:implement`	TDD tasks	Passing tests + production code
6. Review	`/dc:review`	Implemented code	7-dimension review report, merge approval

The critical insight is that every phase traces back to the Gherkin spec. The architect references the spec when choosing technologies. The task breakdown references specific scenario IDs. The TDD tests are step definitions of the Gherkin scenarios. The review verifies spec compliance. Nothing floats free from the original specification.

Spec Traceability in Practice

Every TDD task generated by /dc:breakdown is tagged with a scenario reference:

                javascript
                # Task: TASK-007
# Spec reference: spec.feature#Scenario:Successful-login
# Iron Law: TDD required
# Parallelizable: YES (no dependency on TASK-008)

## RED phase (write failing test first)
describe('POST /auth/login', () => {
  it('returns 200 and JWT token for valid credentials', async () => {
    // Given: a registered user
    const user = await createUser({ email: 'dev@test.com', password: 'P@ssw0rd!' });

    // When: they submit valid credentials
    const res = await request(app)
      .post('/auth/login')
      .send({ email: 'dev@test.com', password: 'P@ssw0rd!' });

    // Then: they receive a JWT token
    expect(res.status).toBe(200);
    expect(res.body).toHaveProperty('token');
    expect(isValidJWT(res.body.token)).toBe(true);
    expect(getJWTExpiry(res.body.token)).toBeCloseTo(Date.now() + 86400000, -3);
  });
});
            

5. Auto-QA: Catching Ambiguities Before They Become Bugs

The /dc:clarify command runs the framework's Auto-QA system, which performs an 8-check multi-layer validation on your Gherkin specification. This runs automatically after /dc:specify and blocks the pipeline until all ambiguities are resolved.

The 8-Check Validation Layer

Completeness check — Are there scenarios for all identified actors and use cases? Missing scenarios are flagged with the specific gap.
Ambiguity check — Does any step contain vague language ("quickly," "correctly," "properly")? Every vague term must be replaced with a measurable criterion.
Contradiction check — Do any two scenarios specify conflicting behaviors for the same input? Contradictions are surfaced before a single line of code is written.
Boundary check — Are edge cases covered? The framework checks for missing null/empty inputs, maximum values, concurrent access, and error conditions.
Security check — Does the spec address OWASP Top 10 relevant to the feature? Missing authentication, authorization, or injection scenarios are flagged.
Idempotency check — For mutating operations, are retry/duplicate scenarios specified? A missing idempotency scenario is a common source of production bugs.
Data consistency check — Does the spec's data model (inferred from Gherkin entities) match the auto-generated DBML schema? Divergences are flagged immediately.
Nyquist Validation — Are there at least N×2 scenarios for a feature with N primary behaviors? Inspired by the Nyquist sampling theorem, this heuristic ensures sufficient coverage density to catch aliased bugs.

"The best time to catch a bug is before you write the code. Nyquist Validation gives us a mathematical framework for knowing when our specification has enough resolution to prevent the bugs we haven't thought of yet."

A Clarify Session in Action

After running /dc:clarify, the framework produces a structured report like this:

                bash
                /dc:clarify output for spec.feature

CHECK 1 (Completeness): PASS - 8 scenarios covering 4 actors
CHECK 2 (Ambiguity): FAIL
  - Line 23: "the system responds quickly" -> define SLA in milliseconds
  - Line 47: "valid credentials" -> specify exact validation rules for password

CHECK 3 (Contradiction): PASS
CHECK 4 (Boundary): FAIL
  - Missing: scenario for password with only spaces
  - Missing: scenario for email with Unicode characters
  - Missing: scenario for concurrent login attempts (race condition)

CHECK 5 (Security): FAIL
  - Missing: brute-force protection scenario (rate limiting)
  - Missing: account lockout after N failed attempts

CHECK 6 (Idempotency): PASS - login is naturally idempotent
CHECK 7 (Data Consistency): PASS - DBML schema aligns with entities
CHECK 8 (Nyquist): FAIL - 8 scenarios for 6 behaviors (need ≥12)

Pipeline BLOCKED. Resolve 4 issues to proceed to /dc:tech-plan.
            

This blocking behavior is intentional and critical. By refusing to let the pipeline advance until ambiguities are resolved, the framework prevents the most expensive category of bugs: those that arise from incomplete thinking at the specification stage.

6. Real Example: Building a REST API with Gherkin SDD

Let's walk through a complete Gherkin SDD workflow for a simple task management REST API. We'll go from specification to a working implementation following the full pipeline.

Step 1: Write the Gherkin Specification

                gherkin
                Feature: Task Management API
  As a project member
  I want to create, read, update, and delete tasks
  So that my team can track work progress

  Background:
    Given the API is running at "https://api.example.com/v1"
    And I am authenticated with a valid JWT token as user "alice@team.com"

  # --- CREATE ---

  Scenario: Create a task with all required fields
    When I POST /tasks with body:
      """
      { "title": "Implement login", "priority": "high", "dueDate": "2026-04-01" }
      """
    Then the response status is 201
    And the response body contains a "taskId" field (UUID)
    And the task is persisted in the database
    And the response body "createdBy" equals "alice@team.com"

  Scenario: Reject task creation with missing title
    When I POST /tasks with body:
      """
      { "priority": "high", "dueDate": "2026-04-01" }
      """
    Then the response status is 422
    And the response body contains error code "VALIDATION_ERROR"
    And the error details list "title" as required

  Scenario: Reject task creation with past dueDate
    When I POST /tasks with body:
      """
      { "title": "Old task", "dueDate": "2020-01-01" }
      """
    Then the response status is 422
    And the response body contains error code "INVALID_DUE_DATE"

  # --- READ ---

  Scenario: Retrieve an existing task
    Given a task exists with id "task-123" belonging to the current user
    When I GET /tasks/task-123
    Then the response status is 200
    And the response body contains "taskId", "title", "priority", "status", "createdBy"

  Scenario: Return 404 for non-existent task
    Given no task exists with id "task-999"
    When I GET /tasks/task-999
    Then the response status is 404
    And the response body contains error code "TASK_NOT_FOUND"

  Scenario: Prevent access to another user's task
    Given a task with id "task-456" belongs to user "bob@team.com"
    When I GET /tasks/task-456 as user "alice@team.com"
    Then the response status is 403
    And the response body contains error code "ACCESS_DENIED"

  # --- UPDATE ---

  Scenario: Update task status to completed
    Given a task exists with id "task-123" and status "in_progress"
    When I PATCH /tasks/task-123 with body: { "status": "completed" }
    Then the response status is 200
    And the response body "status" equals "completed"
    And the response body "completedAt" is a valid ISO timestamp

  # --- DELETE ---

  Scenario: Soft-delete a task
    Given a task exists with id "task-123"
    When I DELETE /tasks/task-123
    Then the response status is 204
    And the task is marked as deleted (soft delete) in the database
    And subsequent GET /tasks/task-123 returns 404
            

Step 2: Run Auto-QA with /dc:clarify

The framework runs the 8 checks. It flags two missing scenarios: one for concurrent task creation (race condition check) and one for listing tasks with pagination. We add those scenarios and re-run. All checks pass. The pipeline unblocks.

Step 3: Generate the Technical Plan

Running /dc:tech-plan produces an architecture blueprint. The framework recommends Express.js + PostgreSQL (given the relational nature of the DBML schema), specifies the folder structure, and records Architecture Decision Records (ADRs) for key choices like soft deletes and JWT-based auth.

Step 4: Break Down into TDD Tasks

/dc:breakdown generates 12 TDD tasks from the specification, marking 7 of them as parallelizable. Each task carries its scenario reference. For example:

                bash
                TASK-001 [spec: Scenario:Create-task-all-required] [parallel: YES]
  RED:   Write failing test for POST /tasks -> 201 + taskId
  GREEN: Implement POST /tasks handler
  REFACTOR: Extract validation middleware

TASK-002 [spec: Scenario:Reject-missing-title] [parallel: YES]
  RED:   Write failing test for POST /tasks -> 422 missing title
  GREEN: Add title validation
  REFACTOR: Consolidate validation errors

TASK-003 [spec: Scenario:Retrieve-existing-task] [parallel: YES]
  RED:   Write failing test for GET /tasks/:id -> 200
  GREEN: Implement GET /tasks/:id handler
  REFACTOR: Extract task serializer

... (9 more tasks)
            

Step 5: Implement with TDD

Running /dc:implement TASK-001 inside a Docker container, the AI agent writes the failing test first, confirms it fails, then writes the minimum production code to make it pass, then refactors. The entire cycle is auditable and reproducible.

                javascript
                // RED: failing test (TASK-001)
test('POST /tasks creates a task and returns 201', async () => {
  const res = await supertest(app)
    .post('/tasks')
    .set('Authorization', `Bearer ${aliceToken}`)
    .send({ title: 'Implement login', priority: 'high', dueDate: '2026-04-01' });

  expect(res.status).toBe(201);          // FAILS (no handler yet)
  expect(res.body.taskId).toMatch(/^[0-9a-f-]{36}$/); // UUID format
  expect(res.body.createdBy).toBe('alice@team.com');
});

// GREEN: minimal implementation
app.post('/tasks', authenticate, async (req, res) => {
  const { title, priority, dueDate } = req.body;
  const task = await db.tasks.create({
    title,
    priority: priority ?? 'normal',
    dueDate: new Date(dueDate),
    createdBy: req.user.email,
  });
  res.status(201).json({ taskId: task.id, createdBy: task.createdBy });
});

// REFACTOR: add input validation middleware, extract task serializer
            

Step 6: 7-Dimension Review

Before merging, /dc:review evaluates the implementation across seven dimensions. For our task API, the review surfaces one issue: the POST /tasks handler doesn't sanitize dueDate against SQL injection in the raw query fallback path. The code is sent back for a targeted fix before merge approval is granted.

7. Getting Started

You can start using Gherkin-first SDD with Don Cheli in under 60 seconds. The framework installs globally and works with any existing project — you don't need to rewrite anything.

                bash
                # Option 1: npx (no global install required)
npx don-cheli-sdd init

# Option 2: global install via npm
npm install -g don-cheli-sdd
dc init

# Option 3: clone and install
git clone https://github.com/doncheli/don-cheli-sdd.git
cd don-cheli-sdd && bash scripts/instalar.sh --global

# Initialize SDD in your existing project
/dc:init --type api --name "my-task-api"

# Write your first Gherkin specification
/dc:specify

# Run Auto-QA on your spec
/dc:clarify

# Launch the full pipeline
/dc:start
            

The framework works natively with Claude Code (full 72+ command support), Google Gemini (via 14 skills and 9 workflows), Cursor IDE (via .cursorrules), and any agent that reads AGENTS.md.

For Gherkin-specific workflows, the most useful commands are:

/dc:specify — Start a new Gherkin specification session with guided interview
/dc:clarify — Run Auto-QA (8-check validation + Nyquist) on current spec
/dc:validate-spec — Validate spec against DBML schema for consistency
/dc:detect-ambiguity — Standalone ambiguity scanner (subset of clarify)
/dc:breakdown — Generate TDD task list from validated Gherkin spec

Traditional BDD vs. Gherkin SDD

Aspect	Traditional BDD	Gherkin SDD (Don Cheli)
Who writes specs?	QA + BA collaboration	Developer + AI agent (guided interview)
Spec validation	Manual peer review	Automated 8-check + Nyquist Validation
Schema generation	Separate DBA process	Auto-generated DBML from Gherkin entities
Pipeline integration	Standalone (Cucumber/SpecFlow)	Feeds directly into 6-phase SDD pipeline
TDD enforcement	Optional	Iron Law — pipeline is blocked without it
Security scenarios	Not enforced	OWASP check flags missing security specs
Spec traceability	Manual tagging	Every task auto-tagged with scenario ID
AI integration	Not native	AI reads Gherkin to generate code + tests

Gherkin SDD is not a replacement for traditional BDD tooling — it's an evolution. You can still use Cucumber, Playwright, or Behave to execute your feature files. What the Don Cheli framework adds is the scaffolding that makes Gherkin the authoritative source of truth for the entire development lifecycle, not just a documentation artifact that gets out of sync with the real code.

The result is software that is demonstrably correct relative to its specification, auditable at every step, and maintainable over time because every behavior is documented in a language that business stakeholders and engineers share equally.

Ready to build with Gherkin SDD?

Get the Don Cheli SDD framework and start writing specifications that feed directly into a validated, TDD-enforced pipeline — free and open source.

Get Started on GitHub →

#gherkin #sdd #bdd #specifications #don-cheli