Describe what to test. Get production-grade test code.

Seven test automation platforms, one architecture. Selenium, Playwright, API, Docker, DeepEval, SSH, Database. Every platform follows the same 5-layer pattern. Every engineer writes the same way. Every test is maintainable.

Each QA Platform is an agent harness - a kernel-governed runtime for a specific test discipline. Drop a domain spec in, get a working agent that runs Selenium / Playwright / Docker / SSH tests with mechanical quality gates.

Each platform is a harness built on the Isagawa Kernel. See /kernel.html for the underlying governance model.

The Problem

UI tests are brittle. Every engineer writes them differently.

AI-generated tests make it worse. More code, same chaos, no architecture. Selectors break on every deploy. Test suites become unmaintainable. Teams can't share patterns. Every framework is different.

How do you keep selector locators in one place?
How do you ensure every engineer follows the same pattern?
How do you make AI-generated tests maintainable, not chaotic?

The Architecture

Five layers. One pattern. Five platforms.

Inspired by the Screenplay Pattern and redesigned for AI-generated code. The strict layer separation is what gives the AI discipline to produce maintainable code every time, not just the first time. Every test follows the same architecture, whether you are testing web UI with Selenium, TypeScript with Playwright, container images, LLM outputs, or SSH configurations. No inheritance. Composition all the way down. State checks in one place. Constants isolated. Tests stay clean.

Interface

Connection management. Browser driver setup, SSH sessions, Docker clients, database connection pools. Retry logic, timeout handling, and connection pooling live here. No business logic. The rest of the stack never touches the transport layer directly.

Domain Object

One class per domain concept. Page Objects for UI, Validators for SSH, Image Objects for Docker, Metric Objects for LLM eval, Table Objects for database. Constants and state-check methods live here. Return self for fluent chaining.

Task

Domain operations. Composes domain objects. One responsibility per method. No constants, no locators. Returns None. Orchestration lives in Roles, not Tasks.

Role

User workflows. Composes Tasks in sequence. One workflow method per role behavior. No return values. Tests call one role method and assert via domain object state-checks.

Test

Assertions and data. Load test data from JSON. Call one role workflow. Assert via domain object state-check methods. No orchestration. Clean, readable test code.

How It Works

Five steps. From description to passing tests.

You provide a persona, a URL, and a plain-English workflow. The AI agent handles everything else: credential strategy, element discovery, BDD scenario generation, code construction, and test execution. If a test fails, you triage it together.

User Input

Provide three things: the persona (who is testing), the target URL, and what to test in plain English. "HR Manager logs in and creates an employee." That is the entire input.

Pre-flight

The agent resolves the credential strategy, validates access to the target application, and confirms the environment is ready. No test code is written until pre-flight passes.

AI Processing

The agent generates BDD scenarios from your description, defines expected states, then discovers every element on the target pages. Selectors, interactions, and state transitions are mapped automatically.

Construction

The agent reads the reference implementations and builds all five layers: Domain Objects with locators, Tasks that compose them, Roles that orchestrate workflows, and Tests that assert via state-check methods. Every file follows the pattern.

Execution

Tests run via pytest. If they pass, you are done. If a test fails, the agent presents the failure for human-in-the-loop triage. You decide: fix the test, fix the app, or skip. No silent failures.

See It In Action

One command. Five layers. Passing tests.

Invoke /qa-workflow inside Claude Code. Provide a persona, a URL, and a workflow. The agent handles the rest.

test-generation

The Platforms

Same architecture. Different stacks.

Whether you test web browsers, APIs, databases, container images, LLM outputs, or SSH configurations, the same 5-layer architecture applies. Each harness enforces the pattern mechanically - no drift, no exceptions. One architecture. Seven harnesses.

Platform Selenium

Stack: Python, Selenium, pytest

Tests: Web browsers via Selenium WebDriver

Use when: You need cross-browser Selenium tests with strict architecture

View on GitHub ↗

Platform Playwright

Stack: TypeScript, Playwright Test

Tests: Web UI, API endpoints, and hybrid workflows

Use when: You need TypeScript tests with full browser and API control

View on GitHub ↗

Platform API

Stack: TypeScript, Playwright APIRequestContext

Tests: REST and GraphQL API endpoints, contract testing

Use when: You need API response validation and contract testing

View on GitHub ↗

Platform Docker

Stack: Docker, pytest, Python

Tests: Container image compliance and behavior

Use when: You need infrastructure and container testing

View on GitHub ↗

Platform DeepEval

Stack: Python, DeepEval, LLM eval

Tests: LLM output quality and correctness

Use when: You need AI output validation and assertions

Contact for access ↗

Platform SSH

Stack: Python, SSH, compliance frameworks

Tests: SSH configuration and compliance validation

Use when: You need infrastructure compliance testing

View on GitHub ↗

Platform Database

Stack: Python, pyodbc, pytest

Tests: Query correctness, schema integrity, data migrations

Use when: You need to validate database state, stored procedures, or migration correctness

Contact for access ↗

Tech Stack

Python TypeScript Selenium Playwright pytest Docker ODBC MCP

Results

7 Complete platforms

5-layer Architecture pattern

100% Code consistency

Built for Claude Code.

Who This Is For

QA LEAD / CONSISTENCY

QA leads

Evaluating test automation frameworks. Need a pattern every engineer follows. Consistent locator management. Maintainable AI-generated tests.

ENGINEERING MANAGER / PATTERNS

Engineering managers

Wanting consistent test patterns across teams. Reduce maintenance overhead. No more "everyone writes tests differently" chaos.

TEAM / AI ADOPTION

Teams adopting AI testing

Using AI agents to generate tests. Need guardrails, not chaos. A strict architecture that keeps AI-generated code clean and maintainable.