Seven test automation platforms, one architecture. Selenium, Playwright, API, Docker, DeepEval, SSH, Database. Every platform follows the same 5-layer pattern. Every engineer writes the same way. Every test is maintainable.
Each QA Platform is an agent harness - a kernel-governed runtime for a specific test discipline. Drop a domain spec in, get a working agent that runs Selenium / Playwright / Docker / SSH tests with mechanical quality gates.
Each platform is a harness built on the Isagawa Kernel. See /kernel.html for the underlying governance model.
The Problem
AI-generated tests make it worse. More code, same chaos, no architecture. Selectors break on every deploy. Test suites become unmaintainable. Teams can't share patterns. Every framework is different.
The Architecture
Inspired by the Screenplay Pattern and redesigned for AI-generated code. The strict layer separation is what gives the AI discipline to produce maintainable code every time, not just the first time. Every test follows the same architecture, whether you are testing web UI with Selenium, TypeScript with Playwright, container images, LLM outputs, or SSH configurations. No inheritance. Composition all the way down. State checks in one place. Constants isolated. Tests stay clean.
Connection management. Browser driver setup, SSH sessions, Docker clients, database connection pools. Retry logic, timeout handling, and connection pooling live here. No business logic. The rest of the stack never touches the transport layer directly.
One class per domain concept. Page Objects for UI, Validators for SSH, Image Objects for Docker, Metric Objects for LLM eval, Table Objects for database. Constants and state-check methods live here. Return self for fluent chaining.
Domain operations. Composes domain objects. One responsibility per method. No constants, no locators. Returns None. Orchestration lives in Roles, not Tasks.
User workflows. Composes Tasks in sequence. One workflow method per role behavior. No return values. Tests call one role method and assert via domain object state-checks.
Assertions and data. Load test data from JSON. Call one role workflow. Assert via domain object state-check methods. No orchestration. Clean, readable test code.
How It Works
You provide a persona, a URL, and a plain-English workflow. The AI agent handles everything else: credential strategy, element discovery, BDD scenario generation, code construction, and test execution. If a test fails, you triage it together.
Provide three things: the persona (who is testing), the target URL, and what to test in plain English. "HR Manager logs in and creates an employee." That is the entire input.
The agent resolves the credential strategy, validates access to the target application, and confirms the environment is ready. No test code is written until pre-flight passes.
The agent generates BDD scenarios from your description, defines expected states, then discovers every element on the target pages. Selectors, interactions, and state transitions are mapped automatically.
The agent reads the reference implementations and builds all five layers: Domain Objects with locators, Tasks that compose them, Roles that orchestrate workflows, and Tests that assert via state-check methods. Every file follows the pattern.
Tests run via pytest. If they pass, you are done. If a test fails, the agent presents the failure for human-in-the-loop triage. You decide: fix the test, fix the app, or skip. No silent failures.
See It In Action
Invoke /qa-workflow inside Claude Code. Provide a persona, a URL, and a workflow. The agent handles the rest.
The Platforms
Whether you test web browsers, APIs, databases, container images, LLM outputs, or SSH configurations, the same 5-layer architecture applies. Each harness enforces the pattern mechanically - no drift, no exceptions. One architecture. Seven harnesses.
Stack: Python, Selenium, pytest
Tests: Web browsers via Selenium WebDriver
Use when: You need cross-browser Selenium tests with strict architecture
View on GitHub ↗Stack: TypeScript, Playwright Test
Tests: Web UI, API endpoints, and hybrid workflows
Use when: You need TypeScript tests with full browser and API control
View on GitHub ↗Stack: TypeScript, Playwright APIRequestContext
Tests: REST and GraphQL API endpoints, contract testing
Use when: You need API response validation and contract testing
View on GitHub ↗Stack: Docker, pytest, Python
Tests: Container image compliance and behavior
Use when: You need infrastructure and container testing
View on GitHub ↗Stack: Python, DeepEval, LLM eval
Tests: LLM output quality and correctness
Use when: You need AI output validation and assertions
Contact for access ↗Stack: Python, SSH, compliance frameworks
Tests: SSH configuration and compliance validation
Use when: You need infrastructure compliance testing
View on GitHub ↗Stack: Python, pyodbc, pytest
Tests: Query correctness, schema integrity, data migrations
Use when: You need to validate database state, stored procedures, or migration correctness
Contact for access ↗Tech Stack
Results
Built for Claude Code.
Who This Is For
Evaluating test automation frameworks. Need a pattern every engineer follows. Consistent locator management. Maintainable AI-generated tests.
Wanting consistent test patterns across teams. Reduce maintenance overhead. No more "everyone writes tests differently" chaos.
Using AI agents to generate tests. Need guardrails, not chaos. A strict architecture that keeps AI-generated code clean and maintainable.