SafeTest Forge

SafeTest Forge is an AI-powered test generation and debugging tool built with the Claude Agent SDK. It is a local-first TypeScript tool that generates, runs, repairs, inspects, and rewinds Python pytest tests. Point it at a Python repository, specify a target module, and the agent writes tests, executes them locally, and fixes failures — all while you watch the live event trace in the UI.

The repository is open source. Live Claude runs require your own Anthropic API key, but local development, tests, and smoke evaluation work in fake mode without one. New language support is planned for future releases.

What makes this project interesting is the end-to-end agent loop. The Claude Agent SDK handles streaming events, structured output, checkpoint capture, and subagent definitions. When a generated test fails, the agent reads the error output, diagnoses the issue, and attempts a repair round automatically. If something goes wrong, you can rewind to any checkpoint and inspect exactly what the agent was thinking at that point. The entire flow , from test generation to execution to repair , happens locally on your machine with full visibility into every step.

Current Capabilities

CLI-First V1 Flow - Run, cancel, report, trace, and rewind commands through a complete CLI interface
Python Repo Validation - Package-shape detection and ambiguous-monorepo rejection without --target
Policy Enforcement - Test-only writes and a restricted shell allowlist keep the agent from touching production code
Claude Agent SDK Integration - Streaming events, structured output, checkpoint capture, and subagent definitions
Deterministic Fake-Agent Path - Unit, integration, and CLI smoke tests run without an API key
Local Pytest Execution - Stdout/stderr capture, timeout classification, cancellation polling, and one repair round
Live-Run Cancellation - Propagation through the run-level abort controller, including persisted cancel requests observed across processes
Flat-File Persistence - Runs, traces, reports, checkpoints, and fake rewind snapshots stored under .safetest-forge/
React UI - 2-column layout with color-coded trace badges, phase progress bar, click-to-copy run ID, stat grid report panel, and REST + SSE wiring

For setup instructions, usage details, and architecture overview, see the GitHub repository.

Future Plans

Support for additional programming languages beyond Python

SafeTest Forge

Contents

SafeTest Forge

Current Capabilities

Future Plans