๐Ÿ”ง Signal-Foundry

Evaluation-first AI systems โ€” harness engineering for reliable automation

3 tasks
89 committed eval cases
7 LLM models
354 unit tests
โ€” live cost
โ€” LLM calls
BYOK (Bring Your Own Key): The server-bundled NVIDIA key is free-tier and may be expired or rate-limited at any time. We recommend pasting your own key to avoid errors. Free NVIDIA signup at build.nvidia.com (~4 req/min rate limit). For Gemini / Claude / GPT, top up an OpenRouter account โ€” better quality, much faster, pay-per-call. LangSmith is purely optional โ€” paste a key from smith.langchain.com/settings to send traces to your own console. Keys are stored in your browser session only (never persisted server-side).

๐Ÿ“‹ Task 1: CI/CD Skills Engine

GitHub CI/CD workflows packaged as reusable, precisely-triggered Claude Skills with sandbox execution and idempotency.

4 skills ยท OSV.dev advisories ยท ruff/eslint/bandit ยท semver release planner

๐ŸŒ Task 2: Browser Automation Agent

Self-healing browser agent with AOM-first locators, semantic state tracking, and silent failure prevention.

13-class healer ยท 50+ silent-failure phrases ยท vision-language opt-in ยท 30-case eval set

๐Ÿ“Š Task 3: SEC 10-K Extraction

Hybrid rule+LLM pipeline for structured extraction with XBRL cross-validation and cost discipline.

4-stage pipeline ยท 16-case eval ยท 10-K + 10-K/A + 20-F ยท char_range grounding

Quick launchpad