An AI agent with full control over your Windows desktop. It sees your screen, operates any app, browses the web, runs terminal commands, hears your voice — and learns new skills from JSON definitions.
Every interface your computer exposes — unified under one AI brain.
Full Windows UI Automation. Click buttons by name, type into fields, read element trees — no coordinates needed.
Chrome DevTools Protocol. Navigate, click elements, fill forms, intercept network, execute JavaScript.
Native Windows Pseudo Console. Run commands, manage processes in isolated terminals, each with unique tracking IDs.
OpenCV-powered webcam capture. The AI can see the physical world — read documents, identify objects, observe the environment.
Whisper-powered speech recognition with wake word detection. Say "Hey Cortex" and speak your command naturally.
JSON-driven deterministic skills. The AI inspects any app, generates a skill definition, saves it, and executes it — all in one session.
From inspecting an app to creating and running a skill — in seconds.
Define app automations as JSON. The agent executes deterministic step sequences — no fumbling, no screenshots.
{
"App": "Spotify",
"WindowTitleMatch": "Spotify",
"Actions": [{
"Name": "SpotifySearch",
"Parameters": [{
"Name": "Query",
"Required": true
}],
"Steps": [
{ "Type": "FocusWindow" },
{ "Type": "PressKeyCombination",
"Keys": ["Control", "L"] },
{ "Type": "TypeText",
"Text": "{Query}" },
{ "Type": "PressKey",
"Key": "Return" }
]
}]
}
Use SkillScaffold or CLIScaffold to discover an app's UI elements, keyboard shortcuts, and supported patterns.
Write a JSON skill with deterministic steps: FocusWindow, PressKey, TypeText, InvokeElement, and more.
The agent writes the JSON file via WriteFile — native C#, no terminal. Deduplication prevents loops.
Call RunNewSkill to execute immediately in the same session. No restart. No rebuild. Instant.
Give your AI full autonomy over your desktop.