MACP Control Plane — Troubleshooting
Runtime Connection Failures
Symptom: readyz returns runtime.ok: false
Checks:
- Verify runtime is running:
grpcurl -plaintext 127.0.0.1:50051 list - Check
RUNTIME_ADDRESSenv var matches the runtime's listen address - If using TLS, ensure
RUNTIME_TLS=trueand certificates are valid - Check
RUNTIME_REQUEST_TIMEOUT_MS(default 30s) — increase if runtime is slow to respond
Migration Issues
Symptom: Application fails to start with database errors
Steps:
- Ensure PostgreSQL is running and accessible via
DATABASE_URL - Run migrations:
npm run drizzle:migrate - Check
drizzle/directory for migration files - Use
npm run drizzle:studioto inspect database state
Stuck Runs
Symptom: Runs stay in starting or running state indefinitely
Steps:
- Check stream consumer logs for reconnection errors
- Verify runtime session state:
GET /readyz - Check
STREAM_MAX_RETRIES(default 5) andSTREAM_IDLE_TIMEOUT_MS(default 120s) - Manually cancel the run:
POST /runs/{id}/cancel
High Memory Usage
Causes:
- Too many active SSE subscribers — StreamHub cleans up idle subjects after 60s
- Large replay queries — replay now uses cursor-based pagination (batch size configurable via
REPLAY_BATCH_SIZE) - Database connection pool exhaustion — check
DB_POOL_MAX(default 20)
Common Error Codes
| Code | Meaning |
|---|---|
RUN_NOT_FOUND | The specified run ID does not exist |
INVALID_STATE_TRANSITION | Cannot transition the run to the requested state |
RUNTIME_UNAVAILABLE | Cannot connect to the gRPC runtime |
RUNTIME_TIMEOUT | gRPC call exceeded deadline |
STREAM_EXHAUSTED | Max retries reached for stream reconnection |
SESSION_EXPIRED | Runtime session has expired |
KICKOFF_FAILED | A kickoff message failed to send |