AIOps Platform · 2025-Present
OpsRabbit
AI-powered CloudOps and SRE incident investigation platform.
Role and Outcome
Responsible for technical architecture and hands-on delivery direction across system design, deployment, DevOps, cost, scale, reliability, and production readiness.
Built OpsRabbit as a hands-on founder product focused on reducing MTTR, improving incident investigation, and helping engineering teams connect alerts, logs, deploys, ownership, and operational context.
View detailed build noteScale
Built as a founder-led product initiative for modern SRE, CloudOps, and incident response teams.
Architecture
- Designed the product around AI-assisted incident investigation and operational evidence gathering.
- Focused workflows on CloudOps, SRE, incident response, and reliability engineering use cases.
- Built the platform direction around practical automation that helps teams move from noisy alerts to clearer action.
- Kept the product grounded in real operational workflows rather than generic chatbot-style responses.
Lessons Learned
- AI for operations must connect incidents to real engineering context: ownership, deploys, alerts, logs, and system behavior.
- SRE automation needs explainability and traceability because engineers will not trust black-box answers during production incidents.
- The product architecture has to fit existing incident workflows while gradually improving speed, consistency, and reliability.