Building the Future of Root Cause Analysis Together
Welcome to the OperationsPAI community repository! This is your starting point for understanding our vision, values, and how to participate in building a self-evolving RCA ecosystem.
OperationsPAI is an open-source community building the world's first self-evolving training environment for Root Cause Analysis (RCA) in microservices.
Bridge the gap between academic research and industrial practice by creating an intelligent platform where:
- Researchers can develop and evaluate RCA algorithms with continuously generated data
- Practitioners can test and deploy production-ready RCA solutions
- Students can learn distributed systems through hands-on experience
Unlike static benchmarks, we use intelligent fault injection that evolves with your algorithms:
graph LR
A[Intelligent Fault<br/>Injection] --> B[Microservices]
B --> C[Data Collection]
C --> D[Algorithm<br/>Evaluation]
D --> E[Fitness<br/>Feedback]
E --> A
style A fill:#e1f5ff
style B fill:#fff4e1
style C fill:#e8f5e9
style D fill:#f3e5f5
style E fill:#ffe1e1
The stronger your algorithm, the harder the faults become.
Core Principles:
- π Open by Default: All code, data, and research openly shared
- π¬ Research Meets Practice: Academic rigor with production readiness
- π€ Community First: Success measured by community growth
- π― Quality Over Speed: Build it right, not just fast
- No more "no data" problem: Dynamic data generation for any fault scenario
- Reproducible experiments: Standardized environment and evaluation metrics
- Focus on algorithms: Skip the infrastructure setup, dive into innovation
- Evaluate before deploy: Compare RCA algorithms on standardized benchmarks
- Learn from research: Access cutting-edge algorithms from academia
- Contribute scenarios: Share your real-world challenges (anonymized)
- Hands-on experience: Work with realistic distributed systems
- Multiple entry points: From testing to algorithm development
- Build your portfolio: Contribute to a growing open-source project
- Understand Our Vision: Read Vision & Mission
- Learn Our Values: Review Core Values
- Join the Conversation: GitHub Discussions
- Find Your First Task: Browse Good First Issues
- Code Contributions: See Contributing Guide
- Documentation: Help improve our docs
- Testing: Verify installation on different platforms
- Community Support: Answer questions in GitHub Discussions
- Quick Start: 30-minute demo (coming soon)
- Documentation: Technical docs
- Repositories: Browse our GitHub organization
OperationsPAI consists of multiple interconnected repositories:
| Repository | Description | Status |
|---|---|---|
| AegisLab | Core orchestration platform | β Active |
| Pandora | Intelligent fault scheduler | π§ In Progress |
| RCABench Platform | Algorithm evaluation framework | β Active |
| chaos-experiment | Fault injection framework | β Active |
| train-ticket | Demo microservices app | β Active |
| loadgenerator | Traffic generation tool | β Active |
Current Focus (2026 Q1-Q2):
- π₯ Community infrastructure (website, documentation)
- π₯ Quick Start guide (30-minute demo)
- π₯ Code cleanup and standardization
- π§ Intelligent fault scheduling loop
- π Additional microservice targets
See our detailed roadmap for the full 18-month plan.
- Decision Making: See Governance Model
- Code of Conduct: Read our Code of Conduct
- Contribution Process: Follow the Contributing Guide
- π‘ GitHub Discussions: Discussions - Q&A and discussions
- π GitHub Issues: Issues - Bug reports and features
- π¦ Twitter: @OperationsPAI - Updates and announcements
- π Documentation: docs/community/resources.md
- π― Vision: docs/community/vision.md
- π Values: docs/community/values.md
- Vision & Mission - Our long-term goals
- Core Values - Principles that guide us
- Resources - Links and learning materials
- Roadmap - Project timeline and milestones
- Governance - How decisions are made
- Contributing - How to get involved
- Code of Conduct - Community standards
- Technical Architecture
- API Reference
- Quick Start Guide
- Repository-specific docs in each repo's README
If you use OperationsPAI in your research:
- Cite Our Work: Citation format coming soon
- Share Your Papers: We'll list papers using OperationsPAI
- Collaborate: Join our research discussions in GitHub Discussions
- Contribute Algorithms: Add your RCA algorithms to the platform
Papers using OperationsPAI: [Coming soon]
OperationsPAI is released under the Apache 2.0 License.
OperationsPAI is built on top of several excellent open-source projects:
- Chaos Mesh - Chaos engineering platform
- OpenTelemetry - Observability framework
- TrainTicket - Microservice benchmark
Built with β€οΈ by the OperationsPAI Community