Infrastructure monitoring framework turning DevOps runbooks into automated actions
-
Updated
Sep 2, 2018 - Python
Infrastructure monitoring framework turning DevOps runbooks into automated actions
The open standard for Home Assistant instance health monitoring.
Fully Autonomous AI Research System with Self-Evolution, built natively on Claude Code
One prompt. A full AI engineering team. Go lie on the couch. 🧠
PDF extraction that checks its own work. #2 reading order accuracy — zero AI, zero GPU, zero cost.
ArgoCD-Basics-to-Production is a beginner-friendly repository designed to help you understand GitOps and Argo CD from fundamentals to real-world production use. It covers GitOps concepts, Argo CD architecture, and hands-on deployment workflows, organized as a progressive learning series.
layer that lets you connect any agent, any tool, any api together.
A robotframework library that repairs failing Robot Framework tests automatically via AI
ARF is an agentic reliability intelligence platform that separates decision intelligence (OSS) from governed execution (Enterprise), enabling autonomous operations with deterministic safety guarantees.
Self-healing multi-agent system using ontologies and MCP to automatically adapt when database schemas change
LedgerMind — an autonomous living memory for AI agents. It self-heals, resolves conflicts, distills experience into rules, and evolves without human intervention. SQLite + Git + reasoning layer. Perfect for multi-agent systems and on-device deployment.
A Docker Healer - Auto Restarting Unhealthy Containers
Autonomous Software Engineering Agents — self-healing, self-diagnosing development team powered by Claude Code and A2A protocol
AI-powered open-source monitoring platform with auto-remediation. 6 built-in runbooks, MCP integration (global first), DeepSeek root cause analysis. 5-minute Docker setup.
Robot Framework and OpenAI integration
AI-powered autonomous DBA agent — detects, diagnoses, and fixes Oracle database problems automatically
Healix is an AI-assisted automation framework that keeps your Linux system healthy by monitoring logs, services, resources, and auto-healing issues in real-time. Designed for hackathons, devops, cybersecurity, and system automation.
A resilient, fault‑tolerant telemetry analytics pipeline designed to validate, benchmark, and stress‑test high‑frequency sensor data streams under real‑world failure conditions. Includes chaos testing, DLQ repair, GPU‑accelerated ingestion, and end‑to‑end reliability validation for motorsport‑grade telemetry environments.
Add a description, image, and links to the self-healing topic page so that developers can more easily learn about it.
To associate your repository with the self-healing topic, visit your repo's landing page and select "manage topics."