Deployment Guide¶
Overview¶
Deploying the AI Engineering Maintenance Bot is simple - it runs from ONE central repository and requires NO installation in individual repositories.
Deployment Checklist¶
Phase 1: Initial Setup (30 minutes)¶
- [ ] 1.1: Create Anthropic API key
- [ ] 1.2: Create GitHub PAT with org-wide access
- [ ] 1.3: Add secrets to this repository
ANTHROPIC_API_KEYORG_ACCESS_TOKEN- [ ] 1.4: Enable GitHub Actions
- [ ] 1.5: Verify workflows appear in Actions tab
Phase 2: Testing (1-2 hours)¶
- [ ] 2.1: Manual test run of monitor workflow
- [ ] 2.2: Test fix workflow on specific PR
- [ ] 2.3: Verify bot comments appear on test PR
- [ ] 2.4: Check workflow logs for errors
- [ ] 2.5: Confirm bot can access org repos
Phase 3: Limited Monitoring (Week 1-2)¶
- [ ] 3.1: Enable scheduled runs (automatic every 6 hours)
- [ ] 3.2: Monitor for 2-3 days
- [ ] 3.3: Check PRs the bot processes
- [ ] 3.4: Verify auto-merge works correctly
- [ ] 3.5: Review any fixes the bot applies
- [ ] 3.6: Adjust prompt templates if needed
Phase 4: Full Production (Week 3+)¶
- [ ] 4.1: Confirm bot stability
- [ ] 4.2: Enable auto-merge in more repos
- [ ] 4.3: Announce to VectorInstitute team
- [ ] 4.4: Document any customizations made
- [ ] 4.5: Set up monitoring dashboard
Quick Deployment¶
For experienced admins who just want to get it running:
# 1. Add secrets via GitHub UI
# Settings → Secrets → Actions → New secret
# - ANTHROPIC_API_KEY
# - ORG_ACCESS_TOKEN
# 2. Enable workflows
# Actions tab → Enable workflows
# 3. Test
gh workflow run monitor-org-bot-prs.yml --repo VectorInstitute/aieng-bot-maintain
# 4. Monitor
# Check Actions tab for runs every 6 hours
What Happens After Deployment¶
First 6 Hours¶
- Monitor workflow runs automatically
- Scans all VectorInstitute repos
- Finds any open Dependabot PRs
- Processes passing PRs (auto-merge)
- Triggers fixes for failing PRs
- Bot comments appear on processed PRs
- Auto-merges happen for passing PRs
- Fix attempts made on failing PRs
First Day¶
- 4 scan cycles (every 6 hours)
- All active Dependabot PRs processed
- Pattern of bot behavior visible
- Any issues surface quickly
First Week¶
- Bot handles most Dependabot PRs automatically
- Team sees reduced manual PR review load
- Fix success rate becomes clear
- Prompt adjustments may be needed
Monitoring After Deployment¶
Daily Checks (First 2 Weeks)¶
Monitor these metrics:
# Check recent runs
gh run list --workflow=monitor-org-bot-prs.yml --limit 10
# Check for failures
gh run list --workflow=monitor-org-bot-prs.yml --status failure --limit 5
# View specific run
gh run view RUN_ID --log
Track: - PRs found per run - PRs auto-merged - PRs fixed successfully - Any workflow failures - API errors or rate limits
Weekly Review¶
Review bot activity: 1. Actions tab: Check run success rate 2. Across org repos: Find PRs with bot comments 3. Merged PRs: Verify quality of auto-merged changes 4. Fixed PRs: Check if fixes worked correctly 5. Metrics: Count successes vs failures
Success Metrics¶
Target metrics after 2 weeks: - ✅ Monitor workflow: 95%+ success rate - ✅ Auto-merge: 90%+ of passing PRs merged - ✅ Auto-fix: 50%+ of fixable issues resolved - ✅ False positives: 0 (no incorrect merges) - ✅ Team feedback: Positive
Common Post-Deployment Adjustments¶
Adjust Scan Frequency¶
If you want more or less frequent scanning:
# .github/workflows/monitor-org-bot-prs.yml
on:
schedule:
- cron: '0 */12 * * *' # Every 12 hours (less frequent)
# or
- cron: '0 */3 * * *' # Every 3 hours (more frequent)
Exclude Specific Repositories¶
If certain repos need manual review:
# In monitor-org-bot-prs.yml, after getting REPOS
REPOS=$(echo "$REPOS" | grep -v "critical-production-repo")
REPOS=$(echo "$REPOS" | grep -v "experimental-repo")
Adjust Fix Aggressiveness¶
If bot makes too many/few fix attempts:
Edit .github/prompts/ templates to be more/less aggressive in fixes.
Fine-Tune Merge Criteria¶
If auto-merging incorrectly:
# In monitor-org-bot-prs.yml, adjust check logic
# Add filters for specific check names or PR titles
Rollback Plan¶
If something goes wrong:
Immediate Disable (30 seconds)¶
# Disable scheduled runs
gh workflow disable monitor-org-bot-prs.yml \
--repo VectorInstitute/aieng-bot-maintain
Or via GitHub UI: 1. Actions → Monitor Organization Bot PRs 2. Click "..." → Disable workflow
Investigation (While Disabled)¶
- Check recent workflow runs for errors
- Review PRs the bot touched recently
- Check for incorrect merges or fixes
- Identify the issue
Fix and Re-enable¶
- Make necessary corrections (prompts, logic, secrets)
- Test manually first
- Re-enable workflow:
Full Rollback (If Needed)¶
The bot makes no permanent infrastructure changes: - Remove secrets from repository - Disable workflows - No cleanup needed in target repositories - Bot comments remain (harmless documentation)
Cost Estimation¶
Claude API Costs¶
Assuming: - 10 Dependabot PRs/day across org - 50% need fixes (5/day) - ~2000 tokens per fix request - Claude Sonnet: ~$0.001 per 1K tokens
Estimated cost: $0.01-0.05 per day = $0.30-1.50 per month
GitHub API Costs¶
- Organization scanning: Free (read operations)
- PR operations: Free (within rate limits)
- Actions minutes: Free for public repos, included in plan for private
Total estimated cost: <$2/month
Team Communication¶
Announcement Template¶
## 🤖 New: AI Engineering Maintenance Bot
We've deployed an automated bot to help manage Dependabot PRs across all VectorInstitute repositories.
**What it does:**
- Automatically merges Dependabot PRs when all checks pass
- Attempts to fix common issues (linting, tests, security) in failing PRs
- Leaves comments on PRs it processes
**What you need to do:**
- Nothing! The bot works automatically
- You can still manually review and merge if you prefer
- Report any issues to the AI Engineering team
**More info:** https://github.com/VectorInstitute/aieng-bot-maintain
Post in: - Organization Slack/Teams channel - Team meetings - Internal wiki/docs
Long-Term Maintenance¶
Monthly Tasks¶
- Review bot effectiveness
- Update prompts based on patterns
- Check API costs vs budget
- Rotate tokens if needed
Quarterly Tasks¶
- Analyze metrics and trends
- Consider feature enhancements
- Update to new Gemini models
- Review security and permissions
Annual Tasks¶
- Full security audit
- Architecture review
- Cost-benefit analysis
- Team satisfaction survey
Scaling Considerations¶
Current Capacity¶
- Handles 100s of repos
- Processes unlimited PRs
- Limited by: GitHub API rate limits, Claude API quota
If Scaling Needed¶
- Increase Claude API quota
- Use GitHub App instead of PAT (higher rate limits)
- Shard by repository (multiple bot instances)
- Cache PR status to reduce API calls
Support and Updates¶
Getting Updates¶
Watch this repository for: - Bug fixes - New features - Improved prompts - Gemini model updates
Contributing Improvements¶
Submit PRs with: - Better prompt templates - Enhanced error handling - New failure type detection - Improved logging
🤖 AI Engineering Maintenance Bot - Maintaining Vector Institute Repositories built by AI Engineering