Skip to content

Testing Guide

Test Environment Setup

  1. Fork a repository or use a test repository
  2. Enable Dependabot: Settings → Security → Dependabot
  3. Install workflows: Copy .github/workflows/ to test repo
  4. Add secrets: Set ANTHROPIC_API_KEY
  5. Enable auto-merge: Settings → General → Pull Requests

Test Cases

Test Case 1: Auto-Merge Success

Goal: Verify bot auto-merges PRs when all checks pass

Steps: 1. Wait for or create Dependabot PR 2. Ensure all checks pass 3. Monitor workflow: auto-merge-dependabot

Expected: - Bot approves PR with comment - Auto-merge enabled - PR merges automatically - Success comment posted

Test Case 2: Fix Test Failures

Goal: Verify bot fixes failing tests

Test PR: aieng-template-mvp#17

Steps: 1. Trigger on PR with frontend-tests failing 2. Monitor workflow: fix-failing-pr 3. Check bot comments 4. Review pushed changes

Expected: - Bot detects test failure - Comments "Attempting to fix" - Pushes fix commit - Comments result

Test Case 3: Fix Linting Issues

Goal: Verify bot fixes linting problems

Setup: 1. Create PR that introduces linting errors 2. Ensure linting checks fail

Expected: - Bot identifies lint failures - Runs auto-fixers (eslint --fix, prettier, black) - Commits fixes - Checks pass

Test Case 4: Security Vulnerabilities

Goal: Verify bot updates vulnerable dependencies

Setup: 1. Dependabot PR with pip-audit failures 2. Security scan shows CVEs

Expected: - Bot identifies vulnerable packages - Updates to patched versions - Updates requirements.txt - Commits with CVE references

Test Case 5: Build Failures

Goal: Verify bot fixes build errors

Setup: 1. PR with TypeScript compilation errors 2. Build check fails

Expected: - Bot analyzes build logs - Identifies type errors - Updates type definitions - Build passes

Test Case 6: Manual Intervention Required

Goal: Verify bot correctly identifies unfixable issues

Setup: 1. PR with complex breaking changes 2. Multiple critical failures

Expected: - Bot attempts fix - Recognizes limitations - Comments: "Could not automatically fix" - Suggests manual review

Manual Testing

Trigger Workflows Manually

# Using GitHub CLI
gh workflow run auto-merge-dependabot.yml --repo VectorInstitute/your-repo

# Or via Actions tab:
# Actions → Select workflow → Run workflow

Test Individual Components

Test prompt loading:

# Check prompt files exist and are valid markdown
find .github/prompts -name "*.md" -exec md_lint {} \;

Test failure detection:

# Simulate failure detection
gh pr view PR_NUMBER --json statusCheckRollup

Test Claude API:

# Verify API key works
curl -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  https://generativelanguage.googleapis.com/v1beta/models

Integration Testing

Test Across Multiple Repos

  1. Select diverse repos:
  2. Python project
  3. JavaScript/TypeScript project
  4. Mixed stack project

  5. Test different scenarios:

  6. Patch updates (x.y.Z)
  7. Minor updates (x.Y.0)
  8. Major updates (X.0.0)
  9. Security updates
  10. Multiple dependency updates

  11. Monitor for:

  12. False positives (incorrect merges)
  13. False negatives (missed opportunities)
  14. Failed fixes (bot breaks things)
  15. API errors (Gemini failures)

Rollback Testing

Scenario: Bot makes incorrect changes

Steps: 1. Create PR with intentional issues 2. Let bot attempt fix 3. Verify rollback mechanism 4. Check git history

Expected: - Commits are atomic - Easy to revert - No data loss - Clear commit messages

Performance Testing

Metrics to Track

# Average time to auto-merge
# Average time to fix
# Success rate (fixes / attempts)
# API call count
# Cost per fix (Claude API)

Load Testing

Scenario: Multiple PRs simultaneously

Setup: 1. Create 10+ Dependabot PRs 2. Some passing, some failing 3. Monitor workflow queue

Expected: - All PRs processed - No race conditions - No duplicate fixes - Workflows don't block each other

Debugging Tests

Check Workflow Logs

# Get latest run for a workflow
gh run list --workflow=auto-merge-dependabot.yml --limit 1

# View logs
gh run view RUN_ID --log

# Download logs
gh run download RUN_ID

Common Issues

Issue Debug Steps
Workflow doesn't trigger Check event triggers, PR author
API errors Verify secrets, check quotas
Fixes don't work Review prompt, check model
Can't push Check permissions, branch protection

Test Documentation

Record test results:

## Test Run: YYYY-MM-DD

**Environment**: [staging/production/test-repo]
**Test Cases**: [list]
**Results**: [pass/fail counts]
**Issues Found**: [list]
**Actions Taken**: [fixes applied]

Continuous Testing

Schedule regular tests: - Weekly: Run test suite - Monthly: Load testing - Quarterly: Full integration test - After changes: Regression testing

Success Criteria

✅ Auto-merge: 95%+ success rate ✅ Auto-fix: 70%+ success rate ✅ No false positives: 0 incorrect merges ✅ Fast execution: <5min average ✅ Cost effective: <$1 per fix average


🤖 AI Engineering Maintenance Bot - Maintaining Vector Institute Repositories built by AI Engineering