Skip to content

Projects

This page summarizes each project's scope and links to its repository, documentation, and quickstart. Installation instructions are in each project's own README.


Overview

  • What is this? — A collection of research tools, benchmarks, and evaluation frameworks for responsible AI: fairness, explainability, multimodal understanding, and agentic systems.
  • Who is it for? — Researchers and practitioners working on fairness-aware AI, multimodal benchmarking, explainable agentic systems, and reproducible AI evaluation.
  • How is it organized? — Some projects have their own standalone repositories; others live as modules in the main AIXpert repo. Each section below links to the relevant repo and docs.

UnBias-Plus

UnBias-Plus is an AI-driven toolkit for bias detection and debiasing in text. It locates biased segments, classifies severity, explains each span, suggests neutral wording, and returns a full neutral rewrite—usable from the CLI, REST API (FastAPI + demo UI), or Python (UnBiasPlus).

Links: GitHub · Project page · PyPI


FairSense-AgentiX

FairSense-AgentiX is an agentic bias detection and AI-risk analysis platform for text, images, and datasets. A reasoning agent plans per input type, selects tools (OCR, vision models, embeddings, retrieval), critiques and refines outputs, and explains steps via telemetry—aiming for more transparent, context-aware fairness checks than static classifiers.

Links: GitHub · Project page · PyPI


SONIC-O1

Real-world benchmark for evaluating multimodal LLMs on audio-video understanding: short to long-form videos across 13 conversational domains (job interviews, medical, legal, etc.), with three tasks — summarization, multiple-choice QA, and temporal localization — and demographic metadata for fairness analysis.

Links: GitHub · Project page · Dataset · Leaderboard


SONIC-O1 Multi-Agent

Compound multi-agent system for audio-video understanding with Qwen3-Omni: planner, reasoner, and reflection agents with chain-of-thought reasoning, self-reflection, temporal grounding, and optional multi-step task decomposition. Built on LangGraph and vLLM.

Links: GitHub


Explainable Agentic Evaluation Framework

Framework for evaluating explainability in both traditional (static) and agentic AI systems. Compares attribution-based explanations (e.g. SHAP, LIME) with trace-based diagnostics; shows that attribution is stable for static prediction but trace-grounded rubrics are needed to localize agentic failures.

Links: GitHub · Project page


Factual Preference Alignment (F-DPO)

Factuality-aware Direct Preference Optimization: extends DPO with binary factuality labels and a factuality-aware margin to reduce LLM hallucinations without an auxiliary reward model. Single-stage and compute-efficient.

Links: GitHub · Project page · Dataset


Modules in AIXpert

These modules live in the main AIXpert repository. Clone once, run uv sync, then follow the READMEs below for setup and commands.

Controlled Images

Baseline vs. fairness-aware image sets for occupations or social groups; configurable attributes, matched prompts, and reproducible seeds.

Links: Module README

Synthetic Data Generation

Multi-modal synthesis: image + VQA pairs, textual scenes and MCQs, and video generation (Veo/Gemini). Driven by LLM-designed prompts and metadata templates.

Links: Images README · NLP README

Agent Pipeline (CrewAI)

Single-agent orchestration for prompt → image → metadata generation and large-scale data creation with structured JSON task definitions.

Links: Module README

Fairness & Explainability

Statistical metrics (e.g. Statistical Parity, Equal Opportunity), zero-shot explainers (integrated gradients, concept attributions), and visualization (disparity plots, attribution maps).

Links: AIXpert repo


Contributing & Documentation

  • See CONTRIBUTING.md for coding standards (PEP8, Google docstrings), pre-commit hooks (ruff, mypy, typos, nbQA), branching, and tests.
  • Run docs locally: uv sync --no-group docs then mkdocs servehttp://127.0.0.1:8000
  • CI: GitHub Actions (code_checks.yml, unit_tests.yml, integration_tests.yml)