Implementation Details¶

Each repository in this catalog contains implementations of specific machine learning techniques and algorithms. The following information is provided for each repository:

Repository Link: Direct link to the GitHub repository
Description: Brief introduction to the repository's purpose and links to relevant research papers
Algorithms: List of ML algorithms demonstrated in the repository
Datasets: Information on datasets used, with links to publicly available data
Type: The category of implementation:
- bootcamp: Educational implementations developed for workshops and learning purposes
- tool: Production-ready, reusable libraries and frameworks for practical use
- applied-research: Research implementations tied to specific papers or novel methodologies
Year: The year the implementation was published

Usage Notes¶

Note

Many repositories contain code for reference purposes only. To run them, updates may be required to the code and environment files.

Links for only publicly available datasets are provided. Many datasets used in the repositories are only available on the Vector cluster.

Repository Categories¶

The catalog is organized by implementation type to help you quickly find the resources you need. Each category serves different audiences and use cases:

🛠️ Tool Category¶

Purpose: Reusable libraries and frameworks for practical use

Key Characteristics:

Reliable and well maintained software with clear examples
Comprehensive documentation and APIs
Focus on ease of use and integration
Broad applicability across use cases

Examples:

fl4health: Modular federated learning library with 15+ implemented algorithms
cyclops: Healthcare ML toolkit with data processing and model deployment capabilities
vector-inference: LLM inference system with CLI, Python API, and OpenAI-compatible server
fair-sense-ai: AI bias analysis tool for text and visual content
florist: Framework for federated learning workflows

🎓 Bootcamp Category¶

Purpose: Educational implementations developed for workshops and learning purposes

Key Characteristics:

Step-by-step learning materials
Simplified implementations for educational clarity
Workshop-ready code examples
Focus on understanding core concepts

🔬 Applied-Research Category¶

Purpose: Research implementations tied to specific papers or novel methodologies

Key Characteristics:

Often include paper_url and bibtex fields
Implement specific research contributions
Focus on reproducibility and experimentation
May include novel datasets or pre-trained models
Code directly supports published research

Examples:

bias-mitigation-unlearning: Implements specific EMNLP 2024 paper methods (Negation via Task Vectors, PCGU)
kg-rag: Framework for Knowledge Graph RAG research with specific implementations
atomgen: Transformer-based atomistic graph models with pre-trained models
pmc-data-extraction: Research toolkit for Open-PMC data benchmarking

Each implementation includes algorithm tags, dataset information, and other metadata to aid in discovery.

🤝 Contributing¶

If you are a Vector researcher or engineer and would like to add your implementation to this catalog, you can contribute by following our contribution guidelines.

📝 Submit Issues or Suggestions¶

Please use our provided templates:

🐛 Report a bug - for reporting problems or errors
✨ Request a feature - for suggesting improvements or new additions

❓ Questions¶

For any questions, please reach out to the AI Engineering team at Vector Institute.