Skip to content

Implementation Details

Each repository in this catalog contains implementations of specific machine learning techniques and algorithms. The following information is provided for each repository:

  • Repository Link: Direct link to the GitHub repository
  • Description: Brief introduction to the repository's purpose and links to relevant research papers
  • Algorithms: List of ML algorithms demonstrated in the repository
  • Datasets: Information on datasets used, with links to publicly available data
  • Type: The category of implementation:
    • bootcamp: Educational implementations developed for workshops and learning purposes
    • tool: Production-ready, reusable libraries and frameworks for practical use
    • applied-research: Research implementations tied to specific papers or novel methodologies
  • Year: The year the implementation was published

Usage Notes

Note

Many repositories contain code for reference purposes only. To run them, updates may be required to the code and environment files.

Links for only publicly available datasets are provided. Many datasets used in the repositories are only available on the Vector cluster.

Repository Categories

The catalog is organized by implementation type to help you quickly find the resources you need. Each category serves different audiences and use cases:

🛠️ Tool Category

Purpose: Reusable libraries and frameworks for practical use

Key Characteristics:

  • Reliable and well maintained software with clear examples
  • Comprehensive documentation and APIs
  • Focus on ease of use and integration
  • Broad applicability across use cases

Examples:

  • fl4health: Modular federated learning library with 15+ implemented algorithms
  • cyclops: Healthcare ML toolkit with data processing and model deployment capabilities
  • vector-inference: LLM inference system with CLI, Python API, and OpenAI-compatible server
  • fair-sense-ai: AI bias analysis tool for text and visual content
  • florist: Framework for federated learning workflows

🎓 Bootcamp Category

Purpose: Educational implementations developed for workshops and learning purposes

Key Characteristics:

  • Step-by-step learning materials
  • Simplified implementations for educational clarity
  • Workshop-ready code examples
  • Focus on understanding core concepts

🔬 Applied-Research Category

Purpose: Research implementations tied to specific papers or novel methodologies

Key Characteristics:

  • Often include paper_url and bibtex fields
  • Implement specific research contributions
  • Focus on reproducibility and experimentation
  • May include novel datasets or pre-trained models
  • Code directly supports published research

Examples:

  • bias-mitigation-unlearning: Implements specific EMNLP 2024 paper methods (Negation via Task Vectors, PCGU)
  • kg-rag: Framework for Knowledge Graph RAG research with specific implementations
  • atomgen: Transformer-based atomistic graph models with pre-trained models
  • pmc-data-extraction: Research toolkit for Open-PMC data benchmarking

Each implementation includes algorithm tags, dataset information, and other metadata to aid in discovery.

🤝 Contributing

If you are a Vector researcher or engineer and would like to add your implementation to this catalog, you can contribute by following our contribution guidelines.

📝 Submit Issues or Suggestions

Please use our provided templates:

❓ Questions

For any questions, please reach out to the AI Engineering team at Vector Institute.