Implementation Details¶
Each repository in this catalog contains implementations of specific machine learning techniques and algorithms. The following information is provided for each repository:
- Repository Link: Direct link to the GitHub repository
- Description: Brief introduction to the repository's purpose and links to relevant research papers
- Algorithms: List of ML algorithms demonstrated in the repository
- Datasets: Information on datasets used, with links to publicly available data
- Type: The category of implementation:
- bootcamp: Educational implementations developed for workshops and learning purposes
- tool: Production-ready, reusable libraries and frameworks for practical use
- applied-research: Research implementations tied to specific papers or novel methodologies
- Year: The year the implementation was published
Usage Notes¶
Note
Many repositories contain code for reference purposes only. To run them, updates may be required to the code and environment files.
Links for only publicly available datasets are provided. Many datasets used in the repositories are only available on the Vector cluster.
Repository Categories¶
The catalog is organized by implementation type to help you quickly find the resources you need. Each category serves different audiences and use cases:
🛠️ Tool Category¶
Purpose: Reusable libraries and frameworks for practical use
Key Characteristics:
- Reliable and well maintained software with clear examples
- Comprehensive documentation and APIs
- Focus on ease of use and integration
- Broad applicability across use cases
Examples:
- fl4health: Modular federated learning library with 15+ implemented algorithms
- cyclops: Healthcare ML toolkit with data processing and model deployment capabilities
- vector-inference: LLM inference system with CLI, Python API, and OpenAI-compatible server
- fair-sense-ai: AI bias analysis tool for text and visual content
- florist: Framework for federated learning workflows
🎓 Bootcamp Category¶
Purpose: Educational implementations developed for workshops and learning purposes
Key Characteristics:
- Step-by-step learning materials
- Simplified implementations for educational clarity
- Workshop-ready code examples
- Focus on understanding core concepts
🔬 Applied-Research Category¶
Purpose: Research implementations tied to specific papers or novel methodologies
Key Characteristics:
- Often include
paper_url
andbibtex
fields - Implement specific research contributions
- Focus on reproducibility and experimentation
- May include novel datasets or pre-trained models
- Code directly supports published research
Examples:
- bias-mitigation-unlearning: Implements specific EMNLP 2024 paper methods (Negation via Task Vectors, PCGU)
- kg-rag: Framework for Knowledge Graph RAG research with specific implementations
- atomgen: Transformer-based atomistic graph models with pre-trained models
- pmc-data-extraction: Research toolkit for Open-PMC data benchmarking
Each implementation includes algorithm tags, dataset information, and other metadata to aid in discovery.
🤝 Contributing¶
If you are a Vector researcher or engineer and would like to add your implementation to this catalog, you can contribute by following our contribution guidelines.
📝 Submit Issues or Suggestions¶
Please use our provided templates:
- 🐛 Report a bug - for reporting problems or errors
- ✨ Request a feature - for suggesting improvements or new additions
❓ Questions¶
For any questions, please reach out to the AI Engineering team at Vector Institute.