Skip to content

A curated collection of high-quality AI implementations developed by researchers and engineers at the Vector Institute

Browse Implementations View on GitHub
115
Implementations
7
Years of Research

Browse Implementations by Type

atomgen

2024 applied-research

Library for handling atomistic graph datasets focusing on transformer-based implementations, with utilities for training various models, experimenting with different pre-training tasks, and a suite of pre-trained models with huggingface integrations

AtomFormer SchNet TokenGT

bias-mitigation-unlearning

2024 applied-research

A repository for social bias mitigation in LLMs using machine unlearning

kg-rag

2025 applied-research

A comprehensive framework for Knowledge Graph Retrieval Augmented Generation (KG-RAG).

Datasets: SEC 10-Q

pmc-data-extraction

2024 applied-research

A toolkit to download, augment, and benchmark Open-PMC data

A repository reference implementations for retrieval-augmented generation

Web Search Document Search SQL Search Cloud Search PubMed QA RAG Evaluation

ai-deployment

2024 bootcamp

A repository with reference implementations for deploying AI models in production environments, focusing on best practices and cloud-native solutions.

anomaly-detection

2023 bootcamp

A repository with implementation of anomaly detection techniques

Logistic Regression (Supervised) Random Forest (Supervised) XGBoost (Supervised) CatBoost (Supervised) Light GBM (Supervised) TabNet (Supervised and Semi-supervised) Autoencoder (AE) (Unsupervised) Isolation Forest (Unsupervised)

diffusion-models

2024 bootcamp

A repository with demos for various diffusion models for tabular and time series data

TabDDPM TabSyn ClavaDDPM CSDI TSDiff

finetuning-and-alignment

2024 bootcamp

A repository with implementations advanced fine-tuning techniques and approaches to enhance Large Language Model performance, reduce their computational cost, with a focus on alignment with human values

FSDP DDP Instruction Tuning PEFT Quantization Supervised Fine-tuning

interpretability

2025 bootcamp

A repository providing reference implementations and resources for the 2025 Bootcamp on Interpretable and Explainable AI, covering both post-hoc explainability methods and interpretable models

A repository with implementations of privacy-enhancing techniques for machine learning

Differential Privacy (tensorflow_privacy) PATE Membership Inference Attacks Horizontal Federated Learning Vertical Federated Learning Homomorphic Encryption

recommender-systems

2022 bootcamp

A repository with implementations of recommender systems

Matrix Factorization Collaborative Filtering Content-Based Filtering Sequence Aware Recommender Systems Session-Based Recommender Systems Knowledge Graph-Based Recommender Systems

self-supervised-learning

2024 bootcamp

A repository with reference implementations of self-supervised learning techniques

cyclops

2024 tool

A toolkit for facilitating research and deployment of ML models for healthcare

Binary Classification Multi-label Classification Tabular Data Processing Time-series Data Processing Image Data Processing Dataset Shift Detection Model Report Card Generation

fair-sense-ai

2025 tool

An AI-powered tool designed to analyze bias in text and visual content, with a focus on risk identification, mitigation, and promoting sustainable and trustworthy AI systems

Text Bias Analysis Image Bias Analysis Batch Text CSV Analysis Batch Image Analysis AI Risk Management Green AI Optimization Bias Scoring and Assessment

fed-rag

2025 tool

A framework for fine-tuning retrieval-augmented generation (RAG) systems.

Basic fine-tuning with FL RA-DIT

fl4health

2024 tool

A flexible, modular, and easy to use library to facilitate federated learning research and development in healthcare settings

florist

2024 tool

A platform to launch and monitor Federated Learning (FL) training jobs, designed to bridge the gap between FL algorithm implementations and practical healthcare applications

FL Job Orchestration Training Job Monitoring Client-Server Communication FL4Health Integration Web-based Job Configuration Docker-based Deployment Multi-client Support

mmlearn

2024 tool

A toolkit for research on multimodal representation learning

Contrastive Pretraining I-JEPA

odyssey

2024 tool

A comprehensive library for developing foundation models using Electronic Health Record (EHR) data, with a focus on advanced medical data processing and modeling

EHRMamba CEHR-BERT BigBird MultiBird LSTM XGBoost Multitask Prompted Finetuning (MPF) Next Token Prediction (NTP)
Datasets: MIMIC-IV

vector-inference

2024 tool

Efficient LLM inference on Slurm clusters using vLLM

CLI Python API OpenAI compatible server