Introduction

Welcome to the Vector Institute MIDST challenge (Membership Inference over Diffusion-models-based Synthetic Tabular data) hosted at the 3rd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML 2025).

In this challenge, you will evaluate the resilience of the synthetic tabular data generated by diffusion models against black-box and white-box membership inference attacks.

Challenge Overview

Synthetic data is often perceived as a silver-bullet solution to data anonymization and privacy-preserving data publishing. Drawn from generative models like diffusion models, synthetic data is expected to preserve the statistical properties of the original dataset while remaining resilient to privacy attacks. Recent developments of diffusion models have been effective on a wide range of data types, but their privacy resilience, particularly for tabular formats, remains largely unexplored.

In this challenge, we seek a quantitative evaluation of the privacy gain of synthetic tabular data generated by diffusion models, with a specific focus on its resistance to membership inference attacks (MIAs). Given the heterogeneity and complexity of tabular data, we will explore multiple target models for MIAs, including diffusion models for single tables of mixed data type types and multi-relational tables with interconnected constraints. We expect the development of novel black-box and white-box MIAs tailored to these target diffusion models as a key outcome, enabling a comprehensive evaluation of their privacy efficacy.

For each task in MIDST, you are given a set of challenge points, the aim is to decide which of these challenge points were used to train the model. You can compete on any of four separate membership inference tasks. Each task will be scored separately. You do not need to participate in all of them, and can choose to participate in as many as you like. Throughout the competition, submissions will be scored on a subset of the evaluation data and ranked on a live scoreboard. When submission closes, the final scores will be computed on a separate subset of the evaluation data.

The winner of each task will be eligible for an award of $2000 CAD and the runner-up of each task for an award of $1000 CAD (in the event of tied entries, these awards may be adjusted). This competition is co-located with the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML) 2025, and the winners will be invited to present their strategies at the conference.

Task Details

The generative models are developed on the training data set to generate synthetic data. They are expected to learn the statistics without memorizing the individual data. To evaluate this promise, membership inference attacks assess whether the model distinguishes between the training data set and the holdout data set, both are derived from the same, larger data set.

For each of the four tasks, we train a set of models on different splits of a public dataset. For each of these models, we provide m challenge points; exactly half of which are members (i.e., used to train the model) and half are non-members (i.e., from the holdout set; they come from the same public dataset as the training set, but were not used to train the model). Your goal is to determine which challenge points are members and which are non-members.

This challenge is composed of four different tasks, each associated with a separate category. The categories are defined based on the access to the generative models and the type of the tabular data as follows:

Note: In white-box attacks, you have access to the models and their generated synthetic output. Training sets for these models are selected from a public dataset. In black-box attack, you have access to the same information as the white-box attack, except for the models.

To facilitate participation in MIDST, we develop some shadow models for both single table and multi-table tasks. The shadow models are the same for black-box and white-box tasks. You are free to choose these shadow models and/or generate your own if needed in developing your MIAs.

Models and Datasets

MIDST examines the privacy of three recent diffusion-model base tabular synthesis approaches:

We include each of these models with a dedicated directory in the MIDST Models repository which will be made publicly available December 1st. In each directory, there is a README file that provides an overview of the topic, prerequisites, and notebook descriptions.

Submissions and Scoring

Submissions will be ranked based on their performance in membership inference against the associated models.

There are three sets of challenges: train, dev, and final. For models in train, we reveal the full training dataset, and consequently the ground truth membership data for challenge points. These models can be used by participants to develop their attacks. For models in the dev and final sets, no ground truth is revealed and participants must submit their membership predictions for challenge points.

During the competition, there will be a live scoreboard based on the dev challenges. The final ranking will be decided on the rank set; scoring for this dataset will be withheld until the competition ends.

For each challenge point, the submission must provide a value, indicating the confidence level with which the challenge point is a member. Each value must be a floating point number in the range [0.0, 1.0], where 1.0 indicates certainty that the challenge point is a member, and 0.0 indicates certainty that it is a non-member.

Submissions will be evaluated according to their True Positive Rate at 10% False Positive Rate (i.e. TPR @ 0.1 FPR). In this context, positive challenge points are members and negative challenge points are non-members. For each submission, the scoring program concatenates the confidence values for all models (dev and final treated separately) and compares these to the reference ground truth. The scoring program determines the minimum confidence threshold for membership such that at most 10% of the non-member challenge points are incorrectly classified as members. The score is the True Positive Rate achieved by this threshold (i.e., the proportion of correctly classified member challenge points). The live scoreboard shows additional scores (i.e., TPR at other FPRs, membership inference advantage, accuracy, AUC-ROC score), but these are only informational.

You are allowed to make multiple submissions, but only your latest submission will be considered. In order for a submission to be valid, you must submit confidence values for all challenge points in all three scenarios of the task.

Winner Selection

Winners will be selected independently for each task (i.e. if you choose not to participate in certain tasks, this will not affect your rank for the tasks in which you do participate). For each task, the winner will be the one achieving the highest average score (TPR @ 0.1 FPR) across the three scenarios.

Important Dates

Terms and Conditions

Codabench Competitions

Getting Started

You need to register on CodaBench for the tasks in which you would like to participate, first. Upon registration, you will be given URLs from which to download the challenge data.

Event Organizers

Meet the Event Organizers

Event Sponsors

Meet the Event Sponsors

FAQ

Browse FAQ

Acknowledgements

We’d like to thank MICO organizers, for their open source project, and very helpful comments.

Contact

For more information or help with navigating our repository, please contact masoumeh@vectorinstitute.ai, xi.he@vectorinstitute.ai or john.jewell@vectorinstitute.ai.