Announcing the Winners of the MIDST Challenge!
The Vector Institute MIDST challenge (Membership Inference over Diffusion-models-based Synthetic Tabular data) will be hosted at the 3rd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML 2025). The competition was launched in December 2024, and final submissions were due on February 28th, 2025. We are excited to announce the winning submissions!
The goal of this challenge was to evaluate the resilience of the synthetic tabular data generated by diffusion models against black-box and white-box membership inference attacks. We sought a quantitative evaluation of the privacy gain of synthetic tabular data generated by diffusion models, with a specific focus on its resistance to membership inference attacks (MIAs). Given the heterogeneity and complexity of tabular data, we explored multiple target models for MIAs, including diffusion models for single tables of mixed data type types and multi-relational tables with interconnected constraints. We expected the development of novel black-box and white-box MIAs tailored to these target diffusion models as a key outcome, enabling a comprehensive evaluation of their privacy efficacy. The following is a link to the GitHub repository: link
- Challenge Tasks
- Evaluation Criteria
- Accessibility
- Transparency
- Result
- Analysis
- Event Organizers
- Event Sponsors
- Frequently Asked Questions
- Acknowledgements
- Next Steps
Challenge Tasks
This challenge was composed of four different tasks, each associated with a separate category. The categories were defined based on the access to the generative models and the type of the tabular data as follows: Access to the models: black-box, Data: single table Access to the models: white-box, Data: single table Access to the models: black-box, Data: multi-table Access to the models: white-box, Data: multi-table
To facilitate participation in MIDST, we developed shadow models for both single table and multi-table tasks. The shadow models were the same for black-box and white-box tasks. Applicants were free to choose these shadow models and/or generate their own if needed in developing their MIAs.
We hosted the competition tasks as separate competitions on CodaBench.
Evaluation Criteria
Submissions were evaluated and ranked based on their true positive rate at a 10% false positive rate (TPR @ 10% FPR). This metric reflects a realistic attack scenario in which an adversary aims to accurately identify as many members as possible while allowing only a small margin for error. We also plotted full Receiver Operating Characteristic (ROC) curves for each attack and reported additional metrics, including the Area Under the Curve (AUC), overall accuracy, and membership inference advantage (defined as TPR - FPR).
Accessibility
We structured the competition to ensure accessibility, minimizing the need for extensive computational resources. To support this, we provided a calibrated set of shadow models so that participants were not required to train additional models themselves. For context, training the 450 models made available during the competition required approximately 1500 GPU hours. Participants were welcome to join any subset of the four competition tracks.
Transparency
The implementations for this competition are based on the Diffusion Model Bootcamp provided by the Vector Institute. A more detailed technical description of the competition as well as the code used to train models and score submissions is available on the competition GitHub repository.
Result
We received entries from 71 distinct participants across the 4 tracks. We congratulate all participants for taking part in this competition, and we are particularly excited to announce the winner and runner-up in each track.
Track | Winner | Runner-up |
---|---|---|
Black-box Single Table | Tartan Federer | CITADEL & UQAM |
White-box Single Table | Tartan Federer | Yan Pang |
Black-box Multi Table | Tartan Federer | Cyber@BGU |
White-box Multi Table | Tartan Federer | ** |
The winner of each track is eligible for an award from Vector of $2000 CAD; runners-up are eligible for an award of $1000 CAD.
** We received several submissions for the white-box multi-table task; however, their performance did not significantly exceed that of random guessing.
Analysis
Findings:
- In the white-box track, our top-performing teams used different approaches in their attack development. Tartan Federer used SecMI as a starting point for their attack design. While SecMI has shown success in image-based diffusion models, its original design proved less effective for tabular data – highlighting the effect of data domain on attack development. Tartan Federer identified noise initialization as a key factor influencing attack efficacy and proposed a machine-learning-driven approach that leverages loss features across different noises and time steps. Inspired by the success of the GSA approach in computer vision, Yan Pang leveraged the differences in gradients between member and non-member samples for their attack development.
- In the black-box track, the best performing submissions employed a diverse set of techniques too. Cyber@BGU team leveraged shadow models, auxiliary machine learning models, and an attack classifier to craft their attack. Tartan Federer also used shadow model parameters for their attack development. CITADEL & UQAM performed their MIA through an ensemble technique. In addition to shadow-model-based predictions of RMIA and DOMIAS, their meta-classifier takes continuous features of the data as well as several measurements of Gower distance between the data points and the synthetic dataset as inputs.
Interesting observations for further investigation:
- TabSYN vs TabDDPM: MIDST provided two models with different structure for single table tracks: TabSYN and TabDDPM. The competition considers the highest score achieved in attacking either of the models for ranking. Most of the attacks submitted targeted TabDDPM, a few that attacked both achieved higher scores for TabDDPM. It remains an open question whether the preference comes from the fact that latent space diffusion models like TabSYN are less explored, or that the structure makes these models more resilient against membership inference attacks. An evidence to the former argument is the SecMI attack, where latent space diffusion models are considered in attack extension rather than in the default design of the attack itself.
- Single table vs multi-table: MIDST uses Transaction table from the Berka dataset for the single table tracks. For multi-table tracks, the other tables from the Berka dataset are added as well. However, the MIDST challenge points for all tracks were restricted to the Transaction table. An intuitive consequence from this setup would be that the attacks designed for single table models are applicable to multi-table ones, with similar success rate if they opt to not use the additional information from the other tables, and higher success rates if they opt to do so. However, the submitted results – particularly on white-box track, do not follow this intuition.
- Comparison with the other AI-Gen for tabular synthesis: Diffusion models perform exceptionally well for tabular synthesis. MIDST results show that this synthesis is not free of privacy leakage. However, without further investigation and comparison with other GenAI approaches, it remains unclear whether this privacy leakage is specific to diffusion models.
Event Organizers
Event Sponsors
FAQ
Acknowledgements
We’d like to thank MICO organizers, for their open source project, and very helpful comments.
Next Steps
MIDST is part of ongoing efforts at the Vector Institute to provide guidance on the privacy evaluation of synthetic data. If you are interested in joining a discussion with our team or collaborating with us on the topic, please contact us at the following emails: masoumeh@vectorinstitute.ai, xi.he@uwaterloo.ca, or veronica.chatrath@vectorinstitute.ai.