Maty Bohacek

Based out of Stanford University & San Francisco, CA

I am a student researcher at Stanford University advised by Prof. Maneesh Agrawala. I work at the intersection of AI, computer vision, and media forensics. In a world that is becoming increasingly AI-mediated, my goal is to build AI systems and that are inherently trustworthy and interpretable, resulting in productive, democratic discourse.

News

(September 2025)

  • Started my visit to Bill Freeman’s lab at MIT.

  • Received the Czech Senate Medal (a civilian honor) from Senate Speaker Milos Vystrcil in Prague.

(August 2025)

  • Our IJCAI-W paper ‘Human Action CLIPs’ won the best paper award (conference held in Montreal, Canada).

(October 2025)

  • Delivered remarks at the Planeta AI conference.

  • Delivered a talk at the INN conference.

  • Hosted a workshop and presented two workshop papers at ICCV in Honolulu, HI.

(November 2025)

  • One paper at ACM MM in Dublin, Ireland.

  • See full archive in my journal here.

Upcoming

(January 2026)

  • The last two episodes of our podcast Shell Game (reached #1 tech podcast on Apple in December) are coming out!

(March 2026)

  • I will deliver a keynote at the SAFE workshop at WACV in Tuscon, AZ.

(June 2026)

  • Our workshop will be hosted at CVPR in Denver, CO. More info soon.

Highlighted Publications

* Equal contribution. Not necessarily chronological.

Uncovering Competency Gaps in Large Language Models and Their Benchmarks

Bohacek M., Scherrer N., Dufour N., Leung T., Bregler C., and Chan SCY. Under Review.
Paper — Surfacing actionable insights for improved evals and models.

LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer

Panda R., Fein D., Singhal A., Fiore M., Agrawala M., and Maty Bohacek. Under Review.
Project Paper Code Weights — Interpretable style in vision models.

Compliance Rating Scheme: A Data Provenance Framework for Generative AI Datasets

Bohacek M. and Vilanova I. ACM MM 2025.
Paper — Leveraging media provenance to construct and evaluate compliant datasets.

Uncovering Conceptual Blindspots in Generative Image Models Using Sparse Autoencoders

Bohacek M.*, Fel T.*, Agrawala M., Lubana E. Under Review.
Project Paper Code Weights — Method for identifying blindspots in T2I models: concepts that the model was trained on but can’t generate.

Human Action CLIPs: Detecting AI-generated Human Motion

Bohacek M. & Farid H. IJCAI-W 2025.
Paper Dataset — This paper proposes a method for distinguishing real and fake (T2V) video using multi-modal semantic embeddings, evaluated on DeepAction, a new dataset of real and AI-generated human motion.

The DeepSpeak Dataset

Barrington S., Bohacek M., and Farid H. ArXiv, abs/2408.05366.
Paper Dataset — This paper introduces DeepSpeak, a large-scale dataset of real and deepfake footage designed to support research on detecting state-of-the-art face-swap and lip-sync deepfakes.

Nepotistically Trained Generative-AI Models Collapse

Bohacek M. & Farid H. ICLR-W 2025.
PaperThis paper demonstrates how some generative AI models, when retrained on their own outputs, produce distorted images and struggle to recover even after retraining on real data.

GenAI Confessions: Black-box Membership Inference for Generative Image Models

Bohacek M. & Farid H. ICCV-W 2025.
Paper Dataset — A method to determine whether a generative AI model was trained on specific images.

Synthetic Human Action Video Data Generation with Pose Transfer

Knapp V. & Bohacek M. CVPR-W 2025.
Project Paper Code Data — We show that synthetic data of human motion improves performance of action classification and understanding.

For a complete list of my academic publications, please refer to this page or my Google Scholar profile.

Contact & Misc.

  • Email: maty (at) stanford (dot) edu

  • Resume (coming soon)