[{"data":1,"prerenderedAt":72},["ShallowReactive",2],{"term-m\u002Fmodel-inversion":3,"related-m\u002Fmodel-inversion":58},{"id":4,"title":5,"acronym":6,"body":7,"category":40,"description":41,"difficulty":42,"extension":43,"letter":44,"meta":45,"navigation":46,"path":47,"related":48,"seo":52,"sitemap":53,"stem":56,"subcategory":6,"__hash__":57},"terms\u002Fterms\u002Fm\u002Fmodel-inversion.md","Model Inversion",null,{"type":8,"value":9,"toc":33},"minimark",[10,15,19,23,26,30],[11,12,14],"h2",{"id":13},"eli5-the-vibe-check","ELI5 — The Vibe Check",[16,17,18],"p",{},"Model inversion is reconstructing training data from a trained ML model — the privacy attack that makes ML teams sweat. You trained a model on private medical records. Someone probes your model carefully, analyzing its outputs and confidence scores across thousands of queries. Over time, they reconstruct data that looks suspiciously like your private training set. The model learned the data too well and is now accidentally leaking it.",[11,20,22],{"id":21},"real-talk","Real Talk",[16,24,25],{},"Model inversion attacks work by querying a model repeatedly and using the outputs (predictions, confidence scores, embeddings) to infer information about training data. Fredrikson et al. (2015) demonstrated extracting facial images from a facial recognition model. Defenses include differential privacy (adding noise during training), output perturbation, confidence score masking, and limiting API query rates. The attack is particularly relevant for models trained on PII, medical, or financial data.",[11,27,29],{"id":28},"when-youll-hear-this","When You'll Hear This",[16,31,32],{},"\"Model inversion is why we don't expose raw confidence scores in the API.\" \u002F \"Fine-tuning on customer data without differential privacy is a model inversion risk.\"",{"title":34,"searchDepth":35,"depth":35,"links":36},"",2,[37,38,39],{"id":13,"depth":35,"text":14},{"id":21,"depth":35,"text":22},{"id":28,"depth":35,"text":29},"security","Model inversion is reconstructing training data from a trained ML model — the privacy attack that makes ML teams sweat.","advanced","md","m",{},true,"\u002Fterms\u002Fm\u002Fmodel-inversion",[49,50,51],"Machine Learning","AI Safety","Alignment",{"title":5,"description":41},{"changefreq":54,"priority":55},"weekly",0.7,"terms\u002Fm\u002Fmodel-inversion","C4-TVM9wbifZGmZTzCaHHpSgsBsnBA56CMR-O4h0QQ4",[59,64,67],{"title":50,"path":60,"acronym":6,"category":61,"difficulty":62,"description":63},"\u002Fterms\u002Fa\u002Fai-safety","ai","intermediate","AI Safety is the field of making sure AI doesn't go off the rails.",{"title":51,"path":65,"acronym":6,"category":61,"difficulty":62,"description":66},"\u002Fterms\u002Fa\u002Falignment","Alignment is the AI safety challenge of making sure AI does what we actually want, not just what we literally said.",{"title":49,"path":68,"acronym":69,"category":61,"difficulty":70,"description":71},"\u002Fterms\u002Fm\u002Fmachine-learning","ML","beginner","Machine Learning is teaching a computer by showing it thousands of examples instead of writing out every rule.",1775560914106]