[{"data":1,"prerenderedAt":82},["ShallowReactive",2],{"term-t\u002Ftraining":3,"related-t\u002Ftraining":62},{"id":4,"title":5,"acronym":6,"body":7,"category":40,"description":41,"difficulty":42,"extension":43,"letter":44,"meta":45,"navigation":46,"path":47,"related":48,"seo":56,"sitemap":57,"stem":60,"subcategory":6,"__hash__":61},"terms\u002Fterms\u002Ft\u002Ftraining.md","Training",null,{"type":8,"value":9,"toc":33},"minimark",[10,15,19,23,26,30],[11,12,14],"h2",{"id":13},"eli5-the-vibe-check","ELI5 — The Vibe Check",[16,17,18],"p",{},"Training is the long, expensive process where an AI learns from data. You feed it millions of examples, it makes predictions, it checks how wrong it was (loss), and it adjusts its internal numbers (weights) to do better next time. Repeat billions of times. Now you have a model. It costs a fortune in GPU time and electricity.",[11,20,22],{"id":21},"real-talk","Real Talk",[16,24,25],{},"Training is the optimization process that adjusts a model's parameters to minimize a loss function over a training dataset. Each iteration (step) involves a forward pass to compute predictions, a loss calculation, and backpropagation to compute gradients, followed by a weight update via gradient descent. Training large models requires distributed compute across hundreds or thousands of GPUs.",[11,27,29],{"id":28},"when-youll-hear-this","When You'll Hear This",[16,31,32],{},"\"Training took 3 months on 1024 GPUs.\" \u002F \"Don't confuse training cost with inference cost.\"",{"title":34,"searchDepth":35,"depth":35,"links":36},"",2,[37,38,39],{"id":13,"depth":35,"text":14},{"id":21,"depth":35,"text":22},{"id":28,"depth":35,"text":29},"ai","Training is the long, expensive process where an AI learns from data.","intermediate","md","t",{},true,"\u002Fterms\u002Ft\u002Ftraining",[49,50,51,52,53,54,55],"Inference","Epoch","Batch","Loss Function","Gradient Descent","Backpropagation","Weights",{"title":5,"description":41},{"changefreq":58,"priority":59},"weekly",0.7,"terms\u002Ft\u002Ftraining","Wd4dtp0EOdNeM2oFb_iSshroJsoMisgzaeRm9aMnvNA",[63,67,70,73,76,79],{"title":54,"path":64,"acronym":6,"category":40,"difficulty":65,"description":66},"\u002Fterms\u002Fb\u002Fbackpropagation","advanced","Backpropagation is how errors flow backwards through a neural network during training.",{"title":51,"path":68,"acronym":6,"category":40,"difficulty":42,"description":69},"\u002Fterms\u002Fb\u002Fbatch","A batch is a small group of training examples that the model processes at once before updating its weights.",{"title":50,"path":71,"acronym":6,"category":40,"difficulty":42,"description":72},"\u002Fterms\u002Fe\u002Fepoch","An epoch is one complete pass through your entire training dataset. If you have 100,000 examples, one epoch means the model has seen all 100,000 once.",{"title":53,"path":74,"acronym":6,"category":40,"difficulty":65,"description":75},"\u002Fterms\u002Fg\u002Fgradient-descent","Gradient Descent is how an AI learns — it's the algorithm that nudges the model's weights in the right direction after each mistake.",{"title":49,"path":77,"acronym":6,"category":40,"difficulty":42,"description":78},"\u002Fterms\u002Fi\u002Finference","Inference is when the AI actually runs and generates output — as opposed to training, which is when it's learning.",{"title":52,"path":80,"acronym":6,"category":40,"difficulty":42,"description":81},"\u002Fterms\u002Fl\u002Floss-function","The loss function is the AI's score of how badly it's doing.",1776518319737]