Computer Vision
ELI5 — The Vibe Check
Computer Vision is teaching AI to understand images and video. How does your phone unlock with your face? Computer Vision. How does a self-driving car see the road? Computer Vision. How does Instagram know there's a dog in your photo? You guessed it. It's AI with eyes, and it's gotten shockingly good.
Real Talk
Computer Vision is the AI field focused on enabling machines to interpret and understand visual data (images, video). Key tasks include image classification, object detection, segmentation, and image generation. Modern approaches use convolutional neural networks (CNNs) and vision transformers (ViTs). It's increasingly multimodal, combined with NLP in models like GPT-4V and Claude.
When You'll Hear This
"Computer vision is used for quality control on the assembly line." / "The app uses computer vision to identify plants from photos."
Related Terms
Deep Learning
Deep Learning is Machine Learning that's been hitting the gym.
Diffusion Model
Diffusion models generate images by learning to reverse noise. In training, you take an image and slowly add random noise until it's pure static.
Generative AI
Generative AI is AI that creates new stuff — text, images, code, music, video — rather than just classifying or predicting. ChatGPT writes essays.
Neural Network
A neural network is a system loosely inspired by the human brain — lots of little math nodes connected together, passing numbers to each other.