AI - What is it?

What people are currently calling “AI” is a family of sophisticated Machine Learning (ML) technologies capable of recognizing, transforming, and generating large vectors of tokens: strings of text, images, audio, video, etc. A model is a giant pile of linear algebra which acts on these vectors. Large Language Models, or LLMs, operate on natural language: they work by predicting statistically likely completions of an input string, much like a phone autocomplete. Other models are devoted to processing audio, video, or still images, or link multiple kinds of models together.

Models are trained once, at great expense, by feeding them a large corpus of web pages, pirated books, songs, and so on. Once trained, a model can be run again and again cheaply. This is called inference.

Models do not (broadly speaking) learn over time. They can be tuned by their operators, or periodically rebuilt with new inputs or feedback from users and experts. Models also do not remember things intrinsically: when an AI Chatbot references something you said an hour ago, it is because the entire chat history is fed to the model at every turn. Longer-term “memory” is achieved by asking the chatbot to summarize a conversation, and dumping that shorter summary into the input of every run.

LLMs are trained to complete tasks. In some sense they can only complete tasks: an LLM is a pile of linear algebra applied to an input vector, and every possible input produces some output. This means that LLMs tend to complete tasks even when they shouldn’t. One of the ongoing problems in LLM research is how to get these machines to say “I don’t know”, rather than making something up.

AI models do make stuff up! This phenomenon is known as "hallucination," where AI models generate false, inaccurate, or misleading information while sounding highly confident. These errors occur because generative AI is designed to predict the next likely word, not to verify truth, often filling in gaps when training data is insufficient or biased.