What is "AI"?
Behind the hype.
When people talk about "AI" in a modern context, they are usually talking about "generative AI" or "large language models". These are the techniques behind products like ChatGPT, Copilot, Gemini and others.
Here we will break down how this technology works in an easy-to-understand way, and point out what it can, and cannot do.
How it works
If you have used a cell phone in the last 20 years, you have likely encountered predictive text.
If you type "good mo" on your phone, the predictive text system on your phone will most likely display "good morning". It does this because it has a database of the most frequent words that come after other words, and it displays the most likely one.
Generative AI is effectively a super-charged version of text prediction that you have on your phone.
Image: Elise Racine
What is a model?
Programs like ChatGPT output text based on their "model". A model is like a database of information about words and phrases and how they are connected. A model might store that the word cow often comes up near the word grass. Or that bovine comes up less often than cow, but is still somehow connected to grass.
Neither the model nor ChatGPT "knows" what a cow is. The model has a number, or a symbol associated with the letters c o w, and all the other symbols that are somehow related to it.
How do you make a model?
Models are a huge collection of words and their connections. So creating one requires an extremely large amount of text, and then running an expensive algorithm over that text, to find out the relationships between the words.
Making a model is usually called "training".
Models used by products like ChatGPT are created by downloading huge amounts of text from the internet. Whatever is available online: newspapers, books, forum, personal blogs, wikis on fictional subjects, spam websites.
This training data is not filtered to remove information that is not true, so it contains:
- Hateful content, stereotypes, misinformation.
- Jokes, satire, parodies.
- Fictional stories, conspiracies.
- Information that was previously correct, but later disproven.