This content is locked. Please login or become a member.
A Predictive Breakthrough
Artificial intelligence is about prediction. And for a long time, that was about numerical prediction and doing sort of complex algorithms in math so that Netflix could recommend a show for you to watch based on your other shows you watch, Amazon could figure out where to site its next warehouse, or Tesla could figure out how to use data to make sure its cars were driving automatically. The thing that these systems were bad at predicting was the next word in a sentence. So if your sentence ended with the word “filed,” it didn’t know whether you were filing your taxes or filing your nails.
In 2017, a breakthrough paper called “Attention is All You Need” outlined a new kind of AI called the Transformer Attention Mechanism. It basically let the AI pay attention to not just the final word in the sentence, but the entire context of the sentence, the paragraph, the page, and so on. And that let the AI make accurate predictions about what word or part of a word (called tokens) comes next in a sentence. Basically, AI is a very fancy autocomplete. It predicts the next token, which is a word or part of word in a sentence, and that’s what large language models do.
But it turns out, unexpectedly, when large language models get big enough, they also do all kinds of other things we didn’t expect. We can’t fully explain why it’s as good as it is, why something that predicts the next token can do tests at a very high level, that it seems to do creative work, that it seems to respond to you in conversations, that it seems like you’re interacting with a person even though you’re not. You’re interacting with a program. And we call these sets of unexpected things the AI can do as emergent phenomena.
There’s a lot of debate in the academic community over what’s emergent, what isn’t, where we’re being fooled by the AI, and it seems intelligent, but it isn’t. And so, the threshold, the thing that happened, was the model GPT 3.5 came out. And even though it was very similar to earlier existing AI models, something happened when you reached the scale of language that was in the system, where it started to produce much higher quality, much more coherent tests. It took everybody by surprise, I think, that these models were as powerful as they were.
An Existential Crisis
If you haven’t stayed up 3 nights being anxious about AI, if you haven’t had an existential crisis about it, you probably haven’t really experienced AI. This seems to be almost universal because when you use it enough, usually ten hours is kind of my threshold. What will I do for a living? What will my kids do? What does it mean that it’s better than me at some of this stuff? I think that that is something that you can be productive after you go through, but that crisis seems inevitable.
I don’t think the existential crisis piece is not hopeful. I think it’s a very neutral thing. I think we built this thing in large language models that have practical implications, and philosophic implications, and everybody can access them as opposed to technologies hidden somewhere inside large companies. So I think having that moment of like, woah, what does this all mean is not a negative thing. It’s a prerequisite to working with these systems.
General Purpose Technology
Interestingly, GPT doesn’t just stand for the GPT in ChatGPT. It also stands for General Purpose Technology, which is one of these once-in-a-generation technologies that those of us who study innovation think about a lot, which are things like steam power or the Internet or electrification that change everything they touch. They alter society, they alter how we work, they alter how we relate to each other in ways that are really hard to anticipate. All the evidence we have right now is that large language models are exactly that kind of technology. That they will have a large, widespread influence that’s somewhat hard to predict.
I am optimistic that we can use this, regardless of how good it gets, in ways that help us thrive and potentially ban it or stop it if it starts to get to a position where things are going badly for us. But I don’t know if that’s the case. I don’t know if that’s true, and people have lots of debate over this. I know a lot of people think about things like the lawsuits about AI companies or regulation and think maybe AI is gonna go away or stop developing. Maybe it’s got there’s not gonna be enough data to train it. Everything that we see in the world indicates that that’s not the case, that AI is here to stay and will almost certainly keep developing at least in the medium term.
We never know how good it’s going to get. There’s disagreement even among experts in the field about how much further AI can develop, but it is here to stay. It is something that I think we have to deal with and learn to work with and learn to thrive with rather than to just be scared of. When you ask people about the future of AI, there’s a term used within AI insiders called p(doom), which is your probability that we are all going to die. I do not have a p(doom) that I really think about because I do not think we can assign a probability to things going wrong. And again, that makes the technology the agent. We get to decide how this thing is used.