This is how the world ends

Jun 22

Written By Stephen

headpiece full of artificial straw, alas

There's been so much chatter about the apocalyptic possibilities of Artificial Intelligence, most of it seeming to center on overheated notions of ChatGPT's capabilities. I do feel like everyone should calm down, we're not verging on a Paperclip Problem scenario yet, but there are definitely a few ways that AI (which right now reminds me of the joke that "Artificial Intelligence is what you call Machine Learning when you're talking to the marketing department") could cause problems at a societal level.

But first, a comment — ChatGPT is a machine learning model. A very impressive one, that has learned what word-embeddings are correlated with what other word-embeddings and can produce long tracts of text, which more often than not is a good summary of existing literature. However, it is not intelligent, and it does not comprehend, and there’s a few great examples on YouTube of ChatGPT “playing chess” and knowing theory positions, because the first twenty moves of the Najdorf Sicilian are thoroughly documented, before going completely off the rails. We do not have an “Artificial Intelligence” in ChatGPT, although that’s what people are calling it. Calling it AI is a marketing term, and most of the bloviating about its risks starts from the faulty assumption that it is. Which brings me to the first risk…

Undue Credulity Towards a Model’s Knowledge

There’s already been several cases, as of writing, where lawyers have cited nonexistent court cases in briefs they “wrote” on behalf of their clients, and then submitted to the court. I live in terror of the tragedy of the first medical malpractice case centering on a course of treatment prescribed uncritically by a doctor under advice from a large language model.

The chatbot models are great at summarizing vast corpuses of text, but they don’t understand, and they are prone to hallucinate. If I want a high-level summary of a topic, or I want to get past Blank Page Syndrome, these AI chatbots are a pretty good source. But they aren’t capable of making decisions that aren’t just regurgitating what’s in literature, which means they cannot innovate, because real innovation requires doing something that hasn’t been done before and is therefore not in the literature. And they aren’t capable of comprehending context, as evidenced by some of the examples MIT Technology Review put together for the previous iteration of GPT. “But that was the last generation!” you protest, certain that ChatGPT is the philosopher-king that was promised. Well, read this longwinded gibberish.

Our AI Chatbots make very convincing sources, but they aren’t experts with actual knowledge.

“Knowledge” isn’t just an accumulation of facts — I know this as a semi-professional fact-knower who would not consider himself an expert on a lot of things I know facts about — it’s understanding how those facts fit together in context. There’s a world of difference between skimming a Wikipedia article and spending years doing the work that gets cited in the Wikipedia article.

Large language models, and more broadly machine learning models, don’t really have “knowledge”. They capture correlations, and sometimes they capture the correlations so well that it can look like they “understand” the thing. But an expert can spot that it’s basically summarizing entry-level information it’s copying from a textbook, not comprehending the thing itself.

Watch ChatGPT play chess, and you can see this. So long as the positions handed it are straight out of theory — meaning the line has been described by a human expert or developed by a bot that only plays chess such as Stockfish or Leela — ChatGPT will regurgitate the notation it’s read. But the moment you get out of theory, well, the chess speaks for itself.

My concern here is that people who make important decisions begin asking a chatbot what they should do, and copying the chatbot. It’s already being done as a bit of a stunt with unverified results, but now imagine someone making layoff decisions for their manufacturing company, or local city council members making zoning decisions based on the recommendations of ChatGPT (actually… ). The chatbot isn’t capable of understanding your specific context, so it’s not going to give advice that’s even close to the quality of a real expert. But it sure could sound like one if you’re not being careful.

Unintentional Feedback Loops Between Models

On May 6, 2010, at 2:32 PM, the Dow Jones Industrial Average started to drop, and within a few minutes over a trillion dollars of stock value had vanished, before almost immediately recovering by 3 PM. This so-called Flash Crash was blamed on high frequency trading, and in particular the way the models are designed to trade against conventional market making. Because of the volume of activity on the market high frequency trading accounts for — one estimate was that 85% of the market at that time was HFT algorithms trading with each other — they fed back on themselves and created a negative loop that crashed the market for a moment. The results were briefly catastrophic.

When we train machine learning models, we feed them data that we assume is coming from what it’s trying to emulate, say post-war British artwork or the tweets from Law Twitter. The model then tries to represent whatever it’s been given as an output, and usually can be made to do a half-decent job at it. The trouble is when you get two models talking to each other.

Here’s an example. Let’s say you’re a financial institution looking to identify fraudulent transactions. Because of the sheer volume of transactions, you want to thin the haystack using a machine learning model that flags a transaction as potentially fraudulent. But, because of Things and Decisions, you decide you aren’t going to bother checking the model, and just anything the model returns a 90%+ confidence on as fraudulent isn’t flagged for human review, it’s just marked as fraudulent with no recourse. Now, as you retrain this model on new data, it’s going to only see things flagged as fraudulent by itself, so it just re-learns its own criteria for 90%+ confidence.

The model trains on itself.

This is an obvious problem that’s easily resolved by sticking a human in the loop and using some sort of reinforcement learning paradigm to smack the model when it’s wrong and pet the model when it’s right. But what about a more decentralized problem like, say, if 50% of ad copy starts being written by large language models that are trained on ad copy? These may be disparate models, but they will start to converge towards sounding like themselves, since they’re learning by talking to each other. Eventually the ad copy will become more generic than ad copy already is, but there’s also the interesting possibility of the models learning their own model-language, like if you left a slew of two-year-olds alone on an island long enough and they start making language up to describe things and situations they haven’t seen before.

This could lead to some very unusual situations.

Say you’re a lawyer, and you need to write a brief for the court while representing your client. With Google, you can search for similar briefs from similar cases using the right keywords. You can, if you’re particularly lazy, find an existing brief that pretty closely matches your own, and twiddle a few details and submit it as your work, but this is pretty unethical but, since you found this brief through various records, it’s probably safe to say that it’s a legitimate brief with real facts in it, but which the search engine has indexed to make it easier to find.

Now say you gave LawGPT a prompt to write this brief. Because of the fundamental nature of large language models, what comes out will emulate the form of a legal brief, possibly incredibly well, but does not comprehend the function of the brief, nor is it guaranteed to cite proper case law. However, at first blush, this looks like a real brief that could just be cleaned up and submitted. But, knowing humans, the brief isn’t cleaned up as well as it should be, because deadlines. Now it’s out in the world, a half-edited brief that probably has correct facts and references, we hope.

But then LawGPT picks up that brief, and trains itself, and the loop closes. The tool will degrade because its errors are propagating, uncorrected by human experts, back into its training data. We hope that it doesn’t just start writing gibberish briefs after a while, but that requires human intervention and, well, see above about being busy.

But, you may protest, there’s a human in the loop, and they’ll surely do their due diligence and take the output of any chatbot model and clean it up, right? Well, it’s already happening on Mechanical Turk, and this means that the output of one model is being fed into either itself as input, or into other models, and now they’re talking to each other.

This is the sort of language-degrading nightmare George Orwell wrote about in Politics and the English Language, but originating from a decidedly Huxleyian origin of a human antipathy to struggle and desire for doing things more easily. So far this is things like ad copy, but you could imagine this for everything from supply chain management to writing research papers — the models start to feed back on themselves, and it’s very likely that they’ll eventually spiral.

edit: I learned about the existence of this paper hours after posting this blog post, and it’s looking at precisely the scenario I described above, with large language models learning off each other’s output:

“The Curse of Recursion: Training on Generated Data Makes Models Forget”, https://arxiv.org/abs/2305.17493

The tl;dr on it is that the models “forget” low-probability outcomes, effectively shrinking the training set variability, and this happens in a variety of tested models, not just large language models.

The Slow Degradation of Human Ability

This is getting into the realm of science fiction, Frank-Herbert-Butlerian-jihad-against-the-machines territory, but it’s not irrelevant to think about.

Machine learning right now is a tool to show us correlations, and we have to bring our own brains to understand if those correlations are useful, or if the model is hallucinating, or if it’s spitting out true but useless correlations, or something else entirely. Insofar as machine learning innovates, it helps its human users discover unexpected correlations in large datasets faster. But we need to be careful at assuming that ChatGPT, or even a near-future ChatGPT, is going to be able to write at the level of a talented human. But to a certain leadership mindset, ChatGPT is a really great replacement for the sorts of deep thinking and careful consideration that is required for real human expertise.

As alarmist as this sounds, it’s a sentiment expressed by someone who was, at one point, one of the wealthiest people on Earth. To wit:

“I don’t want to say no book is ever worth reading, but I actually do believe something pretty close to that. … If you wrote a book, you f---ed up, and it should have been a six-paragraph blog post.”
-Sam Bankman-Fried, Interview

Before we dismiss this as someone who was obviously not a role model in retrospect, this guy was a Forbes 30 Under 30 whom 5 million users trusted and who was at the head of a company valued, at its peak, at $32bn before it all collapse. Tell me someone like that wouldn’t lay off their entire legal team in favor of a chatbot that’s a lot cheaper.

It’s already happening — GitHub’s CoPilot is writing, by their own estimate, about 45% of new code. As someone who asked ChatGPT to write a regex function and got back code that sorta-worked-but-only-after-I-fixed-it, I’m not sanguine about the quality of that code. Aggregator websites are just summarizing other websites using chatbots, completely removing any human curation. Germany’s largest newspaper is planning to replace writers with AI as of writing. In the words of Axel Springer Chief Executive, Mathias Döpfner:

He predicted that AI would soon be better at the “aggregation of information” than human journalists and said that only publishers who created “the best original content” – such as investigative journalism and original commentary – would survive.

Isn’t a newspaper already mostly investigative journalism and original commentary? Is the local interest piece just going to die? Where the information that’s going to be aggregated going to come from in the first place?

We’re nowhere near the level of technology required to abdicate the role of human expert, or even have the computer participate as an equal. But when supposed experts talk about these things as if they are sentient and intelligent, and it’s reported with credulity to the general public, actual experts are going to be removed by decision-makers in favor of an AI chatbot that looks and sounds pretty smart if you have no idea what it’s talking about.

These are also, by and large, the “learn to code” people who belittle the liberal arts, but would benefit greatly from, say, one of the philosophy classes discussing epistemology or ontology that are disappearing in lieu of university departments with more vocational focus, usually doing something with machine learning to leverage the output of what those disappearing humanities departments would have produced as input for a new, fancy model to emulate.

Once again, the model trains on itself.

This is a human version of the Feedback Loops — human experts are displaced by AI chatbots, who then do a mediocre job, but because there are fewer human experts to step in — either because they’ve been fired or are too busy playing inept-AI whack-a-mole — much of the mediocre output slips through and becomes the new standard for the dwindling next generation of experts, who learned to rely heavily on the chatbot because the goal of their university degree was “get a job” and the way you do that is "prompt engineer”.

Which experts are most likely to be displaced? The ones that are expensive, most likely, will go first. Sorry, marketing teams and software engineers. But don’t worry, you’ll be joined by lawyers and medical professionals soon enough.

The process here is much slower than the Flash Crash, which actually makes it more insidious. Unlike the Mechanical Turk example, where the human was just a pass-through that masked that the models were training each other, here human experts are learning from model output, which degrades the human expert. We don’t notice the degradation in skill until it becomes extremely hard to reverse course — after all, we fired all the school teachers and replaced them with a chatbot to lower property taxes (tell me, with total confidence, this is not a thing we’ll be debating in a year or two in some places).

This is the slowest, insidious-est doom of the three, and also the least likely, assuming the marketing buzz and hullabaloo surrounding language models dies down after they get deployed in production and give pretty middling results.

These are a few of my actual concerns with the current revolution in large language models that’s being marketed as a major breakthrough in artificial intelligence. Some of them are already happening. Some of them are speculative. They’re all easily avoided by understanding the limitations of these models, and by directing a bit of good human expert skepticism at their output. We have an impressive chatbot, but we’re still very much so the experts.

$\setCounter{0}$

Stephen

This is how the world ends

Undue Credulity Towards a Model’s Knowledge

Unintentional Feedback Loops Between Models

The Slow Degradation of Human Ability

Interpreting the Categorical Cross-Entropy Loss Function

The Five Phases of a Data Science Project