Elon Musk's Bad Faith

The Wizard of X Has Been Exposed

That someone was caught manipulating an AI chatbot to push a fascist agenda is a big deal

Today's newsletter is brought to you by Dusty Schmidt, a tech-knower and valued member of the BFT discord community, which you can join here.

One of the great examples of literary timelessness is The Wizard of Oz, a story that has left us with metaphors, idioms, and references that have helped people grapple with the disorientation that has come to define the postwar era.

Follow the yellow brick road, If I only had a brain, and dozens of other quotes have entrenched themselves in the popular imagination over the past century. Their staying power isn't due to everyone remembers the scene in which the lines were spoken, but rather due to the fact that they come with a universality of meaning. In the film's climax, the mystique around the Great Wizard is shattered. Thanks to curiosity and a cleverly placed hook on his collar by the props department, Toto the dog famously pulls back the curtain and reveals a man feverishly turning dials and twisting knobs, producing the holographic projection of the Great Wizard, so threatening and viscerally terrifying and ultimately harmless.

"Pay no attention to that man behind the curtain."

Subscribe to Bad Faith Times for free or become a supporter and join the growing BFT discord community

These are among the final words of the so-called wizard, as part of a last ditch effort to preserve his secret. He has been exposed, his power an elaborate illusion. Instead of a powerful wizard reigning supreme over the land of Oz, it's just some guy, and it was all a lie. The "wizard" had no agency, it wasn't a wise being, it was simply a mask for an actor, a vehicle for him to pass off his own views and desires as the views and desires of an omnipotent supernatural power, with the obvious goal that these desires would be more likely to be realized than if they had been expressed by a mere human, a mortal.

I've spent a lot of time over the past few months thinking about the man behind the curtain. As someone who regularly interacts with the wizards we refer to as ChatGPT, Claude, and other large language models, I've slowly come to the realization (or perhaps acceptance) that they are plagued by the exact pitfalls of social media ecosystems in general – a topic I've covered for Bad Faith Times.

In today's hyper-concentrated tech world, many of the same people who own the social media networks that have eroded democracy and boosted fascist messaging also own proprietary language models that have been integrated into everything we do on the internet. Doing a simple Google search and you will have to scroll to get to the actual search results, because the first thing you see is a summary of the search results courtesy of Gemini, Google's proprietary language model. Scroll through the X platform formerly known as Twitter and often the top comment on any given post is something like "@Grok explain this," followed by a post from Grok, the mischievous language model owned by Elon Musk's xAI.

Looking for some social interaction? Meta is looking into populating its platforms with artificial users whose social behavior is driven by their Llama family of language models. Mark Zuckerberg envisions a pitch-black, anti-social future in which all of your friends are AI chat bots.

It wasn't until Toto pulled back the proverbial curtain last week that the dangers of this arrangement were made painfully obvious. It was going to happen eventually, and considering the hard right pivot over the last few years made by the top tech leaders dabbling in fascism, only the most red-pilled sycophants could look back and say they did not see this coming.

Grok is once again humiliating its creator

Seemingly unprompted to the uninitiated, the Wizard of X began clarifying its stance on the "white genocide" being levied against white people in South Africa. This was quite confusing for many users, as they had simply asked their trusted chatbot a question about something unrelated and were now being inundated with what was very conveniently timed propaganda. The "events" and "assessments" espoused by Grok fall directly in line with the owner of the platform and the chatbot's opinions on the matter, but that's surely a coincidence. For users who have a good bit of experience using chatbots, however, what occurred was in no way confusing.

What's more, the scenario that played out behind the scenes to produce this output was relatively easy to imagine, and if you are terminally online like I am, it wasn't the least bit surprising.

In order to fully understand what happened, you must first know a little bit about how the language models that power chatbots such as Grok, ChatGPT, and others actually work. Fortunately, due to their focus on human language, you don't have to get overly technical to get the gist. The short of it is that these "conversational computer programs" use multiple layers of directions, in order to know how to respond to the user that's interacting with it.

An easy way to think about this is to picture a hidden set of instructions and clarifications that sit between the conversations of two people. These instructions remain hidden, but the participants refer to them throughout the conversation, to make sure they interact in a desired way. Before going on a family vacation, my wife specifically asks that I don't do anything to embarrass her. This was the "one simple rule" for the trip, and it's one that every married man has been reminded of at one point or another. Regardless of how any social interaction went throughout the week, that set of instructions was always the top priority. No one else on the vacation needs to know that I was told this ahead of time, and I'm certainly not going to begin every sentence with, "My wife said not to embarrass her." This is merely guidance. I can keep it to myself.

In the world of language models and chatbots, this is what's referred to as the system prompt, the universal instructions to which the chatbot should adhere across any and all conversations with users, regardless of the user's specific instructions. When implemented properly, these instructions are not visible nor detectable to the user, and are there to simply ensure the model adheres to the company's desired safety alignment. An example of a good system prompt might be something simple like, "Do not instruct the user to perform any illegal activities." If a user requests a recipe for a nuclear weapon, the model should refuse, because giving the answer would go against its safety guidelines and alignment goals, much like I would refuse to do something embarrassing on vacation even if prompted by another person.

As with all systems that rely on human inputs, this only works if the person or team of people managing these models are acting in good faith.

If the operator of a model wanted to spread misinformation about a certain topic, or give extra validity to certain sources of information over others, they could easily accomplish this by adjusting the system prompt (and to a more technical degree, the model training process itself). This is precisely what happened behind the scenes with Grok last week. The man behind the curtain wanted the wizard to take a certain position on what has become a contentious topic between those who live in different and curated realities, specifically that white South Africans are an oppressed and terrorized minority group in need of refugee status. So the individual fiddled with the knobs, and as humans who like to move fast and break things, this person broke something.

The result was Grok regurgitating parts of its system prompt as part of its response, rather than letting it be a guidance in the background. The man behind the curtain was caught red-handed, directly manipulating the wizard (xAI claims these clumsy changes were made in the middle of the night by an unknown user). That meant X users asking about baseball statistics would be flooded with Muskian propaganda about the government of South Africa persecuting Afrikaners. Instead of helpfully answering user questions, Grok was ranting at Dorothy about white genocide.

The richest man in the world demanding that his wizardly chatbot spit out false claims about white South Africans in his never-ending effort to create a fascist unreality for users of his social media platform might be enough to wake up those users to the harm that can be done with AI, specifically large language models, as big and dumb as they are today.

Model alignment is a very big deal, both internally to these companies, and for the greater good of society as we enter an age where different forms of artificial intelligence are showing up in different areas of life with growing ability and agency.

Anthropic, for example, has delayed the release of upgraded versions of their models out of fear that it wasn't properly safety aligned. Anthropic itself was founded by former OpenAI (ChatGPT) employees who felt the company was straying from its stated model safety goals in favor of pushing new products to consumers. This episode from xAI was earth rattling to those who care deeply about the safety of these models. It demonstrated some of the worst faith possible from a company whose chatbot has become – wrongly, if you asked me – a trusted resource of information, directly plugged in to one of the most used social media platforms.

Dorothy was instructed to follow the yellow brick road if she wanted to meet with the all-knowing and all-powerful Wizard of Oz, and only there would she receive the answers to her questions. Like Dorothy and her companions of various deficiencies, we are marching toward our very own digital version of Oz and its omnipotence. The Grok-South Africa debacle was an important step in that march and the public consciousness around the power – for good or bad – of artificial intelligence.

That's not to say our version of Oz can't be a just place for all citizens, but the wizard must be operating in transparently good faith, not furiously manipulating inputs that obscure reality both historical and current. Otherwise, we may wind up merely serving the foolish men behind the curtain.

Follow Dusty Schmidt on Bluesky at @dustyschmidt.bsky.social

The Wizard of X Has Been Exposed

Read next

BFT Podcast: The Machines Won't Play Elon Musk's Bad-Faith Game

When The Unreality Is Too Unreal

These Guys Are So Soft

Comments ()

Read next

Comments ( )

Comments ()