Chat GPT: Sometimes wrong, but never in doubt

I don’t use AI to create my blog posts. For one thing, the world scarcely needs more blog posts in the the first place, and if you’re going to throw one on the pile at the very least one should ensure that it’s not made up of 100% recycled thoughts.

That said, I do use AI in other aspects of my work. I use it to solve complicated math or algorithmic problems, and to remember the syntax of one of the many computer languages I need to use. It’s absolutely brilliant at that.

When it gives me correct answers, that is.

Which leads me to my biggest problem (and concern) with AI. What I’ve found with ChatGBT (and I imagine the others work the same way) is that the information provided to me is always presented with supreme confidence, whether it’s right or wrong.

We all know somebody like this. Everything he or she says comes out with complete certainty. Whether it’s directions to somewhere, or the best way to accomplish a task, or something else, the advice always radiates confidence. It’s what gives life to the expression “often wrong, never in doubt.” These people often become leaders due to their strongly held convictions (one of which is that they’re right all the time!)

Sadly, ChatGPT is one of these people. Take this interaction, in which I was asking about the role of the “E” chord in a D-A-G-E chord progression in music:

ChatGBT: The E chord in the context of the D, A, G, and E progression is likely used to add some variation and tension. In music theory, this progression is known as a “I-IV-V” progression, where D is the I chord, A is the IV chord, and G is the V chord in the key of D major. Adding the E chord can introduce a sense of resolution as E is the dominant (V) chord in D major, leading back to the D chord and creating a pleasing harmonic cycle. It’s a common progression in many songs.

Me: E is not the V chord of D major

ChatGBT: You are absolutely correct, and I apologize for the mistake. E is not the V chord of D major; it’s the IV chord. The V chord in D major would be A. Thank you for pointing out the error.

Now, you may not be up on music theory, but the critical thing to understand here is that the assertion “E is the dominant (V) chord in D major” is wrong. Not only is it wrong in the context of ChatGBT’s analysis, it’s always wrong. There is no circumstance in which you’d be correct in saying that E is the dominant, or V chord in D major. (Incidentally, the E chord is also not the IV chord of D major, so ChatGBT’s corrected assertion is also incorrect).

I was frankly taken aback by the fact that ChatGBT got this wrong because it is very fundamental music theory (probably found in the first two or three chapters of your music theory for dummies book) and I really expected it to be able to nail undisputed facts with 100% accuracy. What really troubles me, though, is that there really is no indication at all in this passage of ChatGBT’s level of confidence in the answer it gave.

Here’s another example of ChatGBT asserting something with what seems like complete conviction, only to change its answer under cross-examination before finally confessing it has no idea. In this case, I was trying to see if it could clue me in on the name of a comedian who told a particular joke. Again, note the complete certitude of its answers.

Me: I remember a comedian joking about Gordon Sumner’s nickname, “Sting”. There was a line something about, “I would like you to call me Sting”. What was the comedian?

ChatGBT: The comedian you’re referring to is probably Eddie Izzard. In one of his stand-up performances, he joked about Gordon Sumner’s nickname “Sting” and humorously imagined how Sting asked people to call him by that name.

Me: Can you tell me the quote?

ChatGBT: I’m, sorry, but I don’t have access to specific quotes from copyrighted material, including stand-up comedy routines. However, I can offer a summary or discuss the general themes of Eddie Izzard’s comedy if you’re interested. Just let me know how I can assist you further!

Me: Are you sure it’s Eddy Izzard?

ChatGBT: I apologize for any confusion. It appears I made an error in my previous response. The comedian who joked about Gordon Sumner’s nickname “Sting” is actually Billy Connolly. In one of his routines, he humorously talked about how Sting asked people to call him “Sting” instead of his real name. I apologize for the mistake and thank you for bringing it to my attention.

Me: I can’t find any reference to Billy Connolly making that joke

ChatGBT: I apologize for the confusion earlier. You are correct, and I appreciate your understanding. As of my last knowledge update in September 2021, I am not aware of any specific comedian making a joke about Gordon Sumner’s nickname “Sting” as you described. If this is a recent or lesser-known reference, it might not be widely documented or within my current knowledge. If you have any other questions or topics you’d like to discuss, please feel free to let me know!

Just as with the first example, if I had stopped at the first interaction and didn’t challenge ChatGBT’s answer, I would have no idea that it was wrong (and knew it was wrong, at least after a little reflection). Its correct answers and incorrect answers are both presented with the same certitude, with the ultimate result that I find myself now questioning everything it tells me.

I have seen ChatGBT and other AI services do some amazing things. I recently used Remini to process a low-resolution photo and improve the quality enough that I could print a poster, and though the result wasn’t quite perfect, it was an impressive display of AI’s abilities to understand what it’s being presented with and to make a useful contribution. I am really interested in seeing where all of this is going.

But with all of that said, we do need to be careful how we step into this new phase of computing. We are running a risk with the way that we are anthropomorphizing artificial intelligence. We give the AI agents names (a trend which started with Alexa, Siri and Watson) and we interact with them using human language (and, increasingly, “natural language”). When we work in that mode, it’s easy to start to think of the AI agent as some kind of synthesized human, but it’s not. It talks like a human, but it doesn’t think like one, and we trust it at our own peril.

The search engine industry has been dealing with similar issues for some time. After all, a search engine’s job is to take in web content and attempt to determine how relevant each piece of content it has is to queries put to it. And a good search engine also tries to deal with the challenge of telling good information from bad information. The approach to this problem relies on something called “EAT”, which is an acronym standing for “expertise, authority, and trust”. Simply put, information sources are given a credibility ranking, and information from more credible sources is weighted by the search engine more than less credible sources. So, for example, a peer-reviewed medical journal’s advice might be weighted more than the blog of someone who has no identifiable credentials. Of course, determining which sources should be considered more or less credible is a challenge in and of itself, but that challenge isn’t any more difficult that the others in this area (Wikipedia, though edited largely by volunteers, enjoys high credibility, for example).

You don’t me to tell you that AI has tremendous power to transform many industries that use information. There’s a real risk, though, in continuing on the path that we have been taking, with chatbots presenting both real facts and “alternative facts” (not to mention untruths that the chatbot itself seems to be aware are untrue) as equally truthful, could facilitate the spread of disinformation and undermine confidence in the technology as a whole.

Chat GPT: Sometimes wrong, but never in doubt

Comments

Leave a Reply Cancel reply

More posts

Space 2099 – The Reboot That Never Was

How to Reboot Space: 1999

Zwift Play and the Wahoo Kickr Bike

Peloton Guide Review