AI Confidence is a Human Problem

Asking ChatGPT to pick a random number between 1 and 100 is a bit of a trend at the moment. I took the bait and followed suit. Spoiler: it picked 73. Lumo, the AI agent from ProtonMail, picked 45. Claude’s was the best, though: 42. I already knew why, but I thought I’d ask all three how they “randomly” picked their respective numbers. While they all admitted to just picking the most statistically significant value that seemed plausible, Claude essentially answered, “Because of Hitchhiker’s Guide to the Galaxy.”

I’m not surprised by the outcome. The model is simply pattern matching based on the training set it has. Picking a random number is not really a pattern, and in the computational sense the model is not reaching for a random number generator tool. As ChatGPT told me, it didn’t actually feel the need to do so.

None of this is actually a model issue: it is a human issue. I see two dynamics at play.

First, we users have been acclimated to a computer system returning deterministic responses. If my calculator ever gave me the answer “42” to the question “10+3”, I’d say it was broken and toss it out. AI, though, is not only non-deterministic, we have also anthropomorphized it. That act of it feeling like a really smart person at the other end of the keyboard creates a psychological tendency to accept the generated response. Those responses can be really good, and yet as this exercise demonstrates, they may simply be plausible responses; a probability average that looks good enough to sound accurate.

That raises the second issue: the authors of these chat agents could do better to have the model respond with less authority rather than replying with confidence, yet wrong guesses. This would at least be more transparent to users and would have the curious effect of building more trust in using these tools, because we users can more clearly see the boundaries of when AI is useful and valuable versus when it is just sprouting off pure rot.

I see this pattern repeated in other tasks. I have asked for code output or edit suggestions. Often times the response is very good and meets my needs. There are moments, though, where it looks plausible at first blush, yet digging just one layer deeper that statistical average prediction is visible and the result is less than stellar. If I had to bet, this “plausible-but-wrong” pattern is one reason why some are calling AI slop.

You shouldn’t take away that I am negative about AI. On the contrary, the fact that it can infer and generate predictions makes it incredibly useful. To get there, though, we need to see AI for what it actually is: a tool, and not some mystical entity with supernatural powers of deduction. Once we do that we can get on with building some rather clever systems that might actually be useful.

I taunted Claude at the end of that chat about being trained on Intel Pentium mathematics. I think it won the internet for that day.

Daniel Konopacki

AI Confidence is a Human Problem

Leave a comment Cancel reply

AI Confidence is a Human Problem

Share this:

Leave a comment Cancel reply