Neither of us knows

Post #444,202 by crazy 6/18/24 10:47:44 AM 6/18/24 10:47:44 AM Reply	Neither of us knows You are sure. I'm not Beyond that, it's a black box.
Post #444,205 by Another Scott 6/18/24 7:44:31 PM 6/18/24 9:45:29 PM Reply	arXiv.org - Ha[l]lucination is inevitible. https://arxiv.org/abs/2401.11817 Hallucination is Inevitable: An Innate Limitation of Large Language Models Ziwei Xu, Sanjay Jain, Mohan Kankanhalli Hallucination has been widely recognized to be a significant drawback for large language models (LLMs). There have been many works that attempt to reduce the extent of hallucination. These efforts have mostly been empirical so far, which cannot answer the fundamental question whether it can be completely eliminated. In this paper, we formalize the problem and show that it is impossible to eliminate hallucination in LLMs. Specifically, we define a formal world where hallucination is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all of the computable functions and will therefore always hallucinate. Since the formal world is a part of the real world which is much more complicated, hallucinations are also inevitable for real world LLMs. Furthermore, for real world LLMs constrained by provable time complexity, we describe the hallucination-prone tasks and empirically validate our claims. Finally, using the formal world framework, we discuss the possible mechanisms and efficacies of existing hallucination mitigators as well as the practical implications on the safe deployment of LLMs. LLMs are impressive, but they're still just "page-level autocomplete" as DeLong says. They're even worse when humans don't curate the data going into them and just hoover up everything they can find... Cheers, Scott. Edited by Another Scott June 18, 2024, 09:45:29 PM EDT Expand All History
Post #444,207 by drook 6/18/24 10:26:25 PM 6/18/24 10:26:25 PM Reply	Humans are also susceptible to hallucination Do we have reason to believe LLMs will consistently and perpetually be "worse" than human participants? -- Drew
Post #444,208 by Another Scott 6/18/24 10:58:01 PM 6/18/24 10:58:01 PM Reply	Dunno. I think the point is that LLMs are not and cannot be an "intelligence" as commonly understood. They can string together words that might, or might not, be a good approximation of information. Since the supposed point of them is to approximate knowledgeable humans ("they can pass the bar exam!!"), the fact that they cannot be hallucination-free seems to raise red flags. What good is an "expert" that you have to second-guess and double-check when it comes to anything important? Hacker News thread where someone makes similar points about Google's "AI Overviews". Yeah, humans make mistakes. But everyone knows that, and that's why "I want to speak to the manager" exists. What "manager" are we going to talk to when everyone we attempt to interact with is just a LLM instance?? We all know the ancient "computers cannot make mistakes" trope that gets trotted out when there are problems. There's a rather infamous example of that still playing out in the UK. I'm sure there are examples of LLMs that do a decent job (what used to be called "expert systems" seems related, I think), but they didn't crawl TheOnion and Reddit in an attempt to get huge and crush their competitors. The good expert systems builders fed it known-good, or at least best-effort good, information. But even then, humans always have to check the work. Didn't someone say that once garbage info - like putting glue on pizza, or eating rocks - gets into these LLMs, then it's impossible to get it out? They basically have to retrain it with new data? We'll see. Cheers, Scott.
Post #444,209 by drook 6/19/24 12:39:59 AM 6/19/24 12:39:59 AM Reply	That's an interesting test Can you teach it that what it thinks it knows is wrong? And again, show me that you can teach a Trumper that what they know is wrong and I'll concede that LLMs are worse than humans. -- Drew
Post #444,212 by CRConrad 6/19/24 4:56:48 AM 6/19/24 4:56:48 AM Reply	Tangent -- re: "page-level autocomplete" Not exactly page-, but post-/comment-level: I've fumble-fingered away responses on various online platforms on my phone a few times, so I've had to start over from the beginning. And a few times, ordinary dumb AutoCarrot on my phone has apparently recognised "He's trying to say the same thing again" and offered up the next word -- all of them, in order; the next one after the one I just accepted, all the way through to the end -- from the screed I'd failed to post just before. Far from "intelligence", but nice and handy. Too bad it only happens so rarely, not onsistently. -- Christian R. Conrad The Man Who Apparently Still Knows Fucking Everything Mail: Same username as at the top left of this post, at iki.fi
Post #444,262 by crazy 6/23/24 8:45:46 PM 6/23/24 8:45:46 PM Reply	It's a solvable problem I initially thought along the lines of having multiple AI back ends answering the same problem and then the resulting answers would be voted upon by multiple AI back ends and if one of them starts hallucinating, two of them will say that guy's hallucinating. If the two of them are hallucinating then a red flag will be thrown up And the answer will be presented as untrustworthy. But if you trigger three then your problem is really s******* and I'm sorry. So then the concept of mixture of agents showed up simply to attempt to get the best answers possible. It sends the same question to multiple back ends and then aggregates them. It answers your question and it simultaneously presents a variety of answers that are alternatives to educate you on the possibilities. https://youtu.be/aoikSxHXBYw?si=JnjVH0oO8roS9kQN I don't know if it'll handle hallucinations right now, but I can guarantee you they are thinking about it. There will be some type of boundary checking sooner or later when there are multiple AI agents working in concert but using a different back ends. It failed on the snake game but is incredible on logic puzzles in general. I saw Claude 3.5 do the snake game in a single shot and it was incredible.

Welcome to IWETHEY!