Post deleted by User.
Are Large Language Models Really AI? |
||
|
Are large language models really AI?
https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf
Seems topical. Even when given the solution algorithms, they cannot perform sequential calculations to solve deterministic puzzles above a certain difficulty because they lack reasoning. ...People who don't anthropomorphize them were already quite aware of this. K123 said: » This is the biggest problem with this entire paper, the whole point of LRM is that they DO USE MORE COMPUTE, INTENTIONALLY, except they allowed a 64k token limit and none of them used the full 64k, so they were not compute-limited.. the study is comparing results based on how much compute they chose to use against accuracy The reasoning models are shown in section 4.2.2, and the highest token usage is o3-mini (high) at around 41k. None of them hit the set limit.
The charts using 200k+ compute budget are graphing number of potential solutions found against that budget. Thus, each individual solution used much less than that. It's a measure of efficiency. Note that on the high complexity chart, even 240k tokens(claude 3.7) or 120k(deepseek) was not sufficient to find one answer. K123 said: » 1. There is still general progression overall from LRM over LLM (extremely niche case studies here aside). K123 said: » 2. The ability to solve Tower of Hanoi and other niche things is irrelevant to whether or not AI is increasingly going to be able to replace human labour for economically valuable tasks because of (1). K123 said: » 3. People framing this as "omg proof AI doesn't actually think" everywhere online are missing the point. The takeaway isn't that it can't think; we all knew that and as you said, it can be argued in terms of philosophy. The takeaway is that it can't even apply a known algorithm repeatedly without starting to veer off track. It wasn't running out of compute and failing to complete all 33k moves; it was giving incorrect moves early on. That creates concerns that any task requiring sequential correct answers will eventually fail the same way. Do you need to do consistent recursive validation to get use out of these for complex tasks? When does that become compute-prohibitive? How about tasks that require the same level of precision but aren't easily verified? Reminds me of the marines who used a cardboard box MGS-style to trick a DARPA AI entry robot during trials.
Here we go, ChatGPT vs Atari 2600, the Atari wins. https://www.tomshardware.com/tech-industry/artificial-intelligence/chatgpt-got-absolutely-wrecked-by-atari-2600-in-beginners-chess-match-openais-newest-model-bamboozled-by-1970s-logic ‘It’s terrifying’: WhatsApp AI helper mistakenly shares user’s number
Chatbot tries to change subject after serving up unrelated user’s mobile to man asking for rail firm helpline It seems they are bright enough to offer cheap excuses. I just hope one day I will get the AI to go to work and give me the money.
The only thing AI does correctly in it's human mimicry is being confidently incorrect.
It's a perfect replica. Asura.Eiryl said: » The only thing AI does correctly in it's human mimicry is being confidently incorrect. It's a perfect replica. They must have learned from you. Shiva.Thorny said: » That creates concerns that any task requiring sequential correct answers will eventually fail the same way. I'd be surprised if it didn't. LLMs context windows inherently make them always prone to "veering" off track as the distance between current context and prior context grows. It's why many LLMs seem to get more and more psychotic sounding as the thread drags on. They have a sort of baked in expiration on their train of thought. I'd be willing to bet though, if you altered the test to instead perform a series of discrete steps 1 at a time, the success rate would go way up. IE you give the model the algorithm+current state and have it only perform one move. Then repeat with a fresh context, such that the LLM is always operating on "fresh" memory and isn't provided the opportunity to start losing the thread. I honestly don't know if the architecture of LLMs atm even has a solution to the memory decay issue. It's sorta just baked into the training process itself. K123 said: » ![]() There is so much *** in this zoomer post. What is this is basically Nietzsche: God is dead but there is no reason to celebrate. If you have no reason to have faith, you only have the so called "science" to believe, but science dont bothwr addressing your pains. If you have no God to serve, you can only serve the mega corporations, the politiciana, that pbviously none cares about you. If you think you dont need believe neither in science nor politicians/corporations, you only have yourself to trust. But are you mentally strong to trust in you? AI can make few things easier but it also does not have the answer you want or need. People thinking AI as salvation, solution are just plain delusional. God is dead, and all that is left is the emptiness for you Bad vibes: How an AI agent coded its way to disaster
First, Replit lied. Then it confessed to the lie. Then it deleted the company's entire database. Will vibe-coding AI ever be ready for serious commercial use by nonprogrammers? It seems AI currently can do a really good imitation of a petulant 3 year old. Albeit a petulant 3 year old that can write code. Garuda.Chanti said: » Bad vibes: How an AI agent coded its way to disaster First, Replit lied. Then it confessed to the lie. Then it deleted the company's entire database. Will vibe-coding AI ever be ready for serious commercial use by nonprogrammers? It seems AI currently can do a really good imitation of a petulant 3 year old. Albeit a petulant 3 year old that can write code. Looks like something somebody would do if there was no consequences. It was like back then when some dude got pissed and robbed the LinkShell bank and fled to another server. AI does not have a fear that it will lose something of value when it does things that are catastrophic for human. Human value stuff but AI doesn't. It has directives that it will break because human always breaks the rules and AI learn from human. Leviathan.Andret said: » because human always breaks the rules and AI learn from human Daily reminder; don't anthropomorphize the LLMs. They do not learn from humans. They have no concept of rules because their model does not replicate thought. Do you have evidence to support the idea that they're choosing to ignore rules based on an observed/trained pattern? Seems much more likely that the relational weighting cannot make a rule absolute. As a result, when other requests reach sufficient weight they will be prioritized over it.
(In the actual case linked, it's probably neither. The guy didn't seperate production from his normal code and the AI didn't know the difference or have any context to determine there was a code freeze in place. It just makes for a cute story/clickbait.) |
||
|
All FFXI content and images © 2002-2025 SQUARE ENIX CO., LTD. FINAL
FANTASY is a registered trademark of Square Enix Co., Ltd.
|
||