|
|
Are large language models really AI?
サーバ: Asura
Game: FFXI
Posts: 1147
By Asura.Iamaman 2025-11-25 15:46:13
This is a fair example and one I was aware of, the issue I take is that this isn't how AI is being promoted by the industry.
The way most people read the way AI is going to work is this: give me code, I put it in AI, tell AI to find bugs, AI gives me bugs. Now we, here, understand this is not the case but your typical CISO does not. The goal is to promote the idea that you can remove the person and automate results using AI, something they've claimed for years prior to LLMs and...wasn't the case. In this case, the bug was found 8% of the time, so between weeding out false positives and running enough for it to find the bug, is that really working the way it's being promoted esp in context of the Anthropic posts?
In the case above, you had someone who understood the code enough to find a bug on their own (he kindof hints at this but IMO underplays the value here), then fed the LLM the specific code necessary AND understands how LLMs work enough to know how to prompt it and provide what is needed. You'd also have to have enough familiarity with the code to weed out false positives, again, something that requires manual intervention and review. This code is not hard to get through, for sure, but still...there's a prerequisite that someone can interpret the result and make sense of it.
In the context of the other discussion, exploiting an issue like this is also extremely volatile. Most memory corruption bugs I've found in the course of my career are not practically exploitable, but get CVEs anyway because it segfaults and most people don't care if it's actually exploitable. Feeding a LLM the code necessary to reliably exploit a UAF bug would require understanding of the compiled code, allocator internals, thread state, and a number of other factors that it's just not capable of handling to produce a working exploit. So yea, bug hunting there is some optimization, but not enough to replace people and even (in my experience) provide meaningful output. You still need someone who can understand these internals to really turn it into something useful, but it involves correlation of so many different factors (some of which are non-deterministic, like allocator state) that LLMs can't begin to handle it.
I actually ran a similar test not long ago for a bug (in the Linux kernel actually, also) that related to the way state was handled for a certain kernel module. I prompted it and hand held it to see if I could get it even close to finding the bug, even going to the point of asking "is this a bug?" while pointing out the bug and it basically told me to read the code myself after giving me the wrong answer repeatedly. Granted, the code in my case is a little more complex as it relates to data fed in from userspace that is very indirectly interacted with, whereas data pulled through a command handler off the network is a more linear path (which is what was done here). In another case, I was interacting with a service over a public available IPC API and it just invented header files, function calls, and data structures that didn't exist and even pointing at the code, it couldn't get over whatever it was hallucinating. I used Claude and Copilot, though, so maybe I need try o3 and replicate his process here a little better than I have in the past (our current work isn't strictly related to this at the moment).
I think the overall point here is that yea, there is value in some cases that can optimize what someone who already knows what they are doing is capable of. The problem is that's not the pitch, the pitch is a lot simpler and is the same pitch that's been out there for years prior to LLMs, but doesn't match the reality when you are dealing with complex targets. These tools can make someone more optimized, at times, and others they can end up chasing ghosts. That's also questioning whether simpler testing methods, like simply fuzzing the SMB protocol in this case with the proper instrumentation, and whether or not it would've identified the same bug with less work and core understanding of the code (initially, anyway)
[+]
VIP
サーバ: Fenrir
Game: FFXI
Posts: 1182
By Fenrir.Niflheim 2025-11-25 18:49:44
In the case above, you had someone who understood the code enough to find a bug on their own (he kindof hints at this but IMO underplays the value here), then fed the LLM the specific code necessary AND understands how LLMs work enough to know how to prompt it and provide what is needed. You'd also have to have enough familiarity with the code to weed out false positives, again, something that requires manual intervention and review. This code is not hard to get through, for sure, but still...there's a prerequisite that someone can interpret the result and make sense of it.
Yep, and for the otherside of it we have death by a thousand slops the maintainer of curl discusses how they are inundated with false reports, those 92% false positives from the LLM security researcher.
And another: google sends a bug report with a scheduled disclosure timeline to ffmpeg if a company has the time to find the exploits maybe they should fix the bug as well. Also worth noting the exploit impacts an irrelevant section of ffmpeg.
[+]
サーバ: Asura
Game: FFXI
Posts: 1147
By Asura.Iamaman 2025-11-25 19:28:39
This is something that absolutely boils my *** blood, it's been going on for decades in the security industry.
They inflate EVERYTHING then act like they are sooo hot because they....dumped a 50mb PDF on some vendor that triggers a NULL pointer dereference that they insist is a buffer overflow (when it isn't) and demand credit despite doing 0 triage, 0 reduction of the repro, and basically just used CPU cycles to find the magic pattern that triggers the crash. Some of the biggest names in the space are notorious for doing this and most people don't realize it, because their public persona is very much analytical, detailed, etc but they wear every CVE as a badge of honor when they did little to no work to actually identify the issue and most of the time they aren't even an exploitable bug category, forget being an exploitable/reachable bug.
A lot of these stick with me, one was a bug in some dumb document parser and everyone was melting down about it. It reproduced on Linux and, in theory, reproduced on Windows. The problem is, it didn't, it gracefully exited because Visual C++ defaults to checked iterators and when the out of bounds access occurred, the iterator identified it and gracefully exited without a segfault or corruption. The Linux version didn't do that because gcc didn't have that feature. There was all sorts of media hysteria and no one bothered to attach a *** debugger to see what was going on. This type of ***happens all the time and even the big vendors do it.
The problem is that, in the open source space especially, this leads to what you mention above. The maintainers can't handle the number of reports, people aren't doing any real triage or work, and in the mess of this real bugs don't get fixed or they get fixed and prioritized wrong because it was wrapped up in another code change but the bug wasn't identified (which is bad when you consider how downstream maintainers cherry pick fixes). The reason people like Google don't go and offer fixes is because they don't look at the code closely enough to offer any meaningful attempt at patching it. They'll fuzz it out or see asan triggers a problem then report it without much more analysis. In a lot of cases, they aren't even capable of offering a fix.
I don't really interact with the community the same way as I did before, so it never really occurred to me that LLMs would cause more of this to happen or make it worse. I can only imagine, it was bad enough before.
[+]
サーバ: Asura
Game: FFXI
Posts: 55
By Asura.Cossack 2025-11-26 03:11:56
"There is no life on this planet!
Jehovah-One replaced all life with machinery five centuries ago
The so-called industrial revolution was just another hoax
And we all fell for it, 'cause we were all programmed to
Even I fell for it, I believe in the steam engine
Even though I don't believe in anything
[+]
Garuda.Chanti
サーバ: Garuda
Game: FFXI
Posts: 12085
By Garuda.Chanti 2025-11-26 09:46:03
Each Time AI Gets Smarter, We Change the Definition of Intelligence Or someday there may be intelligence on this planet but when that happens we can always move the goalposts so we are safe?
Why would a code writing AI improve itself? I wasn't suggesting it would have sentience. If a code writing AI could fully replace the job of a software developer, the people who created it would be foolish not to use that capability to exponentially improve its own codebase. It's not scifi, it doesn't recognize that it's improving itself. It's just the logical progression of a product that reaches that state.
But software has been used to improve its own function for well over a decade. And design its own hardware too. So I may have leapt to a conclusion that you were talking about an NHI event.
サーバ: Fenrir
Game: FFXI
Posts: 380
By Fenrir.Brimstonefox 2025-11-26 10:14:07
Why would a code writing AI improve itself? I wasn't suggesting it would have sentience. If a code writing AI could fully replace the job of a software developer, the people who created it would be foolish not to use that capability to exponentially improve its own codebase. It's not scifi, it doesn't recognize that it's improving itself. It's just the logical progression of a product that reaches that state.
But software has been used to improve its own function for well over a decade. And design its own hardware too.
Much longer than that its Moore's law in essence, semiconductor development is not unlike gear progression in FFXI, i'd wager it would be considerably easier for a level 1 character to solo v25 bumba than to take all the knowledge needed to make a 1nm technology but have none of the requisite hardware or software available to do so.
Garuda.Chanti
サーバ: Garuda
Game: FFXI
Posts: 12085
By Garuda.Chanti 2025-11-28 13:52:12
Shiva.Thorny
サーバ: Shiva
Game: FFXI
Posts: 3652
By Shiva.Thorny 2025-11-28 14:10:32
I'd interpret it to mean that they serve the purpose of translating data into a variety of languages and formats to share the data. The implication is that they're disseminating information rather than synthesizing novel information. Information dissemination systems of the past would distribute the exact same text to everyone (such as a static html file or a printed set of encyclopedias). In some cases, they might have several language options each with a static set of text, but that was the extent of variety. They regurgitated exact data they were provided to present to the user.
In contrast, LLMs are capable of dynamically presenting the information they have available in numerous ways. The LLM is trained on information and it can present that information using language as a medium, which would be using the communication function of language.
I mostly agree with the takeaways. We know that LLMs cannot think because they lack a basis for truth. While they can branch into determinative reasoning for certain problems (such as higher mathematics or the application of formulas), it only happens because a human programmed the model to use a different subroutine to solve that style of problem.
tldr; LLMs can't innovate. It doesn't really have much bearing on how they'll stack up to workers at current tasks though; most workers do very little innovation.
[+]
By K123 2025-11-28 15:19:25
Most humans hardly "think" and aren't really "intelligent" by the standards we are holding LLMs to.
AI models are discovering things humans haven't simply by raw compute. We might not call it "innovation", but this is just semantics. AI and LLMs both are creating value, functional, financial, and for the welfare of humans.
Garuda.Chanti
サーバ: Garuda
Game: FFXI
Posts: 12085
By Garuda.Chanti 2025-11-28 16:00:36
@Thorny
The part that confuses me:
Quote: “The problem is that according to current neuroscience, human thinking is largely independent of human language — and we have little reason to believe ever more sophisticated modeling of language will create a form of intelligence that meets or surpasses our own,” I cannot imagine thought without words. I have studied zen, the art of no thought. One approach to zen is to divorce yourself from words. "Silencing the monkey." That monkey that chatters on inside our heads.
By Pantafernando 2025-11-28 16:08:19
I cannot imagine thought without words
Images?
Shiva.Thorny
サーバ: Shiva
Game: FFXI
Posts: 3652
By Shiva.Thorny 2025-11-28 16:18:24
I cannot imagine thought without words.
Thought is the process of connecting related concepts. Words are the way you express them. Consider someone who speaks multiple languages; even if language is a component to creating an idea, the idea continues to exist independent of the language.
I'm not sure if that really helps; I'm certainly no expert. I see thought as a level of abstraction higher than language. Obviously different people have different verbal IQs and ways of thinking; someone with aphantasia would likely have a very difficult time differentiating thought from language.
By K123 2025-11-28 16:47:29
Cattell Culture Fair IQ test is language independent cus it's pattern recognition. I have 142 IQ on that but only 133 on Cattell III B because *** language. None of them mean that much in reality though. There are many forms of intelligence. I knew a guy that day to day was pretty dumb, noticeably, but you put him on Halo and he was a machine. His reactions and speed, etc. were insane.
VIP
サーバ: Fenrir
Game: FFXI
Posts: 1182
By Fenrir.Niflheim 2025-11-28 22:40:26
I cannot imagine thought without words Some people have no "inner monolog" they simply think differently than people with an inner monolog. Similarly some people can not "picture an apple" in their mind.
There is a wide range of human experiences and we are usually oblivious to how different people might be when it concerns something so core to how we experience the world. Until I got my first pair of glasses I did not think other people could see the leaves on a tree, they were always so blurry I just figured everyone saw them the same as me.
By K123 2025-11-29 02:42:26
Most people are oblivious to the fact that half the planet can't even read or write or that half the people in the developed world are NPCs when talking about average intelligence too.
And if not what would or could be? (This assumes that we are intelligent.)
Sub questions:
1, Is self awareness needed for intelligence?
2, Is conciseness needed for intelligence?
3, Would creativity be possible without intelligence?
Feel free to ask more.
I say they aren't. To me they are search engines that have leveled up once or twice but haven't evolved.
They use so much electricity because they have to sift through darn near everything for each request. Intelligence at a minimum would prune search paths way better than LLMs do. Enough to reduce power consumption by several orders of magnitude.
After all if LLMs aren't truly AI then whatever is will suck way more power unless they evolve.
I don't think that LLM's hallucinations are disqualifying. After all I and many of my friends spent real money for hallucinations.
|
|