The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Wikipedia anecdotes - LLM generalizations [was On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)
       [not found]     ` <CAEdTPBcr2ajyAQh24LPtiQLBjfe2G2MYwoq8x_3bt6TzOT1_BA@mail.gmail.com>
@ 2025-05-26 18:45       ` Charles H Sauer (he/him)
  0 siblings, 0 replies; 5+ messages in thread
From: Charles H Sauer (he/him) @ 2025-05-26 18:45 UTC (permalink / raw)
  To: COFF; +Cc: The Eunuchs Hysterical Society

TUHS->COFF
>     > It's like Wikipedia.
> 
>     No, Wikipedia has (at least historically) human editors who
>     supposedly have some knowledge of reality and history.
> 
>     An LLM response is going to be a series of tokens predicted based on
>     probabilities from its training data. ...
> 
>     Assuming the sources it cites are real works, it seems fine as a
>     search engine, but the text that it outputs should absolutely not be
>     thought of as something arrived at by similar means as text produced
>     by supposedly knowledgeable and well-intentioned humans.
> 
> An LLM can weigh sources, but it has to be taught to do that.  A human 
> can weigh sources, but it has to be taught to do that.

Before LLMs, Wikipedia, World Wide Web, ... adages such as "Trust, but 
verify," and "Inspect what you expect," were appropriate, and still are.

Dabbling in editing and creating Wikipedia articles has enforced those 
notions. A few anecdotes here -- I could cite others.

1. I think my first experience was trying in 2008 to fix what is now at 
https://en.wikipedia.org/wiki/Vulcan_Gas_Company_(1967%E2%80%931970), 
because the article had so much erroneous content, and because I had 
worked/performed at that venue 1969-70. Much of what I did in 2008 was 
accepted without anyone else verifying. But others broke things/changed 
things, even renamed the original article and replaced it with an 
article about a newer club that adopted the name. A few years ago, I 
tried to make corrections, citing poster images at 
https://concerts.fandom.com/wiki/Vulcan_Gas_Company. Those changes were 
vetoed because fandom.com was considered unreliable. I copied the images 
from fandom to https://technologists.com/VGC/, and then citing those 
images was then accepted by the editors involved. (The article has been 
changed dramatically, still is seriously deficient, IMO, but I'm not 
interested in fixing.)

2. Last year, I created https://en.wikipedia.org/wiki/Hub_City_Movers, 
citing sources I considered reliable. Citations to images at discogs.com 
were vetoed as unreliable, based on analogous bias against that site. 
Partly to see what was possible, I engaged with editors, found citations 
they found acceptable, and ultimately produced a better article.

3. Later last year, I edited https://en.wikipedia.org/wiki/IBM_AIX to 
fix obviously erroneous discussion of AIX 1/2/3. Even though I used my 
own writings as references, the changes were accepted.

I still use the Web, Wikipedia, and even LLMs, but cautiously.

Charlie
-- 
voice: +1.512.784.7526       e-mail: sauer@technologists.com
fax: +1.512.346.5240         Web: https://technologists.com/sauer/
Facebook/Google/LinkedIn/mas.to: CharlesHSauer


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [TUHS] Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)
       [not found] <F7093F5EDCBB735E2C7C473314D40D5A.for-standards-violators@oclsc.org>
       [not found] ` <CAEdTPBeFUcxAZWn1=mZnwTmF2a3DN-1GnXXB6WmV5gaqZHz1Lw@mail.gmail.com>
@ 2025-05-26 20:36 ` Noel Hunt
       [not found]   ` <CAKzdPgzNBUiT4GeQUnBX38+3dtNY=2Gw=9mFUy2anMoO4DUECg@mail.gmail.com>
       [not found] ` <DEBB648F-52A0-4E52-AC26-E2067FE7E0CD@humeweb.com>
  2 siblings, 1 reply; 5+ messages in thread
From: Noel Hunt @ 2025-05-26 20:36 UTC (permalink / raw)
  To: Norman Wilson; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 262 bytes --]

On Tue, 27 May 2025 at 02:40, Norman Wilson <norman@oclsc.org> wrote:

> an LLM is pretty much just a much-fancier and better-automated
> descendant of Mark V Shaney: https://en.wikipedia.org/wiki/Mark_V._Shaney


I am glad someone has finally pointed that out.

[-- Attachment #2: Type: text/html, Size: 831 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [TUHS] Re: On the unreliability of LLM-based search results
       [not found]     ` <87frgqequk.fsf@gmail.com>
@ 2025-05-27  3:08       ` George Michaelson
  0 siblings, 0 replies; 5+ messages in thread
From: George Michaelson @ 2025-05-27  3:08 UTC (permalink / raw)
  To: Alexis; +Cc: tuhs

[-- Attachment #1: Type: text/plain, Size: 464 bytes --]

>
>
> We're way off topic. Warren should send a kill.

That said: please don't repeat the "hallucinate" label. It's self
-aggrandisement. Its deliberate, to foster belief "it's like thinking"

It's not a hallucination, it's bad model data and bad constraint
programming. They're not thinking, or dreaming, or demanding not to be
turned off, or threatening or bullying: They're not Markov chains either
but they're a damn sight closer to a machine than a mind.


G

[-- Attachment #2: Type: text/html, Size: 824 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [TUHS] Mark V Shaney (Re: Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum))
       [not found]   ` <CAKzdPgzNBUiT4GeQUnBX38+3dtNY=2Gw=9mFUy2anMoO4DUECg@mail.gmail.com>
@ 2025-05-27 18:11     ` Dan Cross
  0 siblings, 0 replies; 5+ messages in thread
From: Dan Cross @ 2025-05-27 18:11 UTC (permalink / raw)
  To: Rob Pike; +Cc: TUHS

On Tue, May 27, 2025 at 2:44 AM Rob Pike <robpike@gmail.com> wrote:
> My name is Rob Pike and I approve this message.

Regarding Mark V. Shaney, I once heard a story that someone was so
upset about some of its posts on USENET that they drove to Bell Labs
to confront it, not realizing "he" was in fact a program, and not a
person. On arriving at the labs, someone explained what was what, they
went home, and that was the end of it.

But, in the version I heard, they drove from, like, Ohio...or
somewhere in the midwest.  I find this very difficult to believe:
surely by the time they got to Harrisburg, perhaps, they'd have cooled
down enough to realize they were making poor life choices and should
turn around and go back home.

Is there any truth to this, or is the story (as I suspect) just apocryphal?

        - Dan C.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [TUHS] Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)
       [not found]         ` <CAO2qRdMHAUHdPj9odydp3c9YwfaaU2pZiR6nmNS8O3r=rjKfWw@mail.gmail.com>
@ 2025-05-31 22:47           ` Luther Johnson
  0 siblings, 0 replies; 5+ messages in thread
From: Luther Johnson @ 2025-05-31 22:47 UTC (permalink / raw)
  To: tuhs

[-- Attachment #1: Type: text/plain, Size: 5910 bytes --]

I think we could call many of these responses "mis-ambiguation", or 
conflation, they mush everything together as long as the questions posed 
and the answers they provide are "buzzword-adjacent", in a very 
superficial, mechanical way. There's no intelligence here, it's just 
amazing how much we project onto these bots because we want to believe 
in them.

On 05/31/2025 03:36 PM, James Johnston wrote:
> Well, I have to say that my experiences with "AI based search" have 
> been beyond grossly annoying. It keeps trying to "help me" by sliding 
> in common terms it actually knows about instead of READING THE DAMN QUERY.
>
> I had much, much better experiences with very literal search methods, 
> and I'd like to go back to that when I'm looking for obscure papers, 
> names, etc.  Telling me "you mean" when I damn well DID NOT MEAN THAT 
> is a worst-case experiences.
>
> Sorry, not so much a V11 experience here, but I have to say it may 
> serve the public, but only to guide them back into boring, 
> middle-of-the-road, 'average mean-calculating' responses that simply 
> neither enlighten nor serve the original purpose of search.
>
> jj - a grumpy old signal processing/hearing guy who used a lot of real 
> operating systems back when and kind of misses them.
>
> On Sat, May 31, 2025 at 2:53 PM Luther Johnson 
> <luther.johnson@makerlisp.com <mailto:luther.johnson@makerlisp.com>> 
> wrote:
>
>     I agree.
>
>     On 05/31/2025 01:09 PM, arnold@skeeve.com
>     <mailto:arnold@skeeve.com> wrote:
>     > It's been going on a for a long time, even before AI. The amount
>     > of cargo cult programming I've seen over the past ~ 10 years
>     > is extremely discouraging.  Look up something on Stack Overflow
>     > and copy/paste it without understanding it.  How much better is
>     > that than relying on AI?  Not much in my opinion.  (Boy, am I glad
>     > I retired recently.)
>     >
>     > Arnold
>     >
>     > Luther Johnson <luther.johnson@makerlisp.com
>     <mailto:luther.johnson@makerlisp.com>> wrote:
>     >
>     >> I think when no-one notices anymore, how wrong automatic
>     information is,
>     >> and how often, it will have effectively redefined reality, and
>     humans,
>     >> who have lost the ability to reason for themselves, will
>     declare that AI
>     >> has met and exceeded human intelligence. They will be right, partly
>     >> because of AI's improvements, but to a larger extent, because
>     we will
>     >> have forgotten how to think. I think AI is having disastrous
>     effects on
>     >> the education of younger generations right now, I see it in my
>     >> workplace, every day.
>     >>
>     >> On 05/31/2025 12:31 PM, andrew@humeweb.com
>     <mailto:andrew@humeweb.com> wrote:
>     >>> generally, i rate norman’s missives very high on the
>     believability scale.
>     >>> but in this case, i think he is wrong.
>     >>>
>     >>> if you take as a baseline, the abilities of LLMs (such as
>     earlier versions of ChatGP?) 2-3 years ago
>     >>> was quite suspect. certainly better than mark shaney, but not
>     overwhelmingly.
>     >>>
>     >>> those days are long past. modern systems are amazingly adept.
>     not necessarily intelligent,
>     >>> but they can (but not always) pass realistic tests, pass SAT
>     tests and bar exams, math olympiad tests
>     >>> and so on. and people can use them to do basic (but realistic)
>     data analysis including experimental design,
>     >>> generate working code, and run that code against synthetic
>     data and produce visual output.
>     >>>
>     >>> sure, there are often mistakes. the issue of hullucinations is
>     real. but where we are now
>     >>> is almost astonishing, and will likely get MUCH better in the
>     next year or three.
>     >>>
>     >>> end-of-admonishment
>     >>>
>     >>>     andrew
>     >>>
>     >>>> On May 26, 2025, at 9:40 AM, Norman Wilson <norman@oclsc.org
>     <mailto:norman@oclsc.org>> wrote:
>     >>>>
>     >>>> G. Branden Robinson:
>     >>>>
>     >>>>    That's why I think Norman has sussed it out accurately. 
>     LLMs are
>     >>>>    fantastic bullshit generators in the Harry G. Frankfurt
>     sense,[1]
>     >>>>    wherein utterances are undertaken neither to enlighten nor
>     to deceive,
>     >>>>    but to construct a simulacrum of plausible discourse. 
>     BSing is a close
>     >>>>    cousin to filibustering, where even plausibility is
>     discarded, often for
>     >>>>    the sake of running out a clock or impeding achievement of
>     consensus.
>     >>>>
>     >>>> ====
>     >>>>
>     >>>> That's exactly what I had in mind.
>     >>>>
>     >>>> I think I had read Frankfurt's book before I first started
>     >>>> calling LLMs bullshit generators, but I can't remember for
>     >>>> sure.  I don't plan to ask ChatGPT (which still, at least
>     >>>> sometimes, credits me with far greater contributions to Unix
>     >>>> than I have actually made).
>     >>>>
>     >>>>
>     >>>> Here's an interesting paper I stumbled across last week
>     >>>> which presents the case better than I could:
>     >>>>
>     >>>> https://link.springer.com/article/10.1007/s10676-024-09775-5
>     >>>>
>     >>>> To link this back to actual Unix history (or something much
>     >>>> nearer that), I realized that `bullshit generator' was a
>     >>>> reasonable summary of what LLMs do after also realizing that
>     >>>> an LLM is pretty much just a much-fancier and better-automated
>     >>>> descendant of Mark V Shaney:
>     https://en.wikipedia.org/wiki/Mark_V._Shaney
>     >>>>
>     >>>> Norman Wilson
>     >>>> Toronto ON
>
>
>
> -- 
> James D. (jj) Johnston
>
> Former Chief Scientist, Immersion Networks


[-- Attachment #2: Type: text/html, Size: 9462 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-05-31 22:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <F7093F5EDCBB735E2C7C473314D40D5A.for-standards-violators@oclsc.org>
     [not found] ` <CAEdTPBeFUcxAZWn1=mZnwTmF2a3DN-1GnXXB6WmV5gaqZHz1Lw@mail.gmail.com>
     [not found]   ` <769a9c94-055d-4bdd-a921-3e154c3b492f@infinitecactus.com>
     [not found]     ` <CAEdTPBcr2ajyAQh24LPtiQLBjfe2G2MYwoq8x_3bt6TzOT1_BA@mail.gmail.com>
2025-05-26 18:45       ` [TUHS] Wikipedia anecdotes - LLM generalizations [was On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum) Charles H Sauer (he/him)
     [not found]     ` <87frgqequk.fsf@gmail.com>
2025-05-27  3:08       ` [TUHS] Re: On the unreliability of LLM-based search results George Michaelson
2025-05-26 20:36 ` [TUHS] Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum) Noel Hunt
     [not found]   ` <CAKzdPgzNBUiT4GeQUnBX38+3dtNY=2Gw=9mFUy2anMoO4DUECg@mail.gmail.com>
2025-05-27 18:11     ` [TUHS] Mark V Shaney (Re: Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)) Dan Cross
     [not found] ` <DEBB648F-52A0-4E52-AC26-E2067FE7E0CD@humeweb.com>
     [not found]   ` <3e4339e9-bf9a-2b72-b47a-f20f81a153b5@makerlisp.com>
     [not found]     ` <202505312009.54VK97bQ4163488@freefriends.org>
     [not found]       ` <0adb7694-f99f-dafa-c906-d5502647aaf0@makerlisp.com>
     [not found]         ` <CAO2qRdMHAUHdPj9odydp3c9YwfaaU2pZiR6nmNS8O3r=rjKfWw@mail.gmail.com>
2025-05-31 22:47           ` [TUHS] Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum) Luther Johnson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).