From: "Charles H Sauer (he/him)" <sauer@technologists.com>
To: COFF <coff@tuhs.org>
Cc: The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: [TUHS] Wikipedia anecdotes - LLM generalizations [was On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)
Date: Mon, 26 May 2025 13:45:44 -0500 [thread overview]
Message-ID: <71936198-93c1-41bc-8a5f-41d95969da0c@technologists.com> (raw)
In-Reply-To: <CAEdTPBcr2ajyAQh24LPtiQLBjfe2G2MYwoq8x_3bt6TzOT1_BA@mail.gmail.com>
TUHS->COFF
> > It's like Wikipedia.
>
> No, Wikipedia has (at least historically) human editors who
> supposedly have some knowledge of reality and history.
>
> An LLM response is going to be a series of tokens predicted based on
> probabilities from its training data. ...
>
> Assuming the sources it cites are real works, it seems fine as a
> search engine, but the text that it outputs should absolutely not be
> thought of as something arrived at by similar means as text produced
> by supposedly knowledgeable and well-intentioned humans.
>
> An LLM can weigh sources, but it has to be taught to do that. A human
> can weigh sources, but it has to be taught to do that.
Before LLMs, Wikipedia, World Wide Web, ... adages such as "Trust, but
verify," and "Inspect what you expect," were appropriate, and still are.
Dabbling in editing and creating Wikipedia articles has enforced those
notions. A few anecdotes here -- I could cite others.
1. I think my first experience was trying in 2008 to fix what is now at
https://en.wikipedia.org/wiki/Vulcan_Gas_Company_(1967%E2%80%931970),
because the article had so much erroneous content, and because I had
worked/performed at that venue 1969-70. Much of what I did in 2008 was
accepted without anyone else verifying. But others broke things/changed
things, even renamed the original article and replaced it with an
article about a newer club that adopted the name. A few years ago, I
tried to make corrections, citing poster images at
https://concerts.fandom.com/wiki/Vulcan_Gas_Company. Those changes were
vetoed because fandom.com was considered unreliable. I copied the images
from fandom to https://technologists.com/VGC/, and then citing those
images was then accepted by the editors involved. (The article has been
changed dramatically, still is seriously deficient, IMO, but I'm not
interested in fixing.)
2. Last year, I created https://en.wikipedia.org/wiki/Hub_City_Movers,
citing sources I considered reliable. Citations to images at discogs.com
were vetoed as unreliable, based on analogous bias against that site.
Partly to see what was possible, I engaged with editors, found citations
they found acceptable, and ultimately produced a better article.
3. Later last year, I edited https://en.wikipedia.org/wiki/IBM_AIX to
fix obviously erroneous discussion of AIX 1/2/3. Even though I used my
own writings as references, the changes were accepted.
I still use the Web, Wikipedia, and even LLMs, but cautiously.
Charlie
--
voice: +1.512.784.7526 e-mail: sauer@technologists.com
fax: +1.512.346.5240 Web: https://technologists.com/sauer/
Facebook/Google/LinkedIn/mas.to: CharlesHSauer
next prev parent reply other threads:[~2025-05-26 18:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <F7093F5EDCBB735E2C7C473314D40D5A.for-standards-violators@oclsc.org>
2025-05-26 20:36 ` [TUHS] Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum) Noel Hunt
[not found] ` <CAKzdPgzNBUiT4GeQUnBX38+3dtNY=2Gw=9mFUy2anMoO4DUECg@mail.gmail.com>
2025-05-27 18:11 ` [TUHS] Mark V Shaney (Re: Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum)) Dan Cross
[not found] ` <CAEdTPBeFUcxAZWn1=mZnwTmF2a3DN-1GnXXB6WmV5gaqZHz1Lw@mail.gmail.com>
[not found] ` <769a9c94-055d-4bdd-a921-3e154c3b492f@infinitecactus.com>
[not found] ` <CAEdTPBcr2ajyAQh24LPtiQLBjfe2G2MYwoq8x_3bt6TzOT1_BA@mail.gmail.com>
2025-05-26 18:45 ` Charles H Sauer (he/him) [this message]
[not found] ` <87frgqequk.fsf@gmail.com>
2025-05-27 3:08 ` [TUHS] Re: On the unreliability of LLM-based search results George Michaelson
[not found] ` <DEBB648F-52A0-4E52-AC26-E2067FE7E0CD@humeweb.com>
[not found] ` <3e4339e9-bf9a-2b72-b47a-f20f81a153b5@makerlisp.com>
[not found] ` <202505312009.54VK97bQ4163488@freefriends.org>
[not found] ` <0adb7694-f99f-dafa-c906-d5502647aaf0@makerlisp.com>
[not found] ` <CAO2qRdMHAUHdPj9odydp3c9YwfaaU2pZiR6nmNS8O3r=rjKfWw@mail.gmail.com>
2025-05-31 22:47 ` [TUHS] Re: On the unreliability of LLM-based search results (was: Listing of early Unix source code from the Computer History Museum) Luther Johnson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=71936198-93c1-41bc-8a5f-41d95969da0c@technologists.com \
--to=sauer@technologists.com \
--cc=coff@tuhs.org \
--cc=tuhs@tuhs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).