The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Lawrence Stewart <stewart@serissa.com>
To: Clem Cole <clemc@ccc.com>
Cc: The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: Re: [TUHS] Archeology: AberMUD, BCPL, ec.
Date: Thu, 31 Jan 2019 14:45:10 -0500	[thread overview]
Message-ID: <AE5F90B7-8D06-46DA-9BE1-F69B8AF498A6@serissa.com> (raw)
In-Reply-To: <7C327ED7-E712-475A-8F7D-EDEBCD529255@serissa.com>

[-- Attachment #1: Type: text/plain, Size: 4001 bytes --]

A followup from TV Raman, now at Google:

> We also did an intern project -- Tom's intern who became my intern after
> Tom left (Arjen De Vries) where we did:
> 
> 1. Converted the caption stream into an sgml document indexed by time --
> so the caption stream came down in dribs and drabs of the form "turn
> background yellow, foreground white, place this text"... that turned
> into the  SGML document, with each element tagged with time.
> 
> 2. We then indexed that collection of SGML documents --   the content
> stream was Tom's ring-buffer of  the CNN live feed (6 hours was what we
> stored from memory)
> 3. We then built a simple-minded search engine over the SGML documents,
> used the CRL reco engine for getting user queries -- you could also just
> type the query at a search box; did the search over the
> caption-doc-index, found the time-stamp and played the video.
> 
> Arjen may have published some of this as his final year Masters project
> out of the University Of Twente -- likely summer 1995.
> -- 
> Id: kg:/m/0285kf1

I searched for Arjen De Vries and found

https://pdfs.semanticscholar.org/fb10/b792fb209e0d347cd14430fbb446c1b178f3.pdf
“Radio and Television Information Filtering through Speech Recognition”
which in turn cites his Master’s thesis from 1995.



> On 2019, Jan 31, at 2:34 PM, Lawrence Stewart <stewart@serissa.com> wrote:
> 
> I was at CRL from 1989 to 1994.  I sent an inquiry to our informal mailing list.
> 
> We had written an audio server along the lines of the X server (http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-93-8.pdf <http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-93-8.pdf>) and Tom Levergood wrote an application called Store24 to keep a rolling 24 history of WBUR (local NPR station).  We thought about using speech recognition to build a searchable index for it.
> 
> The next idea was to do the same thing for Video, perhaps using the closed captioning feed to develop the index.  Dave Wecker (now at Microsoft Research) reports working on extracting data from NPR news streams and it would find the appropriate audio or video clip.  He’s not sure he published that.
> 
> Jim Gettys cites http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-99-2.pdf <http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-99-2.pdf> (Indexing Multimedia for the Internet) and notes that all the DEC techreports are hidden away at http://www.hpl.hp.com/techreports/ <http://www.hpl.hp.com/techreports/>. Choose “Browse by year” and select Compaq/DEC
> 
> -Larry
> 
>> On 2019, Jan 31, at 9:42 AM, Clem Cole <clemc@ccc.com <mailto:clemc@ccc.com>> wrote:
>> 
>> I'm not sure if the old DEC CRL tech reports are still around.   At one time before the Compaq-tion, some folks at CRL and the folks at Boston Public Library and WGBH were working with video and trying to extract all sorts of text from it.   I do not remember how successful they were, but there might be some hints in their tech reports.  I'll ask around and see if I can turn anything up.  Part of the problem I have is I that don't remember who was doing that work, but some of my friends might.
>> 
>> Clem
>> ᐧ
>> 
>> On Thu, Jan 31, 2019 at 2:16 AM Alec Muffett <alec.muffett@gmail.com <mailto:alec.muffett@gmail.com>> wrote:
>> Has anyone ever attempted to OCR a video, perhaps by breaking into frames and then aggregating the results, using multiple frames to correct each other?
>> 
>> On Wed, 30 Jan 2019, 19:51 Richard Salz <rich.salz@gmail.com <mailto:rich.salz@gmail.com> wrote:
>> Some folks are trying to figure out how to get AberMud source online and working; see https://twitter.com/larsbrinkhoff/status/1056823314272960512 <https://twitter.com/larsbrinkhoff/status/1056823314272960512>
>> 
>> Sample code at https://raw.githubusercontent.com/larsbrinkhoff/abermud/master/abermud1/text/timelock.b <https://raw.githubusercontent.com/larsbrinkhoff/abermud/master/abermud1/text/timelock.b>
>> 
>> 
>> 
> 


[-- Attachment #2: Type: text/html, Size: 7135 bytes --]

  reply	other threads:[~2019-01-31 19:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-30 19:51 Richard Salz
2019-01-31  7:15 ` Alec Muffett
2019-01-31  7:28   ` Lars Brinkhoff
2019-01-31 14:42   ` Clem Cole
2019-01-31 19:34     ` Lawrence Stewart
2019-01-31 19:45       ` Lawrence Stewart [this message]
2019-02-01  5:08   ` Jason Stevens
2019-02-01  8:09     ` Steve Nickolas
2019-02-01  4:47 Doug McIlroy
2019-02-01 14:41 ` Nemo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AE5F90B7-8D06-46DA-9BE1-F69B8AF498A6@serissa.com \
    --to=stewart@serissa.com \
    --cc=clemc@ccc.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).