The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Tomasz Rola <rtomek@ceti.pl>
To: tuhs@minnie.tuhs.org
Subject: Re: [TUHS] The most surprising Unix programs
Date: Thu, 19 Mar 2020 22:18:33 +0100	[thread overview]
Message-ID: <20200319211833.GD16996@tau1.ceti.pl> (raw)
In-Reply-To: <CMM.0.95.0.1584651479.beebe@gamma.math.utah.edu>

On Thu, Mar 19, 2020 at 02:57:59PM -0600, Nelson H. F. Beebe wrote:
[...]
> 
> If you want to tackle raw HTML from abitrary source, then I agree with
> you: most HTML on the Web is not grammar conformant, there are
> numerous vendor extensions, and the HTML is hideously idiosynchratic
> and irregularly formatted.
> 
> The solution that I adopted 25 years ago was to write a grammar
> recognizing, but violation lenient, prettyprinter for HTML.  It has
> served well and I use it many times daily for my work in the BibNet
> Project and TeX User Group bibliography archives, now approaching 1.55
> million entries.  The latest public release is available here:
> 
> 	http://www.math.utah.edu/pub/sgml/

Thank you, I will have a longer look at those archives. My plan so far
was to explore html files with CL and Slime (interactive mode for CL
inside Emacs), which would allow me to actually find out what I want
to be looking for - well, hopefully :-).

-- 
Regards,
Tomasz Rola

--
** A C programmer asked whether computer had Buddha's nature.      **
** As the answer, master did "rm -rif" on the programmer's home    **
** directory. And then the C programmer became enlightened...      **
**                                                                 **
** Tomasz Rola          mailto:tomasz_rola@bigfoot.com             **

  reply	other threads:[~2020-03-19 21:26 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19 20:57 Nelson H. F. Beebe
2020-03-19 21:18 ` Tomasz Rola [this message]
2020-03-20  7:14 ` arnold
2020-03-20  7:49   ` Thomas Paulsen
2020-03-20  8:18     ` arnold
  -- strict thread matches above, loose matches on Subject: below --
2020-03-21  1:12 Noel Chiappa
2020-03-20 14:03 Noel Chiappa
2020-03-20 14:08 ` Richard Salz
2020-03-20 14:52   ` Larry McVoy
2020-03-20 14:58     ` Dagobert Michelsen
2020-03-20 15:05       ` Richard Salz
2020-03-20 22:09       ` Mike Markowski
2020-03-20 15:03     ` Gregg Levine
2020-03-20 15:05       ` Chet Ramey
2020-03-20 22:06     ` Dave Horsfall
2020-03-21  4:59     ` Wesley Parish
2020-03-20 21:57   ` Dave Horsfall
2020-03-22 18:05     ` Tony Finch
2020-03-20 15:07 ` Nemo
2020-03-20 19:03   ` Adam Thornton
2020-03-20 16:07 ` Grant Taylor via TUHS
2020-09-13 15:44   ` Juergen Nickelsen
2020-03-13 23:31 Doug McIlroy
2020-03-14  0:40 ` Dave Horsfall
2020-03-14 11:30 ` Harald Arnesen
2020-03-14 12:24   ` Clem Cole
2020-03-15 22:01     ` Rob Pike
2020-03-15 22:14       ` Larry McVoy
2020-03-15 23:34         ` Warner Losh
2020-03-16  2:45           ` Anthony Martin
2020-03-15 22:30       ` Clem Cole
2020-03-15 23:20       ` Dave Horsfall
2020-03-16  0:56         ` Rob Pike
2020-03-20 23:20           ` Dave Horsfall
2020-03-20 23:35             ` Toby Thain
2020-03-21  0:34             ` Rob Pike
2020-03-17 13:03 ` ca6c
2020-03-17 13:30   ` Andy Kosela
2020-03-17 14:53     ` Cág
2020-03-17 14:57       ` Larry McVoy
2020-03-17 14:59         ` Arrigo Triulzi
2020-03-17 15:40   ` Steve Nickolas
2020-03-17 22:28   ` Dave Horsfall
2020-03-18  0:17     ` Jon Steinhart
2020-03-18  3:28       ` Dave Horsfall
2020-03-18  8:40     ` arnold
2020-03-19 12:26     ` Mike Markowski
2020-03-19 21:31       ` Dave Horsfall
2020-03-20 11:48         ` paul
2020-03-20 15:40           ` Grant Taylor via TUHS
2020-03-20 16:40             ` Jon Steinhart
2020-03-20 17:23               ` Grant Taylor via TUHS
2020-03-20 18:43               ` Rich Morin
2020-03-19 20:01   ` Tomasz Rola

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200319211833.GD16996@tau1.ceti.pl \
    --to=rtomek@ceti.pl \
    --cc=tuhs@minnie.tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).