The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: bakul@bitblocks.com (Bakul Shah)
Subject: [TUHS] [groff] The hyphenation algorithm produces wrong results
Date: Sun, 4 Mar 2018 14:36:58 -0800	[thread overview]
Message-ID: <D39FBC5F-28BB-43F6-9B3B-7E87F816DC8A@bitblocks.com> (raw)
In-Reply-To: <20180304215023.883981F96E@orac.inputplus.co.uk>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]



> On Mar 4, 2018, at 1:50 PM, Ralph Corderoy <ralph at inputplus.co.uk> wrote:
> 
> Hi Doug,
> 
> Bakul wrote:
>> I remembered reading about Knuth's line-breaking algorithm in Software
>> Practice & Experience in early eighties and being quite impressed with
>> it.  So may be that clear description of the algorithm has something
>> to do with it?  Ah, here it is:
>> 
>> “Breaking Paragraphs into lines” by Donald Knuth & Plass, SP&E, Volume
>> 11, issue 11, Nov. 1981
> 
> That's more to do with TeX looking at the whole paragraph when deciding
> where to split lines.  Hyphenation is part of that because a word might
> help out by being the ideal thing to split and have the rest of the
> lines sit easily in their length, but TeX's hyphenation algorithm is
> distinct again.
> 
> Ted Harding gives some background on the groff list back in 2001,
> https://lists.gnu.org/archive/html/groff/2001-03/msg00026.html
> but I expect groff used TeX's algorithm because it was published, could
> handle multiple languages, e.g. hyphen.us, and the data files were
> available to contort into what groff ended up using in its simplified
> TeX algorithm.
> 
>    $ cd /usr/share/groff/1.22.3/tmac
>    $ ls hyphen*
>    hyphen.den  hyphenex.cs hyphenex.us hyphen.sv   hyphen.us
>    hyphen.cs   hyphen.det  hyphenex.de hyphen.fr
>    $
> 
> They've comments explaining their content.
> 
> Werner Lemburg on the groff list probably knows for certain as he had to
> fathom all this out before becoming groff's excellent maintainer for
> many years.
> 
> -- 
> Cheers, Ralph.
> https://plus.google.com/+RalphCorderoy

Thanks Ralph and Toby. “Because it was clearly described and published”
was the point I was trying to make and should’ve stopped there : ). SP&E
article had made a strong impression on me and that is what I instantly
thought of. 


      reply	other threads:[~2018-03-04 22:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-04 20:23 Doug McIlroy
2018-03-04 20:42 ` Clem Cole
2018-03-04 21:00   ` Bakul Shah
2018-03-04 21:32     ` Toby Thain
2018-03-04 21:50     ` Ralph Corderoy
2018-03-04 22:36       ` Bakul Shah [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D39FBC5F-28BB-43F6-9B3B-7E87F816DC8A@bitblocks.com \
    --to=bakul@bitblocks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).