From: bakul@bitblocks.com (Bakul Shah)
Subject: [TUHS] [groff] The hyphenation algorithm produces wrong results
Date: Sun, 4 Mar 2018 14:36:58 -0800 [thread overview]
Message-ID: <D39FBC5F-28BB-43F6-9B3B-7E87F816DC8A@bitblocks.com> (raw)
In-Reply-To: <20180304215023.883981F96E@orac.inputplus.co.uk>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]
> On Mar 4, 2018, at 1:50 PM, Ralph Corderoy <ralph at inputplus.co.uk> wrote:
>
> Hi Doug,
>
> Bakul wrote:
>> I remembered reading about Knuth's line-breaking algorithm in Software
>> Practice & Experience in early eighties and being quite impressed with
>> it. So may be that clear description of the algorithm has something
>> to do with it? Ah, here it is:
>>
>> “Breaking Paragraphs into lines” by Donald Knuth & Plass, SP&E, Volume
>> 11, issue 11, Nov. 1981
>
> That's more to do with TeX looking at the whole paragraph when deciding
> where to split lines. Hyphenation is part of that because a word might
> help out by being the ideal thing to split and have the rest of the
> lines sit easily in their length, but TeX's hyphenation algorithm is
> distinct again.
>
> Ted Harding gives some background on the groff list back in 2001,
> https://lists.gnu.org/archive/html/groff/2001-03/msg00026.html
> but I expect groff used TeX's algorithm because it was published, could
> handle multiple languages, e.g. hyphen.us, and the data files were
> available to contort into what groff ended up using in its simplified
> TeX algorithm.
>
> $ cd /usr/share/groff/1.22.3/tmac
> $ ls hyphen*
> hyphen.den hyphenex.cs hyphenex.us hyphen.sv hyphen.us
> hyphen.cs hyphen.det hyphenex.de hyphen.fr
> $
>
> They've comments explaining their content.
>
> Werner Lemburg on the groff list probably knows for certain as he had to
> fathom all this out before becoming groff's excellent maintainer for
> many years.
>
> --
> Cheers, Ralph.
> https://plus.google.com/+RalphCorderoy
Thanks Ralph and Toby. “Because it was clearly described and published”
was the point I was trying to make and should’ve stopped there : ). SP&E
article had made a strong impression on me and that is what I instantly
thought of.
prev parent reply other threads:[~2018-03-04 22:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-04 20:23 Doug McIlroy
2018-03-04 20:42 ` Clem Cole
2018-03-04 21:00 ` Bakul Shah
2018-03-04 21:32 ` Toby Thain
2018-03-04 21:50 ` Ralph Corderoy
2018-03-04 22:36 ` Bakul Shah [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D39FBC5F-28BB-43F6-9B3B-7E87F816DC8A@bitblocks.com \
--to=bakul@bitblocks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).