* [TUHS] [groff] The hyphenation algorithm produces wrong results @ 2018-03-04 20:23 Doug McIlroy 2018-03-04 20:42 ` Clem Cole 0 siblings, 1 reply; 6+ messages in thread From: Doug McIlroy @ 2018-03-04 20:23 UTC (permalink / raw) I hadn't realized that groff hyphenation had been taken from Tex, not troff. Is that becuase Tex did a better job, or because troff's was deemed proprietary? ^ permalink raw reply [flat|nested] 6+ messages in thread
* [TUHS] [groff] The hyphenation algorithm produces wrong results 2018-03-04 20:23 [TUHS] [groff] The hyphenation algorithm produces wrong results Doug McIlroy @ 2018-03-04 20:42 ` Clem Cole 2018-03-04 21:00 ` Bakul Shah 0 siblings, 1 reply; 6+ messages in thread From: Clem Cole @ 2018-03-04 20:42 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 577 bytes --] On Sun, Mar 4, 2018 at 3:23 PM, Doug McIlroy <doug at cs.dartmouth.edu> wrote: > > I hadn't realized that groff hyphenation had been taken from > Tex, not troff. Is that becuase Tex did a better job, or > because troff's was deemed proprietary? > > Given the author, I would guess the later as he wanted to be FOSS and would not have looked at the ditroff source - but that guess is worth just that ;-) ᐧ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20180304/a6c2dac7/attachment.html> ^ permalink raw reply [flat|nested] 6+ messages in thread
* [TUHS] [groff] The hyphenation algorithm produces wrong results 2018-03-04 20:42 ` Clem Cole @ 2018-03-04 21:00 ` Bakul Shah 2018-03-04 21:32 ` Toby Thain 2018-03-04 21:50 ` Ralph Corderoy 0 siblings, 2 replies; 6+ messages in thread From: Bakul Shah @ 2018-03-04 21:00 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1039 bytes --] > On Mar 4, 2018, at 12:42 PM, Clem Cole <clemc at ccc.com> wrote: > > >> On Sun, Mar 4, 2018 at 3:23 PM, Doug McIlroy <doug at cs.dartmouth.edu> wrote: >> >> I hadn't realized that groff hyphenation had been taken from >> Tex, not troff. Is that becuase Tex did a better job, or >> because troff's was deemed proprietary? >> > > Given the author, I would guess the later as he wanted to be FOSS and would not have looked at the ditroff source - but that guess is worth just that ;-) I remembered reading about Knuth's line-breaking algorithm in Software Practice & Experience in early eighties and being quite impressed with it. So may be that clear description of the algorithm has something to do with it? Ah, here it is: “Breaking Paragraphs into lines” by Donald Knuth & Plass, SP&E, Volume 11, issue 11, Nov. 1981 (Download from Wiley is not free) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20180304/3b71374a/attachment.html> ^ permalink raw reply [flat|nested] 6+ messages in thread
* [TUHS] [groff] The hyphenation algorithm produces wrong results 2018-03-04 21:00 ` Bakul Shah @ 2018-03-04 21:32 ` Toby Thain 2018-03-04 21:50 ` Ralph Corderoy 1 sibling, 0 replies; 6+ messages in thread From: Toby Thain @ 2018-03-04 21:32 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1430 bytes --] On 2018-03-04 4:00 PM, Bakul Shah wrote: > > > On Mar 4, 2018, at 12:42 PM, Clem Cole <clemc at ccc.com > <mailto:clemc at ccc.com>> wrote: > >> >> On Sun, Mar 4, 2018 at 3:23 PM, Doug McIlroy <doug at cs.dartmouth.edu >> <mailto:doug at cs.dartmouth.edu>> wrote: >> >> >> I hadn't realized that groff hyphenation had been taken from >> Tex, not troff. Is that becuase Tex did a better job, or >> because troff's was deemed proprietary? >> >> Given the author, I would guess the later as he wanted to be FOSS and >> would not have looked at the ditroff source - but that guess is worth >> just that ;-) > > I remembered reading about Knuth's line-breaking algorithm in > Software Practice & Experience in early eighties and being quite > impressed with it. So may be that clear description of the algorithm > has something to do with it? Ah, here it is: > > “Breaking Paragraphs into lines” by Donald Knuth & Plass, > SP&E, Volume 11, issue 11, Nov. 1981 That's the line breaker, which is an important contributor to the quality of TeX output. But TeX's *hyphenation* algorithm per se was invented by Franklin Mark Liang and was indeed considerably better than its predecessors and competitors (including most or all commercial typesetting software -- which was a big part of the motivation for it): https://tug.org/docs/liang/liang-thesis.pdf --Toby > > (Download from Wiley is not free) > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [TUHS] [groff] The hyphenation algorithm produces wrong results 2018-03-04 21:00 ` Bakul Shah 2018-03-04 21:32 ` Toby Thain @ 2018-03-04 21:50 ` Ralph Corderoy 2018-03-04 22:36 ` Bakul Shah 1 sibling, 1 reply; 6+ messages in thread From: Ralph Corderoy @ 2018-03-04 21:50 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1464 bytes --] Hi Doug, Bakul wrote: > I remembered reading about Knuth's line-breaking algorithm in Software > Practice & Experience in early eighties and being quite impressed with > it. So may be that clear description of the algorithm has something > to do with it? Ah, here it is: > > “Breaking Paragraphs into lines” by Donald Knuth & Plass, SP&E, Volume > 11, issue 11, Nov. 1981 That's more to do with TeX looking at the whole paragraph when deciding where to split lines. Hyphenation is part of that because a word might help out by being the ideal thing to split and have the rest of the lines sit easily in their length, but TeX's hyphenation algorithm is distinct again. Ted Harding gives some background on the groff list back in 2001, https://lists.gnu.org/archive/html/groff/2001-03/msg00026.html but I expect groff used TeX's algorithm because it was published, could handle multiple languages, e.g. hyphen.us, and the data files were available to contort into what groff ended up using in its simplified TeX algorithm. $ cd /usr/share/groff/1.22.3/tmac $ ls hyphen* hyphen.den hyphenex.cs hyphenex.us hyphen.sv hyphen.us hyphen.cs hyphen.det hyphenex.de hyphen.fr $ They've comments explaining their content. Werner Lemburg on the groff list probably knows for certain as he had to fathom all this out before becoming groff's excellent maintainer for many years. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ^ permalink raw reply [flat|nested] 6+ messages in thread
* [TUHS] [groff] The hyphenation algorithm produces wrong results 2018-03-04 21:50 ` Ralph Corderoy @ 2018-03-04 22:36 ` Bakul Shah 0 siblings, 0 replies; 6+ messages in thread From: Bakul Shah @ 2018-03-04 22:36 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1852 bytes --] > On Mar 4, 2018, at 1:50 PM, Ralph Corderoy <ralph at inputplus.co.uk> wrote: > > Hi Doug, > > Bakul wrote: >> I remembered reading about Knuth's line-breaking algorithm in Software >> Practice & Experience in early eighties and being quite impressed with >> it. So may be that clear description of the algorithm has something >> to do with it? Ah, here it is: >> >> “Breaking Paragraphs into lines” by Donald Knuth & Plass, SP&E, Volume >> 11, issue 11, Nov. 1981 > > That's more to do with TeX looking at the whole paragraph when deciding > where to split lines. Hyphenation is part of that because a word might > help out by being the ideal thing to split and have the rest of the > lines sit easily in their length, but TeX's hyphenation algorithm is > distinct again. > > Ted Harding gives some background on the groff list back in 2001, > https://lists.gnu.org/archive/html/groff/2001-03/msg00026.html > but I expect groff used TeX's algorithm because it was published, could > handle multiple languages, e.g. hyphen.us, and the data files were > available to contort into what groff ended up using in its simplified > TeX algorithm. > > $ cd /usr/share/groff/1.22.3/tmac > $ ls hyphen* > hyphen.den hyphenex.cs hyphenex.us hyphen.sv hyphen.us > hyphen.cs hyphen.det hyphenex.de hyphen.fr > $ > > They've comments explaining their content. > > Werner Lemburg on the groff list probably knows for certain as he had to > fathom all this out before becoming groff's excellent maintainer for > many years. > > -- > Cheers, Ralph. > https://plus.google.com/+RalphCorderoy Thanks Ralph and Toby. “Because it was clearly described and published” was the point I was trying to make and should’ve stopped there : ). SP&E article had made a strong impression on me and that is what I instantly thought of. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-03-04 22:36 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-04 20:23 [TUHS] [groff] The hyphenation algorithm produces wrong results Doug McIlroy 2018-03-04 20:42 ` Clem Cole 2018-03-04 21:00 ` Bakul Shah 2018-03-04 21:32 ` Toby Thain 2018-03-04 21:50 ` Ralph Corderoy 2018-03-04 22:36 ` Bakul Shah
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).