ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* ConTeXt german hyphenation
@ 1999-10-17 13:04 Peter Willadt
  1999-10-17 13:39 ` Tobias Burnus
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Peter Willadt @ 1999-10-17 13:04 UTC (permalink / raw)


Hello,

when hyphenating german texts, context seems to fail on words containing
the umlauts (äöü). With LaTeX, hyphenation works. Somehow this reminds
me of oldtime TeX (I know, accented letters do introduce explicit kerns,
and TeX never hyphenates at an explicit kern, but after all I type an
eight-bit character and also an eight-bit character is typeset, I
thought).

Here is an example (have I made something wrong in my setup?):

\starttext
\useencoding[win]
\setupoutput[pdftex]
\setupbodyfont[ber, ptm]
\mainlanguage[de]
\de
\showhyphens{Altenpflegeschülerinnen}
\stoptext

TeX then says Al-ten-pfle-geschülerinnen

With LaTeX, TeX says  Al-ten-pfle-gesch[]ule-rin-nen, which is much
better, it misses only two possible breaks. As I guess that ConTeXt uses
the same hyphenation table as LaTeX does, this is quite astonishing to
me.

Peter Willadt


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ConTeXt german hyphenation
  1999-10-17 13:04 ConTeXt german hyphenation Peter Willadt
@ 1999-10-17 13:39 ` Tobias Burnus
  1999-10-17 15:02   ` Peter Willadt
  1999-10-17 22:45 ` Hans Hagen
  1999-10-30 18:21 ` non-hyphenation Frans Goddijn
  2 siblings, 1 reply; 9+ messages in thread
From: Tobias Burnus @ 1999-10-17 13:39 UTC (permalink / raw)
  Cc: ntg-context

Hallo Peter,

> when hyphenating german texts, context seems to fail on words containing
> the umlauts (äöü). With LaTeX, hyphenation works. Somehow this reminds
> me of oldtime TeX (I know, accented letters do introduce explicit kerns,
> and TeX never hyphenates at an explicit kern, but after all I type an
> eight-bit character and also an eight-bit character is typeset, I
> thought).
>
> With LaTeX, TeX says  Al-ten-pfle-gesch[]ule-rin-nen, which is much
> better, it misses only two possible breaks. As I guess that ConTeXt uses
> the same hyphenation table as LaTeX does, this is quite astonishing to
> me.
(Hans, Taco, correct me if I say nonsense!)

I think TeX doesn't always hyphenate correct in LaTeX, when the hyphenation is
allowed in words containing accentuated characters (such as ü = diphthong + u)
and now two different philosophies become appearent: LaTeX says: better to be
mostly correct, finetuning is possible anyway. ConTeXt: Better not hyphenating
than wrong hyphenation, so the user will correctly reply to a overfull box. -- I
think both ideas are valid. In order to use ä as a letter and not as a accent
placed on a letter you need a font, which contains this letter. This can be a
virtual font (as ae, which maps ä to CM's "a) or a real one as the frequent used
LaTeX EC fonts (which have no Type 1 equivalent.) Using CM fonts directly it is
impossible to hyphenate accented words correctly, independed whether you type
\"a, "a or ä.
(I don't know in how far this is true for Postscript fonts or EC, or for ae
virtual fonts under ConTeXt.)

Regards, groetjes, Gruß,

Tobias


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ConTeXt german hyphenation
  1999-10-17 13:39 ` Tobias Burnus
@ 1999-10-17 15:02   ` Peter Willadt
  1999-10-18  0:12     ` Hans Hagen
  1999-10-18  9:06     ` Taco Hoekwater
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Willadt @ 1999-10-17 15:02 UTC (permalink / raw)
  Cc: ntg-context

Tobias Burnus schrieb:
> 
> Hallo Peter,
> 
> 
> I think TeX doesn't always hyphenate correct in LaTeX, when the hyphenation is
> allowed in words containing accentuated characters (such as ü = diphthong + u)
Hello,

former versions of german.sty (before TeX3) did a dirty trick in fooling
TeX that a new word began anywhere an umlaut was encountered. So there
were more hyphenations than the other way, but of course these were not
always correct. 
Now with TeX3 there came 8-bit and the patterns as well as german.sty
were changed to conform to this new situation.

As far as I know, incorrect hyphenation comes mostly from too long
composite words. Hyphenation of words with umlauts appears correct to
me.

> I think both ideas are valid. In order to use ä as a letter and not as a accent
> placed on a letter you need a font, which contains this letter. This can be a
> virtual font (as ae, which maps ä to CM's "a) or a real one as the frequent used
> LaTeX EC fonts (which have no Type 1 equivalent.) Using CM fonts directly it is
> impossible to hyphenate accented words correctly, independed whether you type
> \"a, "a or ä.
> (I don't know in how far this is true for Postscript fonts or EC, or for ae
> virtual fonts under ConTeXt.)
> 
Well, I have used eight bit Times in the example \setupbodyfont[ber,
ptm].

To me it looks like ConTeXt uses the hyphenation patterns in the plain
old TeX 2 way. Too bad I'm not enough of an expert to really figure that
out.

Peter Willadt


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ConTeXt german hyphenation
  1999-10-17 13:04 ConTeXt german hyphenation Peter Willadt
  1999-10-17 13:39 ` Tobias Burnus
@ 1999-10-17 22:45 ` Hans Hagen
  1999-10-30 18:21 ` non-hyphenation Frans Goddijn
  2 siblings, 0 replies; 9+ messages in thread
From: Hans Hagen @ 1999-10-17 22:45 UTC (permalink / raw)
  Cc: ntg-context

Peter Willadt wrote:

> when hyphenating german texts, context seems to fail on words containing
> the umlauts (äöü). With LaTeX, hyphenation works. Somehow this reminds
> me of oldtime TeX (I know, accented letters do introduce explicit kerns,
> and TeX never hyphenates at an explicit kern, but after all I type an
> eight-bit character and also an eight-bit character is typeset, I
> thought).
> 
> Here is an example (have I made something wrong in my setup?):
> 
> \starttext
> \useencoding[win]
> \setupoutput[pdftex]
> \setupbodyfont[ber, ptm]
> \mainlanguage[de]
> \de
> \showhyphens{Altenpflegeschülerinnen}
> \stoptext
> 
> TeX then says Al-ten-pfle-geschülerinnen
> 
> With LaTeX, TeX says  Al-ten-pfle-gesch[]ule-rin-nen, which is much
> better, it misses only two possible breaks. As I guess that ConTeXt uses
> the same hyphenation table as LaTeX does, this is quite astonishing to
> me.

Some remarks in addition to what Tobias already responded:

If I'm right, the [] in the latex example demonstrates that no eight bit
is used at all, but indicates that something boxed is present there.   

I just ran (windows encoding, so therefore \useencoding[ibm]): 

\useencoding[ibm]
\setupbodyfont[ptm]
\mainlanguage[de]
\de
\showhyphens{Altenpflegeschülerinnen}
\hyphenatedword{Altenpflegeschülerinnen}
\stoptext

shows no [] but something 8 bit. Anyway, I'll have a look at it, maybe
something goes wrong with loading the patterns. 

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.nl
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ConTeXt german hyphenation
  1999-10-17 15:02   ` Peter Willadt
@ 1999-10-18  0:12     ` Hans Hagen
  1999-10-18  9:06     ` Taco Hoekwater
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Hagen @ 1999-10-18  0:12 UTC (permalink / raw)
  Cc: Tobias Burnus, ntg-context

Peter Willadt wrote:

> To me it looks like ConTeXt uses the hyphenation patterns in the plain
> old TeX 2 way. Too bad I'm not enough of an expert to really figure that
> out.

I think I have located the problem. It has to do with the mapping. I
need to switch the mapping too, because otherwise the \lccodes are not
set. I can now get hyphenated german words with umlauts, and, without
[]'s, so real 8 bit. When tested, I will upload a patch. 

Hans 

(this is always a bit difficult to test because I don't use ec and berry
names here). 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.nl
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ConTeXt german hyphenation
  1999-10-17 15:02   ` Peter Willadt
  1999-10-18  0:12     ` Hans Hagen
@ 1999-10-18  9:06     ` Taco Hoekwater
  1 sibling, 0 replies; 9+ messages in thread
From: Taco Hoekwater @ 1999-10-18  9:06 UTC (permalink / raw)
  Cc: burnus, ntg-context

Problem might be that context's definition of low-level commands dont
expland properly. Will investigate tomorrow. 

[Hans, we might have found the first side effect of \dontleavehmode]

Taco


^ permalink raw reply	[flat|nested] 9+ messages in thread

* non-hyphenation
  1999-10-17 13:04 ConTeXt german hyphenation Peter Willadt
  1999-10-17 13:39 ` Tobias Burnus
  1999-10-17 22:45 ` Hans Hagen
@ 1999-10-30 18:21 ` Frans Goddijn
  1999-10-31 22:08   ` non-hyphenation Hans Hagen
  1999-11-01  6:13   ` non-hyphenation Berend de Boer
  2 siblings, 2 replies; 9+ messages in thread
From: Frans Goddijn @ 1999-10-30 18:21 UTC (permalink / raw)


I see that my Dutch context hyphenates the name A. F.Th. van der He-ijden.

How can I prevent this odd hyphenation? I will try \mbox{Heijden} but is
there a better way?

Groet!

Frans

Frans@iaf.nl                        http://www.iaf.nl/Users/Meridian/
BE tel: +32-15-348-909   fax: +32-15-230-687
NL tel/fax: 026-3211759 mobiel: 06-21 81 5881
Postbus 30196  6803 AD  Arnhem NL


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: non-hyphenation
  1999-10-30 18:21 ` non-hyphenation Frans Goddijn
@ 1999-10-31 22:08   ` Hans Hagen
  1999-11-01  6:13   ` non-hyphenation Berend de Boer
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Hagen @ 1999-10-31 22:08 UTC (permalink / raw)
  Cc: ntg-context

Frans Goddijn wrote:

> I see that my Dutch context hyphenates the name A. F.Th. van der He-ijden.

The next entry will do the job: 

\hyphenation{Heij-den}

BTW, when generating a format, context loads lang-nl.hyp, so you could
add it there before generating a format, 

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.nl
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: non-hyphenation
  1999-10-30 18:21 ` non-hyphenation Frans Goddijn
  1999-10-31 22:08   ` non-hyphenation Hans Hagen
@ 1999-11-01  6:13   ` Berend de Boer
  1 sibling, 0 replies; 9+ messages in thread
From: Berend de Boer @ 1999-11-01  6:13 UTC (permalink / raw)


> I see that my Dutch context hyphenates the name A. F.Th. van 
> der He-ijden.
> 
> How can I prevent this odd hyphenation?

Refer to other authors instead? :-)

Groetjes,

Berend. (-:


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~1999-11-01  6:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-10-17 13:04 ConTeXt german hyphenation Peter Willadt
1999-10-17 13:39 ` Tobias Burnus
1999-10-17 15:02   ` Peter Willadt
1999-10-18  0:12     ` Hans Hagen
1999-10-18  9:06     ` Taco Hoekwater
1999-10-17 22:45 ` Hans Hagen
1999-10-30 18:21 ` non-hyphenation Frans Goddijn
1999-10-31 22:08   ` non-hyphenation Hans Hagen
1999-11-01  6:13   ` non-hyphenation Berend de Boer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).