ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: Hans Hagen <pragma@wxs.nl>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: XeConTeXt bug report I: strange benchmark
Date: Wed, 13 May 2009 10:28:14 +0200	[thread overview]
Message-ID: <4A0A849E.4060608@wxs.nl> (raw)
In-Reply-To: <68bfdc900905122138p440e8b0bp8dfdb847fa8a27d9@mail.gmail.com>

Yue Wang wrote:

> After debugging for half an hour in the morning, finally I know why
> this is so slow on both XeTeX and pdfTeX.
> This problem is not an operating system issue, But a ConTeXt "feature".
> 
> in font-mkii, you use the following to define a actual font
> (\definefontlocal and \definefontglobal):
>   {\expandafter\xdef\csname#1\endcsname  % ! ! ! ! not needed in luatex ! ! ! !
>      {\noexpand\csname#1:\endcsname
>       \noexpand\reactivatefont{\somefontname}{\fontfile}}%
> 
> For example, in the previous example, it will define a lot of fonts,
> like \*myzhfont12ptmmexrm*: , is defined as
> \reactivatefont{cmex10}{lmex10}. When define that font, a macro
> \*myzhfont12ptmmexrm* is defined to select the \*myzhfont12ptmmexrm* :
> font.
> \*myzhfont12ptmmexrm* =\csname *myzhfont12ptmmexrm* :\endcsname
> \reactivefont{cmex10}{lmex10}
> TeX then expand your reactivatefont macros to lmex10 at 12.0pt. So the
> \*myzhfont12ptmmexrm*: font will assigned to lmex10 at 12.0pt. This is
> quite right for definition.
> 
> However, after switching the font, the problems occurs:
> it tries to call \*myzhfont12ptmmexrm* macro. the macro then expands to
> \csname *myzhfont12ptmmexrm* :\endcsname \reactivefont{cmex10}{lmex10}
> then to
> \*myzhfont12ptmmexrm* :\reactivefont{cmex10}{lmex10}
> and here TeX tries to do the following thing:
> TeX select the font \*myzhfont12ptmmexrm*:, which is lmex10 at 12pt,
> then *changing* into that font environment in order to *typeset*
> \reactivefont.  it then expand the macro of reacitve font, but found
> nothing to typeset...

some of that activation can involve things that set up the font (some
related properties) so the mechanism is kind of generic; at that point
it is not possible to determine what is relevant and what not

changing such things at frst sight might lead to unexpected side effects
later on (as the context history has proven)

although part of the activation code can be disables when defining a
sequence of fonts (as in a massive switch) it would complicate the font
code even more to do so; in mkiv we do need less of that as explained
below (hence the remark 'not needed in luatex')

for instance, we can redefine (for xetex)

\def\updatefontparameters
  {\edef\@@fontfeatures{\truefontdata\fontfile    \s!features}%
   \edef\@@fontskewchar{\truefontdata\fontfile    \s!skewchar}}

\def\setfontcharacteristics
  {%\updatefontparameters % redundant, will go away, faster too
   \the\everyfont
   \synchronizepatternswithfont}

Which is faster,  but then we also need to make absolutely sure that
xetex only uses opentype fonts (which in turn means that we need to
provide separate typescript files for xetex to make sure that this
happens); this can be significant on a 10 second job with 10K switches,
but less in a complex document that takes 2 minutes to process, or as we
sometimes had, 20 minute jobs

Another option is to make all sizes (\tf*) optional, which then would
force a typeface switch in heads which in turn slows down things at that
end, or you could disable bigmath (that follows the sizes) and then end
 up with mails to this list why math does not scale in titles, etc etc

> what's worse, this will occur on every real fonts in the definition.
> So in fact TeX will switching dozens of fonts into the *current* font
> for a "\switchtobodyfont" call, in order to typeset nothing. And
> that's why ConTeXt is so slow on typeface changing.
> 
> The solution to this bug (or feature?) is quite easy, just use \font
> to define the \*myzhfont12ptmmexrm*: font, but pay attention not to
> switch the current font to \*myzhfont12ptmmexrm*: . Since it is quite
> slow to switch to dozens of font for one \switchtobodyfont call even
> in Knuth TeX.

You make it sound as all of the font system's features are kind of wrong
and contain oversights ... i'm not going to enter a discussion about the
how and why of the context way of doing things as it is a accumulation
of over 15 years of development and experients and realistic typesetting
situations. We're not that stupid.

In mkii (and to a lesser extent in mkiv) we have to deal with several
situations:

Fonts have 256 chars at most so they cannot cover each language. This
means that when one mixes languages, one also might need multiple fonts.
 For instance using two complete different typefaces, one with ec
encoding, another with texnansi encoding and both with different math
fonts also in a different encoding is happing at our end.

As fonts encodings are related to hyphenation, there is a relationship
between a font switch and a language switch i.e. when we're typesetting
in czech it might be that a font switch also results in a patterns
switch due to the fact that patterns relate to fonts. The fact that CJK
does not hyphenate is no reason to remove that feature from the font
system. I know that the latex way of doing things is to replace and
redefine core code for specific purposes, but that's not the way we do
things in context. In my own usage i just accepted the fact that in
order to typeset my own stuff my jobs ran slower because other languages
needed other features to be present.

So, we have encodings, languages, upper/lowercase mappings, font
handling, some math properties, etc. so in practice several bits and
pieces need to be synchronized when we switch a font (not relevant maybe
for cjk but definitely relevant for other scripts). There can even be
weird font related issues that need to be dealt with.

Now, when one uses a typeface (i.e. fontclasses) the actual loading of
the font takes place only once as we assume that one will not redefine a
typeface mid-document. The only things that happening with each
(individual) font switch are these synchronizations.

When switching a typeface, only a few fonts are really defined (loaded)
as tracing shows but some things are set up each time (of course, as we
obey grouping). If you are sure that there are no dependencies (i.e. a
regular title with embedded bold, math, etc) you can use \definefont and
assign that font to the style property of for instance a section head.

The whole system is targeting at situations that we have to deal with at
pragma, and when setting up styles it's no big deal to do that
efficiently with a mix of typefaces and \definefont definitions (using
symbolic FontNames so that encoding etc is nicely dealt with). You
mentioned many font switches in your document, well believe me, some of
the documents we produce have many too and compared to other typesetting
tasks dealing with fonts is not the bottleneck.

Now, in mkiv we have a bit less complex system because we only have one
input encoding (utf i.e. unicode), one font encoding (unicode), and one
math encoding (unicode) and so it has a more optimal code base. As xetex
shares code with pdftex (both use mkii) code, it has the burden of some
overhead not needed if we'd settle on one encoding (although xetex does
handle multiple font encodings). However, that would demand a partial
rewrite of the mkii font subsystem specially for xetex (and eventually
also for the math subsystem) which is currently not on the agenda
(unless i need it in well paid projects).

As i mentioned yesterday, i can disable a few aspects of mkii in xetex
mode, but the gain is not so large and as i don't use xetex on a regular
basis it does not pay off to complicate the code. Simplified math might
make a difference but as lm/gyre open type math is not there yet, i
won't even start looking into it (if at all). The problem is simply that
mkii (or derived xetex upgrades) will not keep up with mkiv anyway. I
don't see much reason for making an advanced mkiv system (which takes
time) and then derive a crippled halfway mkii-mkiv system just for fun
(which is not that much fun actually; okay, if someone comes up with a
project i might refactor bit and pieces).

So, no bugs, nor 'features' but just features cq. functionality that is
the result of over 15 years of supporting everything that came along.
Reading of the font related docs that come with context probably reveal
much of this anyway.

Context deals (and is made for) mixed languages, fonts, etc etc and it
will stay that way. We will not optimize for specific usage. Also, in
practice font switching only part of the game and other typesetting
tasks might take much more time (esp when one sets up complex page
layouts).

Hans

ps. in mkiv we gain time at the definition end but then loose it at the
processing end because there we want to do more clever things that take
time anyway

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


  parent reply	other threads:[~2009-05-13  8:28 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12  5:01 Yue Wang
2009-05-12  9:02 ` Yue Wang
2009-05-12  9:09 ` Yue Wang
2009-05-12  9:30   ` Hans Hagen
2009-05-12  9:59     ` Yue Wang
2009-05-12 10:04       ` Yue Wang
2009-05-12 10:06       ` Hans Hagen
     [not found]         ` <4A094C71.3040503@elvenkind.com>
2009-05-12 11:02           ` Hans Hagen
2009-05-12 11:26             ` Taco Hoekwater
2009-05-12 11:33               ` Yue Wang
2009-05-12 11:52                 ` Taco Hoekwater
2009-05-12 11:57                   ` Hans Hagen
2009-05-12 12:01                   ` Yue Wang
2009-05-12 12:21                     ` Hans Hagen
2009-05-12 12:26                     ` Taco Hoekwater
2009-05-12 12:31                       ` Yue Wang
2009-05-12 11:09           ` Yue Wang
2009-05-12 11:37             ` Hans Hagen
2009-05-12 12:26               ` Yue Wang
2009-05-12 12:54                 ` Hans Hagen
2009-05-12 13:59                   ` Yue Wang
2009-05-12 14:11                     ` Yue Wang
2009-05-12 15:02                       ` Hans Hagen
2009-05-12 15:12                         ` Yue Wang
2009-05-12 15:22                           ` Hans Hagen
2009-05-12 15:45                             ` Yue Wang
2009-05-13  4:38                             ` Yue Wang
2009-05-13  7:49                               ` Yue Wang
2009-05-13  8:28                               ` Hans Hagen [this message]
2009-05-12 15:45                           ` Hans Hagen
2009-05-12 14:50                     ` Hans Hagen
2009-05-12 15:10                       ` Yue Wang
2009-05-12 15:14                         ` Taco Hoekwater
2009-05-12 15:41                           ` Hans Hagen
2009-05-12 15:49                             ` Yue Wang
2009-05-12 15:20                         ` Hans Hagen
2009-05-12 15:50                     ` Hans Hagen
2009-05-13  8:59       ` Mojca Miklavec
2009-05-13  9:55         ` Mojca Miklavec
2009-05-13 11:12           ` Hans Hagen
2009-05-13 11:17           ` Hans Hagen
2009-05-13 10:02         ` Wolfgang Schuster
2009-05-13 10:17           ` Mojca Miklavec
2009-05-13 10:41             ` Wolfgang Schuster
2009-05-13 11:12               ` Mojca Miklavec
2009-05-13 11:28                 ` Wolfgang Schuster
2009-05-13 12:48                   ` Thomas A. Schmitz
2009-05-13 12:26                 ` Yue Wang
2009-05-13 12:59                   ` Hans Hagen
2009-05-13 13:25                     ` Wolfgang Schuster
2009-05-12 11:01   ` Wolfgang Schuster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A0A849E.4060608@wxs.nl \
    --to=pragma@wxs.nl \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).