From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/50389 Path: news.gmane.org!not-for-mail From: Hans Hagen Newsgroups: gmane.comp.tex.context Subject: Re: XeConTeXt bug report I: strange benchmark Date: Wed, 13 May 2009 10:28:14 +0200 Message-ID: <4A0A849E.4060608@wxs.nl> References: <68bfdc900905112201w66e00fb3q6213887b41d16bbc@mail.gmail.com> <68bfdc900905120409me209b75hfb4cd06ca327affb@mail.gmail.com> <4A095F7E.6010900@wxs.nl> <68bfdc900905120526h65b0bbd1pb777e9a4eb4ee742@mail.gmail.com> <4A097196.1060502@wxs.nl> <68bfdc900905120659p1c8d37aav2c7fcc3dc1c0222f@mail.gmail.com> <68bfdc900905120711h265c9799i575cd31493440dba@mail.gmail.com> <4A098F76.6060606@wxs.nl> <68bfdc900905120812l63aed1a5v916bf6d752aa41dc@mail.gmail.com> <4A099440.1030600@wxs.nl> <68bfdc900905122138p440e8b0bp8dfdb847fa8a27d9@mail.gmail.com> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1242203363 14981 80.91.229.12 (13 May 2009 08:29:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 13 May 2009 08:29:23 +0000 (UTC) To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Wed May 13 10:29:15 2009 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1M49py-0008Se-Gr for gctc-ntg-context-518@m.gmane.org; Wed, 13 May 2009 10:29:15 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id C38F61FF16; Wed, 13 May 2009 10:29:13 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 26293-04; Wed, 13 May 2009 10:28:32 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 92DAA1FBC6; Wed, 13 May 2009 10:28:32 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 080BD1FBC6 for ; Wed, 13 May 2009 10:28:31 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 26294-03-5 for ; Wed, 13 May 2009 10:28:19 +0200 (CEST) Original-Received: from filter1-til.mf.surf.net (filter1-til.mf.surf.net [194.171.167.217]) by ronja.ntg.nl (Postfix) with ESMTP id 08EFB1FB39 for ; Wed, 13 May 2009 10:28:19 +0200 (CEST) Original-Received: from mail.pragma-ade.net (dsl-083-247-100-017.solcon.nl [83.247.100.17]) by filter1-til.mf.surf.net (8.13.8/8.13.8/Debian-3) with ESMTP id n4D8SHQ4016748 for ; Wed, 13 May 2009 10:28:18 +0200 Original-Received: from [10.100.1.100] (unverified [10.100.1.100]) by controller-1 (SurgeMail 3.9e) with ESMTP id 9521-1840426 for ; Wed, 13 May 2009 10:28:17 +0200 User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) In-Reply-To: <68bfdc900905122138p440e8b0bp8dfdb847fa8a27d9@mail.gmail.com> X-Originating-IP: 10.100.1.100 X-Authenticated-User: hagen@controller-1 X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN) X-CanIt-Geo: ip=83.247.100.17; country=NL; region=15; city=Almelo; latitude=52.3500; longitude=6.6667; http://maps.google.com/maps?q=52.3500,6.6667&z=6 X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 224437282 - 018a52bd0931 - 20090513 X-Scanned-By: CanIt (www . roaringpenguin . com) on 194.171.167.217 X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.11 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:50389 Archived-At: Yue Wang wrote: > After debugging for half an hour in the morning, finally I know why > this is so slow on both XeTeX and pdfTeX. > This problem is not an operating system issue, But a ConTeXt "feature". > > in font-mkii, you use the following to define a actual font > (\definefontlocal and \definefontglobal): > {\expandafter\xdef\csname#1\endcsname % ! ! ! ! not needed in luatex ! ! ! ! > {\noexpand\csname#1:\endcsname > \noexpand\reactivatefont{\somefontname}{\fontfile}}% > > For example, in the previous example, it will define a lot of fonts, > like \*myzhfont12ptmmexrm*: , is defined as > \reactivatefont{cmex10}{lmex10}. When define that font, a macro > \*myzhfont12ptmmexrm* is defined to select the \*myzhfont12ptmmexrm* : > font. > \*myzhfont12ptmmexrm* =\csname *myzhfont12ptmmexrm* :\endcsname > \reactivefont{cmex10}{lmex10} > TeX then expand your reactivatefont macros to lmex10 at 12.0pt. So the > \*myzhfont12ptmmexrm*: font will assigned to lmex10 at 12.0pt. This is > quite right for definition. > > However, after switching the font, the problems occurs: > it tries to call \*myzhfont12ptmmexrm* macro. the macro then expands to > \csname *myzhfont12ptmmexrm* :\endcsname \reactivefont{cmex10}{lmex10} > then to > \*myzhfont12ptmmexrm* :\reactivefont{cmex10}{lmex10} > and here TeX tries to do the following thing: > TeX select the font \*myzhfont12ptmmexrm*:, which is lmex10 at 12pt, > then *changing* into that font environment in order to *typeset* > \reactivefont. it then expand the macro of reacitve font, but found > nothing to typeset... some of that activation can involve things that set up the font (some related properties) so the mechanism is kind of generic; at that point it is not possible to determine what is relevant and what not changing such things at frst sight might lead to unexpected side effects later on (as the context history has proven) although part of the activation code can be disables when defining a sequence of fonts (as in a massive switch) it would complicate the font code even more to do so; in mkiv we do need less of that as explained below (hence the remark 'not needed in luatex') for instance, we can redefine (for xetex) \def\updatefontparameters {\edef\@@fontfeatures{\truefontdata\fontfile \s!features}% \edef\@@fontskewchar{\truefontdata\fontfile \s!skewchar}} \def\setfontcharacteristics {%\updatefontparameters % redundant, will go away, faster too \the\everyfont \synchronizepatternswithfont} Which is faster, but then we also need to make absolutely sure that xetex only uses opentype fonts (which in turn means that we need to provide separate typescript files for xetex to make sure that this happens); this can be significant on a 10 second job with 10K switches, but less in a complex document that takes 2 minutes to process, or as we sometimes had, 20 minute jobs Another option is to make all sizes (\tf*) optional, which then would force a typeface switch in heads which in turn slows down things at that end, or you could disable bigmath (that follows the sizes) and then end up with mails to this list why math does not scale in titles, etc etc > what's worse, this will occur on every real fonts in the definition. > So in fact TeX will switching dozens of fonts into the *current* font > for a "\switchtobodyfont" call, in order to typeset nothing. And > that's why ConTeXt is so slow on typeface changing. > > The solution to this bug (or feature?) is quite easy, just use \font > to define the \*myzhfont12ptmmexrm*: font, but pay attention not to > switch the current font to \*myzhfont12ptmmexrm*: . Since it is quite > slow to switch to dozens of font for one \switchtobodyfont call even > in Knuth TeX. You make it sound as all of the font system's features are kind of wrong and contain oversights ... i'm not going to enter a discussion about the how and why of the context way of doing things as it is a accumulation of over 15 years of development and experients and realistic typesetting situations. We're not that stupid. In mkii (and to a lesser extent in mkiv) we have to deal with several situations: Fonts have 256 chars at most so they cannot cover each language. This means that when one mixes languages, one also might need multiple fonts. For instance using two complete different typefaces, one with ec encoding, another with texnansi encoding and both with different math fonts also in a different encoding is happing at our end. As fonts encodings are related to hyphenation, there is a relationship between a font switch and a language switch i.e. when we're typesetting in czech it might be that a font switch also results in a patterns switch due to the fact that patterns relate to fonts. The fact that CJK does not hyphenate is no reason to remove that feature from the font system. I know that the latex way of doing things is to replace and redefine core code for specific purposes, but that's not the way we do things in context. In my own usage i just accepted the fact that in order to typeset my own stuff my jobs ran slower because other languages needed other features to be present. So, we have encodings, languages, upper/lowercase mappings, font handling, some math properties, etc. so in practice several bits and pieces need to be synchronized when we switch a font (not relevant maybe for cjk but definitely relevant for other scripts). There can even be weird font related issues that need to be dealt with. Now, when one uses a typeface (i.e. fontclasses) the actual loading of the font takes place only once as we assume that one will not redefine a typeface mid-document. The only things that happening with each (individual) font switch are these synchronizations. When switching a typeface, only a few fonts are really defined (loaded) as tracing shows but some things are set up each time (of course, as we obey grouping). If you are sure that there are no dependencies (i.e. a regular title with embedded bold, math, etc) you can use \definefont and assign that font to the style property of for instance a section head. The whole system is targeting at situations that we have to deal with at pragma, and when setting up styles it's no big deal to do that efficiently with a mix of typefaces and \definefont definitions (using symbolic FontNames so that encoding etc is nicely dealt with). You mentioned many font switches in your document, well believe me, some of the documents we produce have many too and compared to other typesetting tasks dealing with fonts is not the bottleneck. Now, in mkiv we have a bit less complex system because we only have one input encoding (utf i.e. unicode), one font encoding (unicode), and one math encoding (unicode) and so it has a more optimal code base. As xetex shares code with pdftex (both use mkii) code, it has the burden of some overhead not needed if we'd settle on one encoding (although xetex does handle multiple font encodings). However, that would demand a partial rewrite of the mkii font subsystem specially for xetex (and eventually also for the math subsystem) which is currently not on the agenda (unless i need it in well paid projects). As i mentioned yesterday, i can disable a few aspects of mkii in xetex mode, but the gain is not so large and as i don't use xetex on a regular basis it does not pay off to complicate the code. Simplified math might make a difference but as lm/gyre open type math is not there yet, i won't even start looking into it (if at all). The problem is simply that mkii (or derived xetex upgrades) will not keep up with mkiv anyway. I don't see much reason for making an advanced mkiv system (which takes time) and then derive a crippled halfway mkii-mkiv system just for fun (which is not that much fun actually; okay, if someone comes up with a project i might refactor bit and pieces). So, no bugs, nor 'features' but just features cq. functionality that is the result of over 15 years of supporting everything that came along. Reading of the font related docs that come with context probably reveal much of this anyway. Context deals (and is made for) mixed languages, fonts, etc etc and it will stay that way. We will not optimize for specific usage. Also, in practice font switching only part of the game and other typesetting tasks might take much more time (esp when one sets up complex page layouts). Hans ps. in mkiv we gain time at the definition end but then loose it at the processing end because there we want to do more clever things that take time anyway ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl ----------------------------------------------------------------- ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________