Re: Using ConTeXt-LMTX for modern Mathematically-Literate-Programming 1/2

From: Stephen Gaito <stephen@perceptisys.co.uk>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: Using ConTeXt-LMTX for modern Mathematically-Literate-Programming 1/2
Date: Wed, 2 Dec 2020 09:40:29 +0000	[thread overview]
Message-ID: <20201202094029.695ba8b1@nn01> (raw)
In-Reply-To: <6f2c3611-8ae0-2dbe-759f-f57c91da20b6@xs4all.nl>

Hans,

Many thanks for your swift and helpful comments.

After some *very crude* tests using the `luametatex` and `luametafun`
documents, I find that while I *can* stop effective processing at
various points in the LuaMetaTeX pipeline, the time difference overall
is not really significant enough to bother with this approach.

The principle problem is, as you suggested below, "stopping" the
pipeline at the PDF stage (using for example the `pre_output_filter`)
corrupted the `*.tuc` data which is for my purposes, critical.

Your comment was: 

> but keep in mind that multipass data is flushed as part of the
> shipout (because it is often location and order bound)

For the record, using the `append_to_vlist_filter` callback, I did
manage to drastically reduce the "pages" (which were all blank, not
surprisingly).

However, on my elderly desktop from 2008, both callbacks essentially cut
only 6-8 seconds out of 18 seconds, for the `luametatex` document, and
190 seconds, for the `luametafun` document.

In the case of the `luametafun` document, it is the MetaFun/MetaPost
processing which, of course, is taking a long time (as it should, the
graphics computations represent important but complex computations).

My ultimate goal is to parallelize the production of large, heavily
cross-referenced, ConTeXt documents... more on this in a future email...

Again, many thanks for your comments!

Regards,

Stephen Gaito

On Mon, 30 Nov 2020 19:59:07 +0100
Hans Hagen <j.hagen@xs4all.nl> wrote:

> On 11/30/2020 10:51 AM, Stephen Gaito wrote:
> > Hello,
> > 
> > I am slowly working on a Mathematical problem requiring underlying
> > computation.
> > 
> > As Mathematicians (myself included) are rather "conservative", I
> > need to discuss each "chunk" of code with the full set of
> > Mathematical notation.
> > 
> > A couple of years ago I started using ConTeXt-MKIV as a
> > Mathematically-Literate-Programming tool by using its excellent Lua
> > interface to capture the code and dump it to disk for external
> > compilation.
> > 
> > I am now revisiting my original design and want to redo my tools
> > using ConTeXt-LMTX.
> > 
> > I would *like* to be able to "stop" the ConTeXt typesetting at
> > various points for differing purposes:
> > 
> > 1. After all macro expansions (and hence after *my* calls into Lua)
> >     but before line/paragraph/page layout begins.
> 
> maybe something
> 
> \startmystuff
> 
> \stopmystuff
> 
> and then you can hook something into startmystuff and \stopmystuff
> 
> > 2. After line/paragraph/page layout but before PDF generation.
> 
> pdf is generated per page, if needed one can kick in a shipout
> overload
> 
> but keep in mind that multipass data is flushed as part of the
> shipout (because it is often location and order bound)
> 
> > 3. After all PDF generated (ie. a "normal" "full" ConTeXt run).
> > 
> > Stopping after all macro expansions would allow my code generation
> > builds to proceed without the un-needed page setting or PDF
> > generation.
> 
> hm, the problem is always in the 'state' of all kind of variables
> 
> > Stopping after the line/paragraph/page layout would allow multiple
> > "faster(?)" ConTeXt runs while the "*.tuc" file converges to a
> > complete set of page numbers and cross references (etc). Then, once
> > the "*.tuc" file has converged, a full ConTeXt run with PDF output
> > could be done.
> 
> not sure what you mean here ... what is fast? or: how slow is it now? 
> what is the bottleneck? can you cache data that didn't change?
> 
> a large document is normally split up in sections that can be
> processed independent
> 
> \starttext
>      \dorecurse{10000}{\samplefile{ward}\par}
> \stoptext
> 
> runs on my 2013 laptop at over 65 pages per second
> 
> quite often performance is hit by inefficient styling and such ..
> it's no problem to bring a tex system a grinding halt
> 
> > I am very aware that *internally* ConTeXt is probably structured as
> > a tight pipeline with each of the "traditional" TeX stages "Mouth",
> > "Stomach", "page setting", PDF generation.... tightly "chained"...
> > This means that there is no "one" place in the code where all macro
> > expansions have completed but before the page setting "starts", or
> > similarly, after the page setting has finished but before the PDF
> > generation "starts".
> 
> yes and often something is left over for a next page so it's kind of
> fluid
> 
> > ----
> > QUESTION: Is it possible to use the new LuaMetaTeX callbacks (found
> > in chapter 10 of the "LuaMetaTEX Reference Manual") to "suppress"
> > any further computation at various points in the ConTeXt pipeline?
> > ----
> 
> sure, you can kick in handlers at various stages (assuming that you
> keep in mind where you kick them in as there is some order involved)
> 
> > For example, could I use one of the "*_linebreak_filter"s (or the
> > "append_to_vlist_filter") to "return" an empty value and hence
> > reduce further computation downstream in the pipeline?
> 
> you can but linebreak is not the most costly one, you probbaly want
> to intercept the list builder but when you do that you can as well do
> a \stoptext which prevents further reading of content (but i probably 
> misunderstand)
> 
> > Could I use the "pre_output_filter" to "return" an empty value and
> > hence "stop" PDF generation?
> 
> assuming a properky structured document forcing a \stoptext should
> work in most cases
> 
> > (I realize that these callbacks *are* a currently fast moving
> > target. I am happy to follow their changes, equally I would be
> > testing their usefulness and/or impact)
> 
> actually, the callbacks themselves hardly change, but the code
> plugged into them might occasionally (a lot of mkiv code is already
> quite old so i'm now looking at it and see if i can use some recent
> tricks)
> 
> > ALTERNATIVE QUESTION: Would it be possible to provide official
> > ConTeXt-LMTX "modes" which suppress further computation at these
> > points?
> 
> the question is: what do you want to suppress? best first identify
> the bottleneck and then figure out what can be skipped (as mentioned: 
> multipass data can be made more independent I guess but it still
> demands some calculations and analyzing and it's that bit that takes
> the time)
> 
> > This alternative, while some more work for the writing of
> > ConTeXt-LMTX, would ensure less direct external dependence on the
> > LuaMetaTeX callbacks, but would almost certainly be welcomed by the
> > ConTeXt community.
> 
> i need more info (also from others then) about what the reason, goal
> and possible gain is
> 
> - tex: the context code is quite efficient, and tex is quite fast, so 
> there's little to gain there (but as said one can write slow macros
> that spoil that game)
> 
> - lua: on the average lua is fast but garbage collection can be of 
> influence (i need to see code in order to be able to tell if there is
> a gain there); the lua code in context is quite ok but for instance 
> messing with node lists will always come at a cost (crossing the c 
> boundary and such)
> 
> - pdf: the backend code in luametatex is somewhat slower than in
> luatex but we're gaining there (because in related areas we can do
> things different, although there is new functionality that when used
> also comes at a price); but as far as i can tell a luametatex run
> here is on the average some 20% faster than a luatex run so the pdf
> generation slowdown gets kind of obscured by it
> 
> > ----
> > QUESTION: Are the "stages" I have identified major, computationally
> > expensive, "steps" in the overall ConTeXt "computation"?
> > ----
> 
> basic typesetting (hyphenation, font handling): takes a bit of time, 
> extra features that you use add some too: some timings are reported 
> after a run so you get an idea
> 
> par building: unles hz is used, quite fast
> 
> page building: fast but depending on what features are enables 
> finalizing the page can take some time
> 
> expansion: pretty fast on the average
> 
> summary: try to identify where the bottlenecks are
> 
> you can run with
> 
>    \enabletrackers[pages.timing]
> 
> (put it on cont-loc.mkxl somewhere in in texmf-local) and get timings 
> per page (i have that enabled on my machine)
> 
> Hans
> 
> 
> -----------------------------------------------------------------
>                                            Hans Hagen | PRAGMA ADE
>                Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
>         tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
> -----------------------------------------------------------------

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________