From: Hans Hagen <j.hagen@xs4all.nl>
To: "mailing list for ConTeXt users" <ntg-context@ntg.nl>,
"Marcel Fabian Krüger" <tex@2krueger.de>
Subject: Re: Multiple cases of unexpected behaviour in luametatex
Date: Fri, 3 Jul 2020 21:56:28 +0200 [thread overview]
Message-ID: <7d1f8ca7-2e36-3c99-c729-407a396aa0e2@xs4all.nl> (raw)
In-Reply-To: <20200703125545.orklmoqazzgyascu@yoga>
On 7/3/2020 2:55 PM, Marcel Fabian Krüger wrote:
> Hi,
>
> I recently noticed some cases where luametatex behaved in unexpected
> ways:
>
> - The "Extra \fi" error isn't triggered, instead an extra `\fi`
> freezes luametatex. (Can be reproduced by compiling a document which
> only consists of a single \fi)
i already fixed here (noticed it when documenting some conditionals)
> - token.new can only create some `data` tokens, but it doesn't apply
> bound checking on it's arguments:
there is no checking yet, there is an upper limit of 0x1FFFFF, so i'll
add a check for that
> Also for all other commands LuaTeX seems to apply range-checks to
> ensure that such overflows don't happen, even if invalid values are
> passed as firstargument.
indeed, but hadn't yet done that for data, it also need a more strict
check at the tex end (i'm still not sure if i make a slightly different
implementation of it but i can add the test anyway)
> - There is token.primitives(). My assumption is that the returned
> table is meant to indicate the command is, mode and name
> corresponding to every primitive. (I think it is awesome that such a
> table is made available in luametatex) But especially the mode
> field sometimes has values which do not correspond to the mode of
> the actual primitives:
indeed.
> I tried running the following in an almost iniTeX setting where all
> primitives aside from \shipout and \Umathcodenum have their default
> definitions:
>
> ```
> \catcode`\%=12
> \catcode`\~=12
> \directlua{
> local sorted = token.primitives()
> table.sort(sorted, function(a,b) return a[1]<b[1] or a[1]==b[1] and a[2]<b[2]end)
> for _,info in ipairs(sorted) do
> local t = token.create(info[3])
> local rc, rm = t.command, t.mode
> if rc==info[1] and rm ~= info[2] then
> if info[2] == 0 then
> print(string.format('MODE MISMATCH, expected zero: \string\\%s: real: %i, command: %i', info[3], rm, rc))
> else
> print(string.format('MODE MISMATCH: \string\\%s: offset: %i, command: %i', info[3], rm-info[2], rc))
> end
> elseif rc~=info[1] then print(t.csname)
> end
> end
> }
> ```
>
> This indicates that there are two kinds of differences:
> For some command codes, there are multiple primitives whose second
> entry in the token.primitives table is zero even though their mode
> is not zero. This especially affects the commands `above`,
> `after_something`, `make_box`, `un_vbox`, `set_specification` and
> `car_ret`.
> E.g. for after_something, all of \atendofgrouped, \afterassigned and
> \aftergrouped have a zero as second entry in token.primitives.
some tokens are more complex in the sense that they are combinations
(have a follow up) and i'm not sure to what extedn i want to block that
... all a matter of experimenting and time, so
the 'mode' field will be dropped but for now i kept it
some like after_something i need to check (i just didn't update their
ranges yet after adding some more primitives that use them) (maybe some
otheres need an offset added but i'll check it)
> The other difference is that all the internal_... commands have a
> fixed offset which differes between commands in their mode field.
>
> IMO the difference for the internal_... commands make sense because
> they make for easier to use numbers, but having multiple primitives
> indicating mode 0 for the other commands seems to make this table
> significantly less useful because it can't be used to get a unique
> description of a primitive.
>
> (I may have completely misinterpreted the table of course, but given
> that for other primitives the values match I do not think so)
it's a it work in progress as there are some exceptions that use special
chr codes (for instance in conditionals several cmd codes need to have
exclusive codes, so adapting it is a stepwise process; one decision i
need to make there is how close to stay to the original tex codes
eventually i want all to have reasonable ranges in the token interface
(not per se the same as in the engine itself but that's a black box
anyway) which involves some offsetting .. i do that stepwise in order to
keep a working engine (the token interface is not used in context that
much)
Hans
hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://context.aanhet.net
archive : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___________________________________________________________________________________
next prev parent reply other threads:[~2020-07-03 19:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-03 12:55 Marcel Fabian Krüger
2020-07-03 19:56 ` Hans Hagen [this message]
2020-07-04 5:22 ` Marcel Fabian Krüger
2020-07-04 8:33 ` ConTeXt installation on Windows 10 Jean-Pierre Delange
2020-07-04 8:47 ` Wolfgang Schuster
2020-07-04 10:18 ` Jean-Pierre Delange
2020-07-04 13:09 ` Hans Hagen
2020-07-04 17:18 ` Multiple cases of unexpected behaviour in luametatex Hans Hagen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7d1f8ca7-2e36-3c99-c729-407a396aa0e2@xs4all.nl \
--to=j.hagen@xs4all.nl \
--cc=ntg-context@ntg.nl \
--cc=tex@2krueger.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).