* [NTG-context] automatic character comparison
@ 2025-03-04 18:26 Pablo Rodriguez via ntg-context
2025-03-05 1:21 ` [NTG-context] " Bruce Horrocks
2025-03-05 4:23 ` Max Chernoff via ntg-context
0 siblings, 2 replies; 11+ messages in thread
From: Pablo Rodriguez via ntg-context @ 2025-03-04 18:26 UTC (permalink / raw)
To: ConTeXt users; +Cc: Pablo Rodriguez
Dear list,
I have the following sample:
\setuppapersize[A7]
\starttext
\definecolumnset
[paral]
[n=2]
\definesubcolumnset[paral][1][1]
\definesubcolumnset[paral][2][2]
\startcolumnset[paral]
\startsubcolumnset[1]
abce
\stopsubcolumnset
\startsubcolumnset[2]
abcd
\stopsubcolumnset
\flushsubcolumnsets[spread]
\startsubcolumnset[1]
abc%
\inframed
[background=color,
backgroundcolor=lightgreen,
frame=off]
{e}
\stopsubcolumnset
\startsubcolumnset[2]
abc%
\inframed
[background=color,
backgroundcolor=lightred,
frame=off]
{d}
\stopsubcolumnset
\flushsubcolumnsets[spread]
\stopcolumnset
\stoptext
What I want to achieve is automatic text comparison between versions of
the same text (in different subcolumnsets).
The first line shows different versions text. I wonder whether there
would be an automatic way to get the \inframed highlighting with any
character that differs from the other column (it might be different, or
just missing or being added).
I think this may be possible with ConTeXt, but I don’t know how to
achieve it automagically.
Any ideas on how to get this automatic text comparison?
Many thanks in advance,
Pablo
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-04 18:26 [NTG-context] automatic character comparison Pablo Rodriguez via ntg-context
@ 2025-03-05 1:21 ` Bruce Horrocks
2025-03-05 1:27 ` Bruce Horrocks
2025-03-25 18:48 ` Pablo Rodriguez via ntg-context
2025-03-05 4:23 ` Max Chernoff via ntg-context
1 sibling, 2 replies; 11+ messages in thread
From: Bruce Horrocks @ 2025-03-05 1:21 UTC (permalink / raw)
To: ntg-context mailing list; +Cc: Pablo Rodriguez
> On 4 Mar 2025, at 18:26, Pablo Rodriguez via ntg-context <ntg-context@ntg.nl> wrote:
>
> The first line shows different versions text. I wonder whether there
> would be an automatic way to get the \inframed highlighting with any
> character that differs from the other column (it might be different, or
> just missing or being added).
I'm not aware of anything built-in.
One way would be to use buffers: you enter your text into two buffers ("left" & "right"); compare them using a script which then modifies the buffers to highlight the changes in red or green; and then \getbuffer the buffers from inside the columnset commands to print them.
There's a good discussion of comparison algorithms at the link below, including source code in Javascript (but not Lua, unfortunately). However, Context supports Javascript with \startJScode ... \stopJScode so you could try adapting what's there.
<https://neil.fraser.name/writing/diff/>
Regards,
—
Bruce Horrocks
Hampshire, UK
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-05 1:21 ` [NTG-context] " Bruce Horrocks
@ 2025-03-05 1:27 ` Bruce Horrocks
2025-03-05 15:11 ` Pablo Rodriguez via ntg-context
2025-03-25 18:48 ` Pablo Rodriguez via ntg-context
1 sibling, 1 reply; 11+ messages in thread
From: Bruce Horrocks @ 2025-03-05 1:27 UTC (permalink / raw)
To: ntg-context mailing list; +Cc: Pablo Rodriguez
> On 5 Mar 2025, at 01:21, Bruce Horrocks <ntg@scorecrow.com> wrote:
>
>
>
>> On 4 Mar 2025, at 18:26, Pablo Rodriguez via ntg-context <ntg-context@ntg.nl> wrote:
>>
>> The first line shows different versions text. I wonder whether there
>> would be an automatic way to get the \inframed highlighting with any
>> character that differs from the other column (it might be different, or
>> just missing or being added).
>
> I'm not aware of anything built-in.
>
> One way would be to use buffers: you enter your text into two buffers ("left" & "right"); compare them using a script which then modifies the buffers to highlight the changes in red or green; and then \getbuffer the buffers from inside the columnset commands to print them.
>
> There's a good discussion of comparison algorithms at the link below, including source code in Javascript (but not Lua, unfortunately). However, Context supports Javascript with \startJScode ... \stopJScode so you could try adapting what's there.
>
> <https://neil.fraser.name/writing/diff/>
>
Sorry - ignore the JS bit (that's for embedding into the PDF). You'll need to translate Fraser's example code into Lua.
—
Bruce Horrocks
Hampshire, UK
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-04 18:26 [NTG-context] automatic character comparison Pablo Rodriguez via ntg-context
2025-03-05 1:21 ` [NTG-context] " Bruce Horrocks
@ 2025-03-05 4:23 ` Max Chernoff via ntg-context
2025-03-05 15:17 ` Pablo Rodriguez via ntg-context
1 sibling, 1 reply; 11+ messages in thread
From: Max Chernoff via ntg-context @ 2025-03-05 4:23 UTC (permalink / raw)
To: mailing list for ConTeXt users; +Cc: Pablo Rodriguez, Max Chernoff
Hi Pablo,
On Tue, 2025-03-04 at 19:26 +0100, Pablo Rodriguez via ntg-context
wrote:
> What I want to achieve is automatic text comparison between versions of
> the same text (in different subcolumnsets).
>
> The first line shows different versions text. I wonder whether there
> would be an automatic way to get the \inframed highlighting with any
> character that differs from the other column (it might be different, or
> just missing or being added).
>
> I think this may be possible with ConTeXt, but I don’t know how to
> achieve it automagically.
Not quite what you're asking for, but the "compare" script does
something fairly similar:
$ context --extra=compare <filename-1>.pdf <filename-2>.pdf
$ context --extra=compare <filename-1>.pdf <filename-2>.pdf --colors=red,blue --result=<output-name>.pdf
The source for that script is in
tex/context/base/mkiv/mtx-context-compare.tex
so maybe you can put together something similar from there?
Thanks,
-- Max
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-05 1:27 ` Bruce Horrocks
@ 2025-03-05 15:11 ` Pablo Rodriguez via ntg-context
0 siblings, 0 replies; 11+ messages in thread
From: Pablo Rodriguez via ntg-context @ 2025-03-05 15:11 UTC (permalink / raw)
To: ntg-context; +Cc: Pablo Rodriguez
On 3/5/25 02:27, Bruce Horrocks wrote:
>> On 5 Mar 2025, at 01:21, Bruce Horrocks <ntg@scorecrow.com> wrote:
>> [...]
>> I'm not aware of anything built-in.
>>
>> One way would be to use buffers: you enter your text into two
>> buffers ("left" & "right"); compare them using a script which then
>> modifies the buffers to highlight the changes in red or green;
>> and then \getbuffer the buffers from inside the columnset commands
>> to print them.
Many thanks for your reply, Bruce.
I think this might be a a feasible approach for me.
>> There's a good discussion of comparison algorithms at the link
>> below, including source code in Javascript (but not Lua, unfortunately).
There might be a Lua version (I think) here:
https://github.com/google/diff-match-patch.
>> However, Context supports Javascript with \startJScode ... \stopJScode
>> so you could try adapting what's there.
>>
>> <https://neil.fraser.name/writing/diff/>>
> Sorry - ignore the JS bit (that's for embedding into the PDF).
> You'll need to translate Fraser's example code into Lua.
We have https://www.pragma-ade.com/general/manuals/ecmascript-mkiv.pdf,
so translating JS to Lua might not be required.
Many thanks for your help,
Pablo
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-05 4:23 ` Max Chernoff via ntg-context
@ 2025-03-05 15:17 ` Pablo Rodriguez via ntg-context
0 siblings, 0 replies; 11+ messages in thread
From: Pablo Rodriguez via ntg-context @ 2025-03-05 15:17 UTC (permalink / raw)
To: ntg-context; +Cc: Pablo Rodriguez
On 3/5/25 05:23, Max Chernoff via ntg-context wrote:
> Hi Pablo,
Hi Max,
many thanks for your reply.
> Not quite what you're asking for, but the "compare" script does
> something fairly similar:
> [...]
> The source for that script is in
>
> tex/context/base/mkiv/mtx-context-compare.tex
>
> so maybe you can put together something similar from there?
I’m afraid not. I used "compare" in the past, but I need to mark
additions and deletions, not to see differences imposing one file over
the other one.
Sorry, but when I need to see how a document has been modified, diffpdf
presents a clearer overview to me.
Many thanks for your help,
Pablo
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-05 1:21 ` [NTG-context] " Bruce Horrocks
2025-03-05 1:27 ` Bruce Horrocks
@ 2025-03-25 18:48 ` Pablo Rodriguez via ntg-context
2025-03-25 19:04 ` Hans Hagen
2025-03-26 22:57 ` Bruce Horrocks
1 sibling, 2 replies; 11+ messages in thread
From: Pablo Rodriguez via ntg-context @ 2025-03-25 18:48 UTC (permalink / raw)
To: ntg-context; +Cc: Pablo Rodriguez
On 3/5/25 02:21, Bruce Horrocks wrote:
>> On 4 Mar 2025, at 18:26, Pablo Rodriguez wrote:
>>
>> The first line shows different versions text. I wonder whether there
>> would be an automatic way to get the \inframed highlighting with any
>> character that differs from the other column (it might be different, or
>> just missing or being added).
>
> I'm not aware of anything built-in.
[In short, my previous request intended how to have an automatic
comparison of two versions from the same text automatically done.]
Replying to this message from Bruce, I want to describe what I think it
might do the job.
Since I’m just an average computer user (my background is in
humanities), I thank everyone for comments about whether this make sense
(or not at all).
Not being inclined to reinvent the wheel, after some searching I found
out that "git diff" can do a char-level comparison between two texts:
git diff -U1000 --color-words=. one.md two.md > one-two.diff
[BTW, I use Markdown sources (which pandoc converts to XHTML and ConTeXt
typesets them).]
Since the output contains the coloring commands, I need some
substitutions with:
sed -E -f normal.sed one-two.diff > one-two_normal.diff
The contents of the sed script read:
s/(^[#]{2,3})\x1B\[m$/\1/g
s/\x1B\[(36|1).+?\x1B\[m//g
s/\x1B\[31m/\\Subs{/g
s/\x1B\[32m/\\Add{/g
s/\x1B\[m/}/g
Basically, this script removes info that ConTeXt cannot handle and
translates color codes to \Add and \Subst commands.
This minimal sample:
another te\Subs{x}\Add{s}t
On the left page with the older text, it might have the commands:
\protected\def\Add#1{}
\definehighlight[Subs]
[color=red]
On the right page with the newer version, commands might read:
\definehighlight[Adds]
[color=green]
\protected\def\Subst#1{}
At least, this works with a minimal sample.
Is this a feasible approach? I don’t need the most efficient solution,
just one that I can handle and that just works.
Many thanks in advance for your comments,
Pablo
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-25 18:48 ` Pablo Rodriguez via ntg-context
@ 2025-03-25 19:04 ` Hans Hagen
2025-03-26 15:42 ` Pablo Rodriguez via ntg-context
2025-03-26 22:57 ` Bruce Horrocks
1 sibling, 1 reply; 11+ messages in thread
From: Hans Hagen @ 2025-03-25 19:04 UTC (permalink / raw)
To: ntg-context
[-- Attachment #1: Type: text/plain, Size: 2860 bytes --]
On 3/25/2025 7:48 PM, Pablo Rodriguez via ntg-context wrote:
> On 3/5/25 02:21, Bruce Horrocks wrote:
>>> On 4 Mar 2025, at 18:26, Pablo Rodriguez wrote:
>>>
>>> The first line shows different versions text. I wonder whether there
>>> would be an automatic way to get the \inframed highlighting with any
>>> character that differs from the other column (it might be different, or
>>> just missing or being added).
>>
>> I'm not aware of anything built-in.
>
> [In short, my previous request intended how to have an automatic
> comparison of two versions from the same text automatically done.]
>
> Replying to this message from Bruce, I want to describe what I think it
> might do the job.
>
> Since I’m just an average computer user (my background is in
> humanities), I thank everyone for comments about whether this make sense
> (or not at all).
>
> Not being inclined to reinvent the wheel, after some searching I found
> out that "git diff" can do a char-level comparison between two texts:
>
> git diff -U1000 --color-words=. one.md two.md > one-two.diff
>
> [BTW, I use Markdown sources (which pandoc converts to XHTML and ConTeXt
> typesets them).]
>
> Since the output contains the coloring commands, I need some
> substitutions with:
>
> sed -E -f normal.sed one-two.diff > one-two_normal.diff
>
> The contents of the sed script read:
>
> s/(^[#]{2,3})\x1B\[m$/\1/g
> s/\x1B\[(36|1).+?\x1B\[m//g
> s/\x1B\[31m/\\Subs{/g
> s/\x1B\[32m/\\Add{/g
> s/\x1B\[m/}/g
>
> Basically, this script removes info that ConTeXt cannot handle and
> translates color codes to \Add and \Subst commands.
>
> This minimal sample:
>
> another te\Subs{x}\Add{s}t
>
> On the left page with the older text, it might have the commands:
>
> \protected\def\Add#1{}
> \definehighlight[Subs]
> [color=red]
>
> On the right page with the newer version, commands might read:
>
> \definehighlight[Adds]
> [color=green]
> \protected\def\Subst#1{}
>
> At least, this works with a minimal sample.
>
> Is this a feasible approach? I don’t need the most efficient solution,
> just one that I can handle and that just works.
>
> Many thanks in advance for your comments,
Whatever works for you is okay right?
The attached is what is coming one day. The prototype (some 150 lines of
code) works ok here but we need some interface that MS and I will look
into when we pick up the columnsets track.
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
[-- Attachment #2: diff-001.pdf --]
[-- Type: application/pdf, Size: 20851 bytes --]
[-- Attachment #3: Type: text/plain, Size: 511 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-25 19:04 ` Hans Hagen
@ 2025-03-26 15:42 ` Pablo Rodriguez via ntg-context
0 siblings, 0 replies; 11+ messages in thread
From: Pablo Rodriguez via ntg-context @ 2025-03-26 15:42 UTC (permalink / raw)
To: ntg-context; +Cc: Pablo Rodriguez
On 3/25/25 20:04, Hans Hagen wrote:
> [...]
> Whatever works for you is okay right?
>
> The attached is what is coming one day. The prototype (some 150 lines of
> code) works ok here but we need some interface that MS and I will look
> into when we pick up the columnsets track.
Many thanks for the new implementation, Hans.
Is there any chance to become an early adopter of the new prototype for
testing purposes?
Many thanks for your help,
Pablo
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-25 18:48 ` Pablo Rodriguez via ntg-context
2025-03-25 19:04 ` Hans Hagen
@ 2025-03-26 22:57 ` Bruce Horrocks
2025-03-27 4:01 ` Pablo Rodriguez via ntg-context
1 sibling, 1 reply; 11+ messages in thread
From: Bruce Horrocks @ 2025-03-26 22:57 UTC (permalink / raw)
To: ntg-context mailing list; +Cc: Pablo Rodriguez
> On 25 Mar 2025, at 18:48, Pablo Rodriguez via ntg-context <ntg-context@ntg.nl> wrote:
>
> Is this a feasible approach? I don’t need the most efficient solution,
> just one that I can handle and that just works.
I think relying on diff's colouring and a 1000 line change window would work but is not robust as it might unexpectedly break - e.g. if you were to port to another machine or change your terminal settings then you might get different escape sequences for the colours.
An alternative might be to use 'wdiff' which does a word-based comparison instead of the line-based comparison of diff. It also allows you to insert your choice of marker string before and after each change, making it easy to insert Context markup.
There's a LaTeX example in section 2.2 on this page which puts deleted text in boxes, and new text in double boxes. It should be pretty simple to adapt.
<https://www.gnu.org/software/wdiff/manual/wdiff.html>
Regards,
—
Bruce Horrocks
Hampshire, UK
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
* [NTG-context] Re: automatic character comparison
2025-03-26 22:57 ` Bruce Horrocks
@ 2025-03-27 4:01 ` Pablo Rodriguez via ntg-context
0 siblings, 0 replies; 11+ messages in thread
From: Pablo Rodriguez via ntg-context @ 2025-03-27 4:01 UTC (permalink / raw)
To: ntg-context; +Cc: Pablo Rodriguez
On 3/26/25 23:57, Bruce Horrocks wrote:
> On 25 Mar 2025, at 18:48, Pablo Rodriguez via ntg-context <ntg-context@ntg.nl> wrote:
>>
>> Is this a feasible approach? I don’t need the most efficient solution,
>> just one that I can handle and that just works.
>
> I think relying on diff's colouring and a 1000 line change window
> would work but is not robust as it might unexpectedly break - e.g. if
> you were to port to another machine or change your terminal settings
> then you might get different escape sequences for the colours.
Many thanks for your reply, Bruce.
Besides the fact that ConTeXt will have built-in functionality for this,
I think that it would be easy for me to adapt the different escape
sequences for colors (in the rather improbable case I have to port it to
another machine or change my terminal settings).
BTW, after I sent the message, I relalized that my approach was wrong in
a detail. I wanted to add TeX commands in a Markdown source (which was
going to be converted to XHTML).
The right thing to do is to convert the escape sequences for colors into
XML tags in the Markdown source.
> An alternative might be to use 'wdiff' which does a word-based
> comparison instead of the line-based comparison of diff. It also allows
> you to insert your choice of marker string before and after each change,
> making it easy to insert Context markup.
I knew there was such an option, but it isn’t available for MSYS2 (just
in case I might need it there one day) and as far as I can remember it
compares whole words, not single characters.
Many thanks for your help,
Pablo
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive : https://github.com/contextgarden/context
wiki : https://wiki.contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-03-27 4:03 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-04 18:26 [NTG-context] automatic character comparison Pablo Rodriguez via ntg-context
2025-03-05 1:21 ` [NTG-context] " Bruce Horrocks
2025-03-05 1:27 ` Bruce Horrocks
2025-03-05 15:11 ` Pablo Rodriguez via ntg-context
2025-03-25 18:48 ` Pablo Rodriguez via ntg-context
2025-03-25 19:04 ` Hans Hagen
2025-03-26 15:42 ` Pablo Rodriguez via ntg-context
2025-03-26 22:57 ` Bruce Horrocks
2025-03-27 4:01 ` Pablo Rodriguez via ntg-context
2025-03-05 4:23 ` Max Chernoff via ntg-context
2025-03-05 15:17 ` Pablo Rodriguez via ntg-context
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).