ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* lpeg substitution
@ 2009-08-11  9:34 Thomas A. Schmitz
  2009-08-11 10:59 ` Hans Hagen
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas A. Schmitz @ 2009-08-11  9:34 UTC (permalink / raw)
  To: mailing ConTeXt users list for

Hi all,

I'm working on my Greek module again and am trying to filter and  
massage the input via lpeg, but there's something I don't quite get.  
As a minimal example: suppose I want to substitute A and B in my input  
with X and leave all other letters alone. Here's my attempt:

\startluacode
do
     local replace = {
         A = "X",
         B = "X",
     }

     local dosub = (lpeg.Cs(1)) / replace

     local subs =
         (dosub)^0

     function test (string)
	tex.sprint(lpeg.match(subs,string))
     end
end
\stopluacode

\def\Substitute#1{\ctxlua{test("#1")}}

\starttext

\Substitute{ABC}

\stoptext

It substitutes alright, but the "C" is not included in the stream  
which ctxlua gives to TeX. How can I modify my lpeg pattern?

Thanks, and all best

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-11  9:34 lpeg substitution Thomas A. Schmitz
@ 2009-08-11 10:59 ` Hans Hagen
  2009-08-11 11:21   ` Thomas A. Schmitz
  0 siblings, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2009-08-11 10:59 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thomas A. Schmitz wrote:
> Hi all,
> 
> I'm working on my Greek module again and am trying to filter and massage 
> the input via lpeg, but there's something I don't quite get. As a 
> minimal example: suppose I want to substitute A and B in my input with X 
> and leave all other letters alone. Here's my attempt:

brrr ... massaging input ... can be dangerous ... anyhow, here you go

\startluacode
     local replace = {
         A = "X",
         B = "X",
     }
     setmetatable(replace, { __index = function(t,k)
         return k
     end })

     local dosub = (lpeg.Cs(1)) / replace

     local subs =
         (dosub)^0

     function test (string)
     tex.sprint(lpeg.match(subs,string))
     end

\stopluacode

\def\Substitute#1{\ctxlua{test("#1")}}

\starttext

\Substitute{thomas ABC whatever}

\stoptext

and yes, it's slow; the next variant is faster but takes a bit more 
memory (neglectable compared to what is alrwady taken)

     setmetatable(replace, { __index = function(t,k)
         t[k] = k
         return k
     end })


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-11 10:59 ` Hans Hagen
@ 2009-08-11 11:21   ` Thomas A. Schmitz
  2009-08-11 12:35     ` Hans Hagen
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas A. Schmitz @ 2009-08-11 11:21 UTC (permalink / raw)
  To: mailing list for ConTeXt users


On Aug 11, 2009, at 12:59 PM, Hans Hagen wrote:

> Thomas A. Schmitz wrote:
>> Hi all,
>> I'm working on my Greek module again and am trying to filter and  
>> massage the input via lpeg, but there's something I don't quite  
>> get. As a minimal example: suppose I want to substitute A and B in  
>> my input with X and leave all other letters alone. Here's my attempt:
>
> brrr ... massaging input ... can be dangerous ... anyhow, here you go

Thanks Hans! I know it's not a good thing, but I do want to find a  
method to support ASCII transliteration in mkiv. I have learnt lots of  
interesting things about fea files in the past, the most important  
being that they are not the way to go (something that Taco had told me  
very early in my attempts; I should have listened to him...) So now I  
try to transform the input via lpeg. It's just a stopgap, but maybe  
better than nothing.

Thanks, all best

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-11 11:21   ` Thomas A. Schmitz
@ 2009-08-11 12:35     ` Hans Hagen
  2009-08-11 15:14       ` Thomas A. Schmitz
  0 siblings, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2009-08-11 12:35 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thomas A. Schmitz wrote:
> 
> On Aug 11, 2009, at 12:59 PM, Hans Hagen wrote:
> 
>> Thomas A. Schmitz wrote:
>>> Hi all,
>>> I'm working on my Greek module again and am trying to filter and 
>>> massage the input via lpeg, but there's something I don't quite get. 
>>> As a minimal example: suppose I want to substitute A and B in my 
>>> input with X and leave all other letters alone. Here's my attempt:
>>
>> brrr ... massaging input ... can be dangerous ... anyhow, here you go
> 
> Thanks Hans! I know it's not a good thing, but I do want to find a 
> method to support ASCII transliteration in mkiv. I have learnt lots of 
> interesting things about fea files in the past, the most important being 
> that they are not the way to go (something that Taco had told me very 
> early in my attempts; I should have listened to him...) So now I try to 
> transform the input via lpeg. It's just a stopgap, but maybe better than 
> nothing.

what exactly do you want to replace ?

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-11 12:35     ` Hans Hagen
@ 2009-08-11 15:14       ` Thomas A. Schmitz
  2009-08-11 22:01         ` Hans Hagen
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas A. Schmitz @ 2009-08-11 15:14 UTC (permalink / raw)
  To: mailing list for ConTeXt users


On Aug 11, 2009, at 2:35 PM, Hans Hagen wrote:

> what exactly do you want to replace ?
>
> Hans
>

I'm trying to use the lpegs you have written for mtx-babel.lua, but  
instead of rewriting the greek ASCII stuff to a new file, I want to  
convert it to proper utf Greek and feed that to mkiv. As I said, it's  
a stopgap, but better than nothing...

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-11 15:14       ` Thomas A. Schmitz
@ 2009-08-11 22:01         ` Hans Hagen
  2009-08-12  6:51           ` Taco Hoekwater
  0 siblings, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2009-08-11 22:01 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thomas A. Schmitz wrote:
> 
> On Aug 11, 2009, at 2:35 PM, Hans Hagen wrote:
> 
>> what exactly do you want to replace ?
>>
>> Hans
>>
> 
> I'm trying to use the lpegs you have written for mtx-babel.lua, but 
> instead of rewriting the greek ASCII stuff to a new file, I want to 
> convert it to proper utf Greek and feed that to mkiv. As I said, it's a 
> stopgap, but better than nothing...

if there is more demand for that i can consider making a substituter 
that operates on the node list in an early stage; that way it is 
controlled by attributes and there is no interference with macro 
definitions, reading modules and such

Hans



-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-11 22:01         ` Hans Hagen
@ 2009-08-12  6:51           ` Taco Hoekwater
  2009-08-12  8:32             ` Hans Hagen
  0 siblings, 1 reply; 9+ messages in thread
From: Taco Hoekwater @ 2009-08-12  6:51 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hans Hagen wrote:
> Thomas A. Schmitz wrote:
>>
>> On Aug 11, 2009, at 2:35 PM, Hans Hagen wrote:
>>
>>> what exactly do you want to replace ?
>>>
>>> Hans
>>>
>>
>> I'm trying to use the lpegs you have written for mtx-babel.lua, but 
>> instead of rewriting the greek ASCII stuff to a new file, I want to 
>> convert it to proper utf Greek and feed that to mkiv. As I said, it's 
>> a stopgap, but better than nothing...
> 
> if there is more demand for that i can consider making a substituter 
> that operates on the node list in an early stage; that way it is 
> controlled by attributes and there is no interference with macro 
> definitions, reading modules and such

Macro interaction may be an issue, but I believe it is still better
for transliterations to work on the actual input strings or on tokens.
For example, you may want to run macros (like \delimitedtext) on the
converted output.

If I had to do this myself, I would probably work on token lists,
even though it is quite a bit less convenient than strings. I remember
we have talked about writing extended lpegs that work directly on
token- and nodelists, that would perhaps be the nicest solution in
the long run. Anyway, I am just thinking out loud.

Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-12  6:51           ` Taco Hoekwater
@ 2009-08-12  8:32             ` Hans Hagen
  2009-08-12  9:04               ` Thomas A. Schmitz
  0 siblings, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2009-08-12  8:32 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Taco Hoekwater wrote:
> Hans Hagen wrote:
>> Thomas A. Schmitz wrote:
>>>
>>> On Aug 11, 2009, at 2:35 PM, Hans Hagen wrote:
>>>
>>>> what exactly do you want to replace ?
>>>>
>>>> Hans
>>>>
>>>
>>> I'm trying to use the lpegs you have written for mtx-babel.lua, but 
>>> instead of rewriting the greek ASCII stuff to a new file, I want to 
>>> convert it to proper utf Greek and feed that to mkiv. As I said, it's 
>>> a stopgap, but better than nothing...
>>
>> if there is more demand for that i can consider making a substituter 
>> that operates on the node list in an early stage; that way it is 
>> controlled by attributes and there is no interference with macro 
>> definitions, reading modules and such
> 
> Macro interaction may be an issue, but I believe it is still better
> for transliterations to work on the actual input strings or on tokens.
> For example, you may want to run macros (like \delimitedtext) on the
> converted output.

my main concern with that is that one then needs to control precisely 
where to apply such translations; for instance turning a< into something 
   else might also mess up math and adding all kind of extra checking 
and housekeeping (for instance when loading modules or whatever in the 
middle of such a conversion)

of course when the to be converted fragments are tagged it's trivial to 
use lpeg and avoid \cs's

btw, i think that delimitedtext would work anyway as we only replace 
"glyph a glyph<" by something else then

anyway, it all depends on the task and hopefully unicode will solve all 
our problems (and not introduce more)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: lpeg substitution
  2009-08-12  8:32             ` Hans Hagen
@ 2009-08-12  9:04               ` Thomas A. Schmitz
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas A. Schmitz @ 2009-08-12  9:04 UTC (permalink / raw)
  To: mailing list for ConTeXt users


On Aug 12, 2009, at 10:32 AM, Hans Hagen wrote:

>>> if there is more demand for that i can consider making a  
>>> substituter that operates on the node list in an early stage; that  
>>> way it is controlled by attributes and there is no interference  
>>> with macro definitions, reading modules and such
>> Macro interaction may be an issue, but I believe it is still better
>> for transliterations to work on the actual input strings or on  
>> tokens.
>> For example, you may want to run macros (like \delimitedtext) on the
>> converted output.
>
> my main concern with that is that one then needs to control  
> precisely where to apply such translations; for instance turning a<  
> into something   else might also mess up math and adding all kind of  
> extra checking and housekeeping (for instance when loading modules  
> or whatever in the middle of such a conversion)
>
> of course when the to be converted fragments are tagged it's trivial  
> to use lpeg and avoid \cs's
>
> btw, i think that delimitedtext would work anyway as we only replace  
> "glyph a glyph<" by something else then
>
> anyway, it all depends on the task and hopefully unicode will solve  
> all our problems (and not introduce more)

Well, in my case the fragments are already delimited, so it's  
relatively easy. However, I wonder whether there are many applications  
for this. I don't see too many, but maybe I'm wrong. From my POV,  
there is no need for this in the core, but maybe others see more usage.

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-08-12  9:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-11  9:34 lpeg substitution Thomas A. Schmitz
2009-08-11 10:59 ` Hans Hagen
2009-08-11 11:21   ` Thomas A. Schmitz
2009-08-11 12:35     ` Hans Hagen
2009-08-11 15:14       ` Thomas A. Schmitz
2009-08-11 22:01         ` Hans Hagen
2009-08-12  6:51           ` Taco Hoekwater
2009-08-12  8:32             ` Hans Hagen
2009-08-12  9:04               ` Thomas A. Schmitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).