* lua questions
@ 2009-01-22 21:24 Thomas A. Schmitz
2009-01-23 11:13 ` Hans Hagen
0 siblings, 1 reply; 5+ messages in thread
From: Thomas A. Schmitz @ 2009-01-22 21:24 UTC (permalink / raw)
To: mailing ConTeXt users list for
Hi all,
this is a bit OT and should probably go to a lua list, but since some
people here are very proficient in lua and I feel less embarrassed
about noob questions here... I have a half-functioning python script
to convert entries from a classics database into the bibtex format. I
want to rewrite it in lua and make it more functional. Three little
problems/questions:
1. I found a script to convert Roman numerals via lpeg here: http://lua-users.org/wiki/LpegRecipes
but it uses the syntax lpeg.Ca which my lpeg doesn't recognize and
which I can't find in the lpeg manual. According to a talk by Roberto
Ierusalimschy, "lpeg.Ca(patt) - "accumulates" the nested captures." (http://www.inf.puc-rio.br/~roberto/lpeg/slides-lpeg-workshop2008.pdf
) Is this obsolete, has it been replaced by anything?
2. How can I check if a string begins with a class of words "(Der |Die
|Das |The |An )" etc. and strip these words from the string? I do it
with a compiled regexp in python, but "Programming in lua" has this to
say: "Unlike some other systems, in Lua a modifier can only be applied
to a character class; there is no way to group patterns under a
modifier. For instance, there is no pattern that matches an optional
word (unless the word has only one letter). Usually you can circumvent
this limitation using some of the advanced techniques that we will see
later." I haven't found these techniques yet.
3. How can I compare strings with utf8 characters? My naive approach
if string.find(record, "Résumé")
doesn't appear to work (while the same method does work if the string
has only ASCII characters).
Sorry if this is OT, and I'll be grateful for any pointers.
Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lua questions
2009-01-22 21:24 lua questions Thomas A. Schmitz
@ 2009-01-23 11:13 ` Hans Hagen
2009-01-23 13:04 ` Thomas A. Schmitz
2009-01-29 12:35 ` Thomas A. Schmitz
0 siblings, 2 replies; 5+ messages in thread
From: Hans Hagen @ 2009-01-23 11:13 UTC (permalink / raw)
To: mailing list for ConTeXt users
Thomas A. Schmitz wrote:
> Hi all,
>
> this is a bit OT and should probably go to a lua list, but since some
> people here are very proficient in lua and I feel less embarrassed about
> noob questions here... I have a half-functioning python script to
> convert entries from a classics database into the bibtex format. I want
> to rewrite it in lua and make it more functional. Three little
> problems/questions:
>
> 1. I found a script to convert Roman numerals via lpeg here:
> http://lua-users.org/wiki/LpegRecipes but it uses the syntax lpeg.Ca
> which my lpeg doesn't recognize and which I can't find in the lpeg
> manual. According to a talk by Roberto Ierusalimschy, "lpeg.Ca(patt) -
> "accumulates" the nested captures."
> (http://www.inf.puc-rio.br/~roberto/lpeg/slides-lpeg-workshop2008.pdf)
> Is this obsolete, has it been replaced by anything?
here is a variant that implements a function (and does not use the env
trick)
do
local add = function (x,y) return x+y end
local P,Ca,Cc= lpeg.P,lpeg.Ca,lpeg.Cc
local symbols = {
I=1,V=5,X=10,L=50,C=100,D=500,M=1000,IV=4,IX=9,XL=40,CD=400,CM=900}
local adders = { }
for s,n in pairs(symbols) do adders[s] = P(s)*Cc(n)/add end
local MS = adders.M^0
local CS =
(adders.D*adders.C^(-4)+adders.CD+adders.CM+adders.C^(-4))^(-1)
local XS = (adders.L*adders.X^(-4)+adders.XL+adders.X^(-4))^(-1)
local IS =
(adders.V*adders.I^(-4)+adders.IX+adders.IV+adders.I^(-4))^(-1)
local p = Ca(Cc(0)*MS*CS*XS*IS)
function string:romantonumber()
return p:match(self:upper())
end
end
print(string.romantonumber("MMIX"))
print(string.romantonumber("MMIIIX"))
just run such script using
mtxrun --script yourscript.lua
as luatex (texlua) has the latest lpeg built in)
> 2. How can I check if a string begins with a class of words "(Der |Die
> |Das |The |An )" etc. and strip these words from the string? I do it
> with a compiled regexp in python, but "Programming in lua" has this to
> say: "Unlike some other systems, in Lua a modifier can only be applied
> to a character class; there is no way to group patterns under a
> modifier. For instance, there is no pattern that matches an optional
> word (unless the word has only one letter). Usually you can circumvent
> this limitation using some of the advanced techniques that we will see
> later." I haven't found these techniques yet.
local stripped = {
"Der", "Die", "Das"
}
local p = lpeg.P(false)
for k, v in ipairs(stripped) do
p = p + lpeg.P(v)
end
local w = p * " "
local stripper = lpeg.Cs(((w/"") + lpeg.C(1))^0)
lpeg.print(stripper)
str = "Germans somehow always talk about Der Thomas and Der Hans"
print(stripper:match(str))
> 3. How can I compare strings with utf8 characters? My naive approach
> if string.find(record, "Résumé")
> doesn't appear to work (while the same method does work if the string
> has only ASCII characters).
since lua is 8 bit clean utf should just work
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lua questions
2009-01-23 11:13 ` Hans Hagen
@ 2009-01-23 13:04 ` Thomas A. Schmitz
2009-01-29 12:35 ` Thomas A. Schmitz
1 sibling, 0 replies; 5+ messages in thread
From: Thomas A. Schmitz @ 2009-01-23 13:04 UTC (permalink / raw)
To: mailing list for ConTeXt users
On Jan 23, 2009, at 12:13 PM, Hans Hagen wrote:
>
> here is a variant that implements a function (and does not use the
> env trick)
>
> do
> local add = function (x,y) return x+y end
> local P,Ca,Cc= lpeg.P,lpeg.Ca,lpeg.Cc
> local symbols =
> { I=1,V=5,X=10,L=50,C=100,D=500,M=1000,IV=4,IX=9,XL=40,CD=400,CM=900}
> local adders = { }
> for s,n in pairs(symbols) do adders[s] = P(s)*Cc(n)/add end
> local MS = adders.M^0
> local CS = (adders.D*adders.C^(-4)+adders.CD+adders.CM
> +adders.C^(-4))^(-1)
> local XS = (adders.L*adders.X^(-4)+adders.XL+adders.X^(-4))^(-1)
> local IS = (adders.V*adders.I^(-4)+adders.IX+adders.IV
> +adders.I^(-4))^(-1)
> local p = Ca(Cc(0)*MS*CS*XS*IS)
> function string:romantonumber()
> return p:match(self:upper())
> end
> end
>
> print(string.romantonumber("MMIX"))
> print(string.romantonumber("MMIIIX"))
>
>
> just run such script using
>
> mtxrun --script yourscript.lua
>
> as luatex (texlua) has the latest lpeg built in)
>
Brilliant! This one does work when I use it with luatex (not with my
system lua though, even though I have the latest released version of
lpeg 0.9 installed. Bizarre...
>
>> 2. How can I check if a string begins with a class of words "(Der |
>> Die |Das |The |An )" etc. and strip these words from the string? I
>> do it with a compiled regexp in python, but "Programming in lua"
>> has this to say: "Unlike some other systems, in Lua a modifier can
>> only be applied to a character class; there is no way to group
>> patterns under a modifier. For instance, there is no pattern that
>> matches an optional word (unless the word has only one letter).
>> Usually you can circumvent this limitation using some of the
>> advanced techniques that we will see later." I haven't found these
>> techniques yet.
>
> local stripped = {
> "Der", "Die", "Das"
> }
>
> local p = lpeg.P(false)
>
> for k, v in ipairs(stripped) do
> p = p + lpeg.P(v)
> end
>
> local w = p * " "
>
> local stripper = lpeg.Cs(((w/"") + lpeg.C(1))^0)
>
> lpeg.print(stripper)
>
> str = "Germans somehow always talk about Der Thomas and Der Hans"
>
> print(stripper:match(str))
>
Brilliant again! I can run with that, looks great! And who doesn't
want a "local stripper" in his code?
>
>> 3. How can I compare strings with utf8 characters? My naive approach
>> if string.find(record, "Résumé")
>> doesn't appear to work (while the same method does work if the
>> string has only ASCII characters).
>
> since lua is 8 bit clean utf should just work
OK, then the problem must be somewhere else. I'll investigate.
Thanks a lot, and best wishes
Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lua questions
2009-01-23 11:13 ` Hans Hagen
2009-01-23 13:04 ` Thomas A. Schmitz
@ 2009-01-29 12:35 ` Thomas A. Schmitz
2009-01-29 12:42 ` Taco Hoekwater
1 sibling, 1 reply; 5+ messages in thread
From: Thomas A. Schmitz @ 2009-01-29 12:35 UTC (permalink / raw)
To: mailing list for ConTeXt users
On Jan 23, 2009, at 12:13 PM, Hans Hagen wrote:
> Thomas A. Schmitz wrote:
>> it uses the syntax lpeg.Ca which my lpeg doesn't recognize and
>> which I can't find in the lpeg manual.
[useful information snipped]
>
> just run such script using
>
> mtxrun --script yourscript.lua
>
> as luatex (texlua) has the latest lpeg built in)
>
Just one remark: my lpeg is
/*
** $Id: lpeg.c,v 1.98 2008/10/11 20:20:43 roberto Exp $
and doesn't have the lpeg.Ca pattern. The lpeg that comes with luatex is
/*
** $Id: lpeg.c,v 1.86 2008/03/07 17:20:19 roberto Exp $
so it's older, and it does have the lpeg.Ca pattern accumulator.
And can I ask one more question about lpeg? Suppose I have the string
"{\em This string is \quotation{heavily} emphasized.}"
and want to transform that into something like
"\color[red]{This string is \quotation{heavily} emphasized.}"
How would I go about this using lpeg? I must use a lpeg.V somewhere,
but I can't figure out where and how.
Thanks, and all best
Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lua questions
2009-01-29 12:35 ` Thomas A. Schmitz
@ 2009-01-29 12:42 ` Taco Hoekwater
0 siblings, 0 replies; 5+ messages in thread
From: Taco Hoekwater @ 2009-01-29 12:42 UTC (permalink / raw)
To: mailing list for ConTeXt users
Thomas A. Schmitz wrote:
>
>
> The lpeg that comes with luatex is
lpeg in luatex is still 0.8.1
Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-01-29 12:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-22 21:24 lua questions Thomas A. Schmitz
2009-01-23 11:13 ` Hans Hagen
2009-01-23 13:04 ` Thomas A. Schmitz
2009-01-29 12:35 ` Thomas A. Schmitz
2009-01-29 12:42 ` Taco Hoekwater
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).