ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: Hans Hagen <pragma@wxs.nl>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: lua questions
Date: Fri, 23 Jan 2009 12:13:15 +0100	[thread overview]
Message-ID: <4979A64B.1080906@wxs.nl> (raw)
In-Reply-To: <8FF223F7-EFDC-43BD-8765-CBB15A03DBAD@uni-bonn.de>

Thomas A. Schmitz wrote:
> Hi all,
> 
> this is a bit OT and should probably go to a lua list, but since some 
> people here are very proficient in lua and I feel less embarrassed about 
> noob questions here... I have a half-functioning python script to 
> convert entries from a classics database into the bibtex format. I want 
> to rewrite it in lua and make it more functional. Three little 
> problems/questions:
> 
> 1. I found a script to convert Roman numerals via lpeg here: 
> http://lua-users.org/wiki/LpegRecipes but it uses the syntax lpeg.Ca 
> which my lpeg doesn't recognize and which I can't find in the lpeg 
> manual. According to a talk by Roberto Ierusalimschy, "lpeg.Ca(patt) - 
> "accumulates" the nested captures." 
> (http://www.inf.puc-rio.br/~roberto/lpeg/slides-lpeg-workshop2008.pdf) 
> Is this obsolete, has it been replaced by anything?

here is a variant that implements a function (and does not use the env 
trick)

do
     local add = function (x,y) return x+y end
     local P,Ca,Cc= lpeg.P,lpeg.Ca,lpeg.Cc
     local symbols = { 
I=1,V=5,X=10,L=50,C=100,D=500,M=1000,IV=4,IX=9,XL=40,CD=400,CM=900}
     local adders = { }
     for s,n in pairs(symbols) do adders[s] = P(s)*Cc(n)/add end
     local MS = adders.M^0
     local CS = 
(adders.D*adders.C^(-4)+adders.CD+adders.CM+adders.C^(-4))^(-1)
     local XS = (adders.L*adders.X^(-4)+adders.XL+adders.X^(-4))^(-1)
     local IS = 
(adders.V*adders.I^(-4)+adders.IX+adders.IV+adders.I^(-4))^(-1)
     local p = Ca(Cc(0)*MS*CS*XS*IS)
     function string:romantonumber()
         return p:match(self:upper())
     end
end

print(string.romantonumber("MMIX"))
print(string.romantonumber("MMIIIX"))


just run such script using

mtxrun --script yourscript.lua

as luatex (texlua) has the latest lpeg built in)


> 2. How can I check if a string begins with a class of words "(Der |Die 
> |Das |The |An )" etc. and strip these words from the string? I do it 
> with a compiled regexp in python, but "Programming in lua" has this to 
> say: "Unlike some other systems, in Lua a modifier can only be applied 
> to a character class; there is no way to group patterns under a 
> modifier. For instance, there is no pattern that matches an optional 
> word (unless the word has only one letter). Usually you can circumvent 
> this limitation using some of the advanced techniques that we will see 
> later." I haven't found these techniques yet.

local stripped = {
     "Der", "Die", "Das"
}

local p = lpeg.P(false)

for k, v in ipairs(stripped) do
     p = p + lpeg.P(v)
end

local w = p * " "

local stripper = lpeg.Cs(((w/"") + lpeg.C(1))^0)

lpeg.print(stripper)

str = "Germans somehow always talk about Der Thomas and Der Hans"

print(stripper:match(str))


> 3. How can I compare strings with utf8 characters? My naive approach
>    if string.find(record, "Résumé")
> doesn't appear to work (while the same method does work if the string 
> has only ASCII characters).

since lua is 8 bit clean utf should just work

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


  reply	other threads:[~2009-01-23 11:13 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-22 21:24 Thomas A. Schmitz
2009-01-23 11:13 ` Hans Hagen [this message]
2009-01-23 13:04   ` Thomas A. Schmitz
2009-01-29 12:35   ` Thomas A. Schmitz
2009-01-29 12:42     ` Taco Hoekwater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4979A64B.1080906@wxs.nl \
    --to=pragma@wxs.nl \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).