ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: Philipp Gesang <gesang@stud.uni-heidelberg.de>
To: hajtmar@gyza.cz, mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: Problem with Lua processing UTF8 substrings
Date: Wed, 1 Feb 2012 21:05:55 +0100	[thread overview]
Message-ID: <20120201200555.GA1647@phlegethon> (raw)
In-Reply-To: <4F2991C9.2090602@gyza.cz>


[-- Attachment #1.1: Type: text/plain, Size: 2407 bytes --]

On 2012-02-01 20:26, Jaroslav Hajtmar wrote:
> I want to use Lua to write characters (substrings) from a string,
> but I get an error message:
> 
> ! String contains an invalid utf-8 sequence.
> 
> Can you please someone help?

Have you tried the unicode library? The standard string library
operates on bytes, therefore extracting a single byte yields an
incomplete multibyte char if the codepoint is beyond ascii.

·································································

\def\mymacro#1{%
  \startluacode
    local utf = unicode.utf8
    local target = [==[\detokenize{#1}]==]
    for i=1, utf.len(target) do
      context(utf.sub(target,i,i)..", ")
    end
  \stopluacode%
}

%% alternatively, use utfcharacters
\define[1]\myothermacro{%
  \startluacode
    local result = { }
    for i in string.utfcharacters[==[\detokenize{#1}]==] do
      result[\letterhash result+1] = i
    end
    context(table.concat(result, ", "))
  \stopluacode
}

\starttext

\mymacro{šěřěžřýčřčžáýčý}\par
\myothermacro{šěřěžřýčřčžáýčý}

\stoptext

·································································

(Lazy people would just do a “local string = unicode.utf8” at the
top of the file.)

Regards
Philipp




> 
> Thanks
> Jaroslav Hajtmar
> 
> Here is my minimal example:
> 
> \def\mymacro#1{\ctxlua{for i=1, string.len('#1') do
> context(string.sub('#1',i,i)..", ") end}}
> 
> \starttext
> 
> %\mymacro{šěřěžřýčřčžáýčý} % Here is a problem
> \mymacro{asdfghjklqwertt} % Here is all OK
> 
> \stoptext
> 
> 
> 
> 
> 
> 
> 
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

-- 
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

[-- Attachment #1.2: Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

  reply	other threads:[~2012-02-01 20:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-01 19:26 Jaroslav Hajtmar
2012-02-01 20:05 ` Philipp Gesang [this message]
2012-02-01 20:17   ` Jaroslav Hajtmar
2012-02-01 20:35     ` Philipp Gesang
2012-02-01 20:38       ` Jaroslav Hajtmar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120201200555.GA1647@phlegethon \
    --to=gesang@stud.uni-heidelberg.de \
    --cc=hajtmar@gyza.cz \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).