Module `database' and UTF8?

ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed

* Module `database' and UTF8?
@ 2007-01-20 19:47 Michal Kvasnicka
  2007-01-21  1:55 ` Mojca Miklavec
  0 siblings, 1 reply; 2+ messages in thread
From: Michal Kvasnicka @ 2007-01-20 19:47 UTC (permalink / raw)


Good evening.

I apologize that I bother you now so much with my own problems but I
have to rewrite some of my macros right now for some reason.

I found there is a great `database' module. I tried it with ConTeXt
version 2007.01.12 15:56, perl TeXExec 5.4.3, and pdfetex 1.40.1 under
SuSE Linux 10.1. I use input encoding regime utf8, and fonts in ec-lm
encoding. I set it this way:
    \defineseparatedlist[CSV]
      [separator=tab,%{,}, %quotechar={"},
       before=\bTABLE, after=\eTABLE,
       first=\bTR, last=\eTR,
       left=\bTD, right=\eTD]
And call it with \processseparatedfile[CSV][file.csv].

It works well with words without accents. If there is an accented letter
in the file.csv, it failes with this error message:
! Argument of \utftwouniglph has an extra }.
<inserted text>
                \par
<to be read again>
                   }
\dodoprocessseplist #1#2        ->\edef \!!stringa {#1}
                                                \ifx \edef@relax
\!!stringa ...
<argument> PŘÍJMY       leden   únor
                                březen  duben   květen  červen 
červenec        srpe...

\doprocessseplist ...elax ->\dodoprocessseplist #1
                                                        \relax  \relax
\relax \end
\doprocessseparatedfileline ...plist \line \relax
                                                  \else \expanded
{\processq...
...

Can you help me to make it work?

Many thanks. Yours
Michal Kvasnicka

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Module `database' and UTF8?
  2007-01-20 19:47 Module `database' and UTF8? Michal Kvasnicka
@ 2007-01-21  1:55 ` Mojca Miklavec
  0 siblings, 0 replies; 2+ messages in thread
From: Mojca Miklavec @ 2007-01-21  1:55 UTC (permalink / raw)


Hello,

you seem to be the first one (beside me and Taco) to complain about
that bug, so perhaps Hans will have one reason more to try to fix it
now (or perhaps to port it to lua, but then you'll have to wait for
some time).

My observations are (were - I'm currently not behind a ConTeXt
machine) as follows:
- 8-bit encodings seem to work OK (I had to convert a few files to
cp1250 just because of that bug, but then it worked OK)
- there are two completely different approaches - one imitates pere
csv (Taco's) and one parses anything "TeX-ish" and also obeys TeX
commands (Hans's).
- Taco's approach has been fixed (utf-8 worked ok after a patch), so
if you don't need TeX commands inside your tables, set "quotechar" to
whatever, which will trigger Taco's mode
- Hans's approach has a bug in utf-8 handling. But the problem only
appears if the very first character in the cell is something non-ascii

So in the case of
    březen  duben  květen  červen
it's probably only the last word the one which is causing problems

A temporary solution might be to define quotechar (which is probably
what you have already tried) or to wait for Hans to fix it.

I find the module really useful, but I don't understand a bit in that
file (even Taco addmited that it was the "worst-readable" macro he has
ever wrote ;) I assume that you've seen the MyWay about it
(http://wiki.contextgarden.net/My_Way) - feel free to post any
comments about its unreadability ;).

Mojca


On 1/20/07, Michal Kvasnicka wrote:
> Good evening.
>
> I apologize that I bother you now so much with my own problems but I
> have to rewrite some of my macros right now for some reason.
>
> I found there is a great `database' module. I tried it with ConTeXt
> version 2007.01.12 15:56, perl TeXExec 5.4.3, and pdfetex 1.40.1 under
> SuSE Linux 10.1. I use input encoding regime utf8, and fonts in ec-lm
> encoding. I set it this way:
>     \defineseparatedlist[CSV]
>       [separator=tab,%{,}, %quotechar={"},
>        before=\bTABLE, after=\eTABLE,
>        first=\bTR, last=\eTR,
>        left=\bTD, right=\eTD]
> And call it with \processseparatedfile[CSV][file.csv].
>
> It works well with words without accents. If there is an accented letter
> in the file.csv, it failes with this error message:
> ! Argument of \utftwouniglph has an extra }.
> <inserted text>
>                 \par
> <to be read again>
>                    }
> \dodoprocessseplist #1#2        ->\edef \!!stringa {#1}
>                                                 \ifx \edef@relax
> \!!stringa ...
> <argument> PŘÍJMY       leden   únor
>                                 březen  duben   květen  červen
> červenec        srpe...
>
> \doprocessseplist ...elax ->\dodoprocessseplist #1
>                                                         \relax  \relax
> \relax \end
> \doprocessseparatedfileline ...plist \line \relax
>                                                   \else \expanded
> {\processq...
> ...
>
> Can you help me to make it work?
>
> Many thanks. Yours
> Michal Kvasnicka

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-01-21  1:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-20 19:47 Module `database' and UTF8? Michal Kvasnicka
2007-01-21  1:55 ` Mojca Miklavec

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).