From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/32864 Path: news.gmane.org!not-for-mail From: "Mojca Miklavec" Newsgroups: gmane.comp.tex.context Subject: Re: Module `database' and UTF8? Date: Sun, 21 Jan 2007 02:55:23 +0100 Message-ID: <6faad9f00701201755g7ac2dd67j634e8bbbb1ee4da9@mail.gmail.com> References: <45B271C0.7020903@econ.muni.cz> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1169346091 6693 80.91.229.12 (21 Jan 2007 02:21:31 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sun, 21 Jan 2007 02:21:31 +0000 (UTC) Original-X-From: ntg-context-bounces@ntg.nl Sun Jan 21 03:21:28 2007 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1H8SKl-0006Rq-00 for gctc-ntg-context-518@m.gmane.org; Sun, 21 Jan 2007 03:21:27 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 734401FCD7; Sun, 21 Jan 2007 03:16:23 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 05446-02-3; Sun, 21 Jan 2007 03:16:20 +0100 (CET) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 907511FFA5; Sun, 21 Jan 2007 02:50:49 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 4EE471FFA5 for ; Sun, 21 Jan 2007 02:50:46 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 03085-07-2 for ; Sun, 21 Jan 2007 02:50:39 +0100 (CET) Original-Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.184]) by ronja.ntg.nl (Postfix) with ESMTP id 04E0E1FFA4 for ; Sun, 21 Jan 2007 02:50:38 +0100 (CET) Original-Received: by nf-out-0910.google.com with SMTP id x37so791899nfc for ; Sat, 20 Jan 2007 17:55:23 -0800 (PST) Original-Received: by 10.49.80.12 with SMTP id h12mr4448628nfl.1169344523333; Sat, 20 Jan 2007 17:55:23 -0800 (PST) Original-Received: by 10.48.209.8 with HTTP; Sat, 20 Jan 2007 17:55:23 -0800 (PST) Original-To: "mailing list for ConTeXt users" In-Reply-To: <45B271C0.7020903@econ.muni.cz> Content-Disposition: inline X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.7 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:32864 Archived-At: Hello, you seem to be the first one (beside me and Taco) to complain about that bug, so perhaps Hans will have one reason more to try to fix it now (or perhaps to port it to lua, but then you'll have to wait for some time). My observations are (were - I'm currently not behind a ConTeXt machine) as follows: - 8-bit encodings seem to work OK (I had to convert a few files to cp1250 just because of that bug, but then it worked OK) - there are two completely different approaches - one imitates pere csv (Taco's) and one parses anything "TeX-ish" and also obeys TeX commands (Hans's). - Taco's approach has been fixed (utf-8 worked ok after a patch), so if you don't need TeX commands inside your tables, set "quotechar" to whatever, which will trigger Taco's mode - Hans's approach has a bug in utf-8 handling. But the problem only appears if the very first character in the cell is something non-ascii So in the case of b=F8ezen duben kv=ECten =E8erven it's probably only the last word the one which is causing problems A temporary solution might be to define quotechar (which is probably what you have already tried) or to wait for Hans to fix it. I find the module really useful, but I don't understand a bit in that file (even Taco addmited that it was the "worst-readable" macro he has ever wrote ;) I assume that you've seen the MyWay about it (http://wiki.contextgarden.net/My_Way) - feel free to post any comments about its unreadability ;). Mojca On 1/20/07, Michal Kvasnicka wrote: > Good evening. > > I apologize that I bother you now so much with my own problems but I > have to rewrite some of my macros right now for some reason. > > I found there is a great `database' module. I tried it with ConTeXt > version 2007.01.12 15:56, perl TeXExec 5.4.3, and pdfetex 1.40.1 under > SuSE Linux 10.1. I use input encoding regime utf8, and fonts in ec-lm > encoding. I set it this way: > \defineseparatedlist[CSV] > [separator=3Dtab,%{,}, %quotechar=3D{"}, > before=3D\bTABLE, after=3D\eTABLE, > first=3D\bTR, last=3D\eTR, > left=3D\bTD, right=3D\eTD] > And call it with \processseparatedfile[CSV][file.csv]. > > It works well with words without accents. If there is an accented letter > in the file.csv, it failes with this error message: > ! Argument of \utftwouniglph has an extra }. > > \par > > } > \dodoprocessseplist #1#2 ->\edef \!!stringa {#1} > \ifx \edef@relax > \!!stringa ... > P=D8=CDJMY leden =FAnor > b=F8ezen duben kv=ECten =E8erven > =E8ervenec srpe... > > \doprocessseplist ...elax ->\dodoprocessseplist #1 > \relax \relax > \relax \end > \doprocessseparatedfileline ...plist \line \relax > \else \expanded > {\processq... > ... > > Can you help me to make it work? > > Many thanks. Yours > Michal Kvasnicka