From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/68908 Path: news.gmane.org!not-for-mail From: Julian Becker Newsgroups: gmane.comp.tex.context Subject: Re: undefined control sequence bug with German umlaut in bibliography Date: Sat, 4 Jun 2011 11:23:22 +0200 Message-ID: References: <20110604003158.GK20248@rae.vm.bytemark.co.uk> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=bcaec52be679f3efef04a4df6b95 X-Trace: dough.gmane.org 1307179432 14423 80.91.229.12 (4 Jun 2011 09:23:52 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 4 Jun 2011 09:23:52 +0000 (UTC) To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Sat Jun 04 11:23:46 2011 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([195.12.62.10]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QSn58-0006z4-6R for gctc-ntg-context-518@m.gmane.org; Sat, 04 Jun 2011 11:23:46 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id A45BFCAD5D; Sat, 4 Jun 2011 11:23:43 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id vxov1Nsahb2O; Sat, 4 Jun 2011 11:23:40 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id CD98ECAD5E; Sat, 4 Jun 2011 11:23:40 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id EA9D5CAD5E for ; Sat, 4 Jun 2011 11:23:38 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Z2P0Q5lLAvtV for ; Sat, 4 Jun 2011 11:23:26 +0200 (CEST) Original-Received: from filter4-ams.mf.surf.net (filter4-ams.mf.surf.net [192.87.102.72]) by balder.ntg.nl (Postfix) with ESMTP id 743AACAD5D for ; Sat, 4 Jun 2011 11:23:26 +0200 (CEST) Original-Received: from mail-ew0-f41.google.com (mail-ew0-f41.google.com [209.85.215.41]) by filter4-ams.mf.surf.net (8.14.3/8.14.3/Debian-5+lenny1) with ESMTP id p549NOse029127 for ; Sat, 4 Jun 2011 11:23:25 +0200 Original-Received: by ewy9 with SMTP id 9so1280862ewy.14 for ; Sat, 04 Jun 2011 02:23:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=rw65rpYu6jKAOSnRsGWWI+2MWA19gSD4wmPb7+2l4j8=; b=v4Aqu5vV29zSNONMG0w5G6LS37cN53o44RioLhG7JVxh7z/Sm0oxCD3FAkVzwsH+0N B3cjMZeno80V6QORzYxkyXoZhoZHc6Cjlc4aDBRpWRsOk+O6vbEiOL5rmNbuEd2+hORX AyQBcoVEcTldl1n1TfHGaKfili4g38n2vNauI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=A76wftt/5v7s2enf2KaW3Db9XxLAns/TlFgw3nxFLEfPFfO/DEy/3FUOuZCtB8+TIH gXFKesgCJLlpVmvqiTlU7WigMG9IxD1INLtXMY5DNojLc3JdjkBffm0xI6FivEsd9mf2 mYNIvjk6hg0cmWZEFmwbX8ftpA+/FGUI4ZHGc= Original-Received: by 10.14.187.142 with SMTP id y14mr1086384eem.177.1307179402653; Sat, 04 Jun 2011 02:23:22 -0700 (PDT) Original-Received: by 10.14.29.77 with HTTP; Sat, 4 Jun 2011 02:23:22 -0700 (PDT) In-Reply-To: X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN) X-CanIt-Geo: ip=209.85.215.41; country=US; region=CA; city=Mountain View; postalcode=94043; latitude=37.4192; longitude=-122.0574; metrocode=807; areacode=650; http://maps.google.com/maps?q=37.4192,-122.0574&z=6 X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 03EPJnoQ5 - 2928a4348417 - 20110604 X-Scanned-By: CanIt (www . roaringpenguin . com) on 192.87.102.72 X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.12 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:68908 Archived-At: --bcaec52be679f3efef04a4df6b95 Content-Type: multipart/alternative; boundary=bcaec52be679f3efe604a4df6b93 --bcaec52be679f3efe604a4df6b93 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable I can also add that in the first case with the author name "Tr=E4ger", the generated bbl-file looks messed up and (Notepad++ doesn't recognize the encoding as UTF8. Changing the encoding to UTF8 manually shows the complete names "Tr=E4ger" correctly, but the abbreviations (what should have been "Tr=E406") seem to be messed up. I'm not familiar with the intricacies and details of UTF8 encoding, but is it possible that there is a byte missing from the "=E4" which has been cut = off during the abbreviation process? The abbreviated "Tr=E406" seems to be incorrectly encoded (in hexadecimal) = as: 54 C3 A4 30 36, while it should be: 54 72 C3 A4 30 36. So in the abbreviation process, the encoding of some characters over severa= l bytes seems to be neglected. I attached the bbl-files for both cases to this e-mail, since I don't know, what would happen to the encoding, if I just pasted them as plain text here= . Julian 2011/6/4 Julian Becker > Thank you everybody for your answers. Writing Tr{\"a}ger as Thomas > suggested works well, but unfortunately, I'm using Mendeley Desktop for t= he > management of my bibtex file and I can't seem to be able to influence the > way in which it encodes the special characters. > > @Mojca: Indeed, it also fails if I just write "Tr=E4ger" in UTF-8 encodin= g > (however, "Schr=E4ger" works just fine. All combinations where the "=E4" = is at > the third place of the word seem to fail.). The error message is similar = but > slightly different now. The log file shows the following: > --------------- > system > begin file test.tex at line 3 > publications > loading database from test.bbl > (test.bbl > ! String contains an invalid utf-8 sequence. > l.1 \setuppublicationlist[samplesize=3D{Tr > =C306},totalnumber=3D1] > A funny symbol that I can't read has just been (re)read. > Just continue, I'll change it to 0xFFFD. > > ! String contains an invalid utf-8 sequence. > l.5 n=3D1,s=3DTr > =C306] > A funny symbol that I can't read has just been (re)read. > Just continue, I'll change it to 0xFFFD. > > ! String contains an invalid utf-8 sequence. > \doifassignmentelse ...gnmentelse \detokenize {#1} > =3D@@\@end@ \expandafter \se... > \dostartpublication ... ->\doifassignmentelse {#1} > {\getparameters [\??pb ][k... > l.9 \stoppublication > > A funny symbol that I can't read has just been (re)read. > Just continue, I'll change it to 0xFFFD. > > ) > ----------------------- > > Julian > > > 2011/6/4 Pontus Lurcock > >> On Fri 03 Jun 2011, Thomas A. Schmitz wrote: >> >> > But I admit it's not easy to know that, bibtex documentation is a >> > real mess >> >> Patience please! =91This document will be expanded when BibTEX version >> 1.00 comes out=92 -- BIBTEXing, February 8, 1988. >> >> :-) >> >> Pont >> >> ________________________________________________________________________= ___________ >> If your question is of interest to others as well, please add an entry t= o >> the Wiki! >> >> maillist : ntg-context@ntg.nl / >> http://www.ntg.nl/mailman/listinfo/ntg-context >> webpage : http://www.pragma-ade.nl / http://tex.aanhet.net >> archive : http://foundry.supelec.fr/projects/contextrev/ >> wiki : http://contextgarden.net >> >> ________________________________________________________________________= ___________ >> > > > > -- > Julian Becker > Institut f=FCr Angewandte Physik, R.123 > Westf=E4lische Wilhelms-Universit=E4t M=FCnster > Corrensstr. 2/4 > 48149 M=FCnster / Westfalen > Tel. 0251 83-3 61 53 > Mob. 0151 599 848 29 > e-mail: j_beck16@uni-muenster.de > > "Keep thy heart with all diligence; for it is the wellspring of life." > --=20 Julian Becker Institut f=FCr Angewandte Physik, R.123 Westf=E4lische Wilhelms-Universit=E4t M=FCnster Corrensstr. 2/4 48149 M=FCnster / Westfalen Tel. 0251 83-3 61 53 Mob. 0151 599 848 29 e-mail: j_beck16@uni-muenster.de "Keep thy heart with all diligence; for it is the wellspring of life." --bcaec52be679f3efe604a4df6b93 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
I can also add that in the first case with the author name "Tr=E4= ger", the generated bbl-file looks messed up and (Notepad++ doesn'= t recognize the encoding as UTF8. Changing the encoding to UTF8 manually sh= ows the complete names "Tr=E4ger" correctly, but the abbreviation= s (what should have been "Tr=E406") seem to be messed up.

I'm not familiar with the intricacies and details o= f UTF8 encoding, but is it possible that there is a byte missing from the &= quot;=E4" which has been cut off during the abbreviation process?

The abbreviated "Tr=E406" seems to be incorre= ctly encoded (in hexadecimal) as: 54 C3 A4 30 36,
while it shoul= d be: 54 72 C3 A4 30 36.

So in the abbreviation pr= ocess, the encoding of some characters over several bytes seems to be negle= cted.

I attached the bbl-files for both cases to this e-mail,= since I don't know, what would happen to the encoding, if I just paste= d them as plain text here.

Julian


2011/6/4 Julian Becker <becker.julian@gmail= .com>
Thank you everybody for your answers. =A0Writing Tr{\"a}ger as Th= omas suggested works well, but unfortunately, I'm using Mendeley Deskto= p for the management of my bibtex file and I can't seem to be able to i= nfluence the way in which it encodes the special characters.

@Mojca: Indeed, it also fails if I just write &qu= ot;Tr=E4ger" in UTF-8 encoding (however, "Schr=E4ger" works = just fine. All combinations where the "=E4" is at the third place= of the word seem to fail.). The error message is similar but slightly diff= erent now. The log file shows the following:
---------------
system > begin= file test.tex at line 3
publications > loading database from test= .bbl
(test.bbl
! String contains an invalid utf-8 sequence.
l.1 \setuppublicationlist[samplesize=3D{Tr
=C306},totalnumber=3D1]
A funny = symbol that I can't read has just been (re)read.
Just continue, I= 9;ll change it to 0xFFFD.

! String contains an invalid utf-8 sequenc= e.
l.5 n=3D1,s=3DTr
=C306]
A funny symbol that I can't r= ead has just been (re)read.
Just continue, I'll change it to 0xFFFD.=

! String contains an invalid utf-8 sequence.
\doifassignmentelse= ...gnmentelse \detokenize {#1}
=3D@@\@end@ \expandafter = \se...
\dostartpublication ... ->\doifassignmentelse {#1}
= {\getparameters [\??pb ][k...
l.9 \stoppublication

A funny symbol that I can&#= 39;t read has just been (re)read.
Just continue, I'll change it to 0= xFFFD.

)
-----------------------

Julian


2011/6/4 Pontus Lurcock <pont@talvi.net>
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
On Fri 03 Jun 2011, Thomas A. Schmitz wrote:

> But I admit it's not easy to know that, bibtex documentation is a<= br> > real mess

Patience please! =91This document will be expanded when BibTEX versio= n
1.00 comes out=92 -- BIBTEXing, February 8, 1988.

:-)

Pont
_________________________________________________________________= __________________
If your question is of interest to others as well, please add an entry to t= he Wiki!

maillist : ntg-cont= ext@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage =A0: http://= www.pragma-ade.nl / http://tex.aanhet.net
archive =A0: http://foundry.supelec.fr/projects/contextrev/
wiki =A0 =A0 : http:= //contextgarden.net
___________________________________________________________________________= ________



--
Julian Becker
Institut f=FCr Angewandte = Physik, R.123
Westf=E4lische Wilhelms-Universit=E4t M=FCnster
Corrensstr. 2/4
48149 M=FCnster / Westfalen
Tel. 0251 83-3 61 53
M= ob. 0151 599 848 29
e-mail: j_beck16@uni-muenster.de

"Keep thy heart = with all diligence; for it is the wellspring of life."



--
Julian Beck= er
Institut f=FCr Angewandte Physik, R.123
Westf=E4lische Wilhelms-Un= iversit=E4t M=FCnster
Corrensstr. 2/4
48149 M=FCnster / Westfalen
= Tel. 0251 83-3 61 53
Mob. 0151 599 848 29
e-mail: j_beck16@uni-muenster.de

"Keep thy heart with all diligenc= e; for it is the wellspring of life."
--bcaec52be679f3efe604a4df6b93-- --bcaec52be679f3efef04a4df6b95 Content-Type: application/octet-stream; name="testA.bbl" Content-Disposition: attachment; filename="testA.bbl" Content-Transfer-Encoding: base64 X-Attachment-Id: f_goiby9730 XHNldHVwcHVibGljYXRpb25saXN0W3NhbXBsZXNpemU9e1RywzA2fSx0b3RhbG51bWJlcj0xXQoK XHN0YXJ0cHVibGljYXRpb25baz1FbnRyeTEsdD1taXNjLAphPXt7VHLDpGdlcn19LHk9MjAwNiwK bj0xLHM9VHLDMDZdClxhdXRob3JbXXtEfVtELl17fXtUcsOkZ2VyfQpccHVieWVhcnsyMDA2fQpc dGl0bGV7e1NvbWUgRG9jdW1lbnR9fQpcc3RvcHB1YmxpY2F0aW9uCgo= --bcaec52be679f3efef04a4df6b95 Content-Type: application/octet-stream; name="testB.bbl" Content-Disposition: attachment; filename="testB.bbl" Content-Transfer-Encoding: base64 X-Attachment-Id: f_goibz1131 XHNldHVwcHVibGljYXRpb25saXN0W3NhbXBsZXNpemU9e1NjaDA2fSx0b3RhbG51bWJlcj0xXQoK XHN0YXJ0cHVibGljYXRpb25baz1FbnRyeTEsdD1taXNjLAphPXt7U2NocsOkZ2VyfX0seT0yMDA2 LApuPTEscz1TY2gwNl0KXGF1dGhvcltde0R9W0QuXXt9e1NjaHLDpGdlcn0KXHB1YnllYXJ7MjAw Nn0KXHRpdGxle3tTb21lIERvY3VtZW50fX0KXHN0b3BwdWJsaWNhdGlvbgoK --bcaec52be679f3efef04a4df6b95 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --bcaec52be679f3efef04a4df6b95--