From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/90585 Path: news.gmane.org!not-for-mail From: luigi scarso Newsgroups: gmane.comp.tex.context Subject: Re: Unicode question Date: Thu, 12 Mar 2015 21:41:40 +0100 Message-ID: References: <20150312084827.6683072a@arcor.com> <20150312185749.0c7f677e@arcor.com> <5501E135.3040501@wxs.nl> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1832139697==" X-Trace: ger.gmane.org 1426192944 16634 80.91.229.3 (12 Mar 2015 20:42:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 12 Mar 2015 20:42:24 +0000 (UTC) To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Thu Mar 12 21:42:13 2015 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([5.39.185.229]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YW9vo-0008Lt-3V for gctc-ntg-context-518@m.gmane.org; Thu, 12 Mar 2015 21:42:12 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 4758A1022C for ; Thu, 12 Mar 2015 21:42:11 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id ASyhGX7gKxEG for ; Thu, 12 Mar 2015 21:42:10 +0100 (CET) Original-Received: from balder.ntg.nl (localhost [IPv6:::1]) by balder.ntg.nl (Postfix) with ESMTP id 5BA2010233 for ; Thu, 12 Mar 2015 21:41:46 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 0CC76101FB for ; Thu, 12 Mar 2015 21:41:42 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id oo5ljd1hho-v for ; Thu, 12 Mar 2015 21:41:41 +0100 (CET) Original-Received: from filter4-ams.mf.surf.net (filter4-ams.mf.surf.net [192.87.102.72]) by balder.ntg.nl (Postfix) with ESMTP id 5821C101F9 for ; Thu, 12 Mar 2015 21:41:41 +0100 (CET) Original-Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) by filter4-ams.mf.surf.net (8.14.3/8.14.3/Debian-9.4) with ESMTP id t2CKfe7x023082 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Thu, 12 Mar 2015 21:41:40 +0100 Original-Received: by wiwl15 with SMTP id l15so931305wiw.0 for ; Thu, 12 Mar 2015 13:41:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=0LCsj8T+7u8h0jhwOEmysKnWWrjzRDD4uLmYHUt5wVk=; b=LoBwS6fR+9rotupiIkrQZPZXKeAy++rWyJBsdQFpVe3p3D/hjChIIPjU0WRQmqjKQj RHIiK5S8mvA0c4mTaN3XQ25nQ4AaovrSpc8Kkz8prOHEX18GfieJS8t84JwCBiPq9ouB 4BYQ3s62LEIKVZU7KzCD3i1fswcaEhtHN7T/5gbscOsmMXZErUygT2harmAYKNFv0f5D FDE+iniJjPx1QMU5U2CJd89t8adz2ZoWuPSfc7BBzUpWpgyRxpXmn+7aoCiZi86mR3h7 DxkOrgD7fh1GroKKLWZCX3xO+FYbaqH0hIBwjQFVCTmyY0bDgw/hqdS6c1tNt7OO4j9H BooQ== X-Received: by 10.180.92.136 with SMTP id cm8mr29886479wib.41.1426192900272; Thu, 12 Mar 2015 13:41:40 -0700 (PDT) Original-Received: by 10.194.120.167 with HTTP; Thu, 12 Mar 2015 13:41:40 -0700 (PDT) In-Reply-To: <5501E135.3040501@wxs.nl> X-Bayes-Prob: 0.0001 (Score 0, tokens from: ntg-context@ntg.nl, base:default, @@RPTN) X-CanIt-Geo: ip=2a00:1450:400c:c05::22d; country=IE; latitude=53.3478; longitude=-6.2597; http://maps.google.com/maps?q=53.3478,-6.2597&z=6 X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 01O2IFEye - 47e966b630f4 - 20150312 (trained as not-spam) Received-SPF: pass (filter4-ams.mf.surf.net: domain of luigi.scarso@gmail.com designates 2a00:1450:400c:c05::22d as permitted sender) receiver=filter4-ams.mf.surf.net; client-ip=2a00:1450:400c:c05::22d; envelope-from=; helo=mail-wi0-x22d.google.com; identity=mailfrom X-Scanned-By: CanIt (www . roaringpenguin . com) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.16 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.org gmane.comp.tex.context:90585 Archived-At: --===============1832139697== Content-Type: multipart/alternative; boundary=f46d043bdef433b68f05111d6941 --f46d043bdef433b68f05111d6941 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Thu, Mar 12, 2015 at 7:55 PM, Hans Hagen wrote: > it's actually a bug ... it is ok to map an invalid character in the input > to 0xFFFD, halt and continue when permitted, but the method used in luate= x > thereby obscures a valid 0xFFFD in the input > > FFFD REPLACEMENT CHARACTER =E2=80=A2 used to replace an incoming character whose value is unknown or unrepresentable in Unicode The meaning of FFFD is not "typeset a question mark on a black box" as in = =EF=BF=BD (which depends to font in anycase so in principle it's possible to see something completely different in a new version of the font) but to signal something potentially wrong with a symbol that currently in most cases is =EF=BF=BD. Misusing the meaning is not bad di per se, but in this specific case I think luatex is correct to be conservative and ask to the user what to do= ; context --batchmode typesets the document, writes the messages on the log, and ends with -1 , so an automatic agent is also alerted. --=20 luigi --f46d043bdef433b68f05111d6941 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Thu, Mar 12, 2015 at 7:55 PM, Hans Hagen <pragma@wxs.nl> = wrote:
it= 's actually a bug ... it is ok to map an invalid character in the input= to 0xFFFD, halt and continue when permitted, but the method used in luatex= thereby obscures a valid 0xFFFD in the input

=C2=A0FFFD =C2=A0REPLACEMENT CHARACTER
=E2= =80=A2 used to replace an incoming character whose
value is unkno= wn or unrepresentable in
Unicode

The mea= ning of FFFD is not "typeset a question mark on a black box" as i= n =EF=BF=BD
(which depends to font in anycase so in principle it&= #39;s possible to see something completely different in a new version of th= e font)
but to signal =C2=A0something potentially wrong with a sy= mbol that currently in most cases is=C2=A0=EF=BF=BD.
Misusing the= meaning =C2=A0is not =C2=A0bad di per se, but in this specific case=C2=A0<= /div>
I think luatex is correct to be conservative and ask to the user = what to do;
context --batchmode=C2=A0
typesets the = document,
writes the messages on the log,
and ends with= -1 , so an automatic agent is also alerted.


<= /div>


--=C2=A0
luigi<= br>
--f46d043bdef433b68f05111d6941-- --===============1832139697== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly90ZXguYWFuaGV0Lm5ldAphcmNoaXZlICA6IGh0dHA6Ly9mb3VuZHJ5LnN1cGVsZWMu ZnIvcHJvamVjdHMvY29udGV4dHJldi8Kd2lraSAgICAgOiBodHRwOi8vY29udGV4dGdhcmRlbi5u ZXQKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX18= --===============1832139697==--