From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/88669 Path: news.gmane.org!not-for-mail From: Mark Szepieniec Newsgroups: gmane.comp.tex.context Subject: Permissible characters in ConTeXt reference labels Date: Tue, 9 Sep 2014 00:20:45 +0200 Message-ID: Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=001a11396846f099680502953a31 X-Trace: ger.gmane.org 1410214877 12118 80.91.229.3 (8 Sep 2014 22:21:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 8 Sep 2014 22:21:17 +0000 (UTC) To: ntg-context@ntg.nl Original-X-From: ntg-context-bounces@ntg.nl Tue Sep 09 00:21:11 2014 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([5.39.185.229]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XR7J7-0001OJ-HK for gctc-ntg-context-518@m.gmane.org; Tue, 09 Sep 2014 00:21:09 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id CDA2110276 for ; Tue, 9 Sep 2014 00:21:06 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id ZFpgVklNz6xh for ; Tue, 9 Sep 2014 00:21:06 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [IPv6:::1]) by balder.ntg.nl (Postfix) with ESMTP id 49ACE10222 for ; Tue, 9 Sep 2014 00:20:58 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id AEB4C101E6 for ; Tue, 9 Sep 2014 00:20:53 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id P5GffuuW8UzA for ; Tue, 9 Sep 2014 00:20:49 +0200 (CEST) Original-Received: from filter2-til.mf.surf.net (filter2-til.mf.surf.net [194.171.167.218]) by balder.ntg.nl (Postfix) with ESMTP id 57376101E3 for ; Tue, 9 Sep 2014 00:20:49 +0200 (CEST) Original-Received: from mail-qc0-x231.google.com (mail-qc0-x231.google.com [IPv6:2607:f8b0:400d:c01::231]) by filter2-til.mf.surf.net (8.14.3/8.14.3/Debian-9.4) with ESMTP id s88MKk1o032260 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Tue, 9 Sep 2014 00:20:48 +0200 Original-Received: by mail-qc0-f177.google.com with SMTP id i8so16392788qcq.22 for ; Mon, 08 Sep 2014 15:20:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=dtKQXZryJDt4g5DATq9NXLrVLInTWsBFCjZT7O3fKfE=; b=lUuYlKUdxJ4iVAYT2ity8tQGKYJF+/f/RUoOAxFw1F/jP6/Pc/Ph0LHSdBMfJOVypc TKt3uP1Ttw6bpIt0WteYoc8n7tMsdfcBPSQvZUQ0sBp9EMbdtmDKuQ7oBC3a0c76sid8 Kc0VguycjGAJVr5luVnKXijjR+wXq4VOteTfo8lNo1pfODBLIc5LMDGClM/xM7JaqiJx /ql9Vm1rKotSPwNtnVuzqfvpfmHc9Qsq6p740LYMr6ay3zwV5ZliWa7AKpy4zD5zHYem E5ifZg8dIELU6fag7PDJN45edK64mpesG5KGH4MW0clBHfE6EYoUbIsK3BBYV+hZNkcO PCgQ== X-Received: by 10.140.96.200 with SMTP id k66mr9451323qge.78.1410214845758; Mon, 08 Sep 2014 15:20:45 -0700 (PDT) Original-Received: by 10.229.196.73 with HTTP; Mon, 8 Sep 2014 15:20:45 -0700 (PDT) X-Bayes-Prob: 0.0001 (Score 0, tokens from: ntg-context@ntg.nl, base:default, @@RPTN) X-CanIt-Geo: ip=2607:f8b0:400d:c01::231; country=US X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 0TMMKkLVq - 391458241c6d - 20140909 (trained as not-spam) X-Scanned-By: CanIt (www . roaringpenguin . com) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.14 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:88669 Archived-At: --001a11396846f099680502953a31 Content-Type: multipart/alternative; boundary=001a11396846f099610502953a2f --001a11396846f099610502953a2f Content-Type: text/plain; charset=UTF-8 I'm trying to fix a problem in pandoc (see https://github.com/jgm/pandoc/pull/1589) where it doesn't properly sanitize the reference labels in ConTeXt output, causing errors during compilation when a label contains '#' for example. Note that this sanitizing is needed in addition to the regular backslash escaping used for control characters: '\#' is still illegal in a label for example. In the sanitizer function I'm writing, I'd like to properly escape all illegal characters, but I couldn't find an explicit list of allowed or illegal characters. Based on some testing I've conducted (see attached file), I've arrived at the following set: \#[]",{}%()|= 1) Does this look like a reasonable set? Are there other characters or sequences that should be included, or are worth testing? 2) I was told (see https://groups.google.com/forum/#!topic/pandoc-discuss/tYpXMUkmbEY) that if the characters " and , didn't work, it would count as a ConTeXt bug, is there any truth to that? Please let me know if any further info is needed on my part. 3) Does anyone see issues with this general approach? I'm relatively new to ConTeXt, so I might be missing either a huge problem, or an obviously easier way to do this. Thanks, Mark --001a11396846f099610502953a2f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I'm trying to fix a problem in pandoc (see=C2=A0https://github.com/jgm/pandoc= /pull/1589) where it doesn't properly sanitize the reference labels= in ConTeXt output, causing errors during compilation when a label contains= '#' for example. Note that this sanitizing is needed in addition t= o the regular backslash escaping used for control characters: '\#' = is still illegal in a label for example.

In the sanitize= r function I'm writing, I'd like to properly escape all illegal cha= racters, but I couldn't find an explicit list of allowed or illegal cha= racters. Based on some testing I've conducted (see attached file), I= 9;ve arrived at the following set:

\#[]",{}%(= )|=3D

1) Does this look like a reasonable set? Are= there other characters or sequences that should be included, or are worth = testing?

2) I was told (see=C2=A0https://group= s.google.com/forum/#!topic/pandoc-discuss/tYpXMUkmbEY) that if the char= acters " and , didn't work, it would count as a ConTeXt bug, is th= ere any truth to that? Please let me know if any further info is needed on = my part.

3) Does anyone see issues with this gener= al approach? I'm relatively new to ConTeXt, so I might be missing eithe= r a huge problem, or an obviously easier way to do this.

Thanks,

Mark
--001a11396846f099610502953a2f-- --001a11396846f099680502953a31 Content-Type: application/x-tex; name="test.tex" Content-Disposition: attachment; filename="test.tex" Content-Transfer-Encoding: base64 X-Attachment-Id: f_hzucztk80 XHN0YXJ0bW9kZVsqbWtpaV0KICBcZW5hYmxlcmVnaW1lW3V0Zi04XSAgCiAgXHNldHVwY29sb3Jz W3N0YXRlPXN0YXJ0XQpcc3RvcG1vZGUKCiUgRW5hYmxlIGh5cGVybGlua3MKXHNldHVwaW50ZXJh Y3Rpb25bc3RhdGU9c3RhcnQsIGNvbG9yPW1pZGRsZWJsdWVdCgpcc2V0dXBwYXBlcnNpemUgW2xl dHRlcl1bbGV0dGVyXQpcc2V0dXBsYXlvdXQgICAgW3dpZHRoPW1pZGRsZSwgIGJhY2tzcGFjZT0x LjVpbiwgY3V0c3BhY2U9MS41aW4sCiAgICAgICAgICAgICAgICAgaGVpZ2h0PW1pZGRsZSwgdG9w c3BhY2U9MC43NWluLCBib3R0b21zcGFjZT0wLjc1aW5dCgpcc2V0dXBwYWdlbnVtYmVyaW5nW2xv Y2F0aW9uPXtmb290ZXIsY2VudGVyfV0KClxzZXR1cGJvZHlmb250WzExcHRdCgpcc2V0dXB3aGl0 ZXNwYWNlW21lZGl1bV0KClxzZXR1cGhlYWRbY2hhcHRlcl0gICAgICBbc3R5bGU9XHRmZF0KXHNl dHVwaGVhZFtzZWN0aW9uXSAgICAgIFtzdHlsZT1cdGZjXQpcc2V0dXBoZWFkW3N1YnNlY3Rpb25d ICAgW3N0eWxlPVx0ZmJdClxzZXR1cGhlYWRbc3Vic3Vic2VjdGlvbl1bc3R5bGU9XGJmXQoKXHNl dHVwaGVhZFtjaGFwdGVyLCBzZWN0aW9uLCBzdWJzZWN0aW9uLCBzdWJzdWJzZWN0aW9uXVtudW1i ZXI9bm9dCgpcZGVmaW5lZGVzY3JpcHRpb24KICBbZGVzY3JpcHRpb25dCiAgW2hlYWRzdHlsZT1i b2xkLCBzdHlsZT1ub3JtYWwsIGxvY2F0aW9uPWhhbmdpbmcsIHdpZHRoPWJyb2FkLCBtYXJnaW49 MWNtXQoKXHNldHVwaXRlbWl6ZVthdXRvaW50cm9dICAgICUgcHJldmVudCBvcnBoYW4gbGlzdCBp bnRybwpcc2V0dXBpdGVtaXplW2luZGVudG5leHQ9bm9dCgpcc2V0dXBmbG9hdFtmaWd1cmVdW2Rl ZmF1bHQ9e2hlcmUsbm9udW1iZXJ9XQpcc2V0dXBmbG9hdFt0YWJsZV1bZGVmYXVsdD17aGVyZSxu b251bWJlcn1dCgpcc2V0dXB0aGlucnVsZXNbd2lkdGg9MTVlbV0gJSB3aWR0aCBvZiBob3Jpem9u dGFsIHJ1bGVzCgpcc2V0dXBkZWxpbWl0ZWR0ZXh0CiAgW2Jsb2NrcXVvdGVdCiAgW2JlZm9yZT17 XGJsYW5rW21lZGl1bV19LAogICBhZnRlcj17XGJsYW5rW21lZGl1bV19LAogICBpbmRlbnRuZXh0 PW5vLAogIF0KCgpcc3RhcnR0ZXh0Cgpcc3Vic2VjdGlvbltYWFhde1RoaXMgaXMgYSBoZWFkZXIg d2l0aCBpZGVudGlmaWVyfQoKXGlue1RoaXN9e31bWFhYXSBpcyBhIGxpbmsgdG8gdGhlIHNlY3Rp b24gaGVhZGVyLgoKJSBUaGUgZm9sbG93aW5nIHRhYmxlIGRlbm90ZXMgdGhlIGNoYXJhY3Rlcihz KSBzdWJzdGl0dXRlZCBmb3IgdGhlCiUgbWlkZGxlIFggaW4gdGhlIGFib3ZlIGxhYmVscywgYW5k IHRoZSByZXN1bHRzIG9mIHJ1bm5pbmcgdGhlCiUgcmVzdWx0aW5nIGZpbGUgdGhyb3VnaCBDb25U ZVh0LgoKJSBYICBPSwolIFxYIGRvZXMgbm90IGNvbXBpbGUKJSAjICBkb2VzIG5vdCBjb21waWxl CiUgXSAgY29tcGlsZXMsIGJ1dCBsYWJlbCBnZXRzIG1hbmdsZWQKJSBbICBPSywgb2RkbHkKJSAi ICBkb2VzIG5vdCBjb21waWxlCiUgXCIgY29tcGlsZXMsIGJ1dCBtYW5nbGVzIGxhYmVsCiUgLCAg ZG9lcyBub3QgY29tcGlsZQolIFwsIGRvZXMgbm90IGNvbXBpbGUKJSB7ICBkb2VzIG5vdCBjb21w aWxlCiUgfSAgZG9lcyBub3QgY29tcGlsZQolICUgIGRvZXMgbm90IGNvbXBpbGUKJSAmICBPSwol IF8gIE9LCiUgXiAgT0sKJSBAICBPSwolICEgIE9LCiUgJCAgT0sKJSApICBPSywgYnV0IHdhcm5z ICJJbGxlZ2FsIGNoYXJhY3RlciIKJSAoICBjb21waWxlcywgYnV0IHdhcm5zICJFbmQgb2YgZmls ZSBpbnNpZGUgYXJyYXkiLCBhbmQgbWFuZ2xlcyBsYWJlbAolIHwgIGRvZXMgbm90IGNvbXBpbGUK JSA/ICBPSwolIC8gIE9LCiUgJyAgT0sKJSB+ICBPSwolIGAgIE9LCiUgOiAgT0sKJSA7ICBPSwol ICsgIE9LCiUgPCAgT0sKJSA9ICBjb21waWxlcywgYnV0IG1hbmdsZXMgdGhlIGxhYmVsCgpcc3Rv cHRleHQK --001a11396846f099680502953a31 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --001a11396846f099680502953a31--