From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/52165 Path: news.gmane.org!not-for-mail From: Jose Augusto Newsgroups: gmane.comp.tex.context Subject: Re: Ruby 1.9.1 and non-ascii char parsing in .tui file Date: Mon, 10 Aug 2009 18:20:31 +0100 Message-ID: <1781d7210908101020y1fd9611du529f65cc6553bc4c@mail.gmail.com> References: <1781d7210908081216n46c62018kb3a5f14643afb396@mail.gmail.com> <4A7F2A0D.2010907@wxs.nl> <1781d7210908092015n2d57ae21n6f09fd66729e41ac@mail.gmail.com> <4A801C60.8030501@wxs.nl> <1781d7210908100821s1c343d9j95aa4bffe9d7db92@mail.gmail.com> <4A804A79.4070904@wxs.nl> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0539219276==" X-Trace: ger.gmane.org 1249924851 18070 80.91.229.12 (10 Aug 2009 17:20:51 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 10 Aug 2009 17:20:51 +0000 (UTC) To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Mon Aug 10 19:20:42 2009 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([195.12.62.10]) by lo.gmane.org with esmtp (Exim 4.50) id 1MaYY5-0004IB-S9 for gctc-ntg-context-518@m.gmane.org; Mon, 10 Aug 2009 19:20:41 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 717BEC9A86; Mon, 10 Aug 2009 19:20:38 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id raixnctOKdLd; Mon, 10 Aug 2009 19:20:36 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 250CBC9A77; Mon, 10 Aug 2009 19:20:36 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id BB4A8C9A77 for ; Mon, 10 Aug 2009 19:20:34 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Emqb-otqQzXQ for ; Mon, 10 Aug 2009 19:20:32 +0200 (CEST) Original-Received: from mail-fx0-f207.google.com (mail-fx0-f207.google.com [209.85.220.207]) by balder.ntg.nl (Postfix) with ESMTP id 1CB45C9A68 for ; Mon, 10 Aug 2009 19:20:32 +0200 (CEST) Original-Received: by fxm3 with SMTP id 3so4495008fxm.8 for ; Mon, 10 Aug 2009 10:20:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=eaGIX2O1wWjEgP7MwYzzoTBfcRDsPrPzIMCJy0D8AYk=; b=sco+g0U19jjD5uWLOdK6ncnsHFY8avtqmLhO1zXeOKXDfk3+bs6l0FBsJCkyDYO+d9 F9cGZXDGmwK1FSVaplDiO8JDF00SaFpt1kUQ5bn8tOu5FxJS440tEfqXq1sgSeyDoneW 8Ug+zTa9NA9ycpdqddEIFWvdHNVPt1Sv7B//o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=sMW19UgjQKbNQqtguhvOmq0Fc/OXEk4IZx/65/Vu/FE5Z7qjsgwuLlQb3lq4acxmPm ZrBOG40ELzK1gg3ZG349syNFH3wwIDxrmiE5Eqcpg1w/VTAwLGoughMXGQw3+xYSsjKy jeEC6lC2OVbR/WIZXC0m0fQQ29EGwDNiz3L2U= Original-Received: by 10.223.119.207 with SMTP id a15mr1017748far.11.1249924831469; Mon, 10 Aug 2009 10:20:31 -0700 (PDT) In-Reply-To: <4A804A79.4070904@wxs.nl> X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.12 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:52165 Archived-At: --===============0539219276== Content-Type: multipart/alternative; boundary=001636c5bd8c9362680470cccd5f --001636c5bd8c9362680470cccd5f Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi Hans, The patch I proposed works also with ruby less than 1.9 (e.g. ruby 1.8.7)! The force_encoding() method is used only if RUBY_VERSION >= 1.9. If the scripts are executed by ruby 1.8 or lesser version, there's no change done to the current line of code (e.g. 'case line.chomp' ). Also, I verified the patch with ruby 1.8.7 and with 1.9.1, and it worked in both cases. The patch has however the problem of slowing processing (the "if" is executed when parsing each line of the files, and probably this issue could be optimized...) Meanwhile I don't think that the magic string # encoding: ASCII-8BIT solves the problem. This string indicates that the script is written in ASCII-8BIT, but when is reading the strings from the .tex or .tui files ruby 1.9.1 considers them as US-ASCII regardless of the encoding declared in # encoding: ... I introduced " # encoding: ASCII-8BIT " in texmfstart.rb, tex.rb and texutil.rb and the problem didn't disapeer :-( Of course I may be wrong. But the experiments I did make me think this way. Also, I don't have Linux at my disposal (I mean, with context installed) and there the behavior perhaps is different... Kind regards and thank you very much. J. Augusto On Mon, Aug 10, 2009 at 5:27 PM, Hans Hagen wrote: > Jose Augusto wrote: > >> I Hans, >> >> I just sent a mail with a possible patch, before I read this answer from >> you >> :-) >> As I say there, the patches work (at least for me) and I had updated >> context >> mkii a few hours ago, so I don't know if the betas you mentioned have >> already >> been installed... >> >> Hope the proposed patches be helpful... >> > > your patch will not work with ruby < 1.9 so if my patch (opening files in > rb mode) works ok that's more robust; > > another option is to patch texmfstart.rb > > #!/usr/bin/env ruby > #encoding: ASCII-8BIT > > > > ----------------------------------------------------------------- > Hans Hagen | PRAGMA ADE > Ridderstraat 27 | 8061 GH Hasselt | The Netherlands > tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com > | www.pragma-pod.nl > ----------------------------------------------------------------- > > ___________________________________________________________________________________ > If your question is of interest to others as well, please add an entry to > the Wiki! > > maillist : ntg-context@ntg.nl / > http://www.ntg.nl/mailman/listinfo/ntg-context > webpage : http://www.pragma-ade.nl / http://tex.aanhet.net > archive : https://foundry.supelec.fr/projects/contextrev/ > wiki : http://contextgarden.net > > ___________________________________________________________________________________ > --001636c5bd8c9362680470cccd5f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Hans,

The patch I proposed works also with ruby less than 1.9 (e.= g. ruby 1.8.7)!
The force_encoding() method is used only if RUBY_VERSIO= N >=3D 1.9.
If the scripts are executed by ruby 1.8 or lesser versio= n, there's no change done to
the current line of code (e.g. 'case line.chomp' ).

Also, I verified the patch with ruby 1.8.7 and=A0 with 1.9.1, an= d it worked in both cases.

The patch has however the problem of slow= ing processing (the "if" is executed
when parsing each line o= f the files, and probably this issue could be optimized...)

Meanwhile I don't think that the magic string
# encoding: ASCII-= 8BIT
solves the problem. This string indicates that the script is writte= n in ASCII-8BIT,
but when is reading the strings from the .tex or .tui f= iles ruby 1.9.1 considers
them as US-ASCII regardless of the encoding declared in # encoding: ...
=
I introduced " # encoding: ASCII-8BIT=A0 " in texmfstart.rb, = tex.rb and texutil.rb
and the problem didn't disapeer :-(

Of = course I may be wrong. But the experiments I did make me think this way. Also, I don't have Linux at my disposal (I mean, with context installed= ) and there
the behavior perhaps is different...

Kind regards and= thank you very much.

J. Augusto






On Mon, Aug 10, 2009 at 5:27 PM, Hans Hagen <pragma@wxs.nl> wrote:
Jose Augusto wrote:
I Hans,

I just sent a mail with a possible patch, before I read this answer from yo= u
:-)
As I say there, the patches work (at least for me) and I had updated contex= t
mkii a few hours ago, so I don't know if the betas you mentioned have already
been installed...

Hope the proposed patches be helpful...

your patch will not work with ruby < 1.9 so if my patch (opening files i= n rb mode) works ok that's more robust;

another option is to patch texmfstart.rb

#!/usr/bin/env ruby
#encoding: ASCII-8BIT



-----------------------------------------------------------------
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0Hans Hagen | PRAGMA ADE
=A0 =A0 =A0 =A0 =A0 =A0 =A0Ridderstraat 27 | 8061 GH Hasselt | The Netherl= ands
=A0 =A0 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 | ww= w.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________= ________
If your question is of interest to others as well, please add an entry to t= he Wiki!

maillist : ntg-cont= ext@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage =A0: http://= www.pragma-ade.nl / http://tex.aanhet.net
archive =A0: https://foundry.supelec.fr/projects/contextrev/
wiki =A0 =A0 : http:= //contextgarden.net
___________________________________________________________________________= ________

--001636c5bd8c9362680470cccd5f-- --===============0539219276== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --===============0539219276==--