From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/15338 Path: main.gmane.org!not-for-mail From: Idris Samawi Hamid Newsgroups: gmane.comp.tex.context Subject: Re: Perl scripting (was: Arabic-utf-8) Date: Sun, 06 Jun 2004 15:03:24 -0600 Organization: Colorado State University Sender: ntg-context-admin@ntg.nl Message-ID: References: <3960A2BA-B799-11D8-B99E-0030659899AA@fiee.net> Reply-To: ntg-context@ntg.nl NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=iso-8859-15 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1086555973 2745 80.91.224.253 (6 Jun 2004 21:06:13 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sun, 6 Jun 2004 21:06:13 +0000 (UTC) Original-X-From: ntg-context-admin@ntg.nl Sun Jun 06 23:06:03 2004 Return-path: Original-Received: from ref.vet.uu.nl ([131.211.172.13] helo=ref.ntg.nl) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1BX4qB-00068U-00 for ; Sun, 06 Jun 2004 23:06:03 +0200 Original-Received: from ref.ntg.nl (localhost.localdomain [127.0.0.1]) by ref.ntg.nl (Postfix) with ESMTP id 4142810B31; Sun, 6 Jun 2004 23:05:56 +0200 (MEST) Original-Received: from eagle.acns.ColoState.EDU (eagle.acns.colostate.edu [129.82.100.90]) by ref.ntg.nl (Postfix) with ESMTP id 4623F10ABD for ; Sun, 6 Jun 2004 23:03:33 +0200 (MEST) Original-Received: from lamar.colostate.edu (lamar.acns.colostate.edu [129.82.100.75]) by eagle.acns.ColoState.EDU (AIX5.1/8.11.6p2/8.11.0) with ESMTP id i56L3QL1036520 for ; Sun, 6 Jun 2004 15:03:32 -0600 Original-Received: from IHAMID (ihamid.libarts.colostate.edu [129.82.187.166]) by lamar.colostate.edu (AIX5.1/8.11.6p2/8.11.0) with ESMTP id i56L3Pe298244 for ; Sun, 6 Jun 2004 15:03:26 -0600 Original-To: ntg-context@ntg.nl In-Reply-To: <3960A2BA-B799-11D8-B99E-0030659899AA@fiee.net> User-Agent: Opera7.23/Win32 M2 build 3227 Errors-To: ntg-context-admin@ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.0.13 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.comp.tex.context:15338 X-Report-Spam: http://spam.gmane.org/gmane.comp.tex.context:15338 On Sun, 6 Jun 2004 11:09:32 +0200, Henning Hraban Ramm wrote: > ----- > > #!/usr/bin/perl -w > use strict; > use warnings; > > my ($Source, $Target) = (shift, shift); # gets 2 file names from command > line > > my %conv = ( # enhance as needed > "\xD8xA7" => "A", > "\xD8xA8" => "b", > "\xD8xAC" => "j", > "\xD8xAF" => "d" > ); > > open SOURCE, "<", $Source || die $!; > open TARGET, ">", $Target || die $!; > # there are ways to read a whole file in one scalar, > # e.g. with File::Slurp, but I don't know them by heart... > while (my $line = ) { > foreach my $key (keys %conv) { > $line =~ s/$key/$conv{$key}/g; > } # foreach > print TARGET $line; > } # while > close SOURCE; > close TARGET; > > ----- Thnx; I'll play around with this as well. BTW: is there any way to do this without the hex editor and just enter the full 4-digit character (a la Thomas's original suggestion) e.g., "\x0627" => "A" While the hex editor certainly works it is really slow and tedious work... > BTW: ActiveState has Perl 5.8.4, at least for Windows (I use it at work). Ok, I found it: http://downloads.activestate.com/ActivePerl/Windows/5.8/ActivePerl-5.8.3.809-MSWin32-x86.zip But the web site (at first glance) sure gives one the impression that their latest release is 5.6.1.638 http://www.activestate.com/ http://www.activestate.com/Products/ActivePerl/ Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523