ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: "Thomas A. Schmitz" <thomas.schmitz@uni-bonn.de>
Subject: Re: Arabic-utf-8 (plus a sample)
Date: Sat, 05 Jun 2004 23:48:18 +0200	[thread overview]
Message-ID: <1086472098.5707.36.camel@tascomputer.home> (raw)
In-Reply-To: <opr844t8tru9mfh0@lamar.colostate.edu>

Just a quick reply (it's bedtime over here): there may be 2 problems. 1
is  that the mail program put in an unwanted linebreak after the =~
part, just remove it; it should all be one line. And then: you'll need a
fairly recent version of perl for it to work, what do you get when you
do
perl --version
I guess for utf to work, it should be at least 5.8.0. Your basic idea of
the usage is right (I'm not a windows person, but I  assume it should be
the same): save the scipt as utf2tex.pl, make it executable and call it
as utf2tex.pl FILENAME.txt.

I guess it would be easiest to convert the utf to ascii directly - that
would mean you could later convert it back. I have a set of scripts that
do just that -- convert babel Greek into utf-8 and back.

If you need more help, I'll look into it tomorrow!

Best

Thomas

On Sat, 2004-06-05 at 23:33, Idris Samawi Hamid wrote:
> On Sat, 05 Jun 2004 22:41:39 +0200, Thomas A. Schmitz 
> <thomas.schmitz@uni-bonn.de> wrote:
> 
> > Idris,
> >
> > I know a bit of perl and would love to help. However, I fear that
> > sending us your stuff via mail will be a bit difficult because the utf-8
> > chracters get transformed into gibberish.
> 
> Thnx 4 such a speedy reply! I don't think you are getting gibberish 
> though; you should be getting the extended ascii representation. So the 
> letter alif (hex 0627) should look like this:
> 
> ا
> 
> Do you get a forward-slashed circle and a section symbol? If so, that's 
> the ascii representation I'm trying to convert to the letter `A'.
> 
> Here are the codes you want:
> 
> ا [0627] => A
> 
> ب [0628] => b
> 
> ج [062C] => j
> 
> د [062F] => d
> 
> Ù‡ [0647] => h
> 
> Ùˆ [0648] => w
> 
> ز [0632] => z
> 
> Let me explain my situation more clearly:-)
> 
> I have a unicode editor, Unitype Global Writer. I save a unicode document 
> as a utf *.txt file. When I open that saved file in my TeX editor 
> (WinEdt), it comes out as extended ascii (that's the "gibberish"). So what 
> I wanted to do was convert the ascii "gibberish" to my Latin 
> transcription. It seems that what you are suggesting is to use the hex 
> representation and convert the unicode txt file into a Latin transcription 
> file directly and bypass the gibberish.
> 
> On your perl file, can you give me an example of how to use it? I tried 
> (in windows, with name
> utf2tex.pl and unicode text in unicode-utf.txt) and get
> 
> =========================
> > perl utf2tex.pl unicode-utf.txt
> Unknown discipline class ':utf8' at C:/Perl/lib/open.pm line 18.
> BEGIN failed--compilation aborted at utf2tex.pl line 4.
> =========================
> 
>  from your script I tried, e.g.
> 
> ============================
> $_ =~
> s/\x{0627}/\x{0041}/esg;
> # from alif to `A'
> ============================
> 
> Your guidance will be greatly appreciated!
> 
> Thnx a million!
> Idris

  reply	other threads:[~2004-06-05 21:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-05 19:32 Idris Samawi Hamid
2004-06-05 20:41 ` Thomas A. Schmitz
2004-06-05 21:33   ` Idris Samawi Hamid
2004-06-05 21:48     ` Thomas A. Schmitz [this message]
2004-06-05 22:51       ` Idris Samawi Hamid
2004-06-05 23:15         ` Re[2]: " Giuseppe Bilotta
2004-06-05 23:31           ` Idris Samawi Hamid
2004-06-05 23:58             ` Re[4]: " Giuseppe Bilotta
2004-06-06  0:19               ` Idris Samawi Hamid
2004-06-06  0:26                 ` Idris Samawi Hamid
2004-06-06  9:09                 ` Perl scripting (was: Arabic-utf-8) Henning Hraban Ramm
2004-06-06 21:03                   ` Idris Samawi Hamid
2004-06-06 21:28                     ` Thomas A. Schmitz
2004-06-07 19:45                       ` Henning Hraban Ramm
2004-06-07 20:53                         ` Thomas A.Schmitz
2004-06-05 23:08 ` [SPAM: 3.411] Arabic-utf-8 (plus a sample) Richard MAHONEY
2004-06-06  0:19   ` Idris Samawi Hamid
2004-06-06 13:22 ` Arabic-utf-8 " George N. White III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1086472098.5707.36.camel@tascomputer.home \
    --to=thomas.schmitz@uni-bonn.de \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).