ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: "Idris Samawi Hamid ادريس سماوي حامد" <ishamid@colostate.edu>
To: "mailing list for ConTeXt users" <ntg-context@ntg.nl>
Subject: Re: Non-printable Unicode control characters
Date: Sat, 16 Aug 2008 20:37:33 -0600	[thread overview]
Message-ID: <op.uf0ewvr5fkrasx@your-b27fb1c401> (raw)
In-Reply-To: <48A68762.8080809@wxs.nl>

On Sat, 16 Aug 2008 01:53:06 -0600, Hans Hagen <pragma@wxs.nl> wrote:

> Khaled Hosny wrote:
>> Unicode has many "control characters" that only control text behaviour
>> and shouldn't be rendered visually in the text, such as Bidi_Control and
>> Join_Control chars (see
>> http://www.unicode.org/Public/5.1.0/ucd/PropList.txt and
>> http://unicode.org/Public/UNIDATA/UCD.html)
>> Currently, ConTeXt handles ZWJ and ZWNJ, but other characters get
>> rendered if the font has glyphs for them or make no effect at all if the
>> font has no glyphs for them. I think that the optimum behaviour is to
>> make those characters affect text formatting while not visually rendered
>> whether the font has glyphs for them or not.
>> It might be also useful if we can enable rendering those characters
>> manually, for drafts and such.
>
> actually we need:
>
> - ignore them (like in verbatim)

Eventually we want to be able to show them in verbatim also (provided the  
font has them).

Indeed, I suggest that -- given an appropriate teletype font -- the  
default for _verbatim text_ should be to _show_ the control chars.

> - act upon them
>
> and
>
> - show them (might somehow interfere with other things)

Showing the control chars in typeset text -- non-verbatim -- should be  
rare; more appropriate for verbatim

> - hide them

I suggest that the default for _typeset text_ should definitely be to  
_hide_ the control chars.

> if i'm right, when bidi is turned on, those chars get processed and then
> discarded from the node list, so some more than zwj and zwnj is handled,

It appears to me that zwj and zwnj etc. should be invisible in  
typeset-text output -- as explained above, but should still be encoded in  
the output pdf. Think pdf-text extraction, converting between Arabic and  
Farsi typesetting conventions, etc.

> and of course others need to be handled as well

Even lsep's and psep's should be present in the output pdf (eg \par =>  
psep). Will make text extraction much more useful, etc.

Best wishes
Idris

-- 
Professor Idris Samawi Hamid, Editor-in-Chief
International Journal of Shi`i Studies
Department of Philosophy
Colorado State University
Fort Collins, CO 80523
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


  reply	other threads:[~2008-08-17  2:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-15 19:23 Khaled Hosny
2008-08-16  7:53 ` Hans Hagen
2008-08-17  2:37   ` Idris Samawi Hamid ادريس سماوي حامد [this message]
2008-08-17 12:46     ` Hans Hagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=op.uf0ewvr5fkrasx@your-b27fb1c401 \
    --to=ishamid@colostate.edu \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).