ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* DocBookInContext & multi-languages (newbie)
@ 2002-11-29  7:20 Gour
  2002-11-29 19:18 ` Simon Pepping
  0 siblings, 1 reply; 38+ messages in thread
From: Gour @ 2002-11-29  7:20 UTC (permalink / raw)


Hello list!

I'm a new in ConTeXt. After few boks done in LyX (LaTeX) I decided to enhance
my "publishing".

Since I need to write in DocBook, I was thrilled to find DocBookInContext
package which enables to map from DocBook to ConTeXt.

I prepared a small article in DocBook and converted it to PDF by:
texexec --pdf file (I'm running SuSE 8.0 & teTeX).

The problem is that I wanted to include some Croatian national characters in
the DocBook file, but they are not shown in generated file.

What should I do to be able to have both English & Croatian language in
ConTeXt?

(In LyX, I would simply use latin-2 encoding and write English & Croatian.)

Usually documents in DocBook have Unicode encoding (UTF-8).

What encoding has to be defined so that conversion DocBook -> ConTeXt will
work properly?

I also run texexec --make --language=hr,en hr

Sincerely,
Gour


-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-11-29  7:20 DocBookInContext & multi-languages (newbie) Gour
@ 2002-11-29 19:18 ` Simon Pepping
  2002-11-30 20:15   ` Gour
  2002-12-02 12:28   ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
  0 siblings, 2 replies; 38+ messages in thread
From: Simon Pepping @ 2002-11-29 19:18 UTC (permalink / raw)


On Fri, Nov 29, 2002 at 08:20:39AM +0100, Gour wrote:
> Hello list!
> 
> Since I need to write in DocBook, I was thrilled to find DocBookInContext
> package which enables to map from DocBook to ConTeXt.
> 
> I prepared a small article in DocBook and converted it to PDF by:
> texexec --pdf file (I'm running SuSE 8.0 & teTeX).
> 
> The problem is that I wanted to include some Croatian national characters in
> the DocBook file, but they are not shown in generated file.
> 
> What should I do to be able to have both English & Croatian language in
> ConTeXt?
> 
> (In LyX, I would simply use latin-2 encoding and write English & Croatian.)
> 
> Usually documents in DocBook have Unicode encoding (UTF-8).
> 
> What encoding has to be defined so that conversion DocBook -> ConTeXt will
> work properly?
> 
> I also run texexec --make --language=hr,en hr

I would like to know that too :-) I have not yet found the time to
find out how Context deals with encodings. I only have a note that
says that one should do \useXMLfilter [utf], and that I should have a
look at the xtag-utf (which is input by the above command) or enco
files.

I would hope that context develops generic input encoding support, so
that I only have to scan the encoding value in the XML declaration,
and input the appropriate encoding file.

Regards, Simon

-- 
Simon Pepping
email: spepping@scaprea.hobby.nl

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-11-29 19:18 ` Simon Pepping
@ 2002-11-30 20:15   ` Gour
  2002-11-30 20:55     ` Bruce D'Arcus
  2002-12-02 19:46     ` Simon Pepping
  2002-12-02 12:28   ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
  1 sibling, 2 replies; 38+ messages in thread
From: Gour @ 2002-11-30 20:15 UTC (permalink / raw)


Simon Pepping (spepping@scaprea.hobby.nl) wrote:

> I would like to know that too :-) I have not yet found the time to
> find out how Context deals with encodings. I only have a note that
> says that one should do \useXMLfilter [utf], and that I should have a
> look at the xtag-utf (which is input by the above command) or enco
> files.

As far as I can see ConTeXt does not understand utf-8 encoding.

Where did you find this note mentioning utf?

> 
> I would hope that context develops generic input encoding support, so
> that I only have to scan the encoding value in the XML declaration,
> and input the appropriate encoding file.

I just read one report abut the problem to publish Unicode documents - FO &
similar converters are not mature enough, PassiveTeX isn't easy to install,
Omega has its own problems ..

Some time ago I saw a post on DocBook list from Sebastian Rahtz who is 
considering to rewrite PassiveTex with ConTeXt support instead of LaTeX.

However, I'm wondering what is the present route for those wanting to get
their utf-8 encoded documents published in ConTeXt?

DocBook is very valuable tool enabling to author documents with DocBook and thenget high quality output with ConTeXt & TeX.

The question remains, how to do it with multi-lingual document encoded in utf-8?

Any hint?

Sincerely,
Gour

-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-11-30 20:15   ` Gour
@ 2002-11-30 20:55     ` Bruce D'Arcus
  2002-12-01  6:40       ` Gour
  2002-12-02 19:46     ` Simon Pepping
  1 sibling, 1 reply; 38+ messages in thread
From: Bruce D'Arcus @ 2002-11-30 20:55 UTC (permalink / raw)



On Saturday, November 30, 2002, at 03:15 PM, Gour wrote:

> However, I'm wondering what is the present route for those wanting to 
> get
> their utf-8 encoded documents published in ConTeXt?

I've wondering about this myself. Given that the default encoding for 
XML is utf-8, it'd seem ConTeXt ought to support it if its going to 
typeset XML.

I posted a note about the tbook project (http://tbookdtd.sf.net/) a 
couple of weeks ago, which includes this binary (description from 
manual):

> tbrplent is a filter program that scans for non-ASCII UTF-8s in the 
> input
> stream and creates decent LATEX macros or, if possible, Latin-1 
> characters
> for the output stream.

Does ConTeXt have an equivalent, or can this one perhaps be modified?

Bruce

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-11-30 20:55     ` Bruce D'Arcus
@ 2002-12-01  6:40       ` Gour
  0 siblings, 0 replies; 38+ messages in thread
From: Gour @ 2002-12-01  6:40 UTC (permalink / raw)


Bruce D'Arcus (bdarcus@fastmail.fm) wrote:

> I've wondering about this myself. Given that the default encoding for 
> XML is utf-8, it'd seem ConTeXt ought to support it if its going to 
> typeset XML.

Yes, that's logical and I'm wondering whether it's true given the fact that TeX 
by itself doesn't handle utf-8.

> I posted a note about the tbook project (http://tbookdtd.sf.net/) a 
> couple of weeks ago, which includes this binary (description from 
> manual):

I saw it and it sounds good. It even has xindy for indexing which I use under
LyX (LaTeX).

Would be nice to have similar capabilities with ConTeXt, however I need DocBook
echangeability.

> >tbrplent is a filter program that scans for non-ASCII UTF-8s in the 
> >input
> >stream and creates decent LATEX macros or, if possible, Latin-1 
> >characters
> >for the output stream.
> 
> Does ConTeXt have an equivalent, or can this one perhaps be modified?

I hope some ConTeXt guru can enlighten us regards.

Sincerely,
Gour

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-11-29 19:18 ` Simon Pepping
  2002-11-30 20:15   ` Gour
@ 2002-12-02 12:28   ` Hans Hagen
  2002-12-02 13:59     ` Gour
  1 sibling, 1 reply; 38+ messages in thread
From: Hans Hagen @ 2002-12-02 12:28 UTC (permalink / raw)


At 08:18 PM 11/29/2002 +0100, Simon Pepping wrote:

about utf8

>I would like to know that too :-) I have not yet found the time to
>find out how Context deals with encodings. I only have a note that
>says that one should do \useXMLfilter [utf], and that I should have a
>look at the xtag-utf (which is input by the above command) or enco
>files.
>
>I would hope that context develops generic input encoding support, so
>that I only have to scan the encoding value in the XML declaration,
>and input the appropriate encoding file.

sure, but for that i need input on how those vector look like; xtag-utf is 
a starting point, but i also need the second 'bank', so whoi can

(1) provide me with proper test files (can be simple text with utf 8 text)
(2) provide the mapping list in terms of <nr><nr> => \namedglyph

later some kind of language dependency needs to be brought in

[so, the framework is already there]

Hans

-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-02 12:28   ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
@ 2002-12-02 13:59     ` Gour
  2002-12-02 14:43       ` Hans Hagen
  0 siblings, 1 reply; 38+ messages in thread
From: Gour @ 2002-12-02 13:59 UTC (permalink / raw)


Hans Hagen (pragma@wxs.nl) wrote:

> sure, but for that i need input on how those vector look like; xtag-utf is 
> a starting point, but i also need the second 'bank', so whoi can

Where can I get some more info how this xtag-utf vector should look like?

> 
> (1) provide me with proper test files (can be simple text with utf 8 text)

UTF-8_and_Unicode_FAQ has some test files and I'm sure this step is not a 
problem.

> (2) provide the mapping list in terms of <nr><nr> => \namedglyph

I need some explanation: e.g. amacron (small latin letter "a" with macron") has
Unicode code U+0101. When I look within Vim (g8 function) it shows me that it 
has "c4 81" hex value in UTF-8 encoding. 

In this way it's possible to get Unicode code & hex value in UTF-8 encoding.

So, I'm interested, what would be the correct entry for the above mentioned list
in the case for amacron:

a) c4 81 -> amacron
b) 0101  -> amacron

or something else?

Please, give me some more info and I'd glad to help since I'm sure that utf-8
support for ConTeXt is the right way to go.

With package like DocBookInConTeXt one can author directly in XML and have all
the advantages of using standard DTD, then one can map the document in ConTeXt
and take advantage of its capabilities, and get the superb TeX quality output.

With the utf-8 support, to get the westernized transliteration of Sanskrit 
mentioned in other thread, is piece of cake.
 
> later some kind of language dependency needs to be brought in

Sure.

> [so, the framework is already there]

Nice to hear that. Let's move forward :-)

Sincerely,
Gour

-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-02 13:59     ` Gour
@ 2002-12-02 14:43       ` Hans Hagen
  2002-12-02 16:36         ` Taco Hoekwater
  2002-12-02 17:40         ` Gour
  0 siblings, 2 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-02 14:43 UTC (permalink / raw)


At 02:59 PM 12/2/2002 +0100, Gour wrote:
>Hans Hagen (pragma@wxs.nl) wrote:
>
> > sure, but for that i need input on how those vector look like; xtag-utf is
> > a starting point, but i also need the second 'bank', so whoi can
>
>Where can I get some more info how this xtag-utf vector should look like?

in xtag-utf.tex in .../tex/context/base (at least in my version and the beta)

> > (1) provide me with proper test files (can be simple text with utf 8 text)
>
>UTF-8_and_Unicode_FAQ has some test files and I'm sure this step is not a
>problem.

So, where can i find that doc?

> > (2) provide the mapping list in terms of <nr><nr> => \namedglyph
>
>I need some explanation: e.g. amacron (small latin letter "a" with 
>macron") has
>Unicode code U+0101. When I look within Vim (g8 function) it shows me that it
>has "c4 81" hex value in UTF-8 encoding.
>
>In this way it's possible to get Unicode code & hex value in UTF-8 encoding.
>
>So, I'm interested, what would be the correct entry for the above 
>mentioned list
>in the case for amacron:
>
>a) c4 81 -> amacron
>b) 0101  -> amacron
>
>or something else?

so, c4 is the trigger, and 81 the character; this means that the function 
attached to c4 has to map the 81 onto \amacron

can you make me a file with a list like:

amacron : 01/01 : c4/c8 : <utfcode>
^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^
normal ascii              real utf

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-02 14:43       ` Hans Hagen
@ 2002-12-02 16:36         ` Taco Hoekwater
  2002-12-02 17:40         ` Gour
  1 sibling, 0 replies; 38+ messages in thread
From: Taco Hoekwater @ 2002-12-02 16:36 UTC (permalink / raw)
  Cc: pragma


UTF8 encoding is rather simple, really:

byte number:
b1              b2               b3              b4
0    -- 127                                                         = unicode 0x00 - 0x7F
192 -- 223   128 -- 191                                       = unicode 0x80 - 0x7FF 
224 -- 239   128 -- 191   128 -- 191                     = unicode 0x800 - 0xFFFF 
240 -- 247   128 -- 191   128 -- 191   128 -- 191   = unicode 0x10000 - 0x1FFFF

There are also sequences for 5 and 6 bytes, but these are illegal for Unicode
representations at the moment:

248 -- 251   128 -- 191    128 -- 191    128 -- 191    128 -- 191   
252 -- 253   128 -- 191    128 -- 191    128 -- 191    128 -- 191    128 -- 191   
 
128 -- 191 are illegal as first chars in UTF8 (that is handy for error-recovery):

254 and 255 are completely illegal and should not appear at all (if you see them,
                   it's a safe bet that the document is encoded as UTF16, not UTF8):


The unicode number for a UTF8 sequence can be calculated as:

byte1                                                                            if byte1 <= 127
(byte1-192)*64 + (byte2-128)                                           if 192 <= byte1 <= 223
(byte1-224)*4096 + (byte2-128)*64  + (byte3-128)              if 224 <= byte1 <= 239
(byte3-240)*262144 + (byte2-128)*4096 + (byte3-128)*64  + (byte4-128)
  if 240<= byte1  <= 247

Simple, eh?

-- 
groeten,

Taco

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-02 14:43       ` Hans Hagen
  2002-12-02 16:36         ` Taco Hoekwater
@ 2002-12-02 17:40         ` Gour
  2002-12-02 20:16           ` Simon Pepping
  2002-12-03 19:14           ` DocBookInContext ... [CSX+, UTF-8 Roman, and Nagari Codings] Richard Mahoney
  1 sibling, 2 replies; 38+ messages in thread
From: Gour @ 2002-12-02 17:40 UTC (permalink / raw)


Hans Hagen (pragma@wxs.nl) wrote:

> in xtag-utf.tex in .../tex/context/base (at least in my version and the 
> beta)

On my SuSE 8.0 I didn't find it, but fortunately it's in the beta which I
downloaded :-)

So here I see something like:

\defineUTFcharacter amacron	1  1

which corresponds to the Unicode code of amacron: U+0101 and it's according to
the output of Vim's function: "ga" which shows:

<ā> 257, Hex 0101, Octal 401.

Now, it just a question of little work to slowly populate this vector with the
values for different Unicode characters. 


> >UTF-8_and_Unicode_FAQ has some test files and I'm sure this step is not a
> >problem.
> 
> So, where can i find that doc?

The FAQ document is at: http://www.cl.cam.ac.uk/~mgk25/unicode.html,
and the example files are under:

http://www.cl.cam.ac.uk/~mgk25/unicode.html#examples

Pls. take a look http://www.macchiato.com/unicode/Unicode_transcriptions.html

under the example's list.

There is also Unicode converter: http://www.macchiato.com/unicode/convert.html

> >a) c4 81 -> amacron
> >b) 0101  -> amacron
> 
> so, c4 is the trigger, and 81 the character; this means that the function 
> attached to c4 has to map the 81 onto \amacron

I'm not sure whether c4 is the trigger for the 81 character.

c4 81 is two-byte representation in memory (that's what you'll see in some
hexadecimal editor) of Unicode amacron character with the code U+0101, or
simply said: utf-8 code for amacron :-)
  
> can you make me a file with a list like:
> 
> amacron : 01/01 : c4/c8 : <utfcode>
> ^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^
> normal ascii              real utf

So, the line for amacron should look like:

amacron	:	01/01	c4/c8

since c4/c8 is utfcode for amacron.

Is this OK?

-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-11-30 20:15   ` Gour
  2002-11-30 20:55     ` Bruce D'Arcus
@ 2002-12-02 19:46     ` Simon Pepping
  2002-12-02 20:30       ` Tobias Burnus
                         ` (2 more replies)
  1 sibling, 3 replies; 38+ messages in thread
From: Simon Pepping @ 2002-12-02 19:46 UTC (permalink / raw)


On Sat, Nov 30, 2002 at 09:15:45PM +0100, Gour wrote:
> Simon Pepping (spepping@scaprea.hobby.nl) wrote:
> 
> > I would like to know that too :-) I have not yet found the time to
> > find out how Context deals with encodings. I only have a note that
> > says that one should do \useXMLfilter [utf], and that I should have a
> > look at the xtag-utf (which is input by the above command) or enco
> > files.
> 
> As far as I can see ConTeXt does not understand utf-8 encoding.
> 
> Where did you find this note mentioning utf?

On my computer :-) I collected remarks made on this list in that
document.

> Some time ago I saw a post on DocBook list from Sebastian Rahtz who is 
> considering to rewrite PassiveTex with ConTeXt support instead of LaTeX.

That would be very good; much better than just doing
docbook. Sometimes I think I would better spend my time on such an
effort, but I am afraid it is a huge task.
 
> The question remains, how to do it with multi-lingual document
> encoded in utf-8? 
>
> Any hint?

As is the case more often in open source: do it yourself. Hans has not
taken part in this discussion, so I think he does not feel like
embarking on an effort in this area.

The basic mechanism to make TeX work with encodings is to declare all
characters above 127 active, and map them to a suitable control
sequence. But that only works with single-byte encodings.

xmltex, David Carlisle's XML parser in tex, which is used by
Passivetex, can swallow and interpret utf-8 encoding. I think he
applies the utf-8 rules to the sequences of single bytes. It should be
easy to transfer this to Context, because it should not be macro
package dependent.

The other options are: use an input filter, like the program that was
mentioned in this thread. Or use NTS, the java based TeX
implementation. Currently it does not deal with multibyte encodings
because it is artificially restricted to 256 characters (if I remember
correctly) and because there are no input encoding macro packages for
higher character codes.

Sebastian's PassiveTeX has long mapping tables for unicode to latex
control sequences. These can be translated to context. (And they
could be made to work with NTS.)

While I am writing this, I am beginning to think that copying xmltex's
algorithm to context is the best way to go.

Regards, Simon

-- 
Simon Pepping
email: spepping@scaprea.hobby.nl

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-02 17:40         ` Gour
@ 2002-12-02 20:16           ` Simon Pepping
  2002-12-02 21:57             ` Hans Hagen
  2002-12-03 19:14           ` DocBookInContext ... [CSX+, UTF-8 Roman, and Nagari Codings] Richard Mahoney
  1 sibling, 1 reply; 38+ messages in thread
From: Simon Pepping @ 2002-12-02 20:16 UTC (permalink / raw)


Hmm, I just wrote my email before fetching the emails that had been
exchanged over the weekend. Not a good idea.

On Mon, Dec 02, 2002 at 06:40:30PM +0100, Gour wrote:
> Hans Hagen (pragma@wxs.nl) wrote:
> 
> > >a) c4 81 -> amacron
> > >b) 0101  -> amacron
> > 
> > so, c4 is the trigger, and 81 the character; this means that the function 
> > attached to c4 has to map the 81 onto \amacron
> 
> I'm not sure whether c4 is the trigger for the 81 character.
> 
> c4 81 is two-byte representation in memory (that's what you'll see in some
> hexadecimal editor) of Unicode amacron character with the code U+0101, or
> simply said: utf-8 code for amacron :-)
>   
> > can you make me a file with a list like:
> > 
> > amacron : 01/01 : c4/c8 : <utfcode>
> > ^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^
> > normal ascii              real utf
> 
> So, the line for amacron should look like:
> 
> amacron	:	01/01	c4/c8
> 
> since c4/c8 is utfcode for amacron.
> 
> Is this OK?

I do not think the mapping files should touch utf-8. The input
mechanism should map utf-8 to unicode, and then the mapping should map
unicode to a macro. In that way the same mapping can be used by other
encodings, provided they have an input mapping to unicode.

Simon

-- 
Simon Pepping
email: spepping@scaprea.hobby.nl

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-12-02 19:46     ` Simon Pepping
@ 2002-12-02 20:30       ` Tobias Burnus
  2002-12-02 21:54       ` Hans Hagen
       [not found]       ` <Pine.LNX.4.44.0212022106550.2205-100000@tom.physik.fu-berl in.de>
  2 siblings, 0 replies; 38+ messages in thread
From: Tobias Burnus @ 2002-12-02 20:30 UTC (permalink / raw)


Hi,

On Mon, 2 Dec 2002, Simon Pepping wrote:
> On Sat, Nov 30, 2002 at 09:15:45PM +0100, Gour wrote:
> > Simon Pepping (spepping@scaprea.hobby.nl) wrote:
> > > says that one should do \useXMLfilter [utf], and that I should have a
> > > look at the xtag-utf (which is input by the above command) or enco
> > > files.
> > As far as I can see ConTeXt does not understand utf-8 encoding.
Well it works with utf8 if you include xtag-utf.tex
($TEXMF/tex/context/base/xtag-utf.tex). That works for instance:
  \input xtag-utf.tex
  øãö
  \bye
(\o,\~a,\"o)

The problem is that that file doesn't contain all > 50000 characters but
only a few (basically latin1 accented characters)

> > Where did you find this note mentioning utf?
I think it went over the mailing list (look at the mailarchive).

> > Some time ago I saw a post on DocBook list from Sebastian Rahtz who is
> > considering to rewrite PassiveTex with ConTeXt support instead of LaTeX.
That would be nice!

> > The question remains, how to do it with multi-lingual document
> > encoded in utf-8?
> > Any hint?
See above. The problem is that a nice font would be needed to.

By the way, I'm looking for a nice looking serif font, which I can use as
math font and which contains at least all MES-1, better also the MES-2
characters (http://www.evertype.com/standards/iso10646/pdf/cwa13873.pdf)
and the default ligatures used by TeX.
So far I mainly found either WGL4 compatible fonts
(http://partners.adobe.com/asn/developer/opentype/appendices/wgl4.html) or
fonts which can be used for math in TeX, but not both. (At least not
within a amount of money which I can spend ;-)

Tobias

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-12-02 19:46     ` Simon Pepping
  2002-12-02 20:30       ` Tobias Burnus
@ 2002-12-02 21:54       ` Hans Hagen
       [not found]       ` <Pine.LNX.4.44.0212022106550.2205-100000@tom.physik.fu-berl in.de>
  2 siblings, 0 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-02 21:54 UTC (permalink / raw)


At 08:46 PM 12/2/2002 +0100, you wrote:

> > Some time ago I saw a post on DocBook list from Sebastian Rahtz who is
> > considering to rewrite PassiveTex with ConTeXt support instead of LaTeX.
>
>That would be very good; much better than just doing
>docbook. Sometimes I think I would better spend my time on such an
>effort, but I am afraid it is a huge task.

since i know a bit about the context internals ... actually, the best way 
to handle it is to build on top of low level counterparts; will look into 
that later  [the only use i see for fo's in our workflows is as sub docs; 
kind of image like approach; so i'll writ eit anyway]

> > The question remains, how to do it with multi-lingual document
> > encoded in utf-8?
> >
> > Any hint?
>
>As is the case more often in open source: do it yourself. Hans has not
>taken part in this discussion, so I think he does not feel like
>embarking on an effort in this area.

hm, utf is on my agenda, but for chars that i don't use myself i depend on 
others -)

>The basic mechanism to make TeX work with encodings is to declare all
>characters above 127 active, and map them to a suitable control
>sequence. But that only works with single-byte encodings.

the machinery is already there for quite some time; it's the way 
chinese/korean is implemented; and for western languages the mechanism is 
even simpler.

>While I am writing this, I am beginning to think that copying xmltex's
>algorithm to context is the best way to go.

not sure about that; as said, context has the machinery already; i only 
need tables to work with (+the conversion stuff taco mailed earlier)

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-02 20:16           ` Simon Pepping
@ 2002-12-02 21:57             ` Hans Hagen
  2002-12-03 20:03               ` Simon Pepping
  0 siblings, 1 reply; 38+ messages in thread
From: Hans Hagen @ 2002-12-02 21:57 UTC (permalink / raw)


At 09:16 PM 12/2/2002 +0100, you wrote:

>I do not think the mapping files should touch utf-8. The input
>mechanism should map utf-8 to unicode, and then the mapping should map
>unicode to a macro. In that way the same mapping can be used by other
>encodings, provided they have an input mapping to unicode.

actually, utf maps onto context internal named glyphs

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
       [not found]       ` <Pine.LNX.4.44.0212022106550.2205-100000@tom.physik.fu-berl in.de>
@ 2002-12-02 21:59         ` Hans Hagen
  2002-12-03 12:48           ` Tobias Burnus
       [not found]           ` <Pine.LNX.4.44.0212031306170.23965-100000@warp9.physik.fu-b erlin.de>
  0 siblings, 2 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-02 21:59 UTC (permalink / raw)


At 09:30 PM 12/2/2002 +0100, you wrote:

>The problem is that that file doesn't contain all > 50000 characters but
>only a few (basically latin1 accented characters)

right, and for a start we only need to build the second vector, which i 
didn't -)

By the way, I'm looking for a nice looking serif font, which I can use as
>math font and which contains at least all MES-1, better also the MES-2
>characters (http://www.evertype.com/standards/iso10646/pdf/cwa13873.pdf)
>and the default ligatures used by TeX.
>So far I mainly found either WGL4 compatible fonts
>(http://partners.adobe.com/asn/developer/opentype/appendices/wgl4.html) or
>fonts which can be used for math in TeX, but not both. (At least not
>within a amount of money which I can spend ;-)

did you try palatino?

\usetypescript[palatino][texnansi] \setupbodyfont[palatino,10pt]

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-12-02 21:59         ` Hans Hagen
@ 2002-12-03 12:48           ` Tobias Burnus
  2002-12-03 13:59             ` Willi Egger
       [not found]           ` <Pine.LNX.4.44.0212031306170.23965-100000@warp9.physik.fu-b erlin.de>
  1 sibling, 1 reply; 38+ messages in thread
From: Tobias Burnus @ 2002-12-03 12:48 UTC (permalink / raw)


Hi,

On Mon, 2 Dec 2002, Hans Hagen wrote:
> did you try palatino?
You mean http://www.micropress-inc.com/fonts/pamath/pamain.htm + Palatino?
Otherwise I haven't found anything related to math and Palatino.

Additionally, Palatino (and/or PAMath) seem to have only the basic letters
from TeX and only little more.

Tobias

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
       [not found]           ` <Pine.LNX.4.44.0212031306170.23965-100000@warp9.physik.fu-b erlin.de>
@ 2002-12-03 13:45             ` Hans Hagen
  0 siblings, 0 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-03 13:45 UTC (permalink / raw)


At 01:48 PM 12/3/2002 +0100, you wrote:
>Hi,
>
>On Mon, 2 Dec 2002, Hans Hagen wrote:
> > did you try palatino?
>You mean http://www.micropress-inc.com/fonts/pamath/pamain.htm + Palatino?
>Otherwise I haven't found anything related to math and Palatino.
>
>Additionally, Palatino (and/or PAMath) seem to have only the basic letters
>from TeX and only little more.

Ah, take a look at tex live! search for tx/px fonts; they should work out 
of the box with the current context

[concerning utf-8: i now have rather good converter running]

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)
  2002-12-03 12:48           ` Tobias Burnus
@ 2002-12-03 13:59             ` Willi Egger
  0 siblings, 0 replies; 38+ messages in thread
From: Willi Egger @ 2002-12-03 13:59 UTC (permalink / raw)


Hi Tobias,

I believe that Hans refers to the px-fonts, which are included in the
texlife distribution


Kind regards Willi

----- Original Message -----
From: "Tobias Burnus" <tobias.burnus@physik.fu-berlin.de>
To: <ntg-context@ntg.nl>
Sent: Tuesday, December 03, 2002 1:48 PM
Subject: Re: [NTG-context] DocBookInContext & multi-languages (newbie)


> Hi,
>
> On Mon, 2 Dec 2002, Hans Hagen wrote:
> > did you try palatino?
> You mean http://www.micropress-inc.com/fonts/pamath/pamain.htm + Palatino?
> Otherwise I haven't found anything related to math and Palatino.
>
> Additionally, Palatino (and/or PAMath) seem to have only the basic letters
> from TeX and only little more.
>
> Tobias
>
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context
>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext ... [CSX+, UTF-8 Roman, and Nagari Codings]
  2002-12-02 17:40         ` Gour
  2002-12-02 20:16           ` Simon Pepping
@ 2002-12-03 19:14           ` Richard Mahoney
  2002-12-04 14:16             ` Gour
  1 sibling, 1 reply; 38+ messages in thread
From: Richard Mahoney @ 2002-12-03 19:14 UTC (permalink / raw)


On Mon, Dec 02, 2002 at 06:40:30PM +0100, Gour wrote:

> So here I see something like:
> 
> \defineUTFcharacter amacron	1  1
> 
> which corresponds to the Unicode code of amacron: U+0101 and it's according to
> the output of Vim's function: "ga" which shows:
> 
> <ā> 257, Hex 0101, Octal 401.
> 
> Now, it just a question of little work to slowly populate this vector with the
> values for different Unicode characters. 

To save yourself time you could look at two C programmes that indicate
CSXp, UTF-8 Roman, and UTF-8 Devanagari codings:

 `csxp2ur' -- converts CSXp --> UTF-8 Roman

 `ur2ud.c' -- converts UTF-8 Roman --> UTF-8 Devanagari

Both are from:

  ftp://bombay.oriental.cam.ac.uk/pub/john/software/programs/


Regards,

 Richard
 

-- 
Richard Mahoney    |  E-mail: rbm49@ext.canterbury.ac.nz
78 Jeffreys Road   |          r.mahoney@comnet.net.nz
Fendalton          |  Telephone: 0064-3-351-5831
CHRISTCHURCH 8005  |  Cellular: 0064-25-829-986
NEW ZEALAND        |  http://homepages.comnet.co.nz/~r-mahoney

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-02 21:57             ` Hans Hagen
@ 2002-12-03 20:03               ` Simon Pepping
  2002-12-03 23:31                 ` Hans Hagen
  0 siblings, 1 reply; 38+ messages in thread
From: Simon Pepping @ 2002-12-03 20:03 UTC (permalink / raw)


On Mon, Dec 02, 2002 at 10:57:19PM +0100, Hans Hagen wrote:
> At 09:16 PM 12/2/2002 +0100, you wrote:
> 
> actually, utf maps onto context internal named glyphs

I had a brief look into xtag-utf, and the xtag-me? and xtag-mx?
modules.  I totally understimated how much you have already done in
this area.

How does one get an internal name for say a Devanagari symbol? It
should somehow refer to a font or a font encoding that contains such a
symbol. Should the font encoding define such internal names, and map
them to the glyph indices in the font?

In another mail you refer to Chinese and korean support; where can
that be seen?

Regards, Simon

-- 
Simon Pepping
email: spepping@scaprea.hobby.nl

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-03 20:03               ` Simon Pepping
@ 2002-12-03 23:31                 ` Hans Hagen
  2002-12-04 14:10                   ` Gour
  0 siblings, 1 reply; 38+ messages in thread
From: Hans Hagen @ 2002-12-03 23:31 UTC (permalink / raw)


At 09:03 PM 12/3/2002 +0100, you wrote:
>On Mon, Dec 02, 2002 at 10:57:19PM +0100, Hans Hagen wrote:
> > At 09:16 PM 12/2/2002 +0100, you wrote:
> >
> > actually, utf maps onto context internal named glyphs
>
>I had a brief look into xtag-utf, and the xtag-me? and xtag-mx?
>modules.  I totally understimated how much you have already done in
>this area.

I'll post the updated utf handler asap; documenting it now


>How does one get an internal name for say a Devanagari symbol? It
>should somehow refer to a font or a font encoding that contains such a
>symbol. Should the font encoding define such internal names, and map
>them to the glyph indices in the font?

names are best; for languages like chinese things are slightly more 
complicates because there the handler (several encodings are supported 
there) must take care of inter character breaking as well

>In another mail you refer to Chinese and korean support; where can
>that be seen?

chinese is described in a manual at our site (follow showcase -> manuals); 
context documentation is being translated into chinese as well

korean is currently being implemented by Cho Jin-Hwan and Wang Lei (also 
supporting an extended version of dvipdfm which does unicode; quite nice to 
see chinese in widgets)

Hans

-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-03 23:31                 ` Hans Hagen
@ 2002-12-04 14:10                   ` Gour
  2002-12-04 16:31                     ` Hans Hagen
  0 siblings, 1 reply; 38+ messages in thread
From: Gour @ 2002-12-04 14:10 UTC (permalink / raw)


Hans Hagen (pragma@wxs.nl) wrote:

> names are best; for languages like chinese things are slightly more 
> complicates because there the handler (several encodings are supported 
> there) must take care of inter character breaking as well

I've just looked briefly on two Devanagari Unicode fonts, and e.g. Devanagari
letter A with the code U+0905, is named "glyph92". Other characters just follow
the pattern.

Sincerely,
Gour

-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Re: DocBookInContext ... [CSX+, UTF-8 Roman, and Nagari Codings]
  2002-12-03 19:14           ` DocBookInContext ... [CSX+, UTF-8 Roman, and Nagari Codings] Richard Mahoney
@ 2002-12-04 14:16             ` Gour
  0 siblings, 0 replies; 38+ messages in thread
From: Gour @ 2002-12-04 14:16 UTC (permalink / raw)


Richard Mahoney (rbm49@ext.canterbury.ac.nz) wrote:

> To save yourself time you could look at two C programmes that indicate
> CSXp, UTF-8 Roman, and UTF-8 Devanagari codings:

Thank you for that, Richard.

However, since I'm not so familiar with Devanagari script, I thought to just 
provide the part of the utf-8 vector for the western transliteration characters.

(it's according to the list you provided on the URL bca*.html)

There is around 30 characters and I have defined them all for entering within
Vim as well as in X via Compose key in epcEdit.

To provide this part of the Unicode, would already cover needs od some users :-)

Sincerely,
Gour

-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-04 14:10                   ` Gour
@ 2002-12-04 16:31                     ` Hans Hagen
  2002-12-04 20:08                       ` Gour
  0 siblings, 1 reply; 38+ messages in thread
From: Hans Hagen @ 2002-12-04 16:31 UTC (permalink / raw)


At 03:10 PM 12/4/2002 +0100, you wrote:
>Hans Hagen (pragma@wxs.nl) wrote:
>
> > names are best; for languages like chinese things are slightly more
> > complicates because there the handler (several encodings are supported
> > there) must take care of inter character breaking as well
>
>I've just looked briefly on two Devanagari Unicode fonts, and e.g. Devanagari
>letter A with the code U+0905, is named "glyph92". Other characters just 
>follow
>the pattern.

hm, in the thousands-of-glyphs test doc that i use i see that they do have 
proper names); what is a good type1 font for testing?

we need

- some demo utf input
- a font with the glyphs
- a suitable map/encoding file for pdftex

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-04 16:31                     ` Hans Hagen
@ 2002-12-04 20:08                       ` Gour
  2002-12-05  0:10                         ` multi-languages [UTF-8 Roman and UTF-8 Nagari test files] Richard Mahoney
  2002-12-05 11:58                         ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
  0 siblings, 2 replies; 38+ messages in thread
From: Gour @ 2002-12-04 20:08 UTC (permalink / raw)


Hans Hagen (pragma@wxs.nl) wrote:

> hm, in the thousands-of-glyphs test doc that i use i see that they do have 
> proper names); what is a good type1 font for testing?

I only found few ttf fonts. If they can be used/transformed, I can send them.

Here is the url for Titus font created in cooperation with Bitstream Inc.:

http://titus.uni-frankfurt.de/unicode/unitest2.htm#TITUUT

> we need
> 
> - some demo utf input

Maybe Richard can supply something.

> - a font with the glyphs

Pls. see above mentioned font.

Sincerely,
Gour

-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: multi-languages [UTF-8 Roman and UTF-8 Nagari test files]
  2002-12-04 20:08                       ` Gour
@ 2002-12-05  0:10                         ` Richard Mahoney
  2002-12-05 11:58                         ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
  1 sibling, 0 replies; 38+ messages in thread
From: Richard Mahoney @ 2002-12-05  0:10 UTC (permalink / raw)


On Wed, Dec 04, 2002 at 09:08:47PM +0100, Gour wrote:
> Hans Hagen (pragma@wxs.nl) wrote:

> > we need
> > 
> > - some demo utf input
> 
> Maybe Richard can supply something.

Well you asked for it ;-)

I've uploaded the following. Do with them as you see fit ...

         1385 Dec  4 23:08 Skt_CSXp.tex
         1436 Dec  4 23:08 Skt_UTF-8.tex
        18227 Dec  4 23:56 Skt_UTF-8_Nagari_BCA_1_1.png
          311 Dec  4 23:52 Skt_UTF-8_Nagari_BCA_1_1.txt
        22786 Dec  4 23:55 Skt_UTF-8_Roman_BCA_1_1.png
          161 Dec  4 23:52 Skt_UTF-8_Roman_BCA_1_1.txt

Each can be accessed at:

 http://homepages.comnet.co.nz/~r-mahoney/sundries/file_name

The PNGs show their respective files in `yudit'.

I can't comment on IE, but the Nagari and Roman translit in these text
files displays without issue with Netscape7 under FreeBSD
4.7-STABLE. Your mileage may vary.


Regards,

 Richard


P.S. Please say if you need anything else.


-- 
Richard Mahoney    |  E-mail: rbm49@ext.canterbury.ac.nz
78 Jeffreys Road   |          r.mahoney@comnet.net.nz
Fendalton          |  Telephone: 0064-3-351-5831
CHRISTCHURCH 8005  |  Cellular: 0064-25-829-986
NEW ZEALAND        |  http://homepages.comnet.co.nz/~r-mahoney

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-04 20:08                       ` Gour
  2002-12-05  0:10                         ` multi-languages [UTF-8 Roman and UTF-8 Nagari test files] Richard Mahoney
@ 2002-12-05 11:58                         ` Hans Hagen
  2002-12-05 12:22                           ` Taco Hoekwater
                                             ` (2 more replies)
  1 sibling, 3 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-05 11:58 UTC (permalink / raw)


At 09:08 PM 12/4/2002 +0100, you wrote:
>Hans Hagen (pragma@wxs.nl) wrote:
>
> > hm, in the thousands-of-glyphs test doc that i use i see that they do have
> > proper names); what is a good type1 font for testing?
>
>I only found few ttf fonts. If they can be used/transformed, I can send them.
>
>Here is the url for Titus font created in cooperation with Bitstream Inc.:
>
>http://titus.uni-frankfurt.de/unicode/unitest2.htm#TITUUT
>
> > we need
> >
> > - some demo utf input
>
>Maybe Richard can supply something.
>
> > - a font with the glyphs
>
>Pls. see above mentioned font.

what we need for that font is a series of proper tfm files; do you know of 
any progress in that direction

keep in mind that in order for utf to work, we need to switch fonts

Hans

-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-05 11:58                         ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
@ 2002-12-05 12:22                           ` Taco Hoekwater
  2002-12-05 13:25                             ` Hans Hagen
  2002-12-05 14:03                           ` Tobias Burnus
  2002-12-05 19:09                           ` Create Type 1 fonts with Indological diacritics and UTF-8 TTF Richard Mahoney
  2 siblings, 1 reply; 38+ messages in thread
From: Taco Hoekwater @ 2002-12-05 12:22 UTC (permalink / raw)


On Thu, 05 Dec 2002 12:58:59 +0100, Hans wrote:

> >Pls. see above mentioned font.
> 
> what we need for that font is a series of proper tfm files; do you know of 
> any progress in that direction
> 
> keep in mind that in order for utf to work, we need to switch fonts

So that's what happened to Cyberbit :)

What is a "proper tfm" in this context? A collection of TFMs that are dumps of
unicode hex blocks is easy to create.

-- 
groeten,

Taco

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie) / utf
  2002-12-05 12:22                           ` Taco Hoekwater
@ 2002-12-05 13:25                             ` Hans Hagen
  0 siblings, 0 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-05 13:25 UTC (permalink / raw)


At 01:22 PM 12/5/2002 +0100, Taco Hoekwater wrote:
>On Thu, 05 Dec 2002 12:58:59 +0100, Hans wrote:
>
> > >Pls. see above mentioned font.
> >
> > what we need for that font is a series of proper tfm files; do you know of
> > any progress in that direction
> >
> > keep in mind that in order for utf to work, we need to switch fonts
>
>So that's what happened to Cyberbit :)
>
>What is a "proper tfm" in this context? A collection of TFMs that are dumps of
>unicode hex blocks is easy to create.

indeed, just found out how to do that; i think that we just need a series 
of enc files like:

/Unicode0x09 [
/index0x0900/index0x0901/index0x0902/index0x0903
/index0x0904/index0x0905/index0x0906/index0x0907
/index0x0908/index0x0909/index0x090A/index0x090B
/index0x090C/index0x090D/index0x090E/index0x090F
/index0x0910/index0x0911/index0x0912/index0x0913

so:

unifont.ttf ->

afm : unifont-0x09.afm
tfm : unifont-0x09.tfm
enc : range0x09.enc
map : appropriate entry

am i right?

Hans

-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: DocBookInContext & multi-languages (newbie)  / utf
  2002-12-05 11:58                         ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
  2002-12-05 12:22                           ` Taco Hoekwater
@ 2002-12-05 14:03                           ` Tobias Burnus
  2002-12-05 19:09                           ` Create Type 1 fonts with Indological diacritics and UTF-8 TTF Richard Mahoney
  2 siblings, 0 replies; 38+ messages in thread
From: Tobias Burnus @ 2002-12-05 14:03 UTC (permalink / raw)


Hi,

On Thu, 5 Dec 2002, Hans Hagen wrote:
> > > - some demo utf input
http://www.cl.cam.ac.uk/~mgk25/unicode.html#examples
also one line above.

> > > - a font with the glyphs
There is also a nice shareware font Code2000
(http://home.att.net/~jameskass/) which contains a lot of characters, but
only in normal (not italic/bold ...)

Addionally there are:
http://bibliofile.mc.duke.edu/gww/fonts/Unicode.html
http://www.hclrss.demon.co.uk/unicode/fontsbyrange.html

Maybe http://bibliofile.mc.duke.edu/gww/fonts/Unicode.html
is best since it is both free and contains also italic, SC etc.

Tobias

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF
  2002-12-05 11:58                         ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
  2002-12-05 12:22                           ` Taco Hoekwater
  2002-12-05 14:03                           ` Tobias Burnus
@ 2002-12-05 19:09                           ` Richard Mahoney
  2002-12-06 14:10                             ` Hans Hagen
  2 siblings, 1 reply; 38+ messages in thread
From: Richard Mahoney @ 2002-12-05 19:09 UTC (permalink / raw)



On Thu, Dec 05, 2002 at 12:58:59PM +0100, Hans Hagen wrote:

> >I only found few ttf fonts. If they can be used/transformed, I can send 
> >them.
> >
> >Here is the url for Titus font created in cooperation with Bitstream Inc.:
> >
> >http://titus.uni-frankfurt.de/unicode/unitest2.htm#TITUUT
> >

For a list of UTF-8 TrueType fonts with Indological diacritics see
Andrew Glass's latest post to INDOLOGY:

 http://listserv.liv.ac.uk/archives/indology.html

 Date:         Thu, 5 Dec 2002 08:04:01 -0800
 Reply-To:     Indology <INDOLOGY@LISTSERV.LIV.AC.UK>
 Sender:       Indology <INDOLOGY@LISTSERV.LIV.AC.UK>
 From:         Andrew Glass <asg@U.WASHINGTON.EDU>
 Organization: University of Washington
 Subject:      Re: Additional formats for e-texts in GRETIL

A search of the archives under `unicode' may also be worthwhile.


> what we need for that font is a series of proper tfm files; do you know of 
> any progress in that direction
> 
> keep in mind that in order for utf to work, we need to switch fonts

In creating a Type 1 fonts with Indological diacritics would
`mkt1font' and `vpl2vpl' be helpful? See:

 ftp://bombay.oriental.cam.ac.uk/pub/john/software/programs/accfonts/README


Regards,

 Richard Mahoney


-- 
Richard Mahoney    |  E-mail: rbm49@ext.canterbury.ac.nz
78 Jeffreys Road   |          r.mahoney@comnet.net.nz
Fendalton          |  Telephone: 0064-3-351-5831
CHRISTCHURCH 8005  |  Cellular: 0064-25-829-986
NEW ZEALAND        |  http://homepages.comnet.co.nz/~r-mahoney

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF
  2002-12-05 19:09                           ` Create Type 1 fonts with Indological diacritics and UTF-8 TTF Richard Mahoney
@ 2002-12-06 14:10                             ` Hans Hagen
  2002-12-06 15:22                               ` Docu set Michael Hallgren
  2002-12-06 15:36                               ` Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF Gour
  0 siblings, 2 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-06 14:10 UTC (permalink / raw)


At 08:09 AM 12/6/2002 +1300, Richard Mahoney wrote:

>On Thu, Dec 05, 2002 at 12:58:59PM +0100, Hans Hagen wrote:
>
> > >I only found few ttf fonts. If they can be used/transformed, I can send
> > >them.
> > >
> > >Here is the url for Titus font created in cooperation with Bitstream Inc.:
> > >
> > >http://titus.uni-frankfurt.de/unicode/unitest2.htm#TITUUT
> > >

btw, quite painful that this font has funny glyph names

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Docu set
  2002-12-06 14:10                             ` Hans Hagen
@ 2002-12-06 15:22                               ` Michael Hallgren
  2002-12-07 14:12                                 ` Patrick Gundlach
  2002-12-06 15:36                               ` Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF Gour
  1 sibling, 1 reply; 38+ messages in thread
From: Michael Hallgren @ 2002-12-06 15:22 UTC (permalink / raw)



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

what wget knobs and file should I currently, ideally use for polling the
full
set of documentation and other off the Pragma server?

Cheers,

mh

- --
Michael Hallgren, http://m.hallgren.free.fr/, mh2198-ripe
-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0 (Build 349) Beta

iQA/AwUBPfDAmCsEKmyTmvZLEQL4NACcDf/sKtYCEimYDvyuf9iJhCqEL38Anjur
HnJ2MfktPjbLXAan/IcYP7OX
=BMP/
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF
  2002-12-06 14:10                             ` Hans Hagen
  2002-12-06 15:22                               ` Docu set Michael Hallgren
@ 2002-12-06 15:36                               ` Gour
  2002-12-06 16:47                                 ` Hans Hagen
  1 sibling, 1 reply; 38+ messages in thread
From: Gour @ 2002-12-06 15:36 UTC (permalink / raw)


Hans Hagen (pragma@wxs.nl) wrote:

> 
> btw, quite painful that this font has funny glyph names
> 

What is the general proposal how to solve the issue with the glyph names?

For the lower parts of the font, probably there is some standard, but also one
can expect problems with the upper parts.

Sincerely,
Gour

-- 
Gour
gour@mail.inet.hr
Registered Linux User #278493

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF
  2002-12-06 15:36                               ` Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF Gour
@ 2002-12-06 16:47                                 ` Hans Hagen
  0 siblings, 0 replies; 38+ messages in thread
From: Hans Hagen @ 2002-12-06 16:47 UTC (permalink / raw)


At 04:36 PM 12/6/2002 +0100, you wrote:
>Hans Hagen (pragma@wxs.nl) wrote:
>
> >
> > btw, quite painful that this font has funny glyph names
> >
>
>What is the general proposal how to solve the issue with the glyph names?
>
>For the lower parts of the font, probably there is some standard, but also one
>can expect problems with the upper parts.

i'm trying something:

unicode0x09.enc with entries like /.c0x0012 and alike but ttf2afm and/or 
ttf2tfm somehow don't see things in the same way and/or skip ranges and/or 
mess up things [i spent a good deal searching for scripts and apps and 
making a few perl script and slowly getting depressed now]

Hans

-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Docu set
  2002-12-06 15:22                               ` Docu set Michael Hallgren
@ 2002-12-07 14:12                                 ` Patrick Gundlach
  2002-12-07 17:37                                   ` Michael Hallgren
  0 siblings, 1 reply; 38+ messages in thread
From: Patrick Gundlach @ 2002-12-07 14:12 UTC (permalink / raw)


"Michael Hallgren" <m.hallgren@free.fr> writes:

Hi,

> what wget knobs and file should I currently, ideally use for polling the
> full
> set of documentation and other off the Pragma server?

get the file:

http://www.pragma-ade.com/context.www

and type wget -Nxi context.www for an automatic download. It seems as
if this is not 100% up to date, but almost.


Patrick

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: Docu set
  2002-12-07 14:12                                 ` Patrick Gundlach
@ 2002-12-07 17:37                                   ` Michael Hallgren
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Hallgren @ 2002-12-07 17:37 UTC (permalink / raw)


> Hi,
>
> > what wget knobs and file should I currently, ideally use for polling the
> > full
> > set of documentation and other off the Pragma server?
>
> get the file:
>
> http://www.pragma-ade.com/context.www
>

Thanks, that was the file name I had forgotten.



> and type wget -Nxi context.www for an automatic download. It seems as
> if this is not 100% up to date, but almost.
>

mh

>
> Patrick
> _______________________________________________
> ntg-context mailing list
> ntg-context@ntg.nl
> http://www.ntg.nl/mailman/listinfo/ntg-context
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2002-12-07 17:37 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-29  7:20 DocBookInContext & multi-languages (newbie) Gour
2002-11-29 19:18 ` Simon Pepping
2002-11-30 20:15   ` Gour
2002-11-30 20:55     ` Bruce D'Arcus
2002-12-01  6:40       ` Gour
2002-12-02 19:46     ` Simon Pepping
2002-12-02 20:30       ` Tobias Burnus
2002-12-02 21:54       ` Hans Hagen
     [not found]       ` <Pine.LNX.4.44.0212022106550.2205-100000@tom.physik.fu-berl in.de>
2002-12-02 21:59         ` Hans Hagen
2002-12-03 12:48           ` Tobias Burnus
2002-12-03 13:59             ` Willi Egger
     [not found]           ` <Pine.LNX.4.44.0212031306170.23965-100000@warp9.physik.fu-b erlin.de>
2002-12-03 13:45             ` Hans Hagen
2002-12-02 12:28   ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
2002-12-02 13:59     ` Gour
2002-12-02 14:43       ` Hans Hagen
2002-12-02 16:36         ` Taco Hoekwater
2002-12-02 17:40         ` Gour
2002-12-02 20:16           ` Simon Pepping
2002-12-02 21:57             ` Hans Hagen
2002-12-03 20:03               ` Simon Pepping
2002-12-03 23:31                 ` Hans Hagen
2002-12-04 14:10                   ` Gour
2002-12-04 16:31                     ` Hans Hagen
2002-12-04 20:08                       ` Gour
2002-12-05  0:10                         ` multi-languages [UTF-8 Roman and UTF-8 Nagari test files] Richard Mahoney
2002-12-05 11:58                         ` DocBookInContext & multi-languages (newbie) / utf Hans Hagen
2002-12-05 12:22                           ` Taco Hoekwater
2002-12-05 13:25                             ` Hans Hagen
2002-12-05 14:03                           ` Tobias Burnus
2002-12-05 19:09                           ` Create Type 1 fonts with Indological diacritics and UTF-8 TTF Richard Mahoney
2002-12-06 14:10                             ` Hans Hagen
2002-12-06 15:22                               ` Docu set Michael Hallgren
2002-12-07 14:12                                 ` Patrick Gundlach
2002-12-07 17:37                                   ` Michael Hallgren
2002-12-06 15:36                               ` Re: Create Type 1 fonts with Indological diacritics and UTF-8 TTF Gour
2002-12-06 16:47                                 ` Hans Hagen
2002-12-03 19:14           ` DocBookInContext ... [CSX+, UTF-8 Roman, and Nagari Codings] Richard Mahoney
2002-12-04 14:16             ` Gour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).