ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* small caps
@ 2005-02-26  9:13 Thomas A.Schmitz
  2005-02-26 23:34 ` Adam Lindsay
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas A.Schmitz @ 2005-02-26  9:13 UTC (permalink / raw)


In the last days, I played around with some truetype fonts, preparing 
them for use with ConTeXt by creating tfms via the texnansi encoding. 
Some of these truetypes have expert features embedded into their 
glyphs. texnansi automatically takes care of integrating the "ff," 
"ffi" and "ffl" ligatures. In order to extract small caps and old-style 
numerals, I created a modified texnansi encoding. Here it comes:

/TeXnANSISCEncoding [
/.notdef % 0
/Euro % /Uni20AC 1
/.notdef % 2
/.notdef % 3
/fraction %	4
/dotaccent %	5
/hungarumlaut %	6
/ogonek	%	7
/fl	%	8
/.notdef % /fraction %	9	not used (see 4), backward compatability only
/cwm	%	10	not used, except boundary char internally maybe
/ff    %	11
/fi    %	12
/.notdef % /fl    %	13	not used (see 8), backward compatability only
/ffi   %	14
/ffl   %	15
/dotlessi %	16
/dotlessj %	17
/grave %	18
/acute %	19
/caron %	20
/breve %	21
/macron %	22
/ring  %	23
/cedilla %	24
/germandbls %	25
/AEsmall    %	26
/OEsmall    %	27
/Oslashsmall %	28
/AE    %	29
/OE    %	30
/Oslash %	31
/space %	32	% /suppress in TeX text
/exclam %	33
/quotedbl %	34	% /quotedblright in TeX text
/numbersign %	35
/dollar %	36
/percent %	37
/ampersand %	38
/quoteright %	39	% /quotesingle in ANSI
/parenleft %	40
/parenright %	41
/asterisk %	42
/plus  %	43
/comma %	44
/hyphen %	45
/period %	46
/slash %	47
/zerooldstyle  %	48
/oneoldstyle   %	49
/twooldstyle   %	50
/threeoldstyle %	51
/fouroldstyle  %	52
/fiveoldstyle  %	53
/sixoldstyle   %	54
/sevenoldstyle %	55
/eightoldstyle %	56
/nineoldstyle  %	57
/colon %	58
/semicolon %	59
/less  %	60	% /exclamdown in Tex text
/equal %	61
/greater %	62	% /questiondown in TeX text
/question %	63
/at %	64
/A %	65
/B %	66
/C %	67
/D %	68
/E %	69
/F %	70
/G %	71
/H %	72
/I %	73
/J %	74
/K %	75
/L %	76
/M %	77
/N %	78
/O %	79
/P %	80
/Q %	81
/R %	82
/S %	83
/T %	84
/U %	85
/V %	86
/W %	87
/X %	88
/Y %	89
/Z %	90
/bracketleft %	91
/backslash %	92	% /quotedblleft in TeX text
/bracketright %	93
/circumflex %	94	% /asciicircum in ASCII
/underscore %	95	% /dotaccent in TeX text
/quoteleft %	96	% /grave accent in ANSI
/Asmall %	97
/Bsmall %	98
/Csmall %	99
/Dsmall %	100
/Esmall %	101
/Fsmall %	102
/Gsmall %	103
/Hsmall %	104
/Ismall %	105
/Jsmall %	106
/Ksmall %	107
/Lsmall %	108
/Msmall %	109
/Nsmall %	110
/Osmall %	111
/Psmall %	112
/Qsmall %	113
/Rsmall %	114
/Ssmall %	115
/Tsmall %	116
/Usmall %	117
/Vsmall %	118
/Wsmall %	119
/Xsmall %	120
/Ysmall %	121
/Zsmall %	122
/braceleft %	123	% /endash in TeX text
/bar   %	124	% /emdash in TeX test
/braceright %	125	% /hungarumlaut in TeX text
/tilde %	126	% /asciitilde in ASCII
/dieresis %	127	not used (see 168), use higher up instead
/Lslash	%	128	this position is unfortunate, but now too late to fix
/quotesingle %	129
/quotesinglbase %	130
/florin %	131
/quotedblbase %	132
/ellipsis %	133
/dagger %	134
/daggerdbl %	135
/circumflex %	136
/perthousand %	137
/Scaron %	138
/guilsinglleft %	139
/OE    %	140
/Zcaron %	141
/asciicircum %	142
/minus %	143
/lslash %	144
/quoteleft %	145
/quoteright %	146
/quotedblleft %	147
/quotedblright %	148
/bullet %	149
/endash %	150
/emdash %	151
/tilde %	152
/trademark %	153
/scaron %	154
/guilsinglright %	155
/oe    %	156
/zcaron %	157
/asciitilde %	158
/Ydieresis %	159
/nbspace %	160	% /space (no break space)
/exclamdown %	161
/cent  %	162
/sterling %	163
/currency %	164
/yen   %	165
/brokenbar %	166
/section %	167
/dieresis %	168
/copyright %	169
/ordfeminine %	170
/guillemotleft %	171
/logicalnot %	172
/sfthyphen %	173 % /hyphen (hanging hyphen)
/registered %	174
/macron %	175
/degree %	176
/plusminus %	177
/twosuperior %	178
/threesuperior %	179
/acute %	180
/mu    %	181
/paragraph %	182
/periodcentered %	183
/cedilla %	184
/onesuperior %	185
/ordmasculine %	186
/guillemotright %	187
/onequarter %	188
/onehalf %	189
/threequarters %	190
/questiondown %	191
/Agrave %	192
/Aacute %	193
/Acircumflex %	194
/Atilde %	195
/Adieresis %	196
/Aring %	197
/AE    %	198
/Ccedilla %	199
/Egrave %	200
/Eacute %	201
/Ecircumflex %	202
/Edieresis %	203
/Igrave %	204
/Iacute %	205
/Icircumflex %	206
/Idieresis %	207
/Eth   %	208
/Ntilde %	209
/Ograve %	210
/Oacute %	211
/Ocircumflex %	212
/Otilde %	213
/Odieresis %	214
/multiply %	215	% OE in T1
/Oslash %	216
/Ugrave %	217
/Uacute %	218
/Ucircumflex %	219
/Udieresis %	220
/Yacute %	221
/Thorn %	222
/germandbls %	223
/Agravesmall %	224
/Aacutesmall %	225
/Acircumflexsmall %	226
/Atildesmall %	227
/Adieresissmall %	228
/Aringsmall %	229
/AEsmall    %	230
/Ccedillasmall %	231
/Egravesmall %	232
/Eacutesmall %	233
/Ecircumflexsmall %	234
/Edieresissmall %	235
/Igravesmall %	236
/Iacutesmall %	237
/Icircumflexsmall %	238
/Idieresissmall %	239
/eth   %	240
/Ntildesmall %	241
/Ogravesmall %	242
/Oacutesmall %	243
/Ocircumflexsmall %	244
/Otildesmall %	245
/Odieresissmall %	246
/divide %	247	% oe in T1
/Oslashsmall %	248
/Ugravesmall %	249
/Uacutesmall %	250
/Ucircumflexsmall %	251
/Udieresissmall %	252
/Yacutesmall %	253
/Thornsmall %	254
/Ydieresissmall %	255	% germandbls in T1
] def

With the help of  this encoding, I was able to create small-cap fonts 
with old-style numerals  that I could then use in my typescript. So I'm 
wondering: is this a good idea, or is there a simpler way of doing 
this?

Best

Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-02-26  9:13 small caps Thomas A.Schmitz
@ 2005-02-26 23:34 ` Adam Lindsay
  2005-02-27  7:57   ` Thomas A.Schmitz
  2005-02-27 17:58   ` h h extern
  0 siblings, 2 replies; 10+ messages in thread
From: Adam Lindsay @ 2005-02-26 23:34 UTC (permalink / raw)


Thomas A.Schmitz said this at Sat, 26 Feb 2005 10:13:00 +0100:

>In the last days, I played around with some truetype fonts, preparing 
>them for use with ConTeXt by creating tfms via the texnansi encoding. 

Hello (again) Thomas,

This is good stuff. I've tried to advocate a naming convention that would
be appropriate to this. I would suggest calling this texnansi-osfsc.enc,
as baseencoding-variant.enc. This is so a modified encoding can
"masquerade" as the base encoding within ConTeXt.

Given this encoding with my suggested name, you could therefore run
texfont as following:
 texfont --encoding=texnansi --variant=osfsc   --[other options]

Variants that select rarer features that Old Style Figures and Small Caps
may need to be given font-specific names, as rare glyph names tend to
vary wildly between fonts.

Cheers,
adam
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Adam T. Lindsay, Computing Dept.     atl@comp.lancs.ac.uk
 Lancaster University, InfoLab21        +44(0)1524/510.514
 Lancaster, LA1 4WA, UK             Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-02-26 23:34 ` Adam Lindsay
@ 2005-02-27  7:57   ` Thomas A.Schmitz
  2005-02-27  9:39     ` Adam Lindsay
  2005-02-27 17:58   ` h h extern
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas A.Schmitz @ 2005-02-27  7:57 UTC (permalink / raw)


> This is good stuff. I've tried to advocate a naming convention that 
> would
> be appropriate to this. I would suggest calling this 
> texnansi-osfsc.enc,
> as baseencoding-variant.enc. This is so a modified encoding can
> "masquerade" as the base encoding within ConTeXt.
>
> Given this encoding with my suggested name, you could therefore run
> texfont as following:
>  texfont --encoding=texnansi --variant=osfsc   --[other options]
That is by far the most elegant solution indeed! I have renamed my 
encoding file.

>
> Variants that select rarer features that Old Style Figures and Small 
> Caps
> may need to be given font-specific names, as rare glyph names tend to
> vary wildly between fonts.
>
Sadly, you are absolutely right about this. And it's not only rare 
glyphs that get wildly different names. There was some rumor on the TeX 
on OS X list that people couldn't get the beautiful HoeflerText font to 
work with TeX; it turned out that this was true for newer versions of 
the font only. I looked into it, and it turns out that Apple (?) has 
given new names even to quite "normal" characters - eacute becomes 
e_acute etc. So if you want to produce a tfm for that font, you have to 
invent a specific encoding vector. Once you know how this works, it's 
easy enough, but really annoying.

So, the "variant" scheme in texfont is at least a convenient way to 
cope with this mess.

Best

Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-02-27  7:57   ` Thomas A.Schmitz
@ 2005-02-27  9:39     ` Adam Lindsay
  2005-03-02  6:18       ` Thomas A.Schmitz
  0 siblings, 1 reply; 10+ messages in thread
From: Adam Lindsay @ 2005-02-27  9:39 UTC (permalink / raw)


Thomas A.Schmitz said this at Sun, 27 Feb 2005 08:57:36 +0100:

>I looked into it, and it turns out that Apple (?) has 
>given new names even to quite "normal" characters - eacute becomes 
>e_acute etc. So if you want to produce a tfm for that font, you have to 
>invent a specific encoding vector. Once you know how this works, it's 
>easy enough, but really annoying.

Indeed, that's one of the reasons why I came up with the unicode
("symbol"[1]) scripts... there are common utilities (ttx and Apple's ftx
suite) that work well at associating canonical characters with glyph
names specific to a font.

I'm sure some enterprising XSLT hacker could take my scripts as a
starting point and make them work with specific TeXy encodings, not just
individual Unicode vectors.

>So, the "variant" scheme in texfont is at least a convenient way to 
>cope with this mess.

Well, it's the simplest of hacks to help *manage* the mess. And it's not
really new--the concept of "variant" is all over Karl Berry's Fontname
conventions. This just makes it a bit more user-friendly.

[1] Which is to say that ttx2enc.xsl got its first public airing in
<http://homepage.mac.com/atl/tex/symb-uni.zip>, in the context of
"Unicode Symbols", but there's nothing inherent to symbols there--it's
all about Unicode in general.

Cheers,
adam
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Adam T. Lindsay, Computing Dept.     atl@comp.lancs.ac.uk
 Lancaster University, InfoLab21        +44(0)1524/510.514
 Lancaster, LA1 4WA, UK             Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-02-26 23:34 ` Adam Lindsay
  2005-02-27  7:57   ` Thomas A.Schmitz
@ 2005-02-27 17:58   ` h h extern
  1 sibling, 0 replies; 10+ messages in thread
From: h h extern @ 2005-02-27 17:58 UTC (permalink / raw)


Adam Lindsay wrote:

> This is good stuff. I've tried to advocate a naming convention that would
> be appropriate to this. I would suggest calling this texnansi-osfsc.enc,
> as baseencoding-variant.enc. This is so a modified encoding can
> "masquerade" as the base encoding within ConTeXt.

i'll add the encoding to the distribution (i just made the formatted file with 
the info sent) [of course users will need to generate the tfm files themselves]

once we have made the switch from map files to inline map code, we can apply 
different encodings more easily at the typescript level (no more need for map files)

another thing coming is that pdftex will provide primitives to set those 
encodings independently of other characteristics (hartmut is working on this);

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-02-27  9:39     ` Adam Lindsay
@ 2005-03-02  6:18       ` Thomas A.Schmitz
  2005-03-02  9:59         ` Adam Lindsay
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas A.Schmitz @ 2005-03-02  6:18 UTC (permalink / raw)


Adam, I feel like a complete idiot now. I had been so proud about this 
idea, but after re-reading you MyWay about OpenType, I see that I had 
been reinventing the wheel: this is exactly the solution you had been 
suggesting almost two years ago. Thanks for being generous about 
this...

However, your post made me think: I know nothing about XSLT, but enough 
perl to shoot myself in the foot. I guess if I had a version of 
texnansi.enc with the unicode values in addition to the names, that 
would be a good starting point. I was thinking of  this route:
1. use ftxdumperfuser to produce cmap.xml,
2. use perl to reduce it to two values: glyphName % UNICODE_VALUE
3. use perl to extract the lines corresponding to a given encoding and 
put them in the right order.

Sounds feasible? Do you know where I could get such a unicode-aware 
version of texnansi.enc?

Best

Thomas

On Feb 27, 2005, at 10:39 AM, Adam Lindsay wrote:
>
> Indeed, that's one of the reasons why I came up with the unicode
> ("symbol"[1]) scripts... there are common utilities (ttx and Apple's 
> ftx
> suite) that work well at associating canonical characters with glyph
> names specific to a font.
>
> I'm sure some enterprising XSLT hacker could take my scripts as a
> starting point and make them work with specific TeXy encodings, not 
> just
> individual Unicode vectors.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-03-02  6:18       ` Thomas A.Schmitz
@ 2005-03-02  9:59         ` Adam Lindsay
  2005-03-02 10:33           ` Thomas A.Schmitz
  0 siblings, 1 reply; 10+ messages in thread
From: Adam Lindsay @ 2005-03-02  9:59 UTC (permalink / raw)


Thomas A.Schmitz said this at Wed, 2 Mar 2005 07:18:29 +0100:

>However, your post made me think: I know nothing about XSLT, but enough 
>perl to shoot myself in the foot. I guess if I had a version of 
>texnansi.enc with the unicode values in addition to the names, that 
>would be a good starting point. I was thinking of  this route:
>1. use ftxdumperfuser to produce cmap.xml,
>2. use perl to reduce it to two values: glyphName % UNICODE_VALUE
>3. use perl to extract the lines corresponding to a given encoding and 
>put them in the right order.

Hold on one minute... we're talking about encodings for alternate glyphs,
right? That's orthogonal to what Unicode is about. 'a' and 'Asmall'
pretty much take up the same unicode "slot". Only 'a' appears in the
.cmap.xml file.

However, a perl-based solution would be very handy, especially as new
free fonts like FPL-Neu (Palatino clone) include OSF and SC glyphs. 
<http://home.vr-web.de/was/x/FPL/> 

The closest I got with perl was some experiments following some rough
heuristics appending "small" to glyph names from an afm file. I got a bit
discouraged, however, and didn't take it further at the time. So I
clapped my hands and giggled girlishly when XeTeX came out and gave me
easy access to the particular AAT fonts I was trying to get to work.

>Sounds feasible? Do you know where I could get such a unicode-aware 
>version of texnansi.enc?

Now that would be a useful thing, regardless. I don't know, but I'll have
a look. I suspect we'll have to create one ourselves. An idle thought
(with the corresponding devilishness) occurs to me: all that information
is in ConTeXt already. Hmm...

What form would be the best? Some simple XML? A perl-friendly list?
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Adam T. Lindsay, Computing Dept.     atl@comp.lancs.ac.uk
 Lancaster University, InfoLab21        +44(0)1524/510.514
 Lancaster, LA1 4WA, UK             Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-03-02  9:59         ` Adam Lindsay
@ 2005-03-02 10:33           ` Thomas A.Schmitz
  2005-03-02 10:54             ` Hans Hagen
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas A.Schmitz @ 2005-03-02 10:33 UTC (permalink / raw)


On Mar 2, 2005, at 10:59 AM, Adam Lindsay wrote:

> Hold on one minute... we're talking about encodings for alternate 
> glyphs,
> right? That's orthogonal to what Unicode is about. 'a' and 'Asmall'
> pretty much take up the same unicode "slot". Only 'a' appears in the
> .cmap.xml file.
>
No, of course you're right. I thought that they were given a value in 
the FFxx range, but that's not right; they don't appear in the cmap, 
only in the afm. So the only thing I can think of: there are only so 
many ways to refer to small caps, Xsmall or X.small or X_small or even 
X-small. We could provide alternatives for that in perl, making 
additions as we go. It's a brute-force attack, kind of aiming with a 
machine gun, but since fonts are such moving targets...

> Now that would be a useful thing, regardless. I don't know, but I'll 
> have
> a look. I suspect we'll have to create one ourselves. An idle thought
> (with the corresponding devilishness) occurs to me: all that 
> information
> is in ConTeXt already. Hmm...
>
> What form would be the best? Some simple XML? A perl-friendly list?

For the time being, I'm thinking of a very simple list that could just 
serve as a pattern for arranging the lines I get from processing the 
cmap.xml. I'm just thinking, not writing code yet...

Best

Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-03-02 10:33           ` Thomas A.Schmitz
@ 2005-03-02 10:54             ` Hans Hagen
  2005-03-02 10:58               ` Adam Lindsay
  0 siblings, 1 reply; 10+ messages in thread
From: Hans Hagen @ 2005-03-02 10:54 UTC (permalink / raw)


Thomas A.Schmitz wrote:

> For the time being, I'm thinking of a very simple list that could just 
> serve as a pattern for arranging the lines I get from processing the 
> cmap.xml. I'm just thinking, not writing code yet...

about code ... wybo dekker has cleaned up the texfont code, so that will be the 
starting point for extensions

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: small caps
  2005-03-02 10:54             ` Hans Hagen
@ 2005-03-02 10:58               ` Adam Lindsay
  0 siblings, 0 replies; 10+ messages in thread
From: Adam Lindsay @ 2005-03-02 10:58 UTC (permalink / raw)


Hans Hagen said this at Wed, 2 Mar 2005 11:54:35 +0100:

>about code ... wybo dekker has cleaned up the texfont code, so that will
>be the 
>starting point for extensions

Wow. Thanks, Wybo! 
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Adam T. Lindsay, Computing Dept.     atl@comp.lancs.ac.uk
 Lancaster University, InfoLab21        +44(0)1524/510.514
 Lancaster, LA1 4WA, UK             Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-03-02 10:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-02-26  9:13 small caps Thomas A.Schmitz
2005-02-26 23:34 ` Adam Lindsay
2005-02-27  7:57   ` Thomas A.Schmitz
2005-02-27  9:39     ` Adam Lindsay
2005-03-02  6:18       ` Thomas A.Schmitz
2005-03-02  9:59         ` Adam Lindsay
2005-03-02 10:33           ` Thomas A.Schmitz
2005-03-02 10:54             ` Hans Hagen
2005-03-02 10:58               ` Adam Lindsay
2005-02-27 17:58   ` h h extern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).