ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* actualtext and encoding
@ 2009-12-06 19:07 Wolfgang Schuster
  2009-12-07  7:52 ` Taco Hoekwater
  0 siblings, 1 reply; 9+ messages in thread
From: Wolfgang Schuster @ 2009-12-06 19:07 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi Hans,

you showed a while ago how the actualtext function of pdf works
and i have a module where i would use it but letters outside of
ascii appear wrong when i copy the text

\starttext
text \pdfliteral{/Span <</ActualText (Müller) >> BDC}Meier\pdfliteral{EMC} text
\stoptext

becomes

text Müller text

when i copy the text from the pdf (Adobe Reader) and insert it in my editor.

Wolfgang

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-06 19:07 actualtext and encoding Wolfgang Schuster
@ 2009-12-07  7:52 ` Taco Hoekwater
  2009-12-07  8:31   ` Wolfgang Schuster
  2009-12-07  8:46   ` Hans Hagen
  0 siblings, 2 replies; 9+ messages in thread
From: Taco Hoekwater @ 2009-12-07  7:52 UTC (permalink / raw)
  To: mailing list for ConTeXt users



Wolfgang Schuster wrote:
> Hi Hans,
> 
> you showed a while ago how the actualtext function of pdf works
> and i have a module where i would use it but letters outside of
> ascii appear wrong when i copy the text
> 
> \starttext
> text \pdfliteral{/Span <</ActualText (Müller) >> BDC}Meier\pdfliteral{EMC} text
> \stoptext

Is /ActualText supposed to be in PDFDoc Encoding?

Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-07  7:52 ` Taco Hoekwater
@ 2009-12-07  8:31   ` Wolfgang Schuster
  2009-12-07  8:46   ` Hans Hagen
  1 sibling, 0 replies; 9+ messages in thread
From: Wolfgang Schuster @ 2009-12-07  8:31 UTC (permalink / raw)
  To: mailing list for ConTeXt users


Am 07.12.2009 um 08:52 schrieb Taco Hoekwater:

> Is /ActualText supposed to be in PDFDoc Encoding?

No, you could also use Unicode Encoding.

Wolfgang

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-07  7:52 ` Taco Hoekwater
  2009-12-07  8:31   ` Wolfgang Schuster
@ 2009-12-07  8:46   ` Hans Hagen
  2009-12-07  8:57     ` Wolfgang Schuster
  1 sibling, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2009-12-07  8:46 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Taco Hoekwater wrote:
> 
> Wolfgang Schuster wrote:
>> Hi Hans,
>>
>> you showed a while ago how the actualtext function of pdf works
>> and i have a module where i would use it but letters outside of
>> ascii appear wrong when i copy the text
>>
>> \starttext
>> text \pdfliteral{/Span <</ActualText (Müller) >> BDC}Meier\pdfliteral{EMC} text
>> \stoptext
> 
> Is /ActualText supposed to be in PDFDoc Encoding?

nopdfcompression

\def\pdfactualtext#1#2%
   {\pdfliteral direct{/Span <</ActualText 
\ctxlua{tex.write(lpdf.tosixteen("#2"))} >> BDC}#1\pdfliteral direct{EMC}}

\starttext
     text \pdfactualtext{Meier}{Müller} text
\stoptext


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-07  8:46   ` Hans Hagen
@ 2009-12-07  8:57     ` Wolfgang Schuster
  2009-12-07  9:11       ` Hans Hagen
  0 siblings, 1 reply; 9+ messages in thread
From: Wolfgang Schuster @ 2009-12-07  8:57 UTC (permalink / raw)
  To: mailing list for ConTeXt users


Am 07.12.2009 um 09:46 schrieb Hans Hagen:

> \def\pdfactualtext#1#2%
>  {\pdfliteral direct{/Span <</ActualText \ctxlua{tex.write(lpdf.tosixteen("#2"))} >> BDC}#1\pdfliteral direct{EMC}}
> 
> \starttext
>    text \pdfactualtext{Meier}{Müller} text
> \stoptext

Perfect, will this end in the core?

Regards,
Wolfgang

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-07  8:57     ` Wolfgang Schuster
@ 2009-12-07  9:11       ` Hans Hagen
  2009-12-07  9:29         ` Wolfgang Schuster
  0 siblings, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2009-12-07  9:11 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Wolfgang Schuster wrote:
> Am 07.12.2009 um 09:46 schrieb Hans Hagen:
> 
>> \def\pdfactualtext#1#2%
>>  {\pdfliteral direct{/Span <</ActualText \ctxlua{tex.write(lpdf.tosixteen("#2"))} >> BDC}#1\pdfliteral direct{EMC}}
>>
>> \starttext
>>    text \pdfactualtext{Meier}{Müller} text
>> \stoptext
> 
> Perfect, will this end in the core?

hm, doesn't that kind of functionality demands a bit more 'thinking'? 
what exactly is needed? how does it relate to linebreaks? other content? 
etc .. actually, such a mechanism should be implemented a bit 
differently (maybe attributes and delayed processing) or maybe 
dictionary driven ..

i have no problem with adding this two line hack as it has a \pdf prefix 
anyway so it sits in the category wolfgangs-only -)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-07  9:11       ` Hans Hagen
@ 2009-12-07  9:29         ` Wolfgang Schuster
  2009-12-07 10:21           ` Hans Hagen
  0 siblings, 1 reply; 9+ messages in thread
From: Wolfgang Schuster @ 2009-12-07  9:29 UTC (permalink / raw)
  To: mailing list for ConTeXt users


Am 07.12.2009 um 10:11 schrieb Hans Hagen:

> hm, doesn't that kind of functionality demands a bit more 'thinking'? what exactly is needed? how does it relate to linebreaks? other content? etc .. actually, such a mechanism should be implemented a bit differently (maybe attributes and delayed processing) or maybe dictionary driven ..

there is no linebreak in the text because it's a boxed content, a simple version of my macro is

\def\pdfactualtext#1#2%
 {\pdfliteral direct{/Span <</ActualText \ctxlua{tex.write(lpdf.tosixteen("#2"))} >> BDC}#1\pdfliteral direct{EMC}}

\def\ruby#1#2%
  {\dontleavehmode\bgroup
   \setbox\scratchboxone\hbox{#1}%
   \setbox\scratchboxtwo\hbox{#2}%
   \scratchdimen\wd\scratchboxone
   \setbox\scratchbox\vbox
     {\hbox to \scratchdimen{\hss\box\scratchboxtwo\hss}
      \hbox to \scratchdimen{\hss\box\scratchboxone\hss}}%
   \pdfactualtext{\box\scratchbox}{#1 (#2)}%
   \egroup}

\starttext

text \ruby{base text}{ruby text} text

\stoptext

Wolfgang

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-07  9:29         ` Wolfgang Schuster
@ 2009-12-07 10:21           ` Hans Hagen
  2009-12-07 10:40             ` Wolfgang Schuster
  0 siblings, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2009-12-07 10:21 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Wolfgang Schuster wrote:
> Am 07.12.2009 um 10:11 schrieb Hans Hagen:
> 
>> hm, doesn't that kind of functionality demands a bit more 'thinking'? what exactly is needed? how does it relate to linebreaks? other content? etc .. actually, such a mechanism should be implemented a bit differently (maybe attributes and delayed processing) or maybe dictionary driven ..
> 
> there is no linebreak in the text because it's a boxed content, a simple version of my macro is
> 
> \def\pdfactualtext#1#2%
>  {\pdfliteral direct{/Span <</ActualText \ctxlua{tex.write(lpdf.tosixteen("#2"))} >> BDC}#1\pdfliteral direct{EMC}}
> 
> \def\ruby#1#2%
>   {\dontleavehmode\bgroup
>    \setbox\scratchboxone\hbox{#1}%
>    \setbox\scratchboxtwo\hbox{#2}%
>    \scratchdimen\wd\scratchboxone
>    \setbox\scratchbox\vbox
>      {\hbox to \scratchdimen{\hss\box\scratchboxtwo\hss}
>       \hbox to \scratchdimen{\hss\box\scratchboxone\hss}}%
>    \pdfactualtext{\box\scratchbox}{#1 (#2)}%
>    \egroup}
> 
> \starttext
> 
> text \ruby{base text}{ruby text} text
> 
> \stoptext

detail ...

\def\ruby#1#2%
   {\dontleavehmode\bgroup
    \setbox\scratchboxone\hbox{#1}%
    \setbox\scratchboxtwo\hbox{#2}%
 
\scratchdimen\wd\ifdim\wd\scratchboxone>\wd\scratchboxtwo\scratchboxone\else\scratchboxtwo\fi
    \setbox\scratchbox\vbox
      {\hbox to \scratchdimen{\hss\box\scratchboxtwo\hss}
       \hbox to \scratchdimen{\hss\box\scratchboxone\hss}}%
    \pdfactualtext{\box\scratchbox}{#1 (#2)}%
    \egroup}

\starttext
     text \ruby{lua text}{ruby or perl text which is more blabla} text
\stoptext



-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: actualtext and encoding
  2009-12-07 10:21           ` Hans Hagen
@ 2009-12-07 10:40             ` Wolfgang Schuster
  0 siblings, 0 replies; 9+ messages in thread
From: Wolfgang Schuster @ 2009-12-07 10:40 UTC (permalink / raw)
  To: mailing list for ConTeXt users


Am 07.12.2009 um 11:21 schrieb Hans Hagen:

> detail ...
> 
> \def\ruby#1#2%
>  {\dontleavehmode\bgroup
>   \setbox\scratchboxone\hbox{#1}%
>   \setbox\scratchboxtwo\hbox{#2}%
> \scratchdimen\wd\ifdim\wd\scratchboxone>\wd\scratchboxtwo\scratchboxone\else\scratchboxtwo\fi
>   \setbox\scratchbox\vbox
>     {\hbox to \scratchdimen{\hss\box\scratchboxtwo\hss}
>      \hbox to \scratchdimen{\hss\box\scratchboxone\hss}}%
>   \pdfactualtext{\box\scratchbox}{#1 (#2)}%
>   \egroup}
> 
> \starttext
>    text \ruby{lua text}{ruby or perl text which is more blabla} text
> \stoptext

this was just a simplified example, the real implementation is more complex

by default ruby text overlaps the surrounding text (like in my example)
and what you suggested can be activated but it's not so nice because
you get unwanted whitespace in the text

Wolfgang

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-12-07 10:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-06 19:07 actualtext and encoding Wolfgang Schuster
2009-12-07  7:52 ` Taco Hoekwater
2009-12-07  8:31   ` Wolfgang Schuster
2009-12-07  8:46   ` Hans Hagen
2009-12-07  8:57     ` Wolfgang Schuster
2009-12-07  9:11       ` Hans Hagen
2009-12-07  9:29         ` Wolfgang Schuster
2009-12-07 10:21           ` Hans Hagen
2009-12-07 10:40             ` Wolfgang Schuster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).