ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* ConvertToConteXt 0.2 - convert special ConTeXt-characters (PHP)
@ 2011-11-01 19:16 Jan Heinen
  2011-11-01 21:46 ` Philipp Gesang
  0 siblings, 1 reply; 2+ messages in thread
From: Jan Heinen @ 2011-11-01 19:16 UTC (permalink / raw)
  To: ntg-context

At first: I am ConTeXt-newby and know PHP very well.

@Peter Münster/Aditya Mahajan  : I think \startasci and 
\stopasci is not the solution, when you generate 
ConTeXt-code with php full of ConTeXt-macro-calls:
because sometimes the special-characters are 
ConTeXt-special-characters and sometimes they are purely the 
wanted text.
@Philipp Gesang: I think Luatex could do the same job for me 
as PHP - however I am familar with PHP.
@all: of course not every Character, i am converting, is a 
ConTeXt-special-character. Though I don't know all important 
characters I took all I could imagine. Shurly I converted
too much however it is no problem:
I have tested my function "ConvertToConteXt" with 400 Pages 
full of text and lots of ConTeXt-special-characters and 
ConTeXt-macro-calls and compiled a nice book with ConText.

Which character must not be converted?

I had a mistake in the function, below is the next version:

Regards, Janis

function ConvertToConteXt ( $xstring ) {
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 
* * * * * * * * *
  *         ConvertToConteXt()
  *         Version 0.2
  *         01.11.2011
  *
  *  author: Jörg Kopp
  *          www.dr-kopp.com
  *
  *      Convert special ConTeXt-characters with php
  *      Works with PHP5
  *
  *      Call it with the string you want to convert ...
  *         ConvertToConteXt ($xstring);
  *
  *      ... and you get back the converted string
  *
  *      e.g.:
  *      Input:
  *          $string = "My root-Directory: /home/hans";
  *          $string = ConvertToConteXt ( $string );
  *
  *      Output/Return:
  *          $string = "My root\\char45Directory\\char58 
\\char47home\\char47hans";
  *
  *      When you write this into a file ...
  *          file_put_contents ( "example.tex", "My 
root\\char45Directory\char58 \\char47home\\char47hans", 
FILE_APPEND );
  *
  *      ... You will find the following in example.tex:
  *          My root\char45Directory\char58 
\char47home\char47hans
  *
  *      An when you compile example.tex with ConTeXt
  *          context example.text
  *
  *      You can read the following in the resulting 
example.pdf:
  *          My root-Directory: /home/hans
  *
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 
* * * * * * * * */

   $xstring = html_entity_decode ( $xstring );            // 
convert HTML-entities into normal characters

   // Braces and Backslash need to be handled first 
otherwise trash will be produced
   $xstring = str_replace ( "{",  "*$##char123##$*", 
$xstring );               // geschweifte Klammer auf/left 
curly brace
   $xstring = str_replace ( "}",  "*$##char125##$*", 
$xstring );               // geschweifte Klammer zu/right 
curly brace
   $xstring = str_replace ( "\\", "{\\char92}",  $xstring ); 
             // Backslash/backslash
   $xstring = str_replace ( "*$##char123##$*",  
"{\\char123}", $xstring );     // This trick is nessecary ...
   $xstring = str_replace ( "*$##char125##$*",  
"{\\char125}", $xstring );     // ...  !!!

   $xstring = str_replace ( "!",  "{\\char33}",  $xstring ); 
// Ausrufungszeichen/ConvertToConteXt
   $xstring = str_replace ( "\"", "{\\char34}",  $xstring ); 
// Anführungszeichen/quotation mark
   $xstring = str_replace ( "#",  "{\\char35}",  $xstring ); 
// Raute/number sign
   $xstring = str_replace ( "$",  "{\\char36}",  $xstring ); 
// Dollar-Zeichen/dollar sign
   $xstring = str_replace ( "%",  "{\\char37}",  $xstring ); 
// Prozent-Zeichen/percent sign
   $xstring = str_replace ( "&",  "{\\char38}",  $xstring ); 
// Kaufmännisches Und/ampersand
   $xstring = str_replace ( "'",  "{\\char39}",  $xstring ); 
// Apostroph/apostrophe
   $xstring = str_replace ( "(",  "{\\char40}",  $xstring ); 
// Klammer auf/left parenthesis
   $xstring = str_replace ( ")",  "{\\char41}",  $xstring ); 
// Klammer zu/right parenthesis
   $xstring = str_replace ( "*",  "{\\char42}",  $xstring ); 
// Stern/asterisk
   $xstring = str_replace ( "+",  "{\\char43}",  $xstring ); 
// Plus/plus sign
   $xstring = str_replace ( ",",  "{\\char44}",  $xstring ); 
// Komma/comma
   $xstring = str_replace ( "-",  "{\\char45}",  $xstring ); 
// Minus/hyphen
   $xstring = str_replace ( ".",  "{\\char46}",  $xstring ); 
// Punkt/period
   $xstring = str_replace ( "/",  "{\\char47}",  $xstring ); 
// Schrägstrich/period
   $xstring = str_replace ( ":",  "{\\char58}",  $xstring ); 
// Doppelpunkt/colon
   $xstring = str_replace ( ";",  "{\\char59}",  $xstring ); 
// Semikolon/semicolon
   $xstring = str_replace ( "<",  "{\\char60}",  $xstring ); 
// Kleinerzeichen/less-than
   $xstring = str_replace ( "=",  "{\\char61}",  $xstring ); 
// Gleichzeichen/equals-to
   $xstring = str_replace ( ">",  "{\\char62}",  $xstring ); 
// Größerzeichen/greater-than
   $xstring = str_replace ( "?",  "{\\char63}",  $xstring ); 
// Fragezeichen/question mark
   $xstring = str_replace ( "@",  "{\\char64}",  $xstring ); 
// at-Zeichen/at sign
   $xstring = str_replace ( "[",  "{\\char91}",  $xstring ); 
// eckige Klammer auf/left square bracket
   $xstring = str_replace ( "]",  "{\\char93}",  $xstring ); 
// eckige Klammer zu/right square bracket
   $xstring = str_replace ( "^",  "{\\char94}",  $xstring ); 
// Zirkumflex/caret
   $xstring = str_replace ( "_",  "{\\char95}",  $xstring ); 
// Unterstrich/underscore
   //$xstring = str_replace ( "°",  "{\\char??}",  $xstring 
); // Grad/ < ------ missing
   $xstring = str_replace ( "`",  "{\\char96}",  $xstring ); 
// accent aigu/acute accent
   $xstring = str_replace ( "|",  "{\\char124}", $xstring ); 
// Pipezeichen/vertical bar
   $xstring = str_replace ( "~",  "{\\char126}",  $xstring 
); // Tilde/tilde
   //$xstring = str_replace ( "•",  "{\\char??}",  $xstring 
); // ?/ < ------ missing
   //$xstring = str_replace ( "º",  "{\\char??}",  $xstring 
); // ?/ < ------ missing

   return $xstring;
}

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: ConvertToConteXt 0.2 - convert special ConTeXt-characters (PHP)
  2011-11-01 19:16 ConvertToConteXt 0.2 - convert special ConTeXt-characters (PHP) Jan Heinen
@ 2011-11-01 21:46 ` Philipp Gesang
  0 siblings, 0 replies; 2+ messages in thread
From: Philipp Gesang @ 2011-11-01 21:46 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 7613 bytes --]

On 2011-11-01 20:16, Jan Heinen wrote:
> @all: of course not every Character, i am converting, is a
> ConTeXt-special-character. Though I don't know all important
> characters I took all I could imagine. Shurly I converted
> too much however it is no problem:

> Which character must not be converted?

··· from catc-ctx.mkiv ··········································

\startcatcodetable \ctxcatcodes
    \catcode\tabasciicode       \spacecatcode
    \catcode\endoflineasciicode \endoflinecatcode
    \catcode\formfeedasciicode  \endoflinecatcode
    \catcode\spaceasciicode     \spacecatcode
    \catcode\endoffileasciicode \ignorecatcode
  % \catcode\circumflexasciicode\superscriptcatcode
  % \catcode\underscoreasciicode\subscriptcatcode
  % \catcode\ampersandasciicode \alignmentcatcode
    \catcode\underscoreasciicode\othercatcode
    \catcode\circumflexasciicode\othercatcode
    \catcode\ampersandasciicode \othercatcode
    \catcode\backslashasciicode \escapecatcode
    \catcode\leftbraceasciicode \begingroupcatcode
    \catcode\rightbraceasciicode\endgroupcatcode
    \catcode\dollarasciicode    \mathshiftcatcode
    \catcode\hashasciicode      \parametercatcode
    \catcode\commentasciicode   \commentcatcode
    \catcode\tildeasciicode     \activecatcode
    \catcode\barasciicode       \activecatcode
\stopcatcodetable

·································································

So, afaict, assuming standard catcodes you should be safe with
escaping »~|\{}$%#« (of which the bar was missing in the snippet
I posted).

Good luck,
Philipp


> 
> I had a mistake in the function, below is the next version:
> 
> Regards, Janis
> 
> function ConvertToConteXt ( $xstring ) {
> /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * * * * *
>  *         ConvertToConteXt()
>  *         Version 0.2
>  *         01.11.2011
>  *
>  *  author: Jörg Kopp
>  *          www.dr-kopp.com
>  *
>  *      Convert special ConTeXt-characters with php
>  *      Works with PHP5
>  *
>  *      Call it with the string you want to convert ...
>  *         ConvertToConteXt ($xstring);
>  *
>  *      ... and you get back the converted string
>  *
>  *      e.g.:
>  *      Input:
>  *          $string = "My root-Directory: /home/hans";
>  *          $string = ConvertToConteXt ( $string );
>  *
>  *      Output/Return:
>  *          $string = "My root\\char45Directory\\char58
> \\char47home\\char47hans";
>  *
>  *      When you write this into a file ...
>  *          file_put_contents ( "example.tex", "My
> root\\char45Directory\char58 \\char47home\\char47hans", FILE_APPEND
> );
>  *
>  *      ... You will find the following in example.tex:
>  *          My root\char45Directory\char58 \char47home\char47hans
>  *
>  *      An when you compile example.tex with ConTeXt
>  *          context example.text
>  *
>  *      You can read the following in the resulting example.pdf:
>  *          My root-Directory: /home/hans
>  *
>  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * * * */
> 
>   $xstring = html_entity_decode ( $xstring );            // convert
> HTML-entities into normal characters
> 
>   // Braces and Backslash need to be handled first otherwise trash
> will be produced
>   $xstring = str_replace ( "{",  "*$##char123##$*", $xstring );
> // geschweifte Klammer auf/left curly brace
>   $xstring = str_replace ( "}",  "*$##char125##$*", $xstring );
> // geschweifte Klammer zu/right curly brace
>   $xstring = str_replace ( "\\", "{\\char92}",  $xstring );
> // Backslash/backslash
>   $xstring = str_replace ( "*$##char123##$*",  "{\\char123}",
> $xstring );     // This trick is nessecary ...
>   $xstring = str_replace ( "*$##char125##$*",  "{\\char125}",
> $xstring );     // ...  !!!
> 
>   $xstring = str_replace ( "!",  "{\\char33}",  $xstring ); //
> Ausrufungszeichen/ConvertToConteXt
>   $xstring = str_replace ( "\"", "{\\char34}",  $xstring ); //
> Anführungszeichen/quotation mark
>   $xstring = str_replace ( "#",  "{\\char35}",  $xstring ); //
> Raute/number sign
>   $xstring = str_replace ( "$",  "{\\char36}",  $xstring ); //
> Dollar-Zeichen/dollar sign
>   $xstring = str_replace ( "%",  "{\\char37}",  $xstring ); //
> Prozent-Zeichen/percent sign
>   $xstring = str_replace ( "&",  "{\\char38}",  $xstring ); //
> Kaufmännisches Und/ampersand
>   $xstring = str_replace ( "'",  "{\\char39}",  $xstring ); //
> Apostroph/apostrophe
>   $xstring = str_replace ( "(",  "{\\char40}",  $xstring ); //
> Klammer auf/left parenthesis
>   $xstring = str_replace ( ")",  "{\\char41}",  $xstring ); //
> Klammer zu/right parenthesis
>   $xstring = str_replace ( "*",  "{\\char42}",  $xstring ); //
> Stern/asterisk
>   $xstring = str_replace ( "+",  "{\\char43}",  $xstring ); //
> Plus/plus sign
>   $xstring = str_replace ( ",",  "{\\char44}",  $xstring ); //
> Komma/comma
>   $xstring = str_replace ( "-",  "{\\char45}",  $xstring ); //
> Minus/hyphen
>   $xstring = str_replace ( ".",  "{\\char46}",  $xstring ); //
> Punkt/period
>   $xstring = str_replace ( "/",  "{\\char47}",  $xstring ); //
> Schrägstrich/period
>   $xstring = str_replace ( ":",  "{\\char58}",  $xstring ); //
> Doppelpunkt/colon
>   $xstring = str_replace ( ";",  "{\\char59}",  $xstring ); //
> Semikolon/semicolon
>   $xstring = str_replace ( "<",  "{\\char60}",  $xstring ); //
> Kleinerzeichen/less-than
>   $xstring = str_replace ( "=",  "{\\char61}",  $xstring ); //
> Gleichzeichen/equals-to
>   $xstring = str_replace ( ">",  "{\\char62}",  $xstring ); //
> Größerzeichen/greater-than
>   $xstring = str_replace ( "?",  "{\\char63}",  $xstring ); //
> Fragezeichen/question mark
>   $xstring = str_replace ( "@",  "{\\char64}",  $xstring ); //
> at-Zeichen/at sign
>   $xstring = str_replace ( "[",  "{\\char91}",  $xstring ); //
> eckige Klammer auf/left square bracket
>   $xstring = str_replace ( "]",  "{\\char93}",  $xstring ); //
> eckige Klammer zu/right square bracket
>   $xstring = str_replace ( "^",  "{\\char94}",  $xstring ); //
> Zirkumflex/caret
>   $xstring = str_replace ( "_",  "{\\char95}",  $xstring ); //
> Unterstrich/underscore
>   //$xstring = str_replace ( "°",  "{\\char??}",  $xstring ); //
> Grad/ < ------ missing
>   $xstring = str_replace ( "`",  "{\\char96}",  $xstring ); //
> accent aigu/acute accent
>   $xstring = str_replace ( "|",  "{\\char124}", $xstring ); //
> Pipezeichen/vertical bar
>   $xstring = str_replace ( "~",  "{\\char126}",  $xstring ); //
> Tilde/tilde
>   //$xstring = str_replace ( "•",  "{\\char??}",  $xstring ); // ?/
> < ------ missing
>   //$xstring = str_replace ( "º",  "{\\char??}",  $xstring ); // ?/
> < ------ missing
> 
>   return $xstring;
> }
> 
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

-- 
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

[-- Attachment #1.2: Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-11-01 21:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-01 19:16 ConvertToConteXt 0.2 - convert special ConTeXt-characters (PHP) Jan Heinen
2011-11-01 21:46 ` Philipp Gesang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).