* Converting to ConTeXt from other formats
@ 2002-07-01 15:41 Christopher Cardinale
2002-07-01 17:06 ` Henning Hraban Ramm
0 siblings, 1 reply; 2+ messages in thread
From: Christopher Cardinale @ 2002-07-01 15:41 UTC (permalink / raw)
Are there any programs for converting RTF or HTML documents into
ConTeXt?
Searching the newsgroup archives I saw that Tobias Burnus had created a
program to convert HTML to ConTeXt, but I can't find it anywhere nor
can I get in touch with him. Is this still around?
As a LaTeX user, I've had good success with RTF2LaTeX2e and I hope
there's an equivalent out there.
Thanks,
Chris Cardinale
__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Converting to ConTeXt from other formats
2002-07-01 15:41 Converting to ConTeXt from other formats Christopher Cardinale
@ 2002-07-01 17:06 ` Henning Hraban Ramm
0 siblings, 0 replies; 2+ messages in thread
From: Henning Hraban Ramm @ 2002-07-01 17:06 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 279 bytes --]
Am Montag, 1. Juli 2002 17:41 schrieb Christopher Cardinale:
> Are there any programs for converting RTF or HTML documents into
> ConTeXt?
Here's a crude script for HTML attached.
Axel Rose is on enhancing it.
Grüßlis vom Hraban!
--
http://www.fiee.net/texnique/
---
[-- Attachment #2: html2context.pl --]
[-- Type: text/x-perl, Size: 3739 bytes --]
#!/usr/bin/perl -w
print "\nThis is HTML2ConTeXt. Version 2002-05-15\n";
print "I'll try to convert your HTML file for ConTeXt.\n";
print "copyleft Henning Hraban Ramm, http://www.fiee.net/texnique/\n\n";
unless ($ARGV[0]) {die "You must name a file to convert! $!"};
my $HTMLDatei = $ARGV[0];
unless (-T $HTMLDatei) {
print "$HTMLDatei not found!\n";
if (-T $HTMLDatei.'.htm') { $HTMLDatei .= ".htm"; }
if (-T $HTMLDatei.'.html') { $HTMLDatei .= ".html"; }
} # unless
$HTMLDatei =~ s/\\/\//g;
my $Table="n";
my $Encod="win";
open (QUELLE, $HTMLDatei) or die "Can't open $HTMLDatei! $!";
my $TeXDatei = $HTMLDatei;
$TeXDatei =~ s/\.htm$/\.tex/i;
print $TeXDatei."\n";
open (ZIEL, ">".$TeXDatei) or die "Can't make $TeXDatei! $!";
while (<QUELLE>) {
# single entities and chars
s§&(.)uml;§\\\"$1§g;
s§&(.)acute;§\\´$1§g;
s§&(.)grave;§\\`$1§g;
s§&(.)circ;§\\^$1§g;
s§&(.)ring;§\\°$1§g;
if ($Encod eq "win") {
s§ß§ß§g;
} else {
s§ß§\\ss{}§g;
} # if Encoding
s§&(\#150|endash);§--§g; # endash
s§ - § -- §g; # endash
s§ §~§g; # non breaking space
s§"([^<>]*)"§\\quotation{$1}§g;
s§&(r|l)aquo;([^<>]*)&(l|r)aquo;§\\quotation{$2}§g;
s§&\#132;([^<>]*)"§\\quotation{$1}§g;
s§&\#132;([^<>]*)$§\\quotation{$1§g; # uncompleted line
s§\s("|\")§ \\quotation{§g; # begin quote
s§"\s§} §g; # end quote
s§"§\"§g; # quote
s§([^\\=\s])\"§$1}§g; # end quote
s§%§|~|\\%{}§g; # percent
s§<§<§g;
s§>§>§g;
s§&§\&§g;
s§&sup(.);§^$1§g;
s§&frac(.)(.);§\\frac{$1}{$2}§g;
s§&\#133;§ §g;
# s§§§g;
# s§§§g;
# TeX words and marks
s§T<SUB>E</SUB>X§TeX§g;
s§pdfTeX§\\pdfTeX{}§gi;
s§ppchTeX§\\pdfTeX{}§gi;
s§ConTeXt§\\ConTeXt{}§g;
s§CONTEXT§\\ConTeXt{}§g;
s§(\s)TeX§$1\\TeX{}§g;
# environments
s§<BODY[^<>]*>§\\starttext§gi;
s§</BODY>§\\stoptext§gi;
s§(<BLOCKQUOTE>|<QUOTE>)§\\startquotation§gi;
s§(<\/BLOCKQUOTE>|<\/QUOTE>)§\\stopquotation§gi;
s§</*DIV[^<>]*>§§gi; # delete all divs
s§</*FONT[^<>]*(>|$)§§gi; # delete all font tags
# Headers
s§<H1>§\\chapter{§gi;
s§<H2>§\\section{§gi;
s§<H3>§\\subsection{§gi;
s§<H4>§\\subsubsection{§gi;
s§</H.>§}§gi;
# Links
s§<A\s(.*)HREF=\"(.*)\">(.*)</A>§\\goto{$3}[URL($2)]§gi;
s§<A\s(.*)NAME=\"(.*)\">(.*)</A>§\\reference[$2]{$3}§gi;
# Tables
if ($Table eq "y") {
s§<TABLE([^<>]*)>§\\bTABLE \%$1 §gi;
s§</TABLE>§\\eTABLE§gi;
s§</TD>§\\eTD §gi;
s§<TD([^<>]*)>§\\bTD §gi;
s§</TR>§\\eTR §gi;
s§<TR([^<>]*)>§\\bTR §gi;
} else {
s§</*T(ABLE|D|R|BODY)[^<>]*>§§gi; # delete all table tags
} # if Table
# Images
s§<IMG\s([^<>]*)>§\\externalfigure[$1]§gi;
s§<IMG\s([^"=]*)src=\"([^<>]*)\"([^<>]*)$§\\externalfigure[$2]\t\% $1 $3§gi;
# Lists
s§<UL>§\\startitemize\[1\]§gi;
s§<OL>§\\startitemize\[n\]§gi;
s§<DL>§\\startitemize\[1\]§gi; # ?
s§</.L>§\\stopitemize§gi;
s§<LI>§\\item §gi;
s§<DT>§\\item §gi; #
s§<DD>§\\item §gi; #
s§</LI>§§gi;
s§<P[^<>]*>§§gi;
# s§</P>§\\par§gi;
s§</P>§\n\n§gi;
s§<BR[^<>]*>§\n§gi;
s§<HR[^<>]*>§\\blank §gi;
s§<(PRE|TT|CODE)>§\\type{§gi;
s§<(STRONG|B)>§{\\bf §gi;
s§<(EM|I|U)>§{\\em §gi;
s§^</([^\s]*)>$§\\stop$1§gi;
s§^<([^\s]*)([^<>]*)>$§\\start$1\[$2\]§gi;
s§</.*>§}§gi; # all other closing tags become }
s§<([^\s]*)(\s)(.*)>§\\$1\[$3\]\{§gi; # all other opening tags become {
s§<([^\s]*)>§\\$1\{§gi; # all other opening tags become {
s§^\s*§§g; # remove trailing spaces
print ZIEL;
print ".";
} # while
print "\n";
close (ZIEL);
close (QUELLE);
# \goto{text}[URL(Link)]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2002-07-01 17:06 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-01 15:41 Converting to ConTeXt from other formats Christopher Cardinale
2002-07-01 17:06 ` Henning Hraban Ramm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).