ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Converting to ConTeXt from other formats
@ 2002-07-01 15:41 Christopher Cardinale
  2002-07-01 17:06 ` Henning Hraban Ramm
  0 siblings, 1 reply; 2+ messages in thread
From: Christopher Cardinale @ 2002-07-01 15:41 UTC (permalink / raw)


Are there any programs for converting RTF or HTML documents into
ConTeXt?

Searching the newsgroup archives I saw that Tobias Burnus had created a
program to convert HTML to ConTeXt, but I can't find it anywhere nor
can I get in touch with him. Is this still around?

As a LaTeX user, I've had good success with RTF2LaTeX2e and I hope
there's an equivalent out there.

Thanks,
Chris Cardinale

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Converting to ConTeXt from other formats
  2002-07-01 15:41 Converting to ConTeXt from other formats Christopher Cardinale
@ 2002-07-01 17:06 ` Henning Hraban Ramm
  0 siblings, 0 replies; 2+ messages in thread
From: Henning Hraban Ramm @ 2002-07-01 17:06 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 279 bytes --]

Am Montag, 1. Juli 2002 17:41 schrieb Christopher Cardinale:
> Are there any programs for converting RTF or HTML documents into
> ConTeXt?

Here's a crude script for HTML attached.
Axel Rose is on enhancing it.

Grüßlis vom Hraban!
-- 
http://www.fiee.net/texnique/
---

[-- Attachment #2: html2context.pl --]
[-- Type: text/x-perl, Size: 3739 bytes --]

#!/usr/bin/perl -w

print "\nThis is HTML2ConTeXt. Version 2002-05-15\n";
print "I'll try to convert your HTML file for ConTeXt.\n";
print "copyleft Henning Hraban Ramm, http://www.fiee.net/texnique/\n\n";

unless ($ARGV[0]) {die "You must name a file to convert! $!"};
my $HTMLDatei = $ARGV[0];
unless (-T $HTMLDatei) {
	print "$HTMLDatei not found!\n";
	if (-T $HTMLDatei.'.htm')  { $HTMLDatei .= ".htm"; }
	if (-T $HTMLDatei.'.html') { $HTMLDatei .= ".html"; }
} # unless
$HTMLDatei =~ s/\\/\//g;

my $Table="n";
my $Encod="win";

open (QUELLE, $HTMLDatei) or die "Can't open $HTMLDatei! $!";

my $TeXDatei = $HTMLDatei;
$TeXDatei =~ s/\.htm$/\.tex/i;
print $TeXDatei."\n";
open (ZIEL, ">".$TeXDatei) or die "Can't make $TeXDatei! $!";

while (<QUELLE>) {
# single entities and chars
	s§&(.)uml;§\\\"$1§g;
	s§&(.)acute;§\\´$1§g;
	s§&(.)grave;§\\`$1§g;
	s§&(.)circ;§\\^$1§g;
	s§&(.)ring;§\\°$1§g;
if ($Encod eq "win") {
	s§&szlig;§ß§g;
} else {
	s§&szlig;§\\ss{}§g;
} # if Encoding
	s§&(\#150|endash);§--§g;	# endash
	s§ - § -- §g;			# endash
	s§&nbsp;§~§g;	# non breaking space
	s§&quot;([^<>]*)&quot;§\\quotation{$1}§g;
	s§&(r|l)aquo;([^<>]*)&(l|r)aquo;§\\quotation{$2}§g;
	s§&\#132;([^<>]*)&quot;§\\quotation{$1}§g;
	s§&\#132;([^<>]*)$§\\quotation{$1§g; # uncompleted line
	s§\s(&quot;|\")§ \\quotation{§g;	# begin quote
	s§&quot;\s§} §g;	# end quote
	s§&quot;§\"§g;		# quote
	s§([^\\=\s])\"§$1}§g;	# end quote
	s§%§|~|\\%{}§g;	# percent
	s§&lt;§<§g;
	s§&gt;§>§g;
	s§&amp;§\&§g;
	s§&sup(.);§^$1§g;
	s§&frac(.)(.);§\\frac{$1}{$2}§g;
	s§&\#133;§ §g;
#	s§§§g;
#	s§§§g;

# TeX words and marks
	s§T<SUB>E</SUB>X§TeX§g;
	s§pdfTeX§\\pdfTeX{}§gi;
	s§ppchTeX§\\pdfTeX{}§gi;
	s§ConTeXt§\\ConTeXt{}§g;
	s§CONTEXT§\\ConTeXt{}§g;
	s§(\s)TeX§$1\\TeX{}§g;

# environments
	s§<BODY[^<>]*>§\\starttext§gi;
	s§</BODY>§\\stoptext§gi;
	s§(<BLOCKQUOTE>|<QUOTE>)§\\startquotation§gi;
	s§(<\/BLOCKQUOTE>|<\/QUOTE>)§\\stopquotation§gi;
	s§</*DIV[^<>]*>§§gi;	# delete all divs
	s§</*FONT[^<>]*(>|$)§§gi;	# delete all font tags

# Headers
	s§<H1>§\\chapter{§gi;
	s§<H2>§\\section{§gi;
	s§<H3>§\\subsection{§gi;
	s§<H4>§\\subsubsection{§gi;
	s§</H.>§}§gi;

# Links
	s§<A\s(.*)HREF=\"(.*)\">(.*)</A>§\\goto{$3}[URL($2)]§gi;
	s§<A\s(.*)NAME=\"(.*)\">(.*)</A>§\\reference[$2]{$3}§gi;

# Tables
if ($Table eq "y") {
	s§<TABLE([^<>]*)>§\\bTABLE \%$1 §gi;
	s§</TABLE>§\\eTABLE§gi;
	s§</TD>§\\eTD §gi;
	s§<TD([^<>]*)>§\\bTD §gi;
	s§</TR>§\\eTR §gi;
	s§<TR([^<>]*)>§\\bTR §gi;
} else {
	s§</*T(ABLE|D|R|BODY)[^<>]*>§§gi;	# delete all table tags
} # if Table

# Images
	s§<IMG\s([^<>]*)>§\\externalfigure[$1]§gi;
	s§<IMG\s([^"=]*)src=\"([^<>]*)\"([^<>]*)$§\\externalfigure[$2]\t\% $1 $3§gi;

# Lists
	s§<UL>§\\startitemize\[1\]§gi;
	s§<OL>§\\startitemize\[n\]§gi;
	s§<DL>§\\startitemize\[1\]§gi; # ?
	s§</.L>§\\stopitemize§gi;
	s§<LI>§\\item §gi;
	s§<DT>§\\item §gi; #
	s§<DD>§\\item §gi; #
	s§</LI>§§gi;


	s§<P[^<>]*>§§gi;
#	s§</P>§\\par§gi;
	s§</P>§\n\n§gi;
	s§<BR[^<>]*>§\n§gi;
	s§<HR[^<>]*>§\\blank §gi;

	s§<(PRE|TT|CODE)>§\\type{§gi;
	s§<(STRONG|B)>§{\\bf §gi;
	s§<(EM|I|U)>§{\\em §gi;

	s§^</([^\s]*)>$§\\stop$1§gi;
	s§^<([^\s]*)([^<>]*)>$§\\start$1\[$2\]§gi;
	s§</.*>§}§gi; # all other closing tags become }
	s§<([^\s]*)(\s)(.*)>§\\$1\[$3\]\{§gi; # all other opening tags become {
	s§<([^\s]*)>§\\$1\{§gi; # all other opening tags become {

	s§^\s*§§g;	# remove trailing spaces

	print ZIEL;
	print ".";
} # while
print "\n";

close (ZIEL);
close (QUELLE);


# \goto{text}[URL(Link)]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2002-07-01 17:06 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-01 15:41 Converting to ConTeXt from other formats Christopher Cardinale
2002-07-01 17:06 ` Henning Hraban Ramm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).