From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/8908 Path: main.gmane.org!not-for-mail From: Simon Pepping Newsgroups: gmane.comp.tex.context Subject: Re: XML and empty line (DocBook) Date: Sat, 3 Aug 2002 17:22:13 +0200 Sender: owner-ntg-context@let.uu.nl Message-ID: <20020803172213.A751@scaprea> References: <20020731214300.A13643@scaprea> NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1035399274 31366 80.91.224.250 (23 Oct 2002 18:54:34 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Wed, 23 Oct 2002 18:54:34 +0000 (UTC) Original-To: NTG-ConTeXt In-Reply-To: <20020731214300.A13643@scaprea>; from spepping@scaprea.hobby.nl on Wed, Jul 31, 2002 at 09:43:00PM +0200 Xref: main.gmane.org gmane.comp.tex.context:8908 X-Report-Spam: http://spam.gmane.org/gmane.comp.tex.context:8908 On Wed, Jul 31, 2002 at 09:43:00PM +0200, Simon Pepping wrote: > On Mon, Jul 29, 2002 at 09:58:35PM +0200, Tobias Burnus wrote: > > Hi, > > > > Using 2002.7.12 I found the problem that > > > > Apache > > <filename>mod_rewrite</filename> > > > > magic > > > > causes the problem with the empty line ( = \par ) any idea how to prevent > > this problem (except by editing the XML source)? > > ! Paragraph ended before \XMLDBdotitle was complete. > > > > \par > > Even in XML mode two blank lines generate a \par. I cannot solve this; > perhaps Hans knows a way out. > > Both this and your previous problem (and Michael's answer to it) show > that TeX has no knowledge of ignorable white space. It cannot, because > it does not know the DTD. (Ignorable white space is all white space in > elements that do not have mixed content.) > > > Addionally I get frequently a ']¿' at the beginning of my documents. > > I believe this is another parsing problem with the internal DTD > set. (AFAIK you should get '¿]' from '>]' in the document. Perhaps it is better not to require that an XML parser in TeX can do all these features right. It must be possible to rewrite the XML file as a 'normalized' file and submit that to the TeX parser. For example, it is possible to write a ContentHandler for a validating SAX parser that removes ignorable white space. Perhaps the same is possible with an XSLT script, but I am not sure if any XSLT processor does a validating parse. Such a procedure would get rid of ignorable white space, and it would resolve entities, thus making the work of a TeX parser much easier. Regards, Simon -- Simon Pepping email: spepping@scaprea.hobby.nl