ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* XML and empty line (DocBook)
@ 2002-07-29 19:58 Tobias Burnus
  2002-07-31 19:43 ` Simon Pepping
  0 siblings, 1 reply; 8+ messages in thread
From: Tobias Burnus @ 2002-07-29 19:58 UTC (permalink / raw)


Hi,

First: unfortunally ConTeXt 2002.7.26  doesn't work with the DocBook
anymore which makes testing a bit harder (2002.7.12 does work).

Using 2002.7.12 I found the problem that

     <title>Apache
     <filename>mod_rewrite</filename>

     magic</title>

causes the problem with the empty line ( = \par ) any idea how to prevent
this problem (except by editing the XML source)?
  ! Paragraph ended before \XMLDBdotitle was complete.
  <to be read again>
                     \par

Addionally I get frequently a ']¿' at the beginning of my documents.

This causes also strange results:
    <para>But why should
    <emphasis>you</emphasis>

    use Bugzilla?</para>

Since the empty line is regarded as parapraph :-(

Tobias


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XML and empty line (DocBook)
  2002-07-29 19:58 XML and empty line (DocBook) Tobias Burnus
@ 2002-07-31 19:43 ` Simon Pepping
  2002-07-31 21:21   ` Michael Wiedmann
  2002-08-03 15:22   ` Simon Pepping
  0 siblings, 2 replies; 8+ messages in thread
From: Simon Pepping @ 2002-07-31 19:43 UTC (permalink / raw)


On Mon, Jul 29, 2002 at 09:58:35PM +0200, Tobias Burnus wrote:
> Hi,
> 
> Using 2002.7.12 I found the problem that
> 
>      <title>Apache
>      <filename>mod_rewrite</filename>
> 
>      magic</title>
> 
> causes the problem with the empty line ( = \par ) any idea how to prevent
> this problem (except by editing the XML source)?
>   ! Paragraph ended before \XMLDBdotitle was complete.
>   <to be read again>
>                      \par

Even in XML mode two blank lines generate a \par. I cannot solve this;
perhaps Hans knows a way out.

Both this and your previous problem (and Michael's answer to it) show
that TeX has no knowledge of ignorable white space. It cannot, because
it does not know the DTD. (Ignorable white space is all white space in
elements that do not have mixed content.)

> Addionally I get frequently a ']¿' at the beginning of my documents.

I believe this is another parsing problem with the internal DTD
set. (AFAIK you should get '¿]' from '>]' in the document.

Regards, Simon

-- 
Simon Pepping
email: spepping@scaprea.hobby.nl


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XML and empty line (DocBook)
  2002-07-31 19:43 ` Simon Pepping
@ 2002-07-31 21:21   ` Michael Wiedmann
  2002-08-01  7:14     ` Hans Hagen
  2002-08-03 15:22   ` Simon Pepping
  1 sibling, 1 reply; 8+ messages in thread
From: Michael Wiedmann @ 2002-07-31 21:21 UTC (permalink / raw)


* Simon Pepping <spepping@scaprea.hobby.nl> [020731 21:43]:
> On Mon, Jul 29, 2002 at 09:58:35PM +0200, Tobias Burnus wrote:
...
> > Addionally I get frequently a ']¿' at the beginning of my documents.
> 
> I believe this is another parsing problem with the internal DTD
> set. (AFAIK you should get '¿]' from '>]' in the document.

I observed this only on the first page (additional page before
the title page) of a DocBook 'article', and not for a 'book'.
In this case this has nothing to do with an internal subset.

Michael
-- 
mw@miwie.in-berlin.de                              http://www.miwie.org
mw@miwie.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XML and empty line (DocBook)
  2002-07-31 21:21   ` Michael Wiedmann
@ 2002-08-01  7:14     ` Hans Hagen
  2002-08-01 19:07       ` Simon Pepping
  0 siblings, 1 reply; 8+ messages in thread
From: Hans Hagen @ 2002-08-01  7:14 UTC (permalink / raw)
  Cc: NTG-ConTeXt

At 11:21 PM 7/31/2002 +0200, Michael Wiedmann wrote:
>* Simon Pepping <spepping@scaprea.hobby.nl> [020731 21:43]:
> > On Mon, Jul 29, 2002 at 09:58:35PM +0200, Tobias Burnus wrote:
>...
> > > Addionally I get frequently a ']¿' at the beginning of my documents.
> >
> > I believe this is another parsing problem with the internal DTD
> > set. (AFAIK you should get '¿]' from '>]' in the document.
>
>I observed this only on the first page (additional page before
>the title page) of a DocBook 'article', and not for a 'book'.
>In this case this has nothing to do with an internal subset.

if you make me small test files to play with, i will have a look

Btw, i fixed the skipped first ENTITY problem,

Hans

-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XML and empty line (DocBook)
  2002-08-01  7:14     ` Hans Hagen
@ 2002-08-01 19:07       ` Simon Pepping
  0 siblings, 0 replies; 8+ messages in thread
From: Simon Pepping @ 2002-08-01 19:07 UTC (permalink / raw)


On Thu, Aug 01, 2002 at 09:14:38AM +0200, Hans Hagen wrote:
> At 11:21 PM 7/31/2002 +0200, Michael Wiedmann wrote:
> >* Simon Pepping <spepping@scaprea.hobby.nl> [020731 21:43]:
> > > On Mon, Jul 29, 2002 at 09:58:35PM +0200, Tobias Burnus wrote:
> >...
> > > > Addionally I get frequently a ']¿' at the beginning of my documents.
> > >
> > > I believe this is another parsing problem with the internal DTD
> > > set. (AFAIK you should get '¿]' from '>]' in the document.
> >
> >I observed this only on the first page (additional page before
> >the title page) of a DocBook 'article', and not for a 'book'.
> >In this case this has nothing to do with an internal subset.
> 
> if you make me small test files to play with, i will have a look

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
                  "/usr/local/lib/sgml/dtd/docbookx-4.1.2/docbookx.dtd" [
<!ENTITY TEX "TeX">
]>

<book>
<bookinfo>
<title>Title</title>
</bookinfo>
<chapter>
<title>Title of chapter</title>
<para>A paragraph &TEX;</para>
</chapter>
</book>

> Btw, i fixed the skipped first ENTITY problem,

Good.

Simon

-- 
Simon Pepping
email: spepping@scaprea.hobby.nl


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XML and empty line (DocBook)
  2002-07-31 19:43 ` Simon Pepping
  2002-07-31 21:21   ` Michael Wiedmann
@ 2002-08-03 15:22   ` Simon Pepping
  2002-08-04 21:59     ` Hans Hagen
  1 sibling, 1 reply; 8+ messages in thread
From: Simon Pepping @ 2002-08-03 15:22 UTC (permalink / raw)


On Wed, Jul 31, 2002 at 09:43:00PM +0200, Simon Pepping wrote:
> On Mon, Jul 29, 2002 at 09:58:35PM +0200, Tobias Burnus wrote:
> > Hi,
> > 
> > Using 2002.7.12 I found the problem that
> > 
> >      <title>Apache
> >      <filename>mod_rewrite</filename>
> > 
> >      magic</title>
> > 
> > causes the problem with the empty line ( = \par ) any idea how to prevent
> > this problem (except by editing the XML source)?
> >   ! Paragraph ended before \XMLDBdotitle was complete.
> >   <to be read again>
> >                      \par
> 
> Even in XML mode two blank lines generate a \par. I cannot solve this;
> perhaps Hans knows a way out.
> 
> Both this and your previous problem (and Michael's answer to it) show
> that TeX has no knowledge of ignorable white space. It cannot, because
> it does not know the DTD. (Ignorable white space is all white space in
> elements that do not have mixed content.)
> 
> > Addionally I get frequently a ']¿' at the beginning of my documents.
> 
> I believe this is another parsing problem with the internal DTD
> set. (AFAIK you should get '¿]' from '>]' in the document.

Perhaps it is better not to require that an XML parser in TeX can do
all these features right. It must be possible to rewrite the XML file
as a 'normalized' file and submit that to the TeX parser. For example,
it is possible to write a ContentHandler for a validating SAX parser
that removes ignorable white space. Perhaps the same is possible with
an XSLT script, but I am not sure if any XSLT processor does a
validating parse. Such a procedure would get rid of ignorable white
space, and it would resolve entities, thus making the work of a TeX
parser much easier.

Regards, Simon

-- 
Simon Pepping
email: spepping@scaprea.hobby.nl


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XML and empty line (DocBook)
  2002-08-03 15:22   ` Simon Pepping
@ 2002-08-04 21:59     ` Hans Hagen
  0 siblings, 0 replies; 8+ messages in thread
From: Hans Hagen @ 2002-08-04 21:59 UTC (permalink / raw)
  Cc: NTG-ConTeXt

At 05:22 PM 8/3/2002 +0200, Simon Pepping wrote:

>Perhaps it is better not to require that an XML parser in TeX can do
>all these features right. It must be possible to rewrite the XML file
>as a 'normalized' file and submit that to the TeX parser. For example,
>it is possible to write a ContentHandler for a validating SAX parser
>that removes ignorable white space. Perhaps the same is possible with
>an XSLT script, but I am not sure if any XSLT processor does a
>validating parse. Such a procedure would get rid of ignorable white
>space, and it would resolve entities, thus making the work of a TeX
>parser much easier.

indeed, in some cases preprocessing is handy, for instance, i sometimes 
convert the 'verbatim cdata' things into code like:

<verbatim>
<line>...</line>

thereby not only gaining much more control over typography, but also 
getting cleaner source code.

I will provide some more cleanup, and esp when we have to deal with 
language specific typesetting, it makes sense to convert all non chars into 
entities (: => &colon; and alike, because this permits language dependent 
spacing). Some of our current project sdemands this kind of control, so you 
can expect some tools

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XML and empty line (DocBook)
       [not found] <Pine.LNX.4.44.0207292143240.13525-100000@warp9.physik.fu-b erlin.de>
@ 2002-07-30  9:26 ` Hans Hagen
  0 siblings, 0 replies; 8+ messages in thread
From: Hans Hagen @ 2002-07-30  9:26 UTC (permalink / raw)
  Cc: ntg-context

At 09:58 PM 7/29/2002 +0200, you wrote:
>Hi,
>
>First: unfortunally ConTeXt 2002.7.26  doesn't work with the DocBook
>anymore which makes testing a bit harder (2002.7.12 does work).

The current docubook style redefines a few low level macros, this has to be 
adapted (i added some low level support macros needed by the authors)

Hans
-------------------------------------------------------------------------
                                   Hans Hagen | PRAGMA ADE | pragma@wxs.nl
                       Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: +31 (0)38 477 53 69 | fax: +31 (0)38 477 53 74 | www.pragma-ade.com
-------------------------------------------------------------------------
                        information: http://www.pragma-ade.com/roadmap.pdf
                     documentation: http://www.pragma-ade.com/showcase.pdf
-------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-08-04 21:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-29 19:58 XML and empty line (DocBook) Tobias Burnus
2002-07-31 19:43 ` Simon Pepping
2002-07-31 21:21   ` Michael Wiedmann
2002-08-01  7:14     ` Hans Hagen
2002-08-01 19:07       ` Simon Pepping
2002-08-03 15:22   ` Simon Pepping
2002-08-04 21:59     ` Hans Hagen
     [not found] <Pine.LNX.4.44.0207292143240.13525-100000@warp9.physik.fu-b erlin.de>
2002-07-30  9:26 ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).