Re: An idea

From: Nikolai Weibull <mailing-lists.context-users@rawuncut.elitemail.org>
Subject: Re: An idea
Date: Mon, 5 Dec 2005 21:48:09 +0100	[thread overview]
Message-ID: <20051205204809.GD19534@puritan.petwork> (raw)
In-Reply-To: <439353D5.4070704@wxs.nl>

[-- Attachment #1: Type: text/plain, Size: 6475 bytes --]

Hans Hagen wrote:

> Nikolai Weibull wrote:

> > Is there (going to be) any support for this export format in ConTeXt
> > so that I can just generate this kind of XML instead so that we only
> > do things once?  I found this in mscite-p.pdf: “The exporter will be
> > descibed as soon as there are styles for processing the XML code.
> > For the moment we stick to showing the schema.”  Am I to take it
> > that there are plans for handling this, but that nothing has been
> > written yet?  I’ll gladly help out if I can.  I was hoping to make
> > an article out of this, but now it seems that a lot of groundwork
> > has already been done.  (I really need to write some articles if I’m
> > going to be accepted for a postgraduate position.)

> see attached zip   

> this is what is produced, writing an xml mapping to context code (i must 
> have an example somewhere but cannot find it now)

(I come off quite negative in this response, but I am not trying to
place blame here, just trying to come up with something better.)

That’s just horrendous.  The ‘n’ attribute on ‘line’ is useless (and why
are empty lines unnumbered?).  You can just as easily use XSLT’s
<number/> to get that effect:

  <xsl:template match="scite:line">
     <xsl:number/>. <xsl:apply-templates/>
  </xsl:template>

Furthermore, the ‘n’ attribute on ‘t’ is equally useless.  How am I to
know what “3” stands for when converting?

Finally, what’s the reason for having ‘s’?  It just complicates matters
immensely:

  <xsl:template match="scite:s">
    <xsl:call-template name="print-spaces">
      <xsl:with-param name="n" select="@n"/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="print-spaces">
    <xsl:param name="n"/>
    <xsl:if test="number($n) > 0">
      <xsl:text> </xsl:text>
      <xsl:call-template name="print-spaces">
        <xsl:with-param name="n" select="number($n) - 1"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

The problem is that the ‘n’ attribute has a default of 1, but as there’s
no way to express default values for attributes in Relax-NG (and there’s
no schema at the bogus url http://www.scintila.org/scite.rng) we’d have
to hack it even more.

Could you enlighten me on the merits of having a tag for representing
(sequences of) whitespace?

Here’s a suggestion for a better grammar:

ID.datatype = xsd:ID
LanguageCode.datatype = xsd:language

id.attrib = attribute id { ID.datatype }?
xmlbase.attrib = attribute xml:base { text }?
Core.attrib = id.attrib, xmlbase.attrib
lang.attrib = attribute xml:lang { LanguageCode.datatype }?
I18n.attrib = lang.attrib
Common.attrib = Core.attrib, I18n.attrib

start = file | line | segment | whitespace

file = element file { file.attlist, file.content }
file.attlist = Common.attrib, attribute path { text }, attribute type { text }
file.content = line*

line = element line { line.attlist, line.content }
line.attlist = Common.attrib
line.content = (segment | whitespace)*

segment = element segment { segment.attlist, text }
segment.attlist = Common.attrib, attribute type { text }

whitespace = element whitespace { whitespace.attlist, empty }
whitespace.attlist = Common.attrib

I’m sure that there are things missing and that it can be improved
further, but it’s a good start in my opinion.

What I can’t decide is whether to write things like

<line>
  <segment type="type">int</segment><whitespace/><segment type="normal">a;</segment>
</line>

or

<line>
  <type>int</type><whitespace/><normal>a;</normal>
</line>

The first scheme is easier to extend with more types, if we don’t
validate the value of segment’s ‘type’ attribute (if we do, then both
schemes are equally “easy” to extend), but the second scheme has the
benefit that as much validation as possible can go into the schema and
can thus ease translation.

I envision that the XSLT ConTeXt-converter for this scheme would have
code in one of these two formats:

<xsl:template match="segment">
  \highlight[<xsl:value-of select="@type"/>]{<xsl:value-of select="."/>}
</xsl:template>

or

<xsl:template match="type|normal|comment|...">
  \highlight[<xsl:value-of select="local-name(.)"/>]{<xsl:value-of select="."/>}
</xsl:template>

I really don’t know what I prefer.  The first means that only the source
and destination need to worry about dealing with new types (the
stylesheet can be used unchanged).

> about an article ... it would be interesting to combine thsi with a kind 
> of literate programming

Yes, that’s true.  I used a simple hack to my code written in literate
programming into my master’s thesis.  It worked quite well actually.
I’m sure that this could be combined with the method I used there.

> (if cweb would produce better output then it may have become a  bit
> more popular)

Yes.  I see three problems with CWEB:

1.  The syntax is too complicated
2.  The output is not great
3.  The input isn’t compilable in the source programming language

What my literate programming system did was allow for embedding comments
in the source instead of the other way around (much like how ConTeXt
sources are documented).  This worked very well actually, especially in
the parts of my code that were written in Ruby where one can do things
like

# ¶ OK, so now that we have all the other methods done, it’s time to
# write our final method \Ruby{method}.  It will simply tie together the
# other methods found in \Ruby{Class}:
def Class.method
  ⋮
end

(A sequence of comments where the first comment began with a pilcrow
sign are processed as stuff to send to ConTeXt and the following code
block (basically everything up to the next pilcrow-endowned comment) was
set inside a verbatim environment.)

Anyway, I’ve attached the Relax-NG schema that I deviced.  Tomorrow I’m
going to work on writing an exporter that exports to this format and a
stylesheet that transforms this to ConTeXt code.  If I have time I’ll
also try to set up some environment for this and some generic
preprocessing directives for texexec.

        nikolai

-- 
Nikolai Weibull: now available free of charge at http://bitwi.se/!
Born in Chicago, IL USA; currently residing in Gothenburg, Sweden.
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}

[-- Attachment #2: syntax-export.rnc --]
[-- Type: text/plain, Size: 838 bytes --]

ID.datatype = xsd:ID
LanguageCode.datatype = xsd:language

id.attrib = attribute id { ID.datatype }?
xmlbase.attrib = attribute xml:base { text }?
Core.attrib = id.attrib, xmlbase.attrib
lang.attrib = attribute xml:lang { LanguageCode.datatype }?
I18n.attrib = lang.attrib
Common.attrib = Core.attrib, I18n.attrib

start = file | line | segment | whitespace

file = element file { file.attlist, file.content }
file.attlist = Common.attrib, attribute path { text }, attribute type { text }
file.content = line*

line = element line { line.attlist, line.content }
line.attlist = Common.attrib
line.content = (segment | whitespace)*

segment = element segment { segment.attlist, text }
segment.attlist = Common.attrib, attribute type { text }

whitespace = element whitespace { whitespace.attlist, empty }
whitespace.attlist = Common.attrib

[-- Attachment #3: Type: text/plain, Size: 139 bytes --]

_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context