9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Dan Cross <cross@math.psu.edu>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] don't shoot me
Date: Fri, 18 Jul 2003 23:45:40 -0400	[thread overview]
Message-ID: <200307190345.h6J3je727417@augusta.math.psu.edu> (raw)
In-Reply-To: Your message of "Fri, 18 Jul 2003 20:12:24 PDT." <d721bb5de9a85ccc79001f9a04c79a3a@collyer.net>

> > Yes, but with some tree structure and naming too.
>
> I haven't used XML, but tree structure seems like a simple thing to
> provide, for example with indentation as Plan 9 prototype files and
> Python do:
>
> /
> 	sys
> 		man
> 			1
> 				cat
> 		doc
> 			fonts
> 			9.ms
>
> Obviously each line could contain more than a simple file name
> component for general data representation.

I guess the one cool think about XML that you don't get with something
like this is a way to name the data meaningfully and naturally.  With
XML, you can write something like the following:

<?xml version="1.0"?>
<address>
  <street>123 Main Street</street>
  <city>Cherry Hill</city>
  <state>New Jersey</state>
  <postcode>12345</postcode>
  <country>USA</country>
</address>

Now, looking beyond the hideous syntax for a moment, one thing jumps
out: all the data is clearly labelled.  I know that ``123 Main Street''
is a street address, and I can extract street addresses by name,
instead of relying on it being in some conventional place in the record
(ie, ``the first text field in each record is the street address'').
Is that useful?  Sometimes yes, more often no.  But when it's needed,
it's really needed and is indispensible.

I contend the only real advantage of XML over other representations is
that it forces data to be labelled (you can do this with sexp's, but
it's not mandatory).  Of course, 9 times out of 10, the labelling
sucks and tells you nothing.  The corresponding sexp might look something
like the following, btw:

(address
  (street "123 Main Street")
  (city "Cherry Hill")
  (state "New Jersey")
  (postcode "12345")
  (country "USA"))

But the following:

("123 Main Street" ("Cherry Hill" "New Jersey") ("12345" "USA"))

is also valid.  Unfortunately, the latter example doesn't preserve any
metainformation about the data; we as humans can look at this and say,
``oh, that looks like an address; 12345 is probably a zip code.''  But
a computer has no idea, and I have no way to tell it other than by
position.  But what if I decide to add another field between the street
address and City/State tuple?  Say, an apartment number field?  All of
a sudden, my position-based extraction logic fails.  At least with XML,
that isn't a problem (in theory, anyway; like I said, the labelling can
be totally bonheaded and meaningless).

Scott Schwartz once proposed using LaTeX syntax for describing data
in the same way one uses XML.  It was a good idea, and we'd end up
with something like:

\begin{address}
  \street{123 Main Street}
  \city{Cherry Hill}
  \state{New Jersey}
  \postcode{12345}
  \country{USA}
\end{address}

Of course, that made too much sense and thus never caught on.  It
would have been a lot cleaner and more compact than using XML
syntax, though.

Oh well.  Like anything else, XML has its place, but it's been shoe
horned into 80 billion different places it doesn't belong.

	- Dan C.



  reply	other threads:[~2003-07-19  3:45 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-18 15:12 pac
2003-07-18 15:17 ` Lucio De Re
2003-07-18 19:15   ` Scott Schwartz
2003-07-18 19:34     ` ron minnich
2003-07-18 20:34       ` Dan Cross
2003-07-18 21:10         ` boyd, rounin
2003-07-18 21:51           ` Skip Tavakkolian
2003-07-18 21:58             ` boyd, rounin
2003-07-21  8:28           ` Douglas A. Gwyn
2003-07-21  8:50             ` boyd, rounin
2003-07-21 15:09             ` Dan Cross
2003-07-18 21:13         ` boyd, rounin
2003-07-18 22:06       ` [9fans] Supermon boyd, rounin
2003-07-18 15:17 ` [9fans] don't shoot me ron minnich
2003-07-18 15:23 ` Fco.J.Ballesteros
2003-07-18 15:50   ` Dave Lukes
2003-07-18 17:43     ` boyd, rounin
2003-07-19  3:12     ` Geoff Collyer
2003-07-19  3:45       ` Dan Cross [this message]
2003-07-19  3:48         ` boyd, rounin
2003-07-19  4:11           ` Dan Cross
2003-07-19  4:15             ` boyd, rounin
2003-07-21  8:28           ` Anthony Mandic
2003-07-21  8:57             ` boyd, rounin
2003-07-19  4:18         ` boyd, rounin
2003-07-19  4:20         ` Geoff Collyer
2003-07-19  4:26           ` boyd, rounin
2003-08-12 16:09       ` jared jennings
2003-07-18 16:55 ` David Presotto
2003-07-18 17:02   ` ron minnich
2003-07-18 18:22     ` Nigel Roles
2003-07-21  8:27     ` Anthony Mandic
2003-07-18 17:37   ` diacritics (was: Re: [9fans] don't shoot me) andrey mirtchovski
2003-07-18 19:39     ` Scott Schwartz
2003-07-18 21:11       ` boyd, rounin
2003-07-18 22:17         ` northern snowfall
2003-07-18 21:25           ` boyd, rounin
2003-07-18 22:32             ` northern snowfall
2003-07-21  8:28             ` Anthony Mandic
2003-07-21  8:55               ` boyd, rounin
2003-08-11 14:33     ` Latchesar Ionkov
2003-08-11 18:05       ` rob pike, esq.
2003-08-11 23:49         ` Latchesar Ionkov
2003-08-12  0:32           ` boyd, rounin
2003-08-12  1:27             ` Latchesar Ionkov
2003-08-12  2:04             ` David Presotto
2003-08-12 18:44         ` Chris Hollis-Locke
2003-08-12 19:06           ` Latchesar Ionkov
2003-08-12 21:37             ` Chris Hollis-Locke
2003-08-12 22:02               ` Charles Forsyth
2003-08-13  1:49               ` boyd, rounin
2003-08-13 10:08                 ` chris
2003-08-13 10:25                   ` boyd, rounin
2003-07-18 17:10 ` [9fans] don't shoot me matt
2003-07-19  6:45 ` Taj Khattra
2003-07-19  6:51   ` boyd, rounin
2003-07-19 11:19     ` Bruce Ellis
2003-07-19 11:41       ` boyd, rounin
2003-07-18 15:45 Trickey, Howard W (Howard)
     [not found] <1627809308@snellwilcox.com>
2003-07-18 17:00 ` steve.simon
2003-07-21 15:40 Stephen Parker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200307190345.h6J3je727417@augusta.math.psu.edu \
    --to=cross@math.psu.edu \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).