From mboxrd@z Thu Jan 1 00:00:00 1970 Message-Id: <200307190345.h6J3je727417@augusta.math.psu.edu> To: 9fans@cse.psu.edu Subject: Re: [9fans] don't shoot me In-Reply-To: Your message of "Fri, 18 Jul 2003 20:12:24 PDT." From: Dan Cross Date: Fri, 18 Jul 2003 23:45:40 -0400 Topicbox-Message-UUID: fcc7b05c-eacb-11e9-9e20-41e7f4b1d025 > > Yes, but with some tree structure and naming too. > > I haven't used XML, but tree structure seems like a simple thing to > provide, for example with indentation as Plan 9 prototype files and > Python do: > > / > sys > man > 1 > cat > doc > fonts > 9.ms > > Obviously each line could contain more than a simple file name > component for general data representation. I guess the one cool think about XML that you don't get with something like this is a way to name the data meaningfully and naturally. With XML, you can write something like the following:
123 Main Street Cherry Hill New Jersey 12345 USA
Now, looking beyond the hideous syntax for a moment, one thing jumps out: all the data is clearly labelled. I know that ``123 Main Street'' is a street address, and I can extract street addresses by name, instead of relying on it being in some conventional place in the record (ie, ``the first text field in each record is the street address''). Is that useful? Sometimes yes, more often no. But when it's needed, it's really needed and is indispensible. I contend the only real advantage of XML over other representations is that it forces data to be labelled (you can do this with sexp's, but it's not mandatory). Of course, 9 times out of 10, the labelling sucks and tells you nothing. The corresponding sexp might look something like the following, btw: (address (street "123 Main Street") (city "Cherry Hill") (state "New Jersey") (postcode "12345") (country "USA")) But the following: ("123 Main Street" ("Cherry Hill" "New Jersey") ("12345" "USA")) is also valid. Unfortunately, the latter example doesn't preserve any metainformation about the data; we as humans can look at this and say, ``oh, that looks like an address; 12345 is probably a zip code.'' But a computer has no idea, and I have no way to tell it other than by position. But what if I decide to add another field between the street address and City/State tuple? Say, an apartment number field? All of a sudden, my position-based extraction logic fails. At least with XML, that isn't a problem (in theory, anyway; like I said, the labelling can be totally bonheaded and meaningless). Scott Schwartz once proposed using LaTeX syntax for describing data in the same way one uses XML. It was a good idea, and we'd end up with something like: \begin{address} \street{123 Main Street} \city{Cherry Hill} \state{New Jersey} \postcode{12345} \country{USA} \end{address} Of course, that made too much sense and thus never caught on. It would have been a lot cleaner and more compact than using XML syntax, though. Oh well. Like anything else, XML has its place, but it's been shoe horned into 80 billion different places it doesn't belong. - Dan C.