From mboxrd@z Thu Jan 1 00:00:00 1970 Message-Id: <200101120031.AAA22053@whitecrow.demon.co.uk> To: 9fans@cse.psu.edu Subject: Re: [9fans] Typesetting In-reply-to: Your message of "Wed, 10 Jan 2001 18:32:55 EST." <200101102332.SAA28475@augusta.math.psu.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii From: Steve Kilbane Date: Fri, 12 Jan 2001 00:31:03 +0000 Topicbox-Message-UUID: 4bf47b7c-eac9-11e9-9e20-41e7f4b1d025 > In article <20010110201239.3ECB619A40@mail.cse.psu.edu> Dan wrote: > The following facts about goal of the web are relevant: > > + The web is about information sharing. Alas, no longer. The web is now about selling something, or looking "cool". If it was just about sharing information, most of the current problems wouldn't be nearly so bad. However, that's a sociological problem, and a technical fix won't help. > + Most web pages are specific to a given topic area. > + The data is often meant to be *used* (by that, I mean > not just telling me where the muffler on my car is, > but manipulated by me). In other words, it's got internal structure that's more than just pixels on the screen, so it should be handled as such. > So, some things that really don't make sense are: > > + The distribution protocol is based on file transfer, not sharing. You're linking the use of the data with the retrieval of the data. I don't think the two are particularly related. The key, I feel, is how you identify what data is to be fetched next. How you actually get it isn't really interesting. > + The markup language doesn't preserve the semantics of the content. > + The Markup language doesn't provide a good way to present the data. Both true. > + There is no built in ordering to the data. That's arguable, so I think you'd better clarify it. > + The browser model is all wrong; it doesn't integrate cleanly into > the rest of the environment, and effectively prevents me from > manipulating the content. But that's never going to change, at least until Microsoft (or any other company) achieve their goal of being the universal platform. While there is heterogeny, you'll need some way to insulate the data from the destination platform's weirdness. > So these are the problems that I think need to be addressed first. You > can probably see where I'm going with this, but, here goes: > > + Replace the distribution protocol with a distributed > filesystem; something similar to AFS. [...] > This simplifies a lot of stuff. It does, but unfortunately, it simplifies the stuff that happens to be simple to begin with. Filesystems are well understood, and so is data transfer in general. HTTP was a bad start to begin with, and we've only still got it now because it had a solid foothold. A side point, though: DNS contains much less information than a web server. Heavily-accessed sites will still need big systems because the intermediate nodes on the net can only cache so much, so many accesses will still make it back to the source machine. > The hierarchial organization of the filesystem namespace > allows me to easily categorize content. I'm sorry to say this, but no chance. Absolutely none. For any arbitrary information storage system, you can't come up with a hierarchy that makes sense to more than one segment of the user-base (unless you count /everything). Different users see things in different ways, and so need a different hierarchy. Worse, it changes depending on what they're looking for. As a simple example, the unbiquitous FAQ: a document ideally written by an expert, for a non-expert reader. The author and the target reader have different views of the same information, and would probably like it presented differently. A tutorial is structured differently from a reference guide, and that's just the tip of the iceberg. > + The next major problem is content markup and presentation. > I haven't figured out too much about that yet. You and most webmasters. :-) > Most > content needs something a little more, umm, attractive > than plain text to be popular, Which takes me back to the original point about what the focus of the web is, nowadays. As it happens, I agree with the masses here, albeit for different reasons. The commercial sites want pages that look snazzy, whereas I want pages that get the information into my brain in the fastest possible way, in a manner I understand. If this means images, animations, etc, then fine - but only if that's the best way. It also probably needs an expert in visual aids to pull it off, and that's a rare talent. > but it's also important > to preserve information about content. For instance, > ``this is a telephone number.'' XML tries to do this, > and I think does okay, but it imposes a rigid structure > on the data. That's kind of unfortunate, since it doesn't > integrate well with text processing tools like grep et al. But then, how do you grep a 3d model? If you want your information to be more than unstructured text, you need different data manipulation tools. I realise I've been purely negative here, but I don't have any constructive comments to make. I worked on the sort of thing you're after (a "knowledge management system" - ick), and I didn't gain much in the way of answers. Mainly, I got an appreciation of how hard the generic problem is. steve