From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Cross Message-Id: <200101102332.SAA28475@augusta.math.psu.edu> To: 9fans@cse.psu.edu Subject: Re: [9fans] Typesetting In-Reply-To: <20010110201239.3ECB619A40@mail.cse.psu.edu> Cc: Date: Wed, 10 Jan 2001 18:32:55 -0500 Topicbox-Message-UUID: 4b603f34-eac9-11e9-9e20-41e7f4b1d025 In article <20010110201239.3ECB619A40@mail.cse.psu.edu> you write: >I agree that the web has become ugly. How would you >change it? I think I would reduce it a bit to something >more content oriented and less driven to "appeal". I've used a lot of emails to folks as sounding boards for some random ideas, but unfortunately, it all sounds like kind of incoherent rambling, as I'm sure this note does. :-) In general, I think that the distribution mechanism is all wrong, as is the fact that there's no decent way to categorize or present the data. The following facts about goal of the web are relevant: + The web is about information sharing. + The web is content driven. + Most web pages are specific to a given topic area. + The data is often meant to be *used* (by that, I mean not just telling me where the muffler on my car is, but manipulated by me). So, some things that really don't make sense are: + The distribution protocol is based on file transfer, not sharing. + The markup language doesn't preserve the semantics of the content. + The Markup language doesn't provide a good way to present the data. + There is no built in ordering to the data. + The browser model is all wrong; it doesn't integrate cleanly into the rest of the environment, and effectively prevents me from manipulating the content. So these are the problems that I think need to be addressed first. You can probably see where I'm going with this, but, here goes: + Replace the distribution protocol with a distributed filesystem; something similar to AFS. - Instead of having web servers, have file servers. - I should never have to talk to more than one file server to get anywhere on the web; kinda like DNS. - File servers should cache data using a mechanism similar to that of DNS; that is, each file should have as part of it's metadata a ``time to live'' detailing how long another server may cache the file. It could use an LRU mechanism to keep the cache size reasonable. Whole file caching is fine. - File servers should provide a network-enabled ``named pipe'' like mechanism to provide interactive services. This simplifies a lot of stuff. First of all, the scalability problems of current-generation web servers go away. I don't need a farm of high powered boxes to serve out content; I just need a few file servers. This is kind of like what Akami (sic) et al attempt to do. Second, the ``session handling'' problem goes away for interactive services. A session is active as long as I have one of these ``named pipe'' like files open. It goes away when I close the file. It provides a mechanism for built-in proxies, since a client only ever talks to a local file server. If I need a proxy for some reason, I can just interject a file server between my clients and the rest of the ``web.'' I don't have to worry about configuring my firewall to allow everyone's desktop machine to access every web server in the world. Instead, I just have a single caching file server in my DMZ or outside my firewall that the desktops talk to. The hierarchial organization of the filesystem namespace allows me to easily categorize content. I can also use this to restrict access to information; if I authenticate to the filesystem, then I can say things like, ``if user is not in acl, don't let him/her see /foo....'' This could be useful for blocking the rest of the world from my internal information, or for blocking the kids from things they shouldn't see. + The next major problem is content markup and presentation. I haven't figured out too much about that yet. Most content needs something a little more, umm, attractive than plain text to be popular, but it's also important to preserve information about content. For instance, ``this is a telephone number.'' XML tries to do this, and I think does okay, but it imposes a rigid structure on the data. That's kind of unfortunate, since it doesn't integrate well with text processing tools like grep et al. Perhaps structured regular expressions and some kind of metalanguage could help out the content structure part, but the markup part is still unsolved. Perhaps another metalanguage derived from structured regular expressions could help here. Either that, or treat everything as an object with a ``render'' method, almost what XML does. Well, that's basically it, sorry it's rather rambling. There are a lot of open issues, like authentication and privacy (both of which are afterthoughts on the web, but must be integral from the beginning of a new system), etc, but I'd rather solve them at the file server level than at the application level. - Dan C.