9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: cross@sdgm.net
To: 9fans@cse.psu.edu
Subject: [9fans] webls.c - synthesize directories on the fly.
Date: Mon, 29 Sep 2003 02:14:37 -0400	[thread overview]
Message-ID: <3b6b67c0806b02b89afd64dcc39fe25b@sdgm.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 2831 bytes --]

ip/httpd/httpd doesn't dynamically create web pages indexing
directories the way some other httpd's can be configured to (apache is
an obvious example here).  This is probably to the good; it prevents
people from snooping around your directory structure unnecessarily.
However, it can also be annoying; for instance, if you want to expose
the contents of a directory hierarchy to the world.  A simple solution
there is to simply write a script that creates directory listings for
you, and then use it to generate static HTML pages, and create links
to those.  However, that's annoying because (a) you have to maintain
the pages, and (b) they clutter up your namespace with things you
ordinarily wouldn't have there.

But nothing is stopping us from implementing a tool to do that
automatically, so I did just that.  Here is webls, a server in the
same vein as man2html (indeed, I started with the man2html source when
building webls), to be run via the /magic, er, magic, and which
generates ls-like output for a directory on the fly.

So far, it seems to work rather well.  I've only tested it myself, I
guess, and rather lightly, but it's simple enough to not be a big
deal.  It also has support for specifying two files,
/sys/lib/webls.allowed and /sys/lib/webls.denied for restricting
access to what parts of the web space it will create listings for (the
two files fit together in a pretty obvious way, but the rules for them
are specified in the source).  In a nutshell, add lines to
webls.allowed with regular expressions for what you want to be
visible, and regular expressions in webls.denied for what you don't
want to be visible.  Webls.allowed overrides webls.denied.  If
webls.allowed exists and webls.denied does not, then access is only
granted for filenames match regular expressions explicitly listed in
webls.allowed.  If webls.denied exists and not webls.allowed, then
access is only denied for requests for directories explicitly listed
in webls.denied.  If neither exists, access is granted everywhere; the
rationale here is that only things in your publically viewable web
namespace are visible, anyhow..  A safe thing to do is simply, ``echo
'.*' > /sys/lib/webls.denied' and then explicitly list those things
you want to be listable.

Anyway, a patch to /sys/src/cmd/ip/httpd/mkfile follows, as well
as the default `turn everything off' webls.denied and webls.c itself.
This is, I think, useful enough to go in the base distribution.

	- Dan C.

brahma% ape/diff -c /n/sources/plan9/sys/src/cmd/ip/httpd/mkfile mkfile
*** /n/sources/plan9/sys/src/cmd/ip/httpd/mkfile	Wed Nov 27 19:23:25 2002
--- mkfile	Mon Sep 29 01:44:15 2003
***************
*** 10,15 ****
--- 10,16 ----
  	man2html\
  	save\
  	wikipost\
+ 	webls\

  LIB=libhttps.a.$O

brahma%

[-- Attachment #2.1: Type: text/plain, Size: 312 bytes --]

The following attachment had content that we can't
prove to be harmless.  To avoid possible automatic
execution, we changed the content headers.
The original header was:

	Content-Disposition: attachment; filename=webls.denied
	Content-Type: text/plain; charset="US-ASCII"
	Content-Transfer-Encoding: 7bit

[-- Attachment #2.2: webls.denied.suspect --]
[-- Type: application/octet-stream, Size: 2 bytes --]

.*

[-- Attachment #3: webls.c --]
[-- Type: text/plain, Size: 6854 bytes --]

#include <u.h>
#include <libc.h>
#include <ctype.h>
#include <bio.h>
#include <regexp.h>
#include <fcall.h>
#include "httpd.h"
#include "httpsrv.h"

static	Hio		*hout;
static	Hio		houtb;
static	HConnect	*connect;
static	int		vermaj, gidwidth, uidwidth, lenwidth, devwidth;
static	int		okfd = -1, notokfd = -1;

void
error(char *title, char *fmt, ...)
{
	va_list arg;
	char buf[1024], *out;

	va_start(arg, fmt);
	out = vseprint(buf, buf+sizeof(buf), fmt, arg);
	va_end(arg);
	*out = 0;

	hprint(hout, "%s 404 %s\r\n", hversion, title);
	hprint(hout, "Date: %D\r\n", time(nil));
	hprint(hout, "Server: Plan9\r\n");
	hprint(hout, "Content-type: text/html\r\n");
	hprint(hout, "\r\n");
	hprint(hout, "<html>\n");
	hprint(hout, "<head><title>%s</title></head>\n", title);
	hprint(hout, "<body>\n");
	hprint(hout, "<h1>%s</h1>\n", title);
	hprint(hout, "%s\n", buf);
	hprint(hout, "</body>\n");
	hprint(hout, "</html>\n");
	hflush(hout);
	writelog(connect, "Reply: 404\nReason: %s\n", title);
	exits(nil);
}

/*
 * Are we actually allowed to look in here?
 *
 * Rules:
 *	1) If neither allowed nor denied files exist, access is granted.
 *	2) If allowed exists and denied does not, dir *must* be in allowed
 *	   for access to be granted, otherwise, access is denied.
 *	3) If denied exists and allowed does not, dir *must not* be in
 *	   denied for access to be granted, otherwise, access is enied.
 *	4) If both exist, okay if either (a) file is not in denied, or
 *	   (b) in denied and in allowed.  Otherwise, access is denied.
 */
static int
allowed(char *dir)
{
	char	*p, *t;
	Reprog	*re;
	int	okay;
	Resub	match;
	Biobuf	aio, dio;

	if (okfd < 0 && notokfd < 0)
		return(1);
	Binit(&aio, okfd, OREAD);
	Binit(&dio, notokfd, OREAD);

	okay = !(notokfd < 0);
	while (okay && (p = Brdstr(&dio, '\n', 0)) != nil) {
		t = strchr(p, '#');
		if (t != nil)
			*t = '\0';
		t = p + strlen(p);
		while(--t > p && isspace(*t))
			*t = '\0';
		if (strlen(p) == 0) {
			free(p);
			continue;
		}
		re = regcomp(p);
		if (re == nil) {
			free(p);
			continue;
		}
		if (regexec(re, dir, &match, 1) == 1)
			okay = 0;
		free(re);
		free(p);
	}
	if (okfd < 0)
		return(okay);
	while (!okay && (p = Brdstr(&aio, '\n', 0)) != nil) {
		t = strchr(p, '#');
		if (t != nil)
			*t = '\0';
		t = p + strlen(p);
		while(--t > p && isspace(*t))
			*t = '\0';
		if (strlen(p) == 0) {
			free(p);
			continue;
		}
		re = regcomp(p);
		if (re == nil) {
			free(p);
			continue;
		}
		if (regexec(re, dir, &match, 1))
			okay = 1;
		free(re);
		free(p);
	}
	return(okay);
}

/*
 * Comparison routine for sorting the directory.
 */
static int
compar(Dir *a, Dir *b)
{
	return(strcmp(a->name, b->name));
}

/*
 * These is for formating; how wide are variable-length
 * fields?
 */
static void
maxwidths(Dir *dp, long n)
{
	long	i;
	char	scratch[64];

	for (i = 0; i < n; i++) {
		if (snprint(scratch, sizeof scratch, "%ud", dp[i].dev) > devwidth)
			devwidth = strlen(scratch);
		if (strlen(dp[i].uid) > uidwidth)
			uidwidth = strlen(dp[i].uid);
		if (strlen(dp[i].gid) > gidwidth)
			gidwidth = strlen(dp[i].gid);
		if (snprint(scratch, sizeof scratch, "%lld", dp[i].length) > lenwidth)
			lenwidth = strlen(scratch);
	}
}

/*
 * Do an actual directory listing.
 * asciitime is lifted directly out of ls.
 */
char *
asciitime(long l)
{
	ulong clk;
	static char buf[32];
	char *t;

	clk = time(nil);
	t = ctime(l);
	/* 6 months in the past or a day in the future */
	if(l<clk-180L*24*60*60 || clk+24L*60*60<l){
		memmove(buf, t+4, 7);		/* month and day */
		memmove(buf+7, t+23, 5);		/* year */
	}else
		memmove(buf, t+4, 12);		/* skip day of week */
	buf[12] = 0;
	return buf;
}

static void
dols(char *dir)
{
	Dir	*d;
	char	*f, *p;
	long	i, n;
	int	fd;

	cleanname(dir);
	if (!allowed(dir)) {
		error("Permission denied", "Cannot list directory %s: Access prohibited", dir);
		return;
	}
	fd = open(dir, OREAD);
	if (fd < 0) {
		error("Cannot read directory",
		    "<p>Cannot read directory %s: %r</p>\n", dir);
		return;
	}
	if (vermaj) {
		hokheaders(connect);
		hprint(hout, "Content-type: text/html\r\n");
		hprint(hout, "\r\n");
	}
	hprint(hout, "<html>\n");
	hprint(hout, "<head><title>Index of %s</title></head>\n", dir);
	hprint(hout, "<body>\n");
	hprint(hout, "<h1>Index of %s</h1>\n", dir);
	n = dirreadall(fd, &d);
	close(fd);
	maxwidths(d, n);
	qsort(d, n, sizeof(Dir), (int (*)(void *, void *))compar);
	hprint(hout, "<pre>\n");
	for (i = 0; i < n; i++) {
		f = smprint("%s/%s", dir, d[i].name);
		cleanname(f);
		if (d[i].mode & DMDIR) {
			p = smprint("/magic/webls?dir=%H", f);
			free(f);
			f = p;
		}
		hprint(hout, "%M %C %*ud %-*s %-*s %*lld %s <a href=\"%s\">%s</a>\n",
		    d[i].mode, d[i].type,
		    devwidth, d[i].dev,
		    uidwidth, d[i].uid,
		    gidwidth, d[i].gid,
		    lenwidth, d[i].length,
		    asciitime(d[i].mtime), f, d[i].name);
		free(f);
	}
	f = smprint("%s/..", dir);
	cleanname(f);
	hprint(hout, "\nGo to <a href=\"/magic/webls?dir=%H\">parent</a> directory\n", f);
	free(f);
	hprint(hout, "</pre>\n</body>\n</html>\n");
	hflush(hout);
	free(d);
}

/*
 * Handle unpacking the request in the URI and
 * invoking the actual handler.
 */
static void
dosearch(char *search)
{
	if (strncmp(search, "dir=", 4) == 0){
		search = hurlunesc(connect, search+4);
		dols(search);
		return;
	}

	/*
	 * Otherwise, we've gotten an illegal request.
	 * spit out a non-apologetic error.
	 */
	search = hurlunesc(connect, search);
	error("Bad directory listing request",
	    "<p>Illegal formatted directory listing request:</p>\n"
	    "<p>%H</p>\n", search);
}

void
main(int argc, char **argv)
{
	fmtinstall('H', httpfmt);
	fmtinstall('U', hurlfmt);
	fmtinstall('M', dirmodefmt);

	if(argc == 2){
		hinit(&houtb, 1, Hwrite);
		hout = &houtb;
		dols(argv[1]);
		exits(nil);
	}
	close(2);

	connect = init(argc, argv);
	hout = &connect->hout;
	vermaj = connect->req.vermaj;
	if(hparseheaders(connect, HSTIMEOUT) < 0)
		exits("failed");

	if(strcmp(connect->req.meth, "GET") != 0 && strcmp(connect->req.meth, "HEAD") != 0){
		hunallowed(connect, "GET, HEAD");
		exits("not allowed");
	}
	if(connect->head.expectother || connect->head.expectcont){
		hfail(connect, HExpectFail, nil);
		exits("failed");
	}

	okfd = open("/sys/lib/webls.allowed", OREAD);
	notokfd = open("/sys/lib/webls.denied", OREAD);

	bind("/usr/web", "/", MREPL);

	if(connect->req.search != nil)
		dosearch(connect->req.search);
	else
		error("Bad argument", "Need a search argument");
	hflush(hout);
	writelog(connect, "200 webls %ld %ld\n", hout->seek, hout->seek);
	exits(nil);
}

             reply	other threads:[~2003-09-29  6:14 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-09-29  6:14 cross [this message]
2003-10-01  5:21 YAMANASHI Takeshi
2003-10-01  5:39 ` Dan Cross
2003-10-01  5:41   ` Lucio De Re
2003-10-01 11:41     ` Eric Grosse
2003-10-01 16:17       ` Dan Cross
2003-10-07 20:30       ` vdharani
2003-10-08  8:09         ` Fco.J.Ballesteros
2003-10-08  8:45           ` arisawa
2003-10-10  7:18         ` Eric Grosse
2003-10-01  8:05 YAMANASHI Takeshi
2003-10-22 12:35 Eric Grosse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b6b67c0806b02b89afd64dcc39fe25b@sdgm.net \
    --to=cross@sdgm.net \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).