From mboxrd@z Thu Jan 1 00:00:00 1970 From: arisawa Content-Type: multipart/mixed; boundary="Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB" Message-Id: <0DF998AF-3534-47C6-83A9-9973FAB55ECF@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Date: Thu, 13 Apr 2017 12:04:21 +0900 References: <167A2E66-33F8-4909-8FDB-06F92E930D8B@gmail.com> <795C88A6-B6FE-4B29-AD74-50A0E4D80D97@orthanc.ca> <6C339BCD-FF90-4C03-9AC9-D4098E63EDD8@gmail.com> <789A602E-FC47-40AE-96F4-64E2D2A5B375@orthanc.ca> <4B168609-E25A-4D3C-9A80-1457BA0EE830@gmail.com> <20170407234654.40b4e23976d220cea4ab3daf@eigenstate.org> <1491636459.3063398.938223400.7B35C566@webmail.messagingengine.com> <1491642434.3078770.938277904.69229AAA@webmail.messagingengine.com> <01A91DF6-C9BC-46FD-85EE-A978410FDAB1@gmail.com> <7D62156B-8EFC-4AB4-9FC9-F8315D4706C0@gmail.com> <1491980708.58243.942088184.4DEC0B7B@webmail.messagingengine.com> To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-Reply-To: <1491980708.58243.942088184.4DEC0B7B@webmail.messagingengine.com> Subject: Re: [9fans] p9p: 9 ls /dev Topicbox-Message-UUID: b9e3b714-ead9-11e9-9d60-3106f5b1d025 --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hello David, thanks for the response. > it may not be worth it I agree. I think it is wise to give up to simulate dirread() rigorously. plan9 is = not unix. > Ori's suggestion to use Getdirentries64 on OSX might be better I cannot find getdirentries64() on my OSX. I examined david code and found the code have still some problems. (1) some directory entries are lost. (2) uid and gid is wrong. the test program is attached here. --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Disposition: attachment; filename=test_dirread.c Content-Type: application/octet-stream; name="test_dirread.c" Content-Transfer-Encoding: 7bit /* compile on p9p * 9c dirread.c * 9c test_dirread.c && 9l -o o.test_dirread test_dirread.o dirread.o */ #include #include #define DBG if(0) #define fatal sysfatal void usage(void) { fprint(2,"usage: %s [-a] dir\n",argv0); exits(nil); } void test2(char *dirname,int flag) { int fd; Dir *db; int i,n; fd = open(dirname,OREAD); if(fd < 0) fatal("%s open error",dirname); if(flag){ n = dirreadall(fd, &db); if(n < 0) fatal("dirread error"); for(i = 0; i < n; i++) print("%s %s %s %uo %uld\n",\ db[i].name, db[i].uid, db[i].gid, db[i].mode,db[i].length); //free(db); } else{ while((n = dirread(fd, &db)) > 0){ DBG print("#DBG main: n=%d\n",n); for(i = 0; i < n; i++) print("%s %s %s %uo %uld\n",\ db[i].name, db[i].uid, db[i].gid, db[i].mode,db[i].length); } //free(db); } close(fd); } int aflag; char *ofile; void main(int argc, char *argv[]) { char *dirname; ARGBEGIN{ case 'a': aflag = 1; break; case 'o': ofile = EARGF(usage()); break; case '?': default: usage(); } ARGEND; if(ofile) print("%s\n", ofile); if(argc == 0) usage(); dirname = argv[0]; test2(dirname,aflag); exits(nil); } --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii the result is shown below. # # ls (official ls in osx) # -bash$ ls /usr/bin|wc 1108 1108 9719 -bash$ ls /dev|wc 352 352 2428 # # david # -bash$ mk -f mkfile_david 9c -D__DARWIN_64_BIT_INO_T test_dirread.c 9c -D__DARWIN_64_BIT_INO_T dirread_david2.c 9l -o o.test_dirread test_dirread.o dirread_david2.o -bash$ o.test_dirread . .DS_Store root 501 416 0 _mp9dir.c root 501 200000320 0 _p9dir.c root 501 20000000106 0 a.out root 501 710 0 a1.c root 501 572 0 b1.c root 501 243 0 dirread.c.orig root 501 743 0 dirread1.c root 501 137 0 dirread_david.c root 501 473 0 ... -bash$ o.test_dirread /usr/bin |wc 770 3850 20936 -bash$ o.test_dirread -a /usr/bin |wc 770 3850 20936 -bash$ o.test_dirread /dev |wc 320 1600 7739 -bash$ o.test_dirread -a /dev |wc 320 1600 7739 -bash$ # # mine # -bash$ mk 9c -D__DARWIN_64_BIT_INO_T test_dirread.c 9c -D__DARWIN_64_BIT_INO_T mdirread2.c 9c -D__DARWIN_64_BIT_INO_T _mp9dir.c 9l -o o.test_dirread test_dirread.o mdirread2.o _mp9dir.o -bash$ o.test_dirread . .DS_Store arisawa staff 644 6148 _mp9dir.c arisawa staff 644 4948 _mp9dir.o arisawa staff 644 10284 _p9dir.c arisawa staff 444 4733 a.out arisawa staff 755 9552 a1.c arisawa staff 644 741 b1.c arisawa staff 644 2390 dirread.c.orig arisawa staff 444 3699 dirread1.c arisawa staff 644 3790 dirread_david.c arisawa staff 644 3768 ... -bash$ o.test_dirread /usr/bin |wc 1108 5540 33034 -bash$ o.test_dirread -a /usr/bin |wc 1108 5540 33034 -bash$ o.test_dirread /dev |wc 352 1760 10308 -bash$ o.test_dirread -a /dev |wc 352 1760 10308 -bash$ David idea is nice. I borrowed the idea: dir = fdopendir(dup(fd)); in mdirread2.c --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Disposition: attachment; filename=mkfile Content-Type: application/octet-stream; name="mkfile" Content-Transfer-Encoding: 7bit <$PLAN9/src/mkhdr TARG=test_dirread test_dirent OFILES=\ test_dirread.$O\ mdirread2.$O\ _mp9dir.$O\ #HFILES=a.h %.$O: %.c 9c -o $target $stem.c $O.test_dirread: test_dirread.$O mdirread2.$O _mp9dir.$O 9l -o $target $stem.$O <$PLAN9/src/mkone #<$PLAN9/src/mkmany CFLAGS=-D__DARWIN_64_BIT_INO_T --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Disposition: attachment; filename=mdirread2.c Content-Type: application/octet-stream; name="mdirread2.c" Content-Transfer-Encoding: 7bit #include #define NOPLAN9DEFINES #include #include #include #include #define DBG if(0) #define fatal sysfatal extern int _p9dir(struct stat*, struct stat*, char*, Dir*, char**, char*); #if defined(__DragonFly__) static inline int d_reclen(struct dirent *de) { return _DIRENT_DIRSIZ(de); } #else static inline int d_reclen(struct dirent *de) { return de->d_reclen; } #endif static int countde(char *p, int n) { char *e; int m; struct dirent *de; e = p+n; m = 0; while(p < e){ de = (struct dirent*)p; if(d_reclen(de) <= 4+2+2+1 || p+d_reclen(de) > e) break; if(de->d_name[0]=='.' && de->d_name[1]==0) de->d_name[0] = 0; else if(de->d_name[0]=='.' && de->d_name[1]=='.' && de->d_name[2]==0) de->d_name[0] = 0; m++; p += d_reclen(de); } return m; } static int dirpackage(int fd, char *buf, int n, Dir **dp) { int oldwd; char *p, *str, *estr; int i, nstr, m; struct dirent *de; struct stat st, lst; Dir *d; n = countde(buf, n); if(n <= 0) return n; if((oldwd = open(".", O_RDONLY)) < 0) return -1; if(fchdir(fd) < 0) return -1; p = buf; nstr = 0; for(i=0; id_name[0] == 0) /* nothing */ {} else if(lstat(de->d_name, &lst) < 0) de->d_name[0] = 0; else{ st = lst; if(S_ISLNK(lst.st_mode)) stat(de->d_name, &st); nstr += _p9dir(&lst, &st, de->d_name, nil, nil, nil); } p += d_reclen(de); } d = malloc(sizeof(Dir)*n+nstr); if(d == nil){ fchdir(oldwd); close(oldwd); return -1; } str = (char*)&d[n]; estr = str+nstr; p = buf; m = 0; for(i=0; id_name[0] != 0 && lstat(de->d_name, &lst) >= 0){ st = lst; if((lst.st_mode&S_IFMT) == S_IFLNK) stat(de->d_name, &st); _p9dir(&lst, &st, de->d_name, &d[m++], &str, estr); } p += d_reclen(de); } fchdir(oldwd); close(oldwd); *dp = d; return m; } /* return value * -1: on error * 0: on eof * others: filled bytes */ int mreaddir(DIR *dirp, char *buf, int m) { /* m is sizeof buf. must be > sizeof(struct dirent) */ struct dirent *ep; int n,status; n = 0; for(;;){ ep = (struct dirent *)(buf+n); status = readdir_r(dirp, ep, &ep); if(status || (ep == NULL)) break; if(strcmp(ep->d_name,".") == 0 || strcmp(ep->d_name,"..") == 0) continue; DBG print("R %d %d %s\n",ep->d_namlen,ep->d_reclen,ep->d_name); n += ep->d_reclen; if(m -n < sizeof(struct dirent)) break; } if(status) return -1; return n; } long dirreadall(int fd, Dir **dp) { static DIR *dir; struct dirent *ep; char *p,*buf,*bufend,*nbuf; int m, n, bufsz; int dentsz = sizeof(struct dirent); int allcsz = 1024;/* allocation unit */ bufsz = allcsz + dentsz;/* what is the best value? */ DBG print("#DBG mdirread begin\n"); buf = malloc(bufsz); if(buf == nil) fatal("# malloc"); dir = fdopendir(dup(fd)); if(dir == nil) fatal("fdopendir"); n = 0; for(;;){ m = mreaddir(dir, buf+n, bufsz-n); if(m == 0) break; n += m; if(bufsz - n < dentsz){ bufsz += allcsz; nbuf = realloc(buf, bufsz); if(nbuf == nil){ free(buf); closedir(dir); return -1; } buf = nbuf; } } /* then n is num of effective bytes in buf */ bufend = buf + n; for(p = buf; p < bufend;){ ep = (struct dirent *)p; DBG print("C %d %d %s\n",ep->d_namlen,ep->d_reclen,ep->d_name); p += ep->d_reclen; } n = dirpackage(fd, buf, n, dp); free(buf); closedir(dir); return n; } long dirread(int fd, Dir **dp) { return dirreadall(fd,dp); } --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Disposition: attachment; filename=_mp9dir.c Content-Type: application/octet-stream; name="_mp9dir.c" Content-Transfer-Encoding: 7bit #include #define NOPLAN9DEFINES #include #include #include #include #include #include #define DMBLOCK 0x00800000 /* same as DMDEVICE */ #define DMCHAR 0x00400000 /* we have room for new definition */ #if defined(__APPLE__) #define _HAVESTGEN #include static vlong disksize(int fd, struct stat *st) { u64int bc; u32int bs; bs = 0; bc = 0; ioctl(fd, DKIOCGETBLOCKSIZE, &bs); ioctl(fd, DKIOCGETBLOCKCOUNT, &bc); if(bs >0 && bc > 0) return bc*bs; return 0; } #elif defined(__FreeBSD__) #define _HAVESTGEN #include #include #include static vlong disksize(int fd, struct stat *st) { off_t mediasize; if(ioctl(fd, DIOCGMEDIASIZE, &mediasize) >= 0) return mediasize; return 0; } #elif defined(__OpenBSD__) #define _HAVESTGEN #include #include #include static vlong disksize(int fd, struct stat *st) { struct disklabel lab; int n; if(!S_ISCHR(st->st_mode)) return 0; if(ioctl(fd, DIOCGDINFO, &lab) < 0) return 0; n = minor(st->st_rdev)&7; if(n >= lab.d_npartitions) return 0; return (vlong)lab.d_partitions[n].p_size * lab.d_secsize; } #elif defined(__linux__) #include #include #include #undef major #define major(dev) ((int)(((dev) >> 8) & 0xff)) static vlong disksize(int fd, struct stat *st) { u64int u64; long l; struct hd_geometry geo; memset(&geo, 0, sizeof geo); l = 0; u64 = 0; #ifdef BLKGETSIZE64 if(ioctl(fd, BLKGETSIZE64, &u64) >= 0) return u64; #endif if(ioctl(fd, BLKGETSIZE, &l) >= 0) return l*512; if(ioctl(fd, HDIO_GETGEO, &geo) >= 0) return (vlong)geo.heads*geo.sectors*geo.cylinders*512; return 0; } #else static vlong disksize(int fd, struct stat *st) { return 0; } #endif int _p9usepwlibrary = 1; /* * Caching the last group and passwd looked up is * a significant win (stupidly enough) on most systems. * It's not safe for threaded programs, but neither is using * getpwnam in the first place, so I'm not too worried. */ int _p9dir(struct stat *lst, struct stat *st, char *name, Dir *d, char **str, char *estr) { char *s; char tmp[20]; static struct group *g; static struct passwd *p; int sz, fd; fd = -1; USED(fd); sz = 0; if(d) memset(d, 0, sizeof *d); /* name */ s = strrchr(name, '/'); if(s) s++; if(!s || !*s) s = name; if(*s == '/') s++; if(*s == 0) s = "/"; if(d){ if(*str + strlen(s)+1 > estr) d->name = "oops"; else{ strcpy(*str, s); d->name = *str; *str += strlen(*str)+1; } } sz += strlen(s)+1; /* user */ p = getpwuid(st->st_uid); if(p == nil){ snprint(tmp, sizeof tmp, "%d", (int)st->st_uid); s = tmp; } else s = p->pw_name; sz += strlen(s)+1; if(d){ if(*str+strlen(s)+1 > estr) d->uid = "oops"; else{ strcpy(*str, s); d->uid = *str; *str += strlen(*str)+1; } } /* group */ g = getgrgid(st->st_gid); if(g == nil){ snprint(tmp, sizeof tmp, "%d", (int)st->st_gid); s = tmp; } else s = g->gr_name; sz += strlen(s)+1; if(d){ if(*str + strlen(s)+1 > estr) d->gid = "oops"; else{ strcpy(*str, s); d->gid = *str; *str += strlen(*str)+1; } } if(d){ d->type = 'M'; d->muid = ""; d->qid.path = st->st_ino; /* * do not include st->st_dev in path, because * automounters give the same file system different * st_dev values for successive mounts, causing * spurious write warnings in acme and sam. d->qid.path |= (uvlong)st->st_dev<<32; */ #ifdef _HAVESTGEN d->qid.vers = st->st_gen; #endif if(d->qid.vers == 0) d->qid.vers = st->st_mtime + st->st_ctime; d->mode = st->st_mode&0777; d->atime = st->st_atime; d->mtime = st->st_mtime; d->length = st->st_size; if(S_ISLNK(lst->st_mode)){ /* yes, lst not st */ d->mode |= DMSYMLINK; d->length = lst->st_size; } else if(S_ISDIR(st->st_mode)){ d->length = 0; d->mode |= DMDIR; d->qid.type = QTDIR; } else if(S_ISFIFO(st->st_mode)) d->mode |= DMNAMEDPIPE; else if(S_ISSOCK(st->st_mode)) d->mode |= DMSOCKET; #ifdef P9P_CMODE else if(S_ISBLK(st->st_mode)){ d->mode |= DMDEVICE; d->qid.path = ('b'<<16)|st->st_rdev; } else if(S_ISCHR(st->st_mode)){ d->mode |= DMDEVICE; d->qid.path = ('c'<<16)|st->st_rdev; } #else else if(S_ISBLK(st->st_mode)){ d->mode |= DMBLOCK; d->qid.path = ('b'<<16)|st->st_rdev; } else if(S_ISCHR(st->st_mode)){ d->mode |= DMCHAR; d->qid.path = ('c'<<16)|st->st_rdev; } #endif /* fetch real size for disks * we should avoid opening char device * because it can make trouble */ if(S_ISBLK(lst->st_mode)){ if((fd = open(name, O_RDONLY)) >= 0){ d->length = disksize(fd, st); close(fd); } } else if(S_ISCHR(lst->st_mode)) d->length = 0; } return sz; } --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 NOTE on _p9dir.c (A) d->length I observed opening character device can make a problem. I think length of those device is allowed to assign 0. if(S_ISBLK(lst->st_mode) || S_ISCHR(lst->st_mode)){ if((fd =3D open(name, O_RDONLY)) >=3D 0){ d->length =3D disksize(fd, st); close(fd); } } (B) DMDEVICE you will find the code in _p9dir.c else if(S_ISBLK(st->st_mode)){ d->mode |=3D DMDEVICE; d->qid.path =3D ('b'<<16)|st->st_rdev; } else if(S_ISCHR(st->st_mode)){ d->mode |=3D DMDEVICE; d->qid.path =3D ('c'<<16)|st->st_rdev; } but I prefer else if(S_ISBLK(st->st_mode)){ d->mode |=3D DMBLOCK; d->qid.path =3D ('b'<<16)|st->st_rdev; } else if(S_ISCHR(st->st_mode)){ d->mode |=3D DMCHAR; d->qid.path =3D ('c'<<16)|st->st_rdev; } (C) I fixed some codes that produce wrong uid and gid. comments welcome > 2017/04/12 16:05=E3=80=81David Arroyo = =E3=81=AE=E3=83=A1=E3=83=BC=E3=83=AB=EF=BC=9A >=20 > The problem is twofold; >=20 > * The function exits early if it can't read a max-length directory > entry, so that entry is "skipped" in subsequent calls to mygetdents. > * As you said, we also lose any buffered dirents that haven't been > returned from readdir yet. >=20 > I think these are both fixable problems, but it may not be worth it. > Ori's suggestion to use Getdirentries64 on OSX might be better. >=20 > David >=20 > On Tue, Apr 11, 2017, at 10:33 PM, arisawa wrote: >> I did more test on david code and found a problem. >>=20 >> -bash$ mk -f mkfile_david >> -bash$ o.test_dirread -a /usr/bin |wc >> 1084 4336 27266 >> -bash$ o.test_dirread /usr/bin |wc >> 1084 4336 27266 >> -bash$ ls /usr/bin |wc >> 1108 1108 9719 >>=20 >> option -a is for dirreadall. >> 1108 - 1084 entries are lost. >>=20 >> they are: >> o.test_dirread /usr/bin | awk '{print $1}' | sort >/tmp/a >> ls /usr/bin | sort >/tmp/b >> diff /tmp/a /tmp/b >>=20 >> -bash$ diff /tmp/a /tmp/b >> 21a22 >>> SplitForks >> 240a242,246 >>> easy_install-2.6 >>> easy_install-2.7 >>> efax >>> efix >>> egrep >> 461a468,473 >>> kcc >>> kdestroy >>> kextutil >>> keytool >>> kgetcred >>> kill.d >> 675a688,694 >>> piconv >>> piconv5.16 >>> piconv5.18 >>> pidpersec.d >>> pkgbuild >>> pkill >>> pl >> 880a900,904 >>> spfquery5.18 >>> splain >>> splain5.16 >>> splain5.18 >>> split >>=20 >> sorry if I make a mistake, but I suspect readdir_r() has a buffer, >> which can make a problem in using dup(). >>=20 >> Kenji Arisawa. >>=20 >>> 2017/04/09 13:18=E3=80=81arisawa = =E3=81=AE=E3=83=A1=E3=83=BC=E3=83=AB=EF=BC=9A >>>=20 >>> thanks david. >>>=20 >>> using dup() is very nice idea! >>> your code works with >>> CFLAGS=3D-D__DARWIN_64_BIT_INO_T # manual is wrong >>> and a fix: >>> // buf =3D ((char*)buf) + d_reclen(buf); >>> buf =3D (struct dirent *)(((char*)buf) + d_reclen(buf)); >>> and adding >>> #define NAME_MAX 256 >>> in somewhere. >>>=20 >>> now /dev is readable. >>>=20 >>> one problem is left. >>>=20 >>> my test code: >>> fd =3D open(dirname,OREAD); >>> if(fd < 0) >>> fatal("%s open error",dirname); >>> while((n =3D dirread(fd, &db)) > 0){ >>> print("#DBG n=3D%d\n",n); >>> for(i =3D 0; i < n; i++) >>> print("%s %s %s \n", db[i].name, db[i].uid, = db[i].gid); >>> } >>> close(fd); >>> shows for dirname=3D$HOME >>> ... >>> arch root 501=20 >>> bin root 501=20 >>> ... >>> but they should be >>> arch arisawa staff=20 >>> bin arisawa staff >>> this problem comes from _p9dir() that is used in dirpackage(). >>>=20 >>> Kenji Arisawa >>>=20 >>>=20 >>>> 2017/04/08 18:07=E3=80=81David Arroyo = =E3=81=AE=E3=83=A1=E3=83=BC=E3=83=AB=EF=BC=9A >>>>=20 >>>> Ignore my previous post, I was tired and forgot about dup(). How = about >>>> something like this? (attached) >>>>=20 >>>> I only tested this on Ubuntu, I don't have an OS X machine. I still = went >>>> with readdir_r because the AIX and Solaris man pages for readdir = were >>>> vague about its behavior when called from multiple threads (glibc, = musl, >>>> FreeBSD look pretty safe). >>>>=20 >>>> Cheers, >>>> David >>>>=20 >>>> On Sat, Apr 8, 2017, at 03:27 AM, David Arroyo wrote: >>>>> This should be doable with some combination of fdopendir(3) and >>>>> readdir(3). I'm not sure how to avoid leaking memory through the >>>>> returned DIR pointer and any memory allocated with by readdir(3). >>>>> This is usually free'd by closedir(3), which we can't use without >>>>> closing the underlying file. >>>>>=20 >>>>> It should be OK to use free() on the return value of fdopendir, = and >>>>> stick to the uglier readdir_r(3) interface. I can definitely see = why >>>>> Russ went with the simpler system-specific interfaces on this. >>>>>=20 >>>>> David >>>>>=20 >>>>> On Sat, Apr 8, 2017, at 02:46 AM, Ori Bernstein wrote: >>>>>> On Sat, 8 Apr 2017 15:21:47 +0900, arisawa = wrote: >>>>>>=20 >>>>>>> but how to? >>>>>>>=20 >>>>>>> unix doesn=E2=80=99t have something like fdreaddir(int fd). >>>>>>> my guess: russ unwillingly used a low level function such as >>>>>>> int getdirentries(int fd, char *buf, int nbytes, long *basep). >>>>>>>=20 >>>>>>> readdirall() might be OK in regular usage. >>>>>>=20 >>>>>> I don't use OSX regularly, although I do maintain the syscall >>>>>> layer for Myrddin on it. >>>>>>=20 >>>>>> Getdirentries64 exists, and rudimentary testing doesn't show >>>>>> any difficulties with using it. >>>>>>=20 >>>>>> --=20 >>>>>> Ori Bernstein >>>>>>=20 >>>>>=20 >>>> >>>=20 >>=20 >>=20 > --Apple-Mail=_71C15674-63FA-47AE-83CB-99F2B8D17CFB--