mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Szabolcs Nagy <nsz@port70.net>
To: musl@lists.openwall.com
Subject: Re: Completeness status of musl
Date: Sun, 26 Jun 2011 12:30:58 +0200	[thread overview]
Message-ID: <20110626103057.GW27421@port70.net> (raw)
In-Reply-To: <20110625183106.GV27421@port70.net>

[-- Attachment #1: Type: text/plain, Size: 1198 bytes --]

* Szabolcs Nagy <nsz@port70.net> [2011-06-25 20:31:06 +0200]:
> tsearch:
>     i provided two different implementations: plain binary
>     tree (as used in bsd) and a self balancing avl tree
>     (glibc seems to use rb-tree)
> 
>     the avl tree implementation uses recursive functions
>     (max depth of an avl tree with 2^42 nodes is 60)

i was not sure about the performance of this
so i did some benchmarks now
it turns out my avl tree is a little bit faster at word
counting than glibc rb-tree

i had to change scanf("%s", buf) to custom getword(buf)
as musl scanf is slower than glibc's and it took
comparable amount of time to the actual tree operations

on a large text file (gcc info page ~ 30K different words)
musl vs glibc tsearch (best run out of several tries)

$ time ./words <gcc.info 
30617

real	0m0.261s
user	0m0.240s
sys	0m0.020s


$ time ./words-gnu <gcc.info 
30617

real	0m0.291s
user	0m0.284s
sys	0m0.008s


by comparision hsearch version gives

$ time ./words <gcc.info 
30617

real	0m0.116s
user	0m0.100s
sys	0m0.016s

(i could not test the glibc version of
hsearch as it segfaults after a while
even when allocate memory for every string
so i think it's a glibc bug)

[-- Attachment #2: words.c --]
[-- Type: text/x-csrc, Size: 869 bytes --]

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <search.h>

static void *root;

static int cmp(const void *a, const void *b) {
	return strcmp(a, b);
}

static char *sdup(char *s) {
	size_t n = strlen(s)+1;
	return memcpy(malloc(n), s, n);
}

static int add(char *s) {
	char **r = tsearch(s, &root, cmp);
	if (*r == s) {
		/* only do the string copy if s is a new string */
		*r = sdup(s);
		return 1;
	}
	return 0;
}

static int getword(char *s) {
	int c;

	for (;;) {
		c = getchar();
		if (c == EOF)
			return 0;
		if (!(c == ' ' || c == '\n' || c == '\r' || c == '\t'))
			break;
	}
	for (;;) {
		*s++ = c;
		c = getchar();
		if (c == EOF || c == ' ' || c == '\n' || c == '\r' || c == '\t')
			break;
	}
	*s = 0;
	return 1;
}

int main(void) {
	char buf[4096];
	int c = 0;

	while (getword(buf))
		c += add(buf);
	printf("%d\n", c);
	return 0;
}

[-- Attachment #3: words-hsearch.c --]
[-- Type: text/x-csrc, Size: 881 bytes --]

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <search.h>

static char *sdup(char *s) {
	size_t n = strlen(s)+1;
	return memcpy(malloc(n), s, n);
}

static int add(char *s) {
	ENTRY *e;
	e = hsearch((ENTRY){.key = s}, ENTER);
	if (e->key == s) {
		/* only do the string copy if s is a new string */
		e->key = sdup(s);
		return 1;
	}
	return 0;
}

static int getword(char *s) {
	int c;

	for (;;) {
		c = getchar();
		if (c == EOF)
			return 0;
		if (!(c == ' ' || c == '\n' || c == '\r' || c == '\t'))
			break;
	}
	for (;;) {
		*s++ = c;
		c = getchar();
		if (c == EOF || c == ' ' || c == '\n' || c == '\r' || c == '\t')
			break;
	}
	*s = 0;
	return 1;
}

int main(void) {
	char buf[4096];
	int c = 0;

	/* start with a small table so resize is tested */
	hcreate(64);
	while (getword(buf))
		c += add(buf);
//	hdestroy();
	printf("%d\n", c);
	return 0;
}

  reply	other threads:[~2011-06-26 10:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-28 23:41 Help assessing " Rich Felker
2011-05-29 16:19 ` gs
2011-05-29 16:44   ` gs
2011-05-29 17:32     ` gs
2011-05-29 18:08       ` Szabolcs Nagy
2011-05-30 10:30         ` Szabolcs Nagy
2011-05-30 15:47           ` Rich Felker
2011-05-30 17:27             ` Szabolcs Nagy
2011-06-08 23:58               ` Completeness " Rich Felker
2011-06-21  1:46                 ` Szabolcs Nagy
2011-06-23  6:54                   ` Rich Felker
2011-06-23 13:25                     ` Szabolcs Nagy
2011-06-23 14:16                       ` Szabolcs Nagy
2011-06-23 18:50                         ` Szabolcs Nagy
2011-06-24 12:12                           ` Szabolcs Nagy
2011-06-25 18:31                 ` Szabolcs Nagy
2011-06-26 10:30                   ` Szabolcs Nagy [this message]
2011-06-29 21:14                 ` Rich Felker
2011-07-05 15:40                   ` Hiltjo Posthuma
2011-06-18 21:32               ` Help assessing " Szabolcs Nagy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110626103057.GW27421@port70.net \
    --to=nsz@port70.net \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).