caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Brian Hurt <brian.hurt@qlogic.com>
To: Ville-Pertti Keinonen <will@exomi.com>
Cc: Brian Hurt <brian.hurt@qlogic.com>, <caml-list@inria.fr>
Subject: Re: [Caml-list] Bug?  Printf, %X and negative numbers
Date: Tue, 1 Apr 2003 13:56:11 -0600 (CST)	[thread overview]
Message-ID: <Pine.LNX.4.33.0304011328530.2225-100000@eagle.ancor.com> (raw)
In-Reply-To: <6A2DCA04-6476-11D7-848D-000393863F70@exomi.com>

On Tue, 1 Apr 2003, Ville-Pertti Keinonen wrote:

> 
> > I was thinking of just a bitmask.  You are always allocating into the
> > minor heap.  You just define the low 1/32nd of the minor heap a 
> > bitmask of
> > what is a pointer and what isn't.  This would probably slow down
> > allocation- not by much, would be my prediction, but it would slow down
> > allocation.  You may gain some of that cost back by not needing to do 
> > the
> > shifts and ors we currently do to deal with the low bit.  I can't 
> > predict
> > what the overall performance delta would be- not even if it will be
> > positive or negative.
> 
> I suspect that would be quite slow, since you have to do one bit 
> operation for each word of each allocation (optimizing them is 
> difficult because you don't know the alignment).

For structures 32 words or less, you can do it like (using x86 assembly 
language here):
	movl	heap_top, %eax
	movl	%eax, %ecx
	andl	$31, %ecx
	movl	%eax, %ebx
	shrw	$5, %bx	; /* partial register stall- oh well */
	movl	$bit_pattern, %esi
	xorl	%edi, %edi
	shldl	%cl, %esi, %edi
	shll	%cl, %esi
	orl	%esi, (%ebx)
	orl	%edi, 4(%ebx)

10 instructions, no branchs, only one partial register stall (which can
probably be removed at the cost of four more instructions), only one
'complex' instruction (the shld) which might take more than a clock cycle.  
Some rearrangement could probably give us 2, maybe 3 instructions issued
per clock.  We're probably looking at single digit clock cycles here.  
Note that I'm assuming the minor heap is 64K (removing the partial stall 
would also solve this).

On the other side of this equation, we have the overhead of dealing with 
the low bit being set.  This does have a performance cost.  How much of 
one depends upon what you're doing with the ints.  But we're not talking 
about more than an instruction or two per access.  With few accesses per 
allocation it's probably a loss- with more access per allocation, it's 
probably a win.  But either way, we're talking about single digit clock 
cycles.

By way of comparison, on the P4 a branch mispredict costs 20-28 clock 
cycles, a cache miss costs 100-300 clock cycles, and a page fault millions 
of clock cycles.

Brian


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  parent reply	other threads:[~2003-04-01 19:54 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-28 21:19 Brian Hurt
2003-03-28 22:21 ` Yutaka OIWA
2003-03-30  9:51 ` Xavier Leroy
2003-03-31 15:44   ` Brian Hurt
2003-03-31 17:13     ` Ville-Pertti Keinonen
2003-04-01  8:19     ` Xavier Leroy
2003-04-01 16:09       ` David Brown
2003-04-01 16:45       ` Tim Freeman
2003-04-01 18:59         ` Brian Hurt
2003-04-01 19:16           ` Ville-Pertti Keinonen
2003-04-01 19:23             ` Tim Freeman
2003-04-01 21:00               ` Ville-Pertti Keinonen
2003-04-01 19:56             ` Brian Hurt [this message]
2003-04-01 20:45               ` Ville-Pertti Keinonen
2003-04-01 21:03                 ` Brian Hurt
2003-04-02  8:55             ` Andreas Rossberg
2003-04-02  9:20               ` Ville-Pertti Keinonen
2003-04-01 18:34       ` Ville-Pertti Keinonen
2003-04-02 11:44 ` Claude Marche
2003-04-02 18:42 Gregory Morrisett
2003-04-02 21:12 ` Ville-Pertti Keinonen
2003-04-02 21:46   ` Lauri Alanko
2003-04-03 17:40     ` Ville-Pertti Keinonen
2003-04-04 16:14   ` Brian Hurt
2003-04-04 17:14     ` Ville-Pertti Keinonen
2003-04-04 17:27     ` Falk Hueffner
2003-04-03  0:52 ` brogoff
2003-04-03  9:29 Fabrice Le Fessant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.33.0304011328530.2225-100000@eagle.ancor.com \
    --to=brian.hurt@qlogic.com \
    --cc=caml-list@inria.fr \
    --cc=will@exomi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).