From mboxrd@z Thu Jan 1 00:00:00 1970 From: dexen deVries To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Date: Mon, 7 May 2012 12:01:55 +0200 Message-ID: <1663922.YALRc7vFOO@coil> User-Agent: KMail/4.8.2 (Linux/3.4.0-rc3-l44; KDE/4.8.2; x86_64; ; ) In-Reply-To: <3E93FE94-76BC-4B38-9FB2-DEDC5C3CEF9E@quintile.net> References: <3E93FE94-76BC-4B38-9FB2-DEDC5C3CEF9E@quintile.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Subject: Re: [9fans] integer width on AMD64 (was: Re: AMD64 system) Topicbox-Message-UUID: 85276a7c-ead7-11e9-9d60-3106f5b1d025 On Monday 07 of May 2012 09:53:01 steve wrote: > sorry for being vague. >=20 > treating pixels as 64bit on amd64 as that is the natural size for the= > machine, vs using 32bits per pixel - 10 bits of r, g, and b or y, u, = and v > plus 2 spare leads to a significant speedup; where significant is a n= umber > lost in the mists of time. >=20 > i believe this speedup is due to the reduction in the rate of cache l= ine > refills, as forsyth described. on RISC, there's usually significant penalty for accessing data units s= maller=20 than machine word (`unaligned access'), but it ain't so on the benevole= nt x86=20 CISC. both handling pixel graphics and transferring to graphic card are speci= al=20 cases. speedup may be due to better prefetch during sequential memory access, = but=20 larger data size should not help much here. more data causes FSB and PCIe contention, and cache trashing. oops? when i asked about int and long size on amd64, i was more concerned wit= h=20 ability to cast between pointer and integer and handling offset of larg= e files. --=20 dexen deVries [[[=E2=86=93][=E2=86=92]]] Weightless and alone you speed through the eerie nothingness of space you circle 'round the Moon and journey back to face the punishing torment of re-entry -- LUNA-C, ``Supaset8 (full release)'', #24m52s