From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Fri, 5 Dec 2008 14:40:34 -0500 To: 9fans@9fans.net Message-ID: In-Reply-To: <4939815F.9020509@telus.net> References: <13426df10812042239pde2100dw696049def0160c4a@mail.gmail.com> <39cb2be32e592403f7336c6200cf56a3@quanstro.net> <13426df10812051049j40b40b78u4ae74a3fc7df07a3@mail.gmail.com> <49397F3E.9070801@telus.net> <57cb40901c57600ac592ec15ccb1a687@coraid.com> <4939815F.9020509@telus.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] image/memimage speed Topicbox-Message-UUID: 5b4262dc-ead4-11e9-9d60-3106f5b1d025 On Fri Dec 5 14:32:56 EST 2008, plalonde@telus.net wrote: > But random access patterns suck at being speculatively cached. > Linear access patterns still require reasonably careful work for the > caching to do the right thing. > Expecting your entire frame buffer to be cached in L2 isn't particularly > reasonable. > > Paul i'm just not convinced that nvidia's poor performance has anything to do with pcie latency or processor stalls. a 500x500 window takes ~1sec to uncover. that's like 2 billion instructions. since a cacheline is ~128 bytes (close enough) that's ~8000 stall opertunities. if it takes all of them, that's only 8 million instructions. on the order of 1/1000th of the actual delay. if WC were the issue, i should see 100x improvement in reading from the card. - erik