From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sun, 19 Nov 1995 10:31:35 -0500 From: Dan Hildebrand danh@qnx.com Subject: Graphics issues Topicbox-Message-UUID: 351badea-eac8-11e9-9e20-41e7f4b1d025 Message-ID: <19951119153135.EUFZ-ZyAajux20o9bRaBC1cRlMNZrh2nuW0fv693_x8@z> In article <95Nov16.124635est.78461@colossus.cse.psu.edu>, <9fans@cse.psu.edu> wrote: >In Brazil, we have demos of software doing 30 frame video in a 640x480 >window; actually, several such windows simultaneously. > >We reject methods that map the display because they cannot be >implemented in a way that is portable to the application. Display >addressing, byte order issues, pixel packing and so on are just too >messy. You need one indirection to hide the details. Remember, we >have many machines that are not PC's. Exactly - in our implementation of the graphics drivers for Photon (a microkernel GUI for QNX), the graphics drivers export a 24 bit color model into which applications draw, relying on the graphics driver to map the requested colors (ie: color select, dither, etc) onto the capabilities of the display hardware. This is especially important as windows are dragged from desktop to desktop across a network, since the color capabilities of each machine could be very different. In the case of Doom, each frame submitted to the driver contains a palette and rather than hitting the hardware palette, the driver does a closest match for all the colors in the frame. It turns out that this calculation can be done more quickly than the hardware palette can be reprogrammed, and as a result, we actually get frame rates under Photon that rival the raw-VGA frame rates. >Somewhat closer to this home, Carmack's observation that the bitmap >read/write protocol should use rectangles rather than whole scan lines >is right, and already part of Brazil. Our solution was to require the >scan lines of the rectangle to start and end at byte boundaries, not >pixels, so that a general bitblt is not called for. Using memmove and >a little care, the performance can be good. On many machines, but not >SPARC, unaligned memmove can be as fast as aligned because of special >instructions or silicon, so the quantization to bytes is good enough. Exactly, in addition, with display hardware appearing that allows one or more clip rectangles to be "pushed" into the hardware before rendering the image, we can maximize the use of the display hardware's capabilities. It's up to the graphics driver to either use the hardware, or if the hardware is lacking, to make up the difference with software. Another capability we want graphics drivers to be able to take advantage of is color space conversion. It becomes possible to place the application's representation of the frame directly into off-screen memory in the video card, and then let the video card do the "stretch blit" as necessary to convert the off-screen representation into one which agrees with the current display mode. As a result, it becomes unnecessary to export the hardware representation of the video memory. >As for events, I side with dhog. The Plan 9 model is general and easy >to get right. If you need special support - and there is no doubt >that games do - I would suggest providing a connection to the raw >devices underneath, and synthesizing whatever else is needed at user >level. That is, rather than build special devices that fold together >multiple devices and demand non-universal hardware features (e.g. key >up), it is better to build devices in the kernel that export the >hardware interface as directly as is practical, and encapsulate the >special properties and desires in adaptable, user-level code. With >some care, even the window system could pass such devices through >cleanly. My impression is that Plan 9's approach has always been that rather than implementing new facilities, fix the ones that exist so that new services aren't necessary. For example, with process-to-process IPX sufficiently fast, threads aren't needed to solve the slow-IPC problem that some OS's demonstrate. Threads may be useful for other reasons, but achieveing fast context switches doesn't need to be one of them. >Performance is not critical here: with human-driven input, >the extra context switch and system call required would be >insignificant on modern machines, especially when compared to the >generation of a 30Hz image. We still have to be careful here, because "human driven" input will be increasing in bandwidth as user-interface expectations increase. For example, handwriting recognition and voice input are certainly high-bandwidth. If we want to be able to deal with these input devices within the GUI, the GUI must provide fast and efficient event mechanisms to pass this data between the services processing that data. -- Dan Hildebrand (danh@qnx.com) QNX Software Systems, Ltd. http://www.qnx.com/~danh 175 Terence Matthews phone: (613) 591-0931 (voice) Kanata, Ontario, Canada (613) 591-3579 (fax) K2M 1W8