9front - general discussion about 9front
 help / color / mirror / Atom feed
* I'm giving up... hwcursor
@ 2016-05-21  2:04 kokamoto
  2016-05-22 11:56 ` [9front] " cinap_lenrek
  0 siblings, 1 reply; 30+ messages in thread
From: kokamoto @ 2016-05-21  2:04 UTC (permalink / raw)
  To: 9front

I changed the /sys/src/9/pc/vgaigfx.c like:

in preallocsize() function
	switch(p->did){
	...
	case 0x2a42: / X200 */
+	case 0x2a02:	/* CF-R7 */
	...

in igfxcurregs() function
	case 0x2a42:	/* X200 */
+	case 0x2a02:	/* CF-R7 */

After rading the programming manual and reference manual of GM965,
I believe other parts of cinap's codes are very nice and right.
He treats the cursor as 64x64x32bpp mode.
(cinap's code is too much wise and compact to me to understand it 
in a short time)

Then, I got mouse cursor, however, it's shape is strange like
X
X
X
X
where X indicates reasonable cursor shape. The vertical four
cursors are always united to one cursor.
Four 16 dot cursor in 64 dot cursor... hmmm...

Is this a bug of manual or chip?

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-21  2:04 I'm giving up... hwcursor kokamoto
@ 2016-05-22 11:56 ` cinap_lenrek
  2016-05-23  0:21   ` kokamoto
  2016-05-23  0:39   ` kokamoto
  0 siblings, 2 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-22 11:56 UTC (permalink / raw)
  To: 9front

that looks like the cursor control register was not updated.

the plan9 cursors are 2x 16x16 1bpp, while we program the hw cursor
to 64x64 32bpp as you said (set bits 2:0 to 111 and bit 5 to 0 in
CURCTL -> 64x64 32bpp AND/XOR mode).

can you put a coherence() call after r[CURCTL] is written? the
cursor registers are double buffered and armed by write to
the base register. so this might prevent reordering issue.

static void
igfxcurenable(VGAscr* scr)
{
	u32int *r;
	int i;

	igfxenable(scr);
	igfxcurload(scr, &arrow);
	igfxcurmove(scr, ZP);

	for(i=0; i<NPIPE; i++){
		if((r = igfxcurregs(scr, i)) != nil){
			r[CURCTL] = (r[CURCTL] & ~(3<<28 | 1<<5)) | (i<<28) | 7;
			coherence();	// <--- make sure CURCTL is written first
			r[CURBASE] = scr->storage;
		}
	}
}

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-22 11:56 ` [9front] " cinap_lenrek
@ 2016-05-23  0:21   ` kokamoto
  2016-05-23  0:39   ` kokamoto
  1 sibling, 0 replies; 30+ messages in thread
From: kokamoto @ 2016-05-23  0:21 UTC (permalink / raw)
  To: 9front

no change is visible.
As I'm an old man, and not familier with pipe on this chip.
In the case, we always see pape B not pipe A.
Is this normal?

Kenji

> static void
> igfxcurenable(VGAscr* scr)
> {
> 	u32int *r;
> 	int i;
> 
> 	igfxenable(scr);
> 	igfxcurload(scr, &arrow);
> 	igfxcurmove(scr, ZP);
> 
> 	for(i=0; i<NPIPE; i++){
> 		if((r = igfxcurregs(scr, i)) != nil){
> 			r[CURCTL] = (r[CURCTL] & ~(3<<28 | 1<<5)) | (i<<28) | 7;
> 			coherence();	// <--- make sure CURCTL is written first
> 			r[CURBASE] = scr->storage;
> 		}
> 	}
> }
> 
> --
> cinap



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-22 11:56 ` [9front] " cinap_lenrek
  2016-05-23  0:21   ` kokamoto
@ 2016-05-23  0:39   ` kokamoto
  2016-05-23  0:57     ` cinap_lenrek
                       ` (2 more replies)
  1 sibling, 3 replies; 30+ messages in thread
From: kokamoto @ 2016-05-23  0:39 UTC (permalink / raw)
  To: 9front

The r[CURCTL] is 0x10000007, and r[CURBASE] is 0x7FC00,
which has not been changed, ie, those are same as not cohenrent()
was inserted.

r[CURCTL] value looks right to me.
However it indicates pipe B not A.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  0:39   ` kokamoto
@ 2016-05-23  0:57     ` cinap_lenrek
  2016-05-23  2:48       ` kokamoto
  2016-05-23  5:09       ` kokamoto
  2016-05-23  8:46     ` cinap_lenrek
  2016-05-23  8:52     ` cinap_lenrek
  2 siblings, 2 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-23  0:57 UTC (permalink / raw)
  To: 9front

the idea of inserting a coherence() call was to make sure
the base register is written last. modern cpu's can reorder
writes. so it could'v happend in theory that the cpu writes
the base first, which would be a no-op, and then we write
the control register, which would just put it in the hold
register.

when you read the control register, you'r accessing the
holding registers contents, not the effective values.
when we write the control register, it is really written
in a holding register. and only after we'v written to the
base register then it is armed and will be made effective
on the next vertical blank. see section 2.10.2.1 in
965_g35_vol3_display_registers_updated.pdf

the pipe assignments should be 1:1, that is cursor a
should be assigned to pipe a and cursor b should be
assigned to pipe b. when a display pipe is disabled,
we skip writing the register. enabling/disabling
display pipes is the job of aux/vga.

igfxcurregs() returns the virtual address of the register
bank for a specific cursor *if* the associated display
pipe was enabled (assuming the 1:1 mapping).

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  0:57     ` cinap_lenrek
@ 2016-05-23  2:48       ` kokamoto
  2016-05-23  3:01         ` kokamoto
  2016-05-23  5:09       ` kokamoto
  1 sibling, 1 reply; 30+ messages in thread
From: kokamoto @ 2016-05-23  2:48 UTC (permalink / raw)
  To: 9front

> on the next vertical blank. see section 2.10.2.1 in
> 965_g35_vol3_display_registers_updated.pdf

Thanks, I'll check it.
Though we have too much manuals related, which is for it
is very difficult to guess.

> igfxcurregs() returns the virtual address of the register
> bank for a specific cursor *if* the associated display
> pipe was enabled (assuming the 1:1 mapping).

In my case, only pipe B is enabled.

Using rio, I see vertical + shaped 4 cursors, too.
Cursor change is working.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  2:48       ` kokamoto
@ 2016-05-23  3:01         ` kokamoto
  0 siblings, 0 replies; 30+ messages in thread
From: kokamoto @ 2016-05-23  3:01 UTC (permalink / raw)
  To: 9front

>> on the next vertical blank. see section 2.10.2.1 in
>> 965_g35_vol3_display_registers_updated.pdf
> 
> Thanks, I'll check it.

Ahaha, it is just I'm reading..
Sorry

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  0:57     ` cinap_lenrek
  2016-05-23  2:48       ` kokamoto
@ 2016-05-23  5:09       ` kokamoto
  2016-05-23  5:23         ` kokamoto
  2016-05-23 15:07         ` cinap_lenrek
  1 sibling, 2 replies; 30+ messages in thread
From: kokamoto @ 2016-05-23  5:09 UTC (permalink / raw)
  To: 9front

> enabling/disabling
> display pipes is the job of aux/vga.

in /sys/src/cmd/aux/vga/igfx.c
snarftrans() function
	switch(igfx->type){
	case TypeG45:
		if(t == &igfx->pipe[0]){			/* PIPEA */
			t->dm[0] = snarfreg(igfx, 0x70050);	/* GMCHDataM */
			t->dn[0] = snarfreg(igfx, 0x70054);	/* GMCHDataN */
			t->lm[0] = snarfreg(igfx, 0x70060);	/* DPLinkM */
			t->ln[0] = snarfreg(igfx, 0x70064);	/* DPLinkN */

and
after the comment line of
/* cursor plane */

snarfpipe() function
	case TypeG45:
		p->dsp->pos	= snarfreg(igfx, 0x7018C + x*0x1000);
		p->dsp->size	= snarfreg(igfx, 0x70190 + x*0x1000);

Above addresses are reserved area, and we have no information of
those functions. 

Are those are related?

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  5:09       ` kokamoto
@ 2016-05-23  5:23         ` kokamoto
  2016-05-23 11:01           ` cinap_lenrek
  2016-05-23 15:07         ` cinap_lenrek
  1 sibling, 1 reply; 30+ messages in thread
From: kokamoto @ 2016-05-23  5:23 UTC (permalink / raw)
  To: 9front

Another point;

when I try to use new cursor, like to male new window on rio,
igfxcurload() function the kernel is called three times.
One(original) + 3 times > 4 vertical aligned cursor may be explained.
Then, the question should be why that function called three times.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  0:39   ` kokamoto
  2016-05-23  0:57     ` cinap_lenrek
@ 2016-05-23  8:46     ` cinap_lenrek
  2016-05-23  8:52     ` cinap_lenrek
  2 siblings, 0 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-23  8:46 UTC (permalink / raw)
  To: 9front

i just noticed...

16*64*4 = 4K, that is 16 rows, 64 columns, 4 byte per pixel. so our plan9
cursor image (16x16) perfectly fits a full page. but the full 64x64
image covers 4 pages.

maybe the graphics memory maps all the pages to the same physical memory,
this would explain the 4 vertial cursor images.

as a quick test, you can try comparing the memory contents *after* we'v
written the cursor like:

if(memcmp((uchar*)scr->vaddr + scr->storage, (uchar*)scr->vaddr + scr->storage + 0x1000, 0x1000) == 0)
	print("mirror!\n");

normally, this shouldnt trigger as the pages >0 should be all zero
and only the first page contains the 16x16 image for our plan9 cursor.

we put the cursor image at the very end of the graphics memory. maybe this
is not linear memory anymore or it is strangely mapped.

we could try to program the cursor in popup mode and write a physical
address instead of a graphics memory address to the base register, then we
are more independent of the graphics memory layout.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  0:39   ` kokamoto
  2016-05-23  0:57     ` cinap_lenrek
  2016-05-23  8:46     ` cinap_lenrek
@ 2016-05-23  8:52     ` cinap_lenrek
  2 siblings, 0 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-23  8:52 UTC (permalink / raw)
  To: 9front

or we use a different mode... where the whole 64x64 cursor will fit into
a single page... :)

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  5:23         ` kokamoto
@ 2016-05-23 11:01           ` cinap_lenrek
  2016-05-23 11:29             ` kokamoto
  0 siblings, 1 reply; 30+ messages in thread
From: cinap_lenrek @ 2016-05-23 11:01 UTC (permalink / raw)
  To: 9front

hm.... thinking about it... if curbase 0x7FC00 is wrong... can you
put a debug print into igfxenable() and print the values for apsize
and storage as returned by preallocsize()?

preallocsize() determines the amount of memory that was reserved by
the bios for the framebuffer. and we assume this memory will be linearly
mapped in the GTT in the framebuffer aperture. we put the cursor at
the end of this allocation (so basically preallocsize() - 64*64*4).
but i doubt your framebuffer is just 512K.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23 11:01           ` cinap_lenrek
@ 2016-05-23 11:29             ` kokamoto
  2016-05-23 11:42               ` kokamoto
  0 siblings, 1 reply; 30+ messages in thread
From: kokamoto @ 2016-05-23 11:29 UTC (permalink / raw)
  To: 9front

> but i doubt your framebuffer is just 512K.

Sorry, I can watch the printed value just a glance
by my debugging method.   I re-watched the value, and
found it is 0x7FC00.
I'll check your advice tommorow.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23 11:29             ` kokamoto
@ 2016-05-23 11:42               ` kokamoto
  2016-05-23 12:05                 ` cinap_lenrek
  0 siblings, 1 reply; 30+ messages in thread
From: kokamoto @ 2016-05-23 11:42 UTC (permalink / raw)
  To: 9front

> found it is 0x7FC00. <=== same!!
> I'll check your advice tommorow.

Sorry, my noise.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23 11:42               ` kokamoto
@ 2016-05-23 12:05                 ` cinap_lenrek
  2016-05-24  1:42                   ` kokamoto
  0 siblings, 1 reply; 30+ messages in thread
From: cinap_lenrek @ 2016-05-23 12:05 UTC (permalink / raw)
  To: 9front

this value makes no sense, because thats arround 500KB.
for a framebuffer for say 1024x768x32 you need 4MB, and
the cursor is supposed to be at the very end of that.

further, preallocsize() will return at minimum a 1MB
allocation size, so 1024*1024 - 64*64*4 = 0xfc000 is
the minimum. it could be that reading that register
back will yield strange stuff, so i'm curious what
the value we store there is.

you might be able to adjust the preallocation size
in the bios. theres usually a setting where one can
specify the amount of memory one wants to reserve for
the graphics card. this is what reallocsize() is supposed
to return.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23  5:09       ` kokamoto
  2016-05-23  5:23         ` kokamoto
@ 2016-05-23 15:07         ` cinap_lenrek
  1 sibling, 0 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-23 15:07 UTC (permalink / raw)
  To: 9front

			t->dm[0] = snarfreg(igfx, 0x70050);	/* GMCHDataM */
			t->dn[0] = snarfreg(igfx, 0x70054);	/* GMCHDataN */
			t->lm[0] = snarfreg(igfx, 0x70060);	/* DPLinkM */
			t->ln[0] = snarfreg(igfx, 0x70064);	/* DPLinkN */

These are registers for the G45, its described in:

https://01.org/sites/default/files/documentation/g45_vol_3_register_0_0.pdf

section 2.10.1.17.

Its is not a bug, but if your card it not a G45 but a 965 that
lacks these registers, then we have to introduce a type for
it and avoid hammering registers that do not exist.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-23 12:05                 ` cinap_lenrek
@ 2016-05-24  1:42                   ` kokamoto
  2016-05-24  8:57                     ` cinap_lenrek
  2016-05-24  9:10                     ` cinap_lenrek
  0 siblings, 2 replies; 30+ messages in thread
From: kokamoto @ 2016-05-24  1:42 UTC (permalink / raw)
  To: 9front

1) I printed the scr->storage value at the end of igfxenable() function,
and the result is 0x7FC000.
I missread the value as 0x7FC00 before.
Then, this is not our problem, I think.

2) comparing the memory test as you adviced returned "mirror!".
if(memcmp((uchar*)scr->vaddr + scr->storage, (uchar*)scr->vaddr + scr->storage + 0x1000, 0x1000) == 0)
	print("mirror!\n");

I put the above line at the end of igfxcurenable() function.
This may be the cause of our problem.

3) thanks to introduce the new (to me) G45 programmers manual.
Yes, it was written there.
However, this would not relate to our problem, because we see
normal screen other than the hardware mouse cursor.

Kenji

PS. 
cinap very much thank you.
My brain goes damaged due to age, so I may bother you
too much.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24  1:42                   ` kokamoto
@ 2016-05-24  8:57                     ` cinap_lenrek
  2016-05-24 10:12                       ` kokamoto
  2016-05-24 10:14                       ` kokamoto
  2016-05-24  9:10                     ` cinap_lenrek
  1 sibling, 2 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-24  8:57 UTC (permalink / raw)
  To: 9front

> I put the above line at the end of igfxcurenable() function.
> This may be the cause of our problem.

put it at the end of igfxcurload(). igfxenable() doesnt
write anything to the graphics memory.

the idea of the test was to see if the second page is a memory
mirror with the first page. as igfxcurload() clears all 4
pages and then initializes the cursor in the first page (16x64x4).
so first page (4k) contains the pixels of our cursor, and second to
4th page should be all zero.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24  1:42                   ` kokamoto
  2016-05-24  8:57                     ` cinap_lenrek
@ 2016-05-24  9:10                     ` cinap_lenrek
  1 sibling, 0 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-24  9:10 UTC (permalink / raw)
  To: 9front


> 1) I printed the scr->storage value at the end of igfxenable() function,
> and the result is 0x7FC000.
> I missread the value as 0x7FC00 before.

ok. that makes sense.

> Then, this is not our problem, I think.

if the pages are mirrord this would exactly explain the vertical cursor
doubling, and it makes sense as the grpahics memory is mapped on a page
by page basis to physical memory (look for GTT in the manual).

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24  8:57                     ` cinap_lenrek
@ 2016-05-24 10:12                       ` kokamoto
  2016-05-24 10:14                       ` kokamoto
  1 sibling, 0 replies; 30+ messages in thread
From: kokamoto @ 2016-05-24 10:12 UTC (permalink / raw)
  To: 9front

>> I put the above line at the end of igfxcurenable() function.
>> This may be the cause of our problem.
> 
> put it at the end of igfxcurload(). igfxenable() doesnt
> write anything to the graphics memory.

Please read my last mail again.
I put it at the end of igfxcurenable() not igfxenable().
I also checked it at the last of igfxcurload(), and the same result.
Then, this is our problem.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24  8:57                     ` cinap_lenrek
  2016-05-24 10:12                       ` kokamoto
@ 2016-05-24 10:14                       ` kokamoto
  2016-05-24 14:17                         ` cinap_lenrek
  2016-05-24 19:57                         ` cinap_lenrek
  1 sibling, 2 replies; 30+ messages in thread
From: kokamoto @ 2016-05-24 10:14 UTC (permalink / raw)
  To: 9front

>> I put the above line at the end of igfxcurenable() function.
>> This may be the cause of our problem.
> 
> put it at the end of igfxcurload(). igfxenable() doesnt
> write anything to the graphics memory.

Please read my message agin.
I put it at the end of igfxcurenable() not igfxenable().
I also put it at the end of igfxcurload(), and the result is same.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24 10:14                       ` kokamoto
@ 2016-05-24 14:17                         ` cinap_lenrek
  2016-05-24 19:57                         ` cinap_lenrek
  1 sibling, 0 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-24 14:17 UTC (permalink / raw)
  To: 9front

> Please read my message agin.
> I put it at the end of igfxcurenable() not igfxenable().
> I also put it at the end of igfxcurload(), and the result is same.

ah! ok.

see:
"7.2.11 GTTMMADR — Graphics Translation Table Range Address"

it says the first bar is evenly split into mmio and gtt, so
halving the bar size should yield the mmio offset of the first
gtt entry. theres one gtt page table entry for every 4K page.

the format of a PTE is described in:
"8.2.1.4 GTT Page Table Entries (PTEs)"

lowest 3 bits are flags/mapping type, and bits 12:31 is the
physical page number... rest is pretty much reserved.

u32int pte, *gtt = (u32int*)((uchar*)scr->mmio + (scr->pci->mem[0].size>>1));
...
print("addr %.8ux pte=%.8ux\n", addr, gtt[addr>>12]);

with that we could probe the gtt and read back the page table entries
that make up graphics memory.

i'll write some little dumper program when i get home, so we can figure
out how the bios setup the gtt. it appears the end of the graphics memory
range is just mapped to dummy pages, and we need to find a continuous
range of 4 pages.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24 10:14                       ` kokamoto
  2016-05-24 14:17                         ` cinap_lenrek
@ 2016-05-24 19:57                         ` cinap_lenrek
  2016-05-24 23:55                           ` kokamoto
  1 sibling, 1 reply; 30+ messages in thread
From: cinap_lenrek @ 2016-05-24 19:57 UTC (permalink / raw)
  To: 9front

[-- Attachment #1: Type: text/plain, Size: 186 bytes --]

attached a hacked igfx.c to be copied to /sys/src/cmd/aux/vga/igfx.c, will
dump the gtt when you run: aux/vga -m ... -p

alternative: http://felloff.net/usr/cinap_lenrek/igfx.c

--
cinap

[-- Attachment #2: igfx.c --]
[-- Type: text/plain, Size: 40930 bytes --]

#include <u.h>
#include <libc.h>
#include <bio.h>

#include "pci.h"
#include "vga.h"

typedef struct Reg Reg;
typedef struct Dpll Dpll;
typedef struct Hdmi Hdmi;
typedef struct Dp Dp;
typedef struct Fdi Fdi;
typedef struct Pfit Pfit;
typedef struct Curs Curs;
typedef struct Plane Plane;
typedef struct Trans Trans;
typedef struct Pipe Pipe;
typedef struct Igfx Igfx;

enum {
	MHz = 1000000,
};

enum {
	TypeG45,
	TypeIVB,		/* Ivy Bridge */
	TypeSNB,		/* Sandy Bridge (unfinished) */
};

enum {
	PortVGA	= 0,		/* adpa */
	PortLCD	= 1,		/* lvds */
	PortDPA	= 2,
	PortDPB	= 3,
	PortDPC	= 4,
	PortDPD	= 5,
};

struct Reg {
	u32int	a;		/* address or 0 when invalid */
	u32int	v;		/* value */
};

struct Dpll {
	Reg	ctrl;		/* DPLLx_CTRL */
	Reg	fp0;		/* FPx0 */
	Reg	fp1;		/* FPx1 */
};

struct Trans {
	Reg	dm[2];		/* pipe/trans DATAM */
	Reg	dn[2];		/* pipe/trans DATAN */
	Reg	lm[2];		/* pipe/trans LINKM */
	Reg	ln[2];		/* pipe/trans LINKN */

	Reg	ht;		/* pipe/trans HTOTAL_x */
	Reg	hb;		/* pipe/trans HBLANK_x */
	Reg	hs;		/* pipe/trans HSYNC_x */
	Reg	vt;		/* pipe/trans VTOTAL_x */
	Reg	vb;		/* pipe/trans VBLANK_x */
	Reg	vs;		/* pipe/trans VSYNC_x */
	Reg	vss;		/* pipe/trans VSYNCSHIFT_x */

	Reg	conf;		/* pipe/trans CONF_x */
	Reg	chicken;	/* workarround register */

	Reg	dpctl;		/* TRANS_DP_CTL_x */

	Dpll	*dpll;		/* this transcoders dpll */
};

struct Hdmi {
	Reg	ctl;
	Reg	bufctl[4];
};

struct Dp {
	Reg	ctl;
	Reg	auxctl;
	Reg	auxdat[5];

	uchar	dpcd[256];
};

struct Fdi {
	Trans;

	Reg	txctl;		/* FDI_TX_CTL */

	Reg	rxctl;		/* FDI_RX_CTL */
	Reg	rxmisc;		/* FDI_RX_MISC */
	Reg	rxtu[2];	/* FDI_RX_TUSIZE */
};

struct Pfit {
	Reg	ctrl;
	Reg	winpos;
	Reg	winsize;
	Reg	pwrgate;
};

struct Plane {
	Reg	cntr;		/* DSPxCNTR */
	Reg	linoff;		/* DSPxLINOFF */
	Reg	stride;		/* DSPxSTRIDE */
	Reg	surf;		/* DSPxSURF */
	Reg	tileoff;	/* DSPxTILEOFF */

	Reg	pos;
	Reg	size;
};

struct Curs {
	Reg	cntr;
	Reg	base;
	Reg	pos;
};

struct Pipe {
	Trans;

	Reg	src;		/* PIPExSRC */

	Fdi	fdi[1];		/* fdi/dp transcoder */

	Plane	dsp[1];		/* display plane */
	Curs	cur[1];		/* hardware cursor */

	Pfit	*pfit;		/* selected panel fitter */
};

struct Igfx {
	Ctlr	*ctlr;
	Pcidev	*pci;

	u32int	pio;
	u32int	*mmio;

	int	type;
	int	cdclk;		/* core display clock in mhz */

	int	npipe;
	Pipe	pipe[4];

	Dpll	dpll[2];
	Pfit	pfit[3];

	/* IVB */
	Reg	dpllsel;	/* DPLL_SEL */
	Reg	drefctl;	/* DREF_CTL */
	Reg	rawclkfreq;	/* RAWCLK_FREQ */
	Reg	ssc4params;	/* SSC4_PARAMS */

	Dp	dp[4];
	Hdmi	hdmi[4];

	Reg	ppcontrol;
	Reg	ppstatus;

	/* G45 */
	Reg	gmbus[6];	/* GMBUSx */

	Reg	sdvoc;
	Reg	sdvob;

	/* common */
	Reg	adpa;
	Reg	lvds;

	Reg	vgacntrl;
};

static u32int
rr(Igfx *igfx, u32int a)
{
	if(a == 0)
		return 0;
	assert((a & 3) == 0);
	if(igfx->mmio != nil)
		return igfx->mmio[a/4];
	outportl(igfx->pio, a);
	return inportl(igfx->pio + 4);
}
static void
wr(Igfx *igfx, u32int a, u32int v)
{
	if(a == 0)	/* invalid */
		return;
	assert((a & 3) == 0);
	if(igfx->mmio != nil){
		igfx->mmio[a/4] = v;
		return;
	}
	outportl(igfx->pio, a);
	outportl(igfx->pio + 4, v);
}
static void
csr(Igfx *igfx, u32int reg, u32int clr, u32int set)
{
	wr(igfx, reg, (rr(igfx, reg) & ~clr) | set);
}

static void
loadreg(Igfx *igfx, Reg r)
{
	wr(igfx, r.a, r.v);
}

static Reg
snarfreg(Igfx *igfx, u32int a)
{
	Reg r;

	r.a = a;
	r.v = rr(igfx, a);
	return r;
}

static void
snarftrans(Igfx *igfx, Trans *t, u32int o)
{
	/* pipe timing */
	t->ht	= snarfreg(igfx, o + 0x00000);
	t->hb	= snarfreg(igfx, o + 0x00004);
	t->hs	= snarfreg(igfx, o + 0x00008);
	t->vt	= snarfreg(igfx, o + 0x0000C);
	t->vb	= snarfreg(igfx, o + 0x00010);
	t->vs	= snarfreg(igfx, o + 0x00014);
	t->vss	= snarfreg(igfx, o + 0x00028);

	t->conf	= snarfreg(igfx, o + 0x10008);

	switch(igfx->type){
	case TypeG45:
		if(t == &igfx->pipe[0]){			/* PIPEA */
			t->dm[0] = snarfreg(igfx, 0x70050);	/* GMCHDataM */
			t->dn[0] = snarfreg(igfx, 0x70054);	/* GMCHDataN */
			t->lm[0] = snarfreg(igfx, 0x70060);	/* DPLinkM */
			t->ln[0] = snarfreg(igfx, 0x70064);	/* DPLinkN */
		}
		break;
	case TypeIVB:
	case TypeSNB:
		t->dm[0] = snarfreg(igfx, o + 0x30);
		t->dn[0] = snarfreg(igfx, o + 0x34);
		t->dm[1] = snarfreg(igfx, o + 0x38);
		t->dn[1] = snarfreg(igfx, o + 0x3c);
		t->lm[0] = snarfreg(igfx, o + 0x40);
		t->ln[0] = snarfreg(igfx, o + 0x44);
		t->lm[1] = snarfreg(igfx, o + 0x48);
		t->ln[1] = snarfreg(igfx, o + 0x4c);
		break;
	}
}

static void
snarfpipe(Igfx *igfx, int x)
{
	u32int o;
	Pipe *p;

	p = &igfx->pipe[x];

	o = 0x60000 + x*0x1000;
	snarftrans(igfx, p, o);

	p->src = snarfreg(igfx, o + 0x0001C);

	if(igfx->type == TypeIVB || igfx->type == TypeSNB) {
		p->fdi->txctl = snarfreg(igfx, o + 0x100);

		o = 0xE0000 | x*0x1000;
		snarftrans(igfx, p->fdi, o);

		p->fdi->dpctl = snarfreg(igfx, o + 0x300);

		p->fdi->rxctl = snarfreg(igfx, o + 0x1000c);
		p->fdi->rxmisc = snarfreg(igfx, o + 0x10010);
		p->fdi->rxtu[0] = snarfreg(igfx, o + 0x10030);
		p->fdi->rxtu[1] = snarfreg(igfx, o + 0x10038);

		p->fdi->chicken = snarfreg(igfx, o + 0x10064);

		p->fdi->dpll = &igfx->dpll[(igfx->dpllsel.v>>(x*4)) & 1];
		p->dpll = nil;
	} else {
		p->dpll = &igfx->dpll[x & 1];
	}

	/* display plane */
	p->dsp->cntr		= snarfreg(igfx, 0x70180 + x*0x1000);
	p->dsp->linoff		= snarfreg(igfx, 0x70184 + x*0x1000);
	p->dsp->stride		= snarfreg(igfx, 0x70188 + x*0x1000);
	p->dsp->tileoff		= snarfreg(igfx, 0x701A4 + x*0x1000);
	p->dsp->surf		= snarfreg(igfx, 0x7019C + x*0x1000);

	/* cursor plane */
	switch(igfx->type){
	case TypeIVB:
	case TypeSNB:
		p->cur->cntr	= snarfreg(igfx, 0x70080 + x*0x1000);
		p->cur->base	= snarfreg(igfx, 0x70084 + x*0x1000);
		p->cur->pos	= snarfreg(igfx, 0x70088 + x*0x1000);
		break;
	case TypeG45:
		p->dsp->pos	= snarfreg(igfx, 0x7018C + x*0x1000);
		p->dsp->size	= snarfreg(igfx, 0x70190 + x*0x1000);

		p->cur->cntr	= snarfreg(igfx, 0x70080 + x*0x40);
		p->cur->base	= snarfreg(igfx, 0x70084 + x*0x40);
		p->cur->pos	= snarfreg(igfx, 0x70088 + x*0x40);
		break;
	}
}

static int
devtype(Igfx *igfx)
{
	if(igfx->pci->vid != 0x8086)
		return -1;
	switch(igfx->pci->did){
	case 0x0166:	/* 3rd Gen Core - ThinkPad X230 */
		return TypeIVB;
	case 0x0126:	/* Thinkpad X220 */
		return TypeSNB;
	case 0x27a2:	/* GM945/82940GML - ThinkPad X60 Tablet */
	case 0x2a02:	/* GM965/GL960/X3100 - ThinkPad X61 Tablet */
	case 0x2a42:	/* 4 Series Mobile - ThinkPad X200 */
		return TypeG45;
	}
	return -1;
}

static Edid* snarfgmedid(Igfx*, int port, int addr);
static Edid* snarfdpedid(Igfx*, Dp *dp, int addr);

static int enabledp(Igfx*, Dp*);

static void
snarf(Vga* vga, Ctlr* ctlr)
{
	Igfx *igfx;
	int x, y;

	igfx = vga->private;
	if(igfx == nil) {
		igfx = alloc(sizeof(Igfx));
		igfx->ctlr = ctlr;
		igfx->pci = vga->pci;
		if(igfx->pci == nil){
			error("%s: no pci device\n", ctlr->name);
			return;
		}
		igfx->type = devtype(igfx);
		if(igfx->type < 0){
			error("%s: unrecognized device\n", ctlr->name);
			return;
		}
		vgactlpci(igfx->pci);
		if(1){
			vgactlw("type", ctlr->name);
			igfx->mmio = segattach(0, "igfxmmio", 0, igfx->pci->mem[0].size);
			if(igfx->mmio == (u32int*)-1)
				error("%s: attaching mmio: %r\n", ctlr->name);
		} else {
			if((igfx->pci->mem[4].bar & 1) == 0)
				error("%s: no pio bar\n", ctlr->name);
			igfx->pio = igfx->pci->mem[4].bar & ~1;
		}
		vga->private = igfx;
	}

	switch(igfx->type){
	case TypeG45:
		igfx->npipe = 2;	/* A,B */
		igfx->cdclk = 200;	/* MHz */

		igfx->dpll[0].ctrl	= snarfreg(igfx, 0x06014);
		igfx->dpll[0].fp0	= snarfreg(igfx, 0x06040);
		igfx->dpll[0].fp1	= snarfreg(igfx, 0x06044);
		igfx->dpll[1].ctrl	= snarfreg(igfx, 0x06018);
		igfx->dpll[1].fp0	= snarfreg(igfx, 0x06048);
		igfx->dpll[1].fp1	= snarfreg(igfx, 0x0604c);

		igfx->adpa		= snarfreg(igfx, 0x061100);
		igfx->lvds		= snarfreg(igfx, 0x061180);
		igfx->sdvob		= snarfreg(igfx, 0x061140);
		igfx->sdvoc		= snarfreg(igfx, 0x061160);

		for(x=0; x<5; x++)
			igfx->gmbus[x]	= snarfreg(igfx, 0x5100 + x*4);
		igfx->gmbus[x]	= snarfreg(igfx, 0x5120);

		igfx->pfit[0].ctrl	= snarfreg(igfx, 0x061230);
		y = (igfx->pfit[0].ctrl.v >> 29) & 3;
		if(igfx->pipe[y].pfit == nil)
			igfx->pipe[y].pfit = &igfx->pfit[0];

		igfx->ppstatus		= snarfreg(igfx, 0x61200);
		igfx->ppcontrol		= snarfreg(igfx, 0x61204);

		igfx->vgacntrl		= snarfreg(igfx, 0x071400);
		break;

	case TypeSNB:
		igfx->npipe = 2;	/* A,B */
		igfx->cdclk = 300;	/* MHz */
		goto PCHcommon;

	case TypeIVB:
		igfx->npipe = 3;	/* A,B,C */
		igfx->cdclk = 400;	/* MHz */

	PCHcommon:
		igfx->dpll[0].ctrl	= snarfreg(igfx, 0xC6014);
		igfx->dpll[0].fp0	= snarfreg(igfx, 0xC6040);
		igfx->dpll[0].fp1	= snarfreg(igfx, 0xC6044);
		igfx->dpll[1].ctrl	= snarfreg(igfx, 0xC6018);
		igfx->dpll[1].fp0	= snarfreg(igfx, 0xC6048);
		igfx->dpll[1].fp1	= snarfreg(igfx, 0xC604c);

		igfx->dpllsel		= snarfreg(igfx, 0xC7000);

		igfx->drefctl		= snarfreg(igfx, 0xC6200);
		igfx->rawclkfreq	= snarfreg(igfx, 0xC6204);
		igfx->ssc4params	= snarfreg(igfx, 0xC6210);

		/* cpu displayport A */
		igfx->dp[0].ctl		= snarfreg(igfx, 0x64000);
		igfx->dp[0].auxctl	= snarfreg(igfx, 0x64010);
		igfx->dp[0].auxdat[0]	= snarfreg(igfx, 0x64014);
		igfx->dp[0].auxdat[1]	= snarfreg(igfx, 0x64018);
		igfx->dp[0].auxdat[2]	= snarfreg(igfx, 0x6401C);
		igfx->dp[0].auxdat[3]	= snarfreg(igfx, 0x64020);
		igfx->dp[0].auxdat[4]	= snarfreg(igfx, 0x64024);

		/* pch displayport B,C,D */
		for(x=1; x<4; x++){
			igfx->dp[x].ctl		= snarfreg(igfx, 0xE4000 + 0x100*x);
			igfx->dp[x].auxctl	= snarfreg(igfx, 0xE4010 + 0x100*x);
			igfx->dp[x].auxdat[0]	= snarfreg(igfx, 0xE4014 + 0x100*x);
			igfx->dp[x].auxdat[1]	= snarfreg(igfx, 0xE4018 + 0x100*x);
			igfx->dp[x].auxdat[2]	= snarfreg(igfx, 0xE401C + 0x100*x);
			igfx->dp[x].auxdat[3]	= snarfreg(igfx, 0xE4020 + 0x100*x);
			igfx->dp[x].auxdat[4]	= snarfreg(igfx, 0xE4024 + 0x100*x);
		}

		for(x=0; x<igfx->npipe; x++){
			igfx->pfit[x].pwrgate	= snarfreg(igfx, 0x68060 + 0x800*x);
			igfx->pfit[x].winpos	= snarfreg(igfx, 0x68070 + 0x800*x);
			igfx->pfit[x].winsize	= snarfreg(igfx, 0x68074 + 0x800*x);
			igfx->pfit[x].ctrl	= snarfreg(igfx, 0x68080 + 0x800*x);

			y = (igfx->pfit[x].ctrl.v >> 29) & 3;
			if(igfx->pipe[y].pfit == nil)
				igfx->pipe[y].pfit = &igfx->pfit[x];
		}
		igfx->ppstatus		= snarfreg(igfx, 0xC7200);
		igfx->ppcontrol		= snarfreg(igfx, 0xC7204);

		igfx->hdmi[1].ctl	= snarfreg(igfx, 0x0E1140);	/* HDMI_CTL_B */
		igfx->hdmi[1].bufctl[0]	= snarfreg(igfx, 0x0FC810);	/* HTMI_BUF_CTL_0 */
		igfx->hdmi[1].bufctl[1]	= snarfreg(igfx, 0x0FC81C);	/* HTMI_BUF_CTL_1 */
		igfx->hdmi[1].bufctl[2]	= snarfreg(igfx, 0x0FC828);	/* HTMI_BUF_CTL_2 */
		igfx->hdmi[1].bufctl[3]	= snarfreg(igfx, 0x0FC834);	/* HTMI_BUF_CTL_3 */

		igfx->hdmi[2].ctl	= snarfreg(igfx, 0x0E1150);	/* HDMI_CTL_C */
		igfx->hdmi[2].bufctl[0]	= snarfreg(igfx, 0x0FCC00);	/* HTMI_BUF_CTL_4 */
		igfx->hdmi[2].bufctl[1]	= snarfreg(igfx, 0x0FCC0C);	/* HTMI_BUF_CTL_5 */
		igfx->hdmi[2].bufctl[2]	= snarfreg(igfx, 0x0FCC18);	/* HTMI_BUF_CTL_6 */
		igfx->hdmi[2].bufctl[3]	= snarfreg(igfx, 0x0FCC24);	/* HTMI_BUF_CTL_7 */

		igfx->hdmi[3].ctl	= snarfreg(igfx, 0x0E1160);	/* HDMI_CTL_D */
		igfx->hdmi[3].bufctl[0]	= snarfreg(igfx, 0x0FD000);	/* HTMI_BUF_CTL_8 */
		igfx->hdmi[3].bufctl[1]	= snarfreg(igfx, 0x0FD00C);	/* HTMI_BUF_CTL_9 */
		igfx->hdmi[3].bufctl[2]	= snarfreg(igfx, 0x0FD018);	/* HTMI_BUF_CTL_10 */
		igfx->hdmi[3].bufctl[3]	= snarfreg(igfx, 0x0FD024);	/* HTMI_BUF_CTL_11 */

		for(x=0; x<5; x++)
			igfx->gmbus[x]	= snarfreg(igfx, 0xC5100 + x*4);
		igfx->gmbus[x]	= snarfreg(igfx, 0xC5120);

		igfx->adpa		= snarfreg(igfx, 0x0E1100);	/* DAC_CTL */
		igfx->lvds		= snarfreg(igfx, 0x0E1180);	/* LVDS_CTL */

		igfx->vgacntrl		= snarfreg(igfx, 0x041000);
		break;
	}

	for(x=0; x<igfx->npipe; x++)
		snarfpipe(igfx, x);

	for(x=0; x<nelem(vga->edid); x++){
		Modelist *l;

		switch(x){
		case PortVGA:
			vga->edid[x] = snarfgmedid(igfx, 2, 0x50);
			break;
		case PortLCD:
			vga->edid[x] = snarfgmedid(igfx, 3, 0x50);
			if(vga->edid[x] == nil)
				continue;
			for(l = vga->edid[x]->modelist; l != nil; l = l->next)
				l->attr = mkattr(l->attr, "lcd", "1");
			break;
		case PortDPA:
		case PortDPB:
		case PortDPC:
		case PortDPD:
			vga->edid[x] = snarfdpedid(igfx, &igfx->dp[x-PortDPA], 0x50);
			break;
		}
		if(vga->edid[x] == nil)
			continue;
		for(l = vga->edid[x]->modelist; l != nil; l = l->next)
			l->attr = mkattr(l->attr, "display", "%d", x+1);
	}

	ctlr->flag |= Fsnarf;
}

static void
options(Vga* vga, Ctlr* ctlr)
{
	USED(vga);
	ctlr->flag |= Hlinear|Ulinear|Foptions;
}

static int
genpll(int freq, int cref, int P2, int *m1, int *m2, int *n, int *p1)
{
	int M1, M2, M, N, P, P1;
	int best, error;
	vlong a;

	best = -1;
	for(N=3; N<=8; N++)
	for(M2=5; M2<=9; M2++)
//	for(M1=10; M1<=20; M1++){
	for(M1=12; M1<=22; M1++){
		M = 5*(M1+2) + (M2+2);
		if(M < 79 || M > 127)
//		if(M < 70 || M > 120)
			continue;
		for(P1=1; P1<=8; P1++){
			P = P1 * P2;
			if(P < 5 || P > 98)
//			if(P < 4 || P > 98)
				continue;
			a = cref;
			a *= M;
			a /= N+2;
			a /= P;
			if(a < 20*MHz || a > 400*MHz)
				continue;
			error = a;
			error -= freq;
			if(error < 0)
				error = -error;
			if(best < 0 || error < best){
				best = error;
				*m1 = M1;
				*m2 = M2;
				*n = N;
				*p1 = P1;
			}
		}
	}
	return best;
}

static int
getcref(Igfx *igfx, int x)
{
	Dpll *dpll;

	dpll = &igfx->dpll[x];
	if(igfx->type == TypeG45){
		if(((dpll->ctrl.v >> 13) & 3) == 3)
			return 100*MHz;
		return 96*MHz;
	}
	return 120*MHz;
}

static int
initdpll(Igfx *igfx, int x, int freq, int port)
{
	int cref, m1, m2, n, p1, p2;
	Dpll *dpll;

	switch(igfx->type){
	case TypeG45:
		/* PLL Reference Input Select */
		dpll = igfx->pipe[x].dpll;
		dpll->ctrl.v &= ~(3<<13);
		dpll->ctrl.v |= (port == PortLCD ? 3 : 0) << 13;
		break;
	case TypeSNB:
	case TypeIVB:
		/* transcoder dpll enable */
		igfx->dpllsel.v |= 8<<(x*4);
		/* program rawclock to 125MHz */
		igfx->rawclkfreq.v = 125;

		igfx->drefctl.v &= ~(3<<13);
		igfx->drefctl.v &= ~(3<<11);
		igfx->drefctl.v &= ~(3<<9);
		igfx->drefctl.v &= ~(3<<7);
		igfx->drefctl.v &= ~3;

		if(port == PortLCD){
			igfx->drefctl.v |= 2<<11;
			igfx->drefctl.v |= 1;
		} else {
			igfx->drefctl.v |= 2<<9;
		}

		/*
		 * PLL Reference Input Select:
		 * 000	DREFCLK		(default is 120 MHz) for DAC/HDMI/DVI/DP
		 * 001	Super SSC	120MHz super-spread clock
		 * 011	SSC		Spread spectrum input clock (120MHz default) for LVDS/DP
		 */
		dpll = igfx->pipe[x].fdi->dpll;
		dpll->ctrl.v &= ~(7<<13);
		dpll->ctrl.v |= (port == PortLCD ? 3 : 0) << 13;
		break;
	default:
		return -1;
	}
	cref = getcref(igfx, x);

	/* Dpll Mode select */
	dpll->ctrl.v &= ~(3<<26);
	dpll->ctrl.v |= (port == PortLCD ? 2 : 1)<<26;

	/* P2 Clock Divide */
	dpll->ctrl.v &= ~(3<<24);
	if(port == PortLCD){
		p2 = 14;
		if(genpll(freq, cref, p2, &m1, &m2, &n, &p1) < 0)
			return -1;
	} else {
		/* generate 270MHz clock for displayport */
		if(port >= PortDPA)
			freq = 270*MHz;

		p2 = 10;
		if(freq > 270*MHz){
			p2 >>= 1;
			dpll->ctrl.v |= (1<<24);
		}
		if(genpll(freq, cref, p2, &m1, &m2, &n, &p1) < 0)
			return -1;
	}

	/* Dpll VCO Enable */
	dpll->ctrl.v |= (1<<31);

	/* Dpll Serial DVO High Speed IO clock Enable */
	if(port >= PortDPA)
		dpll->ctrl.v |= (1<<30);
	else
		dpll->ctrl.v &= ~(1<<30);

	/* VGA Mode Disable */
	dpll->ctrl.v |= (1<<28);

	dpll->fp0.v &= ~(0x3f<<16);
	dpll->fp0.v |= n << 16;
	dpll->fp0.v &= ~(0x3f<<8);
	dpll->fp0.v |= m1 << 8;
	dpll->fp0.v &= ~(0x3f<<0);
	dpll->fp0.v |= m2 << 0;

	/* FP0 P1 Post Divisor */
	dpll->ctrl.v &= ~0xFF0000;
	dpll->ctrl.v |=  0x010000<<(p1-1);

	/* FP1 P1 Post divisor */
	if(igfx->pci->did != 0x27a2){
		dpll->ctrl.v &= ~0xFF;
		dpll->ctrl.v |=  0x01<<(p1-1);
		dpll->fp1.v = dpll->fp0.v;
	}

	return 0;
}

static void
initdatalinkmn(Trans *t, int freq, int lsclk, int lanes, int tu, int bpp)
{
	uvlong m, n;

	n = 0x800000;
	m = (n * ((freq * bpp)/8)) / (lsclk * lanes);

	t->dm[0].v = (tu-1)<<25 | m;
	t->dn[0].v = n;

	n = 0x80000;
	m = (n * freq) / lsclk;

	t->lm[0].v = m;
	t->ln[0].v = n;

	t->dm[1].v = t->dm[0].v;
	t->dn[1].v = t->dn[0].v;
	t->lm[1].v = t->lm[0].v;
	t->ln[1].v = t->ln[0].v;
}

static void
inittrans(Trans *t, Mode *m)
{
	/* clear all but 27:28 frame start delay (initialized by bios) */
	t->conf.v &= 3<<27;

	/* tans/pipe enable */
	t->conf.v |= 1<<31;

	/* trans/pipe timing */
	t->ht.v = (m->ht - 1)<<16 | (m->x - 1);
	t->hs.v = (m->ehs - 1)<<16 | (m->shs - 1);
	t->vt.v = (m->vt - 1)<<16 | (m->y - 1);
	t->vs.v = (m->vre - 1)<<16 | (m->vrs - 1);

	t->hb.v = t->ht.v;
	t->vb.v = t->vt.v;

	t->vss.v = 0;
}

static void
initpipe(Pipe *p, Mode *m)
{
	static uchar bpctab[4] = { 8, 10, 6, 12 };
	int i, tu, bpc, lanes;
	Fdi *fdi;

	/* source image size */
	p->src.v = (m->x - 1)<<16 | (m->y - 1);

	if(p->pfit != nil){
		/* panel fitter off */
		p->pfit->ctrl.v &= ~(1<<31);
		p->pfit->winpos.v = 0;
		p->pfit->winsize.v = 0;
	}

	/* enable and set monitor timings for cpu pipe */
	inittrans(p, m);

	/* default for displayport */
	tu = 64;
	bpc = 6;	/* why */
	lanes = 1;

	fdi = p->fdi;
	if(fdi->rxctl.a != 0){
		/* enable and set monitor timings for transcoder */
		inittrans(fdi, m);

		/* tx port width selection */
		fdi->txctl.v &= ~(7<<19);
		fdi->txctl.v |= (lanes-1)<<19;

		/* rx port width selection */
		fdi->rxctl.v &= ~(7<<19);
		fdi->rxctl.v |= (lanes-1)<<19;
		/* bits per color for transcoder */
		for(i=0; i<nelem(bpctab); i++){
			if(bpctab[i] == bpc){
				fdi->rxctl.v &= ~(7<<16);
				fdi->rxctl.v |= i<<16;
				fdi->dpctl.v &= ~(7<<9);
				fdi->dpctl.v |= i<<9;
				break;
			}
		}

		/* enhanced framing on */
		fdi->rxctl.v |= (1<<6);
		fdi->txctl.v |= (1<<18);

		/* tusize 1 and 2 */
		fdi->rxtu[0].v = (tu-1)<<25;
		fdi->rxtu[1].v = (tu-1)<<25;
		initdatalinkmn(fdi, m->frequency, 270*MHz, lanes, tu, 3*bpc);
	}

	/* bits per color for cpu pipe */
	for(i=0; i<nelem(bpctab); i++){
		if(bpctab[i] == bpc){
			p->conf.v &= ~(7<<5);
			p->conf.v |= i<<5;
			break;
		}
	}
	initdatalinkmn(p, m->frequency, 270*MHz, lanes, tu, 3*bpc);
}

static void
init(Vga* vga, Ctlr* ctlr)
{
	int x, port;
	char *val;
	Igfx *igfx;
	Pipe *p;
	Mode *m;
	Reg *r;

	m = vga->mode;
	if(m->z != 32)
		error("%s: unsupported color depth %d\n", ctlr->name, m->z);

	igfx = vga->private;

	/* disable vga */
	igfx->vgacntrl.v |= (1<<31);

	/* disable all pipes and ports */
	igfx->ppcontrol.v &= 0xFFFF;
	igfx->ppcontrol.v &= ~5;
	igfx->lvds.v &= ~(1<<31);
	igfx->adpa.v &= ~(1<<31);
	if(igfx->type == TypeG45)
		igfx->adpa.v |= (3<<10);	/* Monitor DPMS: off */
	for(x=0; x<nelem(igfx->dp); x++)
		igfx->dp[x].ctl.v &= ~(1<<31);
	for(x=0; x<nelem(igfx->hdmi); x++)
		igfx->hdmi[x].ctl.v &= ~(1<<31);
	for(x=0; x<igfx->npipe; x++){
		/* disable displayport transcoders */
		igfx->pipe[x].dpctl.v &= ~(1<<31);
		igfx->pipe[x].dpctl.v |= (3<<29);
		igfx->pipe[x].fdi->dpctl.v &= ~(1<<31);
		igfx->pipe[x].fdi->dpctl.v |= (3<<29);
		/* disable transcoder/pipe */
		igfx->pipe[x].conf.v &= ~(1<<31);
	}

	if((val = dbattr(m->attr, "display")) != nil)
		port = atoi(val)-1;
	else if(dbattr(m->attr, "lcd") != nil)
		port = PortLCD;
	else
		port = PortVGA;

	trace("%s: display #%d\n", ctlr->name, port+1);

	switch(port){
	default:
	Badport:
		error("%s: display #%d not supported\n", ctlr->name, port+1);
		break;

	case PortVGA:
		if(igfx->npipe > 2)
			x = (igfx->adpa.v >> 29) & 3;
		else
			x = (igfx->adpa.v >> 30) & 1;
		igfx->adpa.v |= (1<<31);
		if(igfx->type == TypeG45){
			igfx->adpa.v &= ~(3<<10);	/* Monitor DPMS: on */

			igfx->adpa.v &= ~(1<<15);	/* ADPA Polarity Select */
			igfx->adpa.v |= 3<<3;
			if(m->hsync == '-')
				igfx->adpa.v ^= 1<<3;
			if(m->vsync == '-')
				igfx->adpa.v ^= 1<<4;
		}
		break;

	case PortLCD:
		if(igfx->npipe > 2)
			x = (igfx->lvds.v >> 29) & 3;
		else
			x = (igfx->lvds.v >> 30) & 1;
		igfx->lvds.v |= (1<<31);
		igfx->ppcontrol.v |= 5;

		if(igfx->type == TypeG45){
			igfx->lvds.v &= ~(1<<24);	/* data format select 18/24bpc */

			igfx->lvds.v &= ~(3<<20);
			if(m->hsync == '-')
				igfx->lvds.v ^= 1<<20;
			if(m->vsync == '-')
				igfx->lvds.v ^= 1<<21;

			igfx->lvds.v |= (1<<15);	/* border enable */
		}
		break;

	case PortDPA:
	case PortDPB:
	case PortDPC:
	case PortDPD:
		r = &igfx->dp[port - PortDPA].ctl;
		if(r->a == 0)
			goto Badport;
		/* port enable */
		r->v |= 1<<31;
		/* port width selection: x1 Mode */
		r->v &= ~(7<<19);

		/* port reversal: off */
		r->v &= ~(1<<15);
		/* reserved MBZ */
		r->v &= ~(15<<11);
		/* use PIPE_A for displayport */
		x = 0;
		/* displayport transcoder */
		if(port == PortDPA){
			/* pll frequency: 270mhz */
			r->v &= ~(3<<16);
			/* pll enable */
			r->v |= 1<<14;
			/* pipe select */
			r->v &= ~(3<<29);
			r->v |= x<<29;
		} else if(igfx->pipe[x].fdi->dpctl.a != 0){
			/* audio output: disable */
			r->v &= ~(1<<6);
			/* transcoder displayport configuration */
			r = &igfx->pipe[x].fdi->dpctl;
			/* transcoder enable */
			r->v |= 1<<31;
			/* port select: B,C,D */
			r->v &= ~(3<<29);
			r->v |= (port-PortDPB)<<29;
		}
		/* sync polarity */
		r->v |= 3<<3;
		if(m->hsync == '-')
			r->v ^= 1<<3;
		if(m->vsync == '-')
			r->v ^= 1<<4;
		break;
	}
	p = &igfx->pipe[x];

	/* plane enable, 32bpp */
	p->dsp->cntr.v = (1<<31) | (6<<26);
	if(igfx->type == TypeG45)
		p->dsp->cntr.v |= x<<24;	/* pipe assign */

	/* stride must be 64 byte aligned */
	p->dsp->stride.v = m->x * (m->z / 8);
	p->dsp->stride.v += 63;
	p->dsp->stride.v &= ~63;

	/* virtual width in pixels */
	vga->virtx = p->dsp->stride.v / (m->z / 8);

	/* plane position and size */
	p->dsp->pos.v = 0;
	p->dsp->size.v = (m->y - 1)<<16 | (m->x - 1);	/* sic */

	p->dsp->surf.v = 0;
	p->dsp->linoff.v = 0;
	p->dsp->tileoff.v = 0;

	/* cursor plane off */
	p->cur->cntr.v = 0;
	if(igfx->type == TypeG45)
		p->cur->cntr.v |= x<<28;	/* pipe assign */
	p->cur->pos.v = 0;
	p->cur->base.v = 0;

	if(initdpll(igfx, x, m->frequency, port) < 0)
		error("%s: frequency %d out of range\n", ctlr->name, m->frequency);

	initpipe(p, m);

	ctlr->flag |= Finit;
}

static void
loadtrans(Igfx *igfx, Trans *t)
{
	int i;

	/* program trans/pipe timings */
	loadreg(igfx, t->ht);
	loadreg(igfx, t->hb);
	loadreg(igfx, t->hs);
	loadreg(igfx, t->vt);
	loadreg(igfx, t->vb);
	loadreg(igfx, t->vs);
	loadreg(igfx, t->vss);

	loadreg(igfx, t->dm[0]);
	loadreg(igfx, t->dn[0]);
	loadreg(igfx, t->lm[0]);
	loadreg(igfx, t->ln[0]);
	loadreg(igfx, t->dm[1]);
	loadreg(igfx, t->dn[1]);
	loadreg(igfx, t->lm[1]);
	loadreg(igfx, t->ln[1]);

	if(t->dpll != nil){
		/* program dpll */
		t->dpll->ctrl.v &= ~(1<<31);
		loadreg(igfx, t->dpll->ctrl);
		loadreg(igfx, t->dpll->fp0);
		loadreg(igfx, t->dpll->fp1);

		/* enable dpll */
		t->dpll->ctrl.v |= (1<<31);
		loadreg(igfx, t->dpll->ctrl);
		sleep(10);
	}

	/* workarround: set timing override bit */
	csr(igfx, t->chicken.a, 0, 1<<31);

	/* enable displayport transcoder */
	loadreg(igfx, t->dpctl);

	/* enable trans/pipe */
	t->conf.v |= (1<<31);
	t->conf.v &= ~(1<<30);
	loadreg(igfx, t->conf);
	for(i=0; i<100; i++){
		sleep(10);
		if(rr(igfx, t->conf.a) & (1<<30))
			break;
	}
}

static void
enablepipe(Igfx *igfx, int x)
{
	int i;
	Pipe *p;

	p = &igfx->pipe[x];
	if((p->conf.v & (1<<31)) == 0)
		return;	/* pipe is disabled, done */

	if(p->fdi->rxctl.a != 0){
		p->fdi->rxctl.v &= ~(1<<31);
		p->fdi->rxctl.v &= ~(1<<4);	/* rawclk */
		p->fdi->rxctl.v |= (1<<13);	/* enable pll */
		loadreg(igfx, p->fdi->rxctl);
		sleep(5);
		p->fdi->rxctl.v |= (1<<4);	/* pcdclk */
		loadreg(igfx, p->fdi->rxctl);
		sleep(5);
		p->fdi->txctl.v &= ~(7<<8 | 1);	/* clear auto training bits */
		p->fdi->txctl.v &= ~(1<<31);
		p->fdi->rxctl.v |= (1<<14);	/* enable pll */
		loadreg(igfx, p->fdi->txctl);
		sleep(5);
	}

	/* image size (vga needs to be off) */
	loadreg(igfx, p->src);

	/* set panel fitter as needed */
	if(p->pfit != nil){
		loadreg(igfx, p->pfit->ctrl);
		loadreg(igfx, p->pfit->winpos);
		loadreg(igfx, p->pfit->winsize);	/* arm */
	}

	/* keep planes disabled while pipe comes up */
	if(igfx->type == TypeG45)
		p->conf.v |= 3<<18;

	/* enable cpu pipe */
	loadtrans(igfx, p);

	/* program plane */
	loadreg(igfx, p->dsp->cntr);
	loadreg(igfx, p->dsp->linoff);
	loadreg(igfx, p->dsp->stride);
	loadreg(igfx, p->dsp->tileoff);
	loadreg(igfx, p->dsp->size);
	loadreg(igfx, p->dsp->pos);
	loadreg(igfx, p->dsp->surf);	/* arm */

	/* program cursor */
	loadreg(igfx, p->cur->cntr);
	loadreg(igfx, p->cur->pos);
	loadreg(igfx, p->cur->base);	/* arm */

	/* enable planes */
	if(igfx->type == TypeG45) {
		p->conf.v &= ~(3<<18);
		loadreg(igfx, p->conf);
	}

	if(p->fdi->rxctl.a != 0){
		/* enable fdi */
		loadreg(igfx, p->fdi->rxtu[1]);
		loadreg(igfx, p->fdi->rxtu[0]);
		loadreg(igfx, p->fdi->rxmisc);

		p->fdi->rxctl.v &= ~(3<<8);	/* link train pattern 00 */
		p->fdi->rxctl.v |= 1<<10;	/* auto train enable */
		p->fdi->rxctl.v |= 1<<31;	/* enable */
		loadreg(igfx, p->fdi->rxctl);

		p->fdi->txctl.v &= ~(3<<8);	/* link train pattern 00 */
		p->fdi->txctl.v |= 1<<10;	/* auto train enable */
		p->fdi->txctl.v |= 1<<31;	/* enable */
		loadreg(igfx, p->fdi->txctl);

		/* wait for link training done */
		for(i=0; i<200; i++){
			sleep(5);
			if(rr(igfx, p->fdi->txctl.a) & 2)
				break;
		}
	}

	/* enable the transcoder */
	loadtrans(igfx, p->fdi);
}

static void
disabletrans(Igfx *igfx, Trans *t)
{
	int i;

	/* disable displayport transcoder */
	csr(igfx, t->dpctl.a, 1<<31, 3<<29);

	/* disable transcoder / pipe */
	csr(igfx, t->conf.a, 1<<31, 0);
	for(i=0; i<100; i++){
		sleep(10);
		if((rr(igfx, t->conf.a) & (1<<30)) == 0)
			break;
	}
	/* workarround: clear timing override bit */
	csr(igfx, t->chicken.a, 1<<31, 0);

	/* disable dpll  */
	if(t->dpll != nil)
		csr(igfx, t->dpll->ctrl.a, 1<<31, 0);
}

static void
disablepipe(Igfx *igfx, int x)
{
	Pipe *p;

	p = &igfx->pipe[x];

	/* planes off */
	csr(igfx, p->dsp->cntr.a, 1<<31, 0);
	wr(igfx, p->dsp->surf.a, 0);	/* arm */
	/* cursor off */
	csr(igfx, p->cur->cntr.a, 1<<5 | 7, 0);
	wr(igfx, p->cur->base.a, 0);	/* arm */

	/* display/overlay/cursor planes off */
	if(igfx->type == TypeG45)
		csr(igfx, p->conf.a, 0, 3<<18);

	/* disable cpu pipe */
	disabletrans(igfx, p);

	/* disable panel fitter */
	if(p->pfit != nil)
		csr(igfx, p->pfit->ctrl.a, 1<<31, 0);

	/* disable fdi transmitter and receiver */
	csr(igfx, p->fdi->txctl.a, 1<<31 | 1<<10, 0);
	csr(igfx, p->fdi->rxctl.a, 1<<31 | 1<<10, 0);

	/* disable pch transcoder */
	disabletrans(igfx, p->fdi);

	/* disable pch dpll enable bit */
	csr(igfx, igfx->dpllsel.a, 8<<(x*4), 0);
}

static void
load(Vga* vga, Ctlr* ctlr)
{
	Igfx *igfx;
	int x;

	igfx = vga->private;

	/* power lcd off */
	if(igfx->ppcontrol.a != 0){
		csr(igfx, igfx->ppcontrol.a, 0xFFFF0005, 0xABCD0000);
		for(x=0; x<5000; x++){
			sleep(10);
			if((rr(igfx, igfx->ppstatus.a) & (1<<31)) == 0)
				break;
		}
	}

	/* disable ports */
	csr(igfx, igfx->sdvob.a, (1<<29) | (1<<31), 0);
	csr(igfx, igfx->sdvoc.a, (1<<29) | (1<<31), 0);
	csr(igfx, igfx->adpa.a, 1<<31, 0);
	csr(igfx, igfx->lvds.a, 1<<31, 0);
	for(x = 0; x < nelem(igfx->dp); x++)
		csr(igfx, igfx->dp[x].ctl.a, 1<<31, 0);
	for(x = 0; x < nelem(igfx->hdmi); x++)
		csr(igfx, igfx->hdmi[x].ctl.a, 1<<31, 0);

	/* disable vga plane */
	csr(igfx, igfx->vgacntrl.a, 0, 1<<31);

	/* turn off all pipes */
	for(x = 0; x < igfx->npipe; x++)
		disablepipe(igfx, x);

	if(igfx->type == TypeG45){
		/* toggle dsp a on and off (from enable sequence) */
		csr(igfx, igfx->pipe[0].conf.a, 3<<18, 0);
		csr(igfx, igfx->pipe[0].dsp->cntr.a, 0, 1<<31);
		wr(igfx, igfx->pipe[0].dsp->surf.a, 0);		/* arm */
		csr(igfx, igfx->pipe[0].dsp->cntr.a, 1<<31, 0);
		wr(igfx, igfx->pipe[0].dsp->surf.a, 0);		/* arm */
		csr(igfx, igfx->pipe[0].conf.a, 0, 3<<18);
	}

	/* program new clock sources */
	loadreg(igfx, igfx->rawclkfreq);
	loadreg(igfx, igfx->drefctl);
	sleep(10);

	/* set lvds before enabling dpll */
	loadreg(igfx, igfx->lvds);

	/* new dpll setting */
	loadreg(igfx, igfx->dpllsel);

	/* program all pipes */
	for(x = 0; x < igfx->npipe; x++)
		enablepipe(igfx, x);

	/* program vga plane */
	loadreg(igfx, igfx->vgacntrl);

	/* program ports */
	loadreg(igfx, igfx->adpa);
	loadreg(igfx, igfx->sdvob);
	loadreg(igfx, igfx->sdvoc);
	for(x = 0; x < nelem(igfx->dp); x++){
		if(enabledp(igfx, &igfx->dp[x]) < 0)
			ctlr->flag |= Ferror;
	}

	/* program lcd power */
	loadreg(igfx, igfx->ppcontrol);

	ctlr->flag |= Fload;
}

static void
dumpreg(char *name, char *item, Reg r)
{
	if(r.a == 0)
		return;

	printitem(name, item);
	Bprint(&stdout, " [%.8ux] = %.8ux\n", r.a, r.v);
}

static void
dumphex(char *name, char *item, uchar *data, int len)
{
	int i;

	for(i=0; i<len; i++){
		if((i & 15) == 0){
			if(i > 0)
				Bprint(&stdout, "\n");
			printitem(name, item);
			Bprint(&stdout, " [%.2x] =", i);
		}
		Bprint(&stdout, " %.2X", data[i]);
	}
	Bprint(&stdout, "\n");
}

static void
dumptiming(char *name, Trans *t)
{
	int tu, m, n;

	if(t->dm[0].a != 0 && t->dm[0].v != 0){
		tu = 1+((t->dm[0].v >> 25) & 0x3f);
		printitem(name, "dm1 tu");
		Bprint(&stdout, " %d\n", tu);

		m = t->dm[0].v & 0xffffff;
		n = t->dn[0].v;
		if(n > 0){
			printitem(name, "dm1/dn1");
			Bprint(&stdout, " %f\n", (double)m / (double)n);
		}

		m = t->lm[0].v;
		n = t->ln[0].v;
		if(n > 0){
			printitem(name, "lm1/ln1");
			Bprint(&stdout, " %f\n", (double)m / (double)n);
		}
	}
}

static void
dumptrans(char *name, Trans *t)
{
	dumpreg(name, "conf", t->conf);

	dumpreg(name, "dm1", t->dm[0]);
	dumpreg(name, "dn1", t->dn[0]);
	dumpreg(name, "lm1", t->lm[0]);
	dumpreg(name, "ln1", t->ln[0]);
	dumpreg(name, "dm2", t->dm[1]);
	dumpreg(name, "dn2", t->dn[1]);
	dumpreg(name, "lm2", t->lm[1]);
	dumpreg(name, "ln2", t->ln[1]);

	dumptiming(name, t);

	dumpreg(name, "ht", t->ht);
	dumpreg(name, "hb", t->hb);
	dumpreg(name, "hs", t->hs);

	dumpreg(name, "vt", t->vt);
	dumpreg(name, "vb", t->vb);
	dumpreg(name, "vs", t->vs);
	dumpreg(name, "vss", t->vss);

	dumpreg(name, "dpctl", t->dpctl);
}

static void
dumppipe(Igfx *igfx, int x)
{
	char name[32];
	Pipe *p;

	p = &igfx->pipe[x];

	snprint(name, sizeof(name), "%s pipe %c", igfx->ctlr->name, 'a'+x);
	dumpreg(name, "src", p->src);
	dumptrans(name, p);

	snprint(name, sizeof(name), "%s fdi %c", igfx->ctlr->name, 'a'+x);
	dumptrans(name, p->fdi);
	dumpreg(name, "txctl", p->fdi->txctl);
	dumpreg(name, "rxctl", p->fdi->rxctl);
	dumpreg(name, "rxmisc", p->fdi->rxmisc);
	dumpreg(name, "rxtu1", p->fdi->rxtu[0]);
	dumpreg(name, "rxtu2", p->fdi->rxtu[1]);

	snprint(name, sizeof(name), "%s dsp %c", igfx->ctlr->name, 'a'+x);
	dumpreg(name, "cntr", p->dsp->cntr);
	dumpreg(name, "linoff", p->dsp->linoff);
	dumpreg(name, "stride", p->dsp->stride);
	dumpreg(name, "surf", p->dsp->surf);
	dumpreg(name, "tileoff", p->dsp->tileoff);
	dumpreg(name, "pos", p->dsp->pos);
	dumpreg(name, "size", p->dsp->size);

	snprint(name, sizeof(name), "%s cur %c", igfx->ctlr->name, 'a'+x);
	dumpreg(name, "cntr", p->cur->cntr);
	dumpreg(name, "base", p->cur->base);
	dumpreg(name, "pos", p->cur->pos);
}

static void
dumpdpll(Igfx *igfx, int x)
{
	int cref, m1, m2, n, p1, p2;
	uvlong freq;
	char name[32];
	Dpll *dpll;
	u32int m;

	dpll = &igfx->dpll[x];
	snprint(name, sizeof(name), "%s dpll %c", igfx->ctlr->name, 'a'+x);

	dumpreg(name, "ctrl", dpll->ctrl);
	dumpreg(name, "fp0", dpll->fp0);
	dumpreg(name, "fp1", dpll->fp1);

	p2 = ((dpll->ctrl.v >> 13) & 3) == 3 ? 14 : 10;
	if(((dpll->ctrl.v >> 24) & 3) == 1)
		p2 >>= 1;
	m = (dpll->ctrl.v >> 16) & 0xFF;
	for(p1 = 1; p1 <= 8; p1++)
		if(m & (1<<(p1-1)))
			break;
	printitem(name, "ctrl p1");
	Bprint(&stdout, " %d\n", p1);
	printitem(name, "ctrl p2");
	Bprint(&stdout, " %d\n", p2);

	n = (dpll->fp0.v >> 16) & 0x3f;
	m1 = (dpll->fp0.v >> 8) & 0x3f;
	m2 = (dpll->fp0.v >> 0) & 0x3f;

	cref = getcref(igfx, x);
	freq = ((uvlong)cref * (5*(m1+2) + (m2+2)) / (n+2)) / (p1 * p2);

	printitem(name, "fp0 m1");
	Bprint(&stdout, " %d\n", m1);
	printitem(name, "fp0 m2");
	Bprint(&stdout, " %d\n", m2);
	printitem(name, "fp0 n");
	Bprint(&stdout, " %d\n", n);

	printitem(name, "cref");
	Bprint(&stdout, " %d\n", cref);
	printitem(name, "fp0 freq");
	Bprint(&stdout, " %lld\n", freq);
}

static void
dumpgtt(Igfx *igfx)
{
	u32int *gtt, ngtt, stolen, i;

	if(igfx->mmio == nil)
		return;
	stolen = 0;
	switch(igfx->type){
	case TypeG45:
		switch((pcicfgr16(igfx->pci, 0x52)>>4)&7){
		case 1: stolen =  1*1024*1024; break;
		case 2: stolen =  4*1024*1024; break;
		case 3: stolen =  8*1024*1024; break;
		case 4: stolen = 16*1024*1024; break;
		case 5: stolen = 32*1024*1024; break;
		case 6: stolen = 48*1024*1024; break;
		case 7: stolen = 64*1024*1024; break;
		}
		break;
	}
	Bprint(&stdout, "%s stolen\t%ud\n", igfx->ctlr->name, stolen);
	ngtt = stolen >> 12;
	if(ngtt == 0)
		return;
	gtt = (u32int*)((uchar*)igfx->mmio + igfx->pci->mem[0].size/2);
	for(i=0; i<ngtt; i++)
		Bprint(&stdout, "%s gtt\t[%.8ux] = %.8ux\n", igfx->ctlr->name, i<<12, gtt[i]);
}

static void
dump(Vga* vga, Ctlr* ctlr)
{
	char name[32];
	Igfx *igfx;
	int x;

	if((igfx = vga->private) == nil)
		return;

	for(x=0; x<igfx->npipe; x++)
		dumppipe(igfx, x);

	for(x=0; x<nelem(igfx->dpll); x++)
		dumpdpll(igfx, x);

	dumpreg(ctlr->name, "dpllsel", igfx->dpllsel);

	dumpreg(ctlr->name, "drefctl", igfx->drefctl);
	dumpreg(ctlr->name, "rawclkfreq", igfx->rawclkfreq);
	dumpreg(ctlr->name, "ssc4params", igfx->ssc4params);

	for(x=0; x<nelem(igfx->dp); x++){
		if(igfx->dp[x].ctl.a == 0)
			continue;
		snprint(name, sizeof(name), "%s dp %c", ctlr->name, 'a'+x);
		dumpreg(name, "ctl", igfx->dp[x].ctl);
		dumphex(name, "dpcd", igfx->dp[x].dpcd, sizeof(igfx->dp[x].dpcd));
	}
	for(x=0; x<nelem(igfx->hdmi); x++){
		snprint(name, sizeof(name), "%s hdmi %c", ctlr->name, 'a'+x);
		dumpreg(name, "ctl", igfx->hdmi[x].ctl);
	}

	for(x=0; x<nelem(igfx->pfit); x++){
		snprint(name, sizeof(name), "%s pfit %c", ctlr->name, 'a'+x);
		dumpreg(name, "ctrl", igfx->pfit[x].ctrl);
		dumpreg(name, "winpos", igfx->pfit[x].winpos);
		dumpreg(name, "winsize", igfx->pfit[x].winsize);
		dumpreg(name, "pwrgate", igfx->pfit[x].pwrgate);
	}

	dumpreg(ctlr->name, "ppcontrol", igfx->ppcontrol);
	dumpreg(ctlr->name, "ppstatus", igfx->ppstatus);

	dumpreg(ctlr->name, "adpa", igfx->adpa);
	dumpreg(ctlr->name, "lvds", igfx->lvds);
	dumpreg(ctlr->name, "sdvob", igfx->sdvob);
	dumpreg(ctlr->name, "sdvoc", igfx->sdvoc);

	dumpreg(ctlr->name, "vgacntrl", igfx->vgacntrl);
	dumpgtt(igfx);
}

static int
dpauxio(Igfx *igfx, Dp *dp, uchar buf[20], int len)
{
	int t, i;
	u32int w;

	if(dp->auxctl.a == 0){
		werrstr("not present");
		return -1;
	}

	t = 0;
	while(rr(igfx, dp->auxctl.a) & (1<<31)){
		if(++t >= 10){
			werrstr("busy");
			return -1;
		}
		sleep(5);
	}

	/* clear sticky bits */
	wr(igfx, dp->auxctl.a, (1<<28) | (1<<25) | (1<<30));

	for(i=0; i<nelem(dp->auxdat); i++){
		w  = buf[i*4+0]<<24;
		w |= buf[i*4+1]<<16;
		w |= buf[i*4+2]<<8;
		w |= buf[i*4+3];
		wr(igfx, dp->auxdat[i].a, w);
	}

	/* 2X Bit Clock divider */
	w = ((dp == &igfx->dp[0]) ? igfx->cdclk : (igfx->rawclkfreq.v & 0x3ff)) >> 1;
	if(w < 1 || w > 0x3fd){
		werrstr("bad clock");
		return -1;
	}

	/* hack: slow down a bit */
	w += 2;

	w |= 1<<31;	/* SendBusy */
	w |= 1<<29;	/* interrupt disabled */
	w |= 3<<26;	/* timeout 1600µs */
	w |= len<<20;	/* send bytes */
	w |= 5<<16;	/* precharge time (5*2 = 10µs) */
	wr(igfx, dp->auxctl.a, w);

	t = 0;
	for(;;){
		w = rr(igfx, dp->auxctl.a);
		if((w & (1<<30)) != 0)
			break;
		if(++t >= 10){
			werrstr("busy");
			return -1;
		}
		sleep(5);
	}
	if(w & (1<<28)){
		werrstr("receive timeout");
		return -1;
	}
	if(w & (1<<25)){
		werrstr("receive error");
		return -1;
	}

	len = (w >> 20) & 0x1f;
	for(i=0; i<nelem(dp->auxdat); i++){
		w = rr(igfx, dp->auxdat[i].a);
		buf[i*4+0] = w>>24;
		buf[i*4+1] = w>>16;
		buf[i*4+2] = w>>8;
		buf[i*4+3] = w;
	}

	return len;
}

enum {
	CmdNative	= 8,
	CmdMot		= 4,
	CmdRead		= 1,
	CmdWrite	= 0,
};

static int
dpauxtra(Igfx *igfx, Dp *dp, int cmd, int addr, uchar *data, int len)
{
	uchar buf[20];
	int r;

	assert(len <= 16);

	memset(buf, 0, sizeof(buf));
	buf[0] = (cmd << 4) | ((addr >> 16) & 0xF);
	buf[1] = addr >> 8;
	buf[2] = addr;
	buf[3] = len-1;
	r = 3;	
	if(data != nil && len > 0){
		if((cmd & CmdRead) == 0)
			memmove(buf+4, data, len);
		r = 4 + len;
	}
	if((r = dpauxio(igfx, dp, buf, r)) < 0){
		trace("%s: dpauxio: dp %c, cmd %x, addr %x, len %d: %r\n",
			igfx->ctlr->name, 'a'+(int)(dp - &igfx->dp[0]), cmd, addr, len);
		return -1;
	}
	if(r == 0 || data == nil || len == 0)
		return 0;
	if((cmd & CmdRead) != 0){
		if(--r < len)
			len = r;
		memmove(data, buf+1, len);
	}
	return len;
}

static int
rdpaux(Igfx *igfx, Dp *dp, int addr)
{
	uchar buf[1];
	if(dpauxtra(igfx, dp, CmdNative|CmdRead, addr, buf, 1) != 1)
		return -1;
	return buf[0];
}
static int
wdpaux(Igfx *igfx, Dp *dp, int addr, uchar val)
{
	if(dpauxtra(igfx, dp, CmdNative|CmdWrite, addr, &val, 1) != 1)
		return -1;
	return 0;
}

static int
enabledp(Igfx *igfx, Dp *dp)
{
	int try, r;

	if(dp->ctl.a == 0)
		return 0;
	if((dp->ctl.v & (1<<31)) == 0)
		return 0;

	/* Link configuration */
	wdpaux(igfx, dp, 0x100, 0x0A);	/* 270Mhz */
	wdpaux(igfx, dp, 0x101, 0x01);	/* one lane */

	r = 0;

	/* Link training pattern 1 */
	dp->ctl.v &= ~(7<<8);
	loadreg(igfx, dp->ctl);
	for(try = 0;;try++){
		if(try > 5)
			goto Fail;
		/* Link training pattern 1 */
		wdpaux(igfx, dp, 0x102, 0x01);
		sleep(100);
		if((r = rdpaux(igfx, dp, 0x202)) < 0)
			goto Fail;
		if(r & 1)	/* LANE0_CR_DONE */
			break;
	}
	trace("pattern1 finished: %x\n", r);

	/* Link training pattern 2 */
	dp->ctl.v &= ~(7<<8);
	dp->ctl.v |= 1<<8;
	loadreg(igfx, dp->ctl);
	for(try = 0;;try++){
		if(try > 5)
			goto Fail;
		/* Link training pattern 2 */
		wdpaux(igfx, dp, 0x102, 0x02);
		sleep(100);
		if((r = rdpaux(igfx, dp, 0x202)) < 0)
			goto Fail;
		if((r & 7) == 7)
			break;
	}
	trace("pattern2 finished: %x\n", r);

	/* stop training */
	dp->ctl.v &= ~(7<<8);
	dp->ctl.v |= 3<<8;
	loadreg(igfx, dp->ctl);
	wdpaux(igfx, dp, 0x102, 0x00);
	return 1;

Fail:
	trace("training failed: %x\n", r);

	/* disable port */
	dp->ctl.v &= ~(1<<31);
	loadreg(igfx, dp->ctl);
	wdpaux(igfx, dp, 0x102, 0x00);
	return -1;
}

static uchar*
edidshift(uchar buf[256])
{
	uchar tmp[256];
	int i;

	/* shift if neccesary so edid block is at the start */
	for(i=0; i<256-8; i++){
		if(buf[i+0] == 0x00 && buf[i+1] == 0xFF && buf[i+2] == 0xFF && buf[i+3] == 0xFF
		&& buf[i+4] == 0xFF && buf[i+5] == 0xFF && buf[i+6] == 0xFF && buf[i+7] == 0x00){
			memmove(tmp, buf, i);
			memmove(buf, buf + i, 256 - i);
			memmove(buf + (256 - i), tmp, i);
			break;
		}
	}
	return buf;
}

static Edid*
snarfdpedid(Igfx *igfx, Dp *dp, int addr)
{
	uchar buf[256];
	int i;

	for(i=0; i<sizeof(dp->dpcd); i+=16)
		if(dpauxtra(igfx, dp, CmdNative|CmdRead, i, dp->dpcd+i, 16) != 16)
			return nil;

	if(dp->dpcd[0] == 0)	/* nothing there, dont try to get edid */
		return nil;

	if(dpauxtra(igfx, dp, CmdMot|CmdRead, addr, nil, 0) < 0)
		return nil;

	for(i=0; i<sizeof(buf); i+=16){
		if(dpauxtra(igfx, dp, CmdMot|CmdRead, addr, buf+i, 16) == 16)
			continue;
		if(dpauxtra(igfx, dp, CmdMot|CmdRead, addr, buf+i, 16) == 16)
			continue;
		if(dpauxtra(igfx, dp, CmdMot|CmdRead, addr, buf+i, 16) == 16)
			continue;
		if(dpauxtra(igfx, dp, CmdMot|CmdRead, addr, buf+i, 16) == 16)
			continue;
		if(dpauxtra(igfx, dp, CmdMot|CmdRead, addr, buf+i, 16) == 16)
			continue;
		return nil;
	}

	dpauxtra(igfx, dp, CmdRead, addr, nil, 0);

	return parseedid128(edidshift(buf));
}

enum {
	GMBUSCP = 0,	/* Clock/Port selection */
	GMBUSCS = 1,	/* Command/Status */
	GMBUSST = 2,	/* Status Register */
	GMBUSDB	= 3,	/* Data Buffer Register */
	GMBUSIM = 4,	/* Interrupt Mask */
	GMBUSIX = 5,	/* Index Register */
};
	
static int
gmbusread(Igfx *igfx, int port, int addr, uchar *data, int len)
{
	u32int x, y;
	int n, t;

	if(igfx->gmbus[GMBUSCP].a == 0)
		return -1;

	wr(igfx, igfx->gmbus[GMBUSCP].a, port);
	wr(igfx, igfx->gmbus[GMBUSIX].a, 0);

	/* bus cycle without index and stop, byte count, slave address, read */
	wr(igfx, igfx->gmbus[GMBUSCS].a, 1<<30 | 5<<25 | len<<16 | addr<<1 | 1);

	n = 0;
	while(len > 0){
		x = 0;
		for(t=0; t<100; t++){
			x = rr(igfx, igfx->gmbus[GMBUSST].a);
			if(x & (1<<11))
				break;
			sleep(5);
		}
		if((x & (1<<11)) == 0)
			return -1;

		t = 4 - (x & 3);
		if(t > len)
			t = len;
		len -= t;

		y = rr(igfx, igfx->gmbus[GMBUSDB].a);
		switch(t){
		case 4:
			data[n++] = y & 0xff, y >>= 8;
		case 3:
			data[n++] = y & 0xff, y >>= 8;
		case 2:
			data[n++] = y & 0xff, y >>= 8;
		case 1:
			data[n++] = y & 0xff;
		}
	}

	return n;
}

static Edid*
snarfgmedid(Igfx *igfx, int port, int addr)
{
	uchar buf[256];

	/* read twice */
	if(gmbusread(igfx, port, addr, buf, 128) != 128)
		return nil;
	if(gmbusread(igfx, port, addr, buf + 128, 128) != 128)
		return nil;

	return parseedid128(edidshift(buf));
}

Ctlr igfx = {
	"igfx",			/* name */
	snarf,			/* snarf */
	options,		/* options */
	init,			/* init */
	load,			/* load */
	dump,			/* dump */
};

Ctlr igfxhwgc = {
	"igfxhwgc",
};

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24 19:57                         ` cinap_lenrek
@ 2016-05-24 23:55                           ` kokamoto
  2016-05-25  9:08                             ` cinap_lenrek
  2016-05-25 12:20                             ` cinap_lenrek
  0 siblings, 2 replies; 30+ messages in thread
From: kokamoto @ 2016-05-24 23:55 UTC (permalink / raw)
  To: 9front

> attached a hacked igfx.c to be copied to /sys/src/cmd/aux/vga/igfx.c, will
> dump the gtt when you run: aux/vga -m ... -p

Very much thank you!
The result is 
igfx gtt [00000000] = 7f800001
  one increments one 4k area
           .....
igfx gtt [007ff000] = 7ff7e001

By the way, we are reading different manual..
I don't see the pages as 
"7.2.11 GTTMMADR — Graphics Translation Table Range Address"
or
"8.2.1.4 GTT Page Table Entries (PTEs)"

Kenji

PS. my manual says we must wait VBLANK when we changed
the cursor contents.   I'm now considering this.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24 23:55                           ` kokamoto
@ 2016-05-25  9:08                             ` cinap_lenrek
  2016-05-25 12:20                             ` cinap_lenrek
  1 sibling, 0 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-25  9:08 UTC (permalink / raw)
  To: 9front

>> attached a hacked igfx.c to be copied to /sys/src/cmd/aux/vga/igfx.c, will
>> dump the gtt when you run: aux/vga -m ... -p

> Very much thank you!
> The result is 
> igfx gtt [00000000] = 7f800001
>  one increments one 4k area
>           .....
> igfx gtt [007ff000] = 7ff7e001

can you send me the full output? its not just identity mapped:

7ff7e000 - 7f800000 = 0077f000 < 007ff000

the exact mapping is precisely the subject of our interest
here :-)

> By the way, we are reading different manual..
> I don't see the pages as 
> "7.2.11 GTTMMADR — Graphics Translation Table Range Address"
> or
> "8.2.1.4 GTT Page Table Entries (PTEs)"

its in volume 1:

https://01.org/sites/default/files/documentation/965_g35_vol_1_graphics_core_0.pdf

> PS. my manual says we must wait VBLANK when we changed
> the cursor contents.   I'm now considering this.

no. the update is just "postponed" until the next vblank.

basically, hardware will just look at the values on each
vblank and then schedule the dma... it does not mean we
need to wait for hardware vblank. thats how i interpret
the documentation.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-24 23:55                           ` kokamoto
  2016-05-25  9:08                             ` cinap_lenrek
@ 2016-05-25 12:20                             ` cinap_lenrek
  2016-05-25 12:45                               ` cinap_lenrek
  2016-05-25 12:49                               ` kokamoto
  1 sibling, 2 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-25 12:20 UTC (permalink / raw)
  To: 9front


basically, in the bios, you configure how much DRAM you
want to give to the graphics card. and the bios will
then take that amount of memory and subtract it from
the top of physical DRAM sapce and not include that in
the e820 memory map.

bios will also setup the GTT to map the framebuffer
using that "stolen" memory block.

my assumption was that all of the stolen memory reported
will be fully mapped in the GTT, but this appears not to be
the case with your bios. because 512k + 4k are not mapped
at the end. this sounds suspiciously like the bios might
have located the GTT table itself in the stolen memory
area.

there should be a page table control register where the
size and base of the GTT table location is stored. i'd
like to get a readout of that... i'll send instructions
when i get home...

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-25 12:20                             ` cinap_lenrek
@ 2016-05-25 12:45                               ` cinap_lenrek
  2016-05-25 12:49                               ` kokamoto
  1 sibling, 0 replies; 30+ messages in thread
From: cinap_lenrek @ 2016-05-25 12:45 UTC (permalink / raw)
  To: 9front

PGTBL_CTL register is at MMIO offset 0x2020, so try:

seg -Lr igfxmmio 0x4000 0x2020

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-25 12:20                             ` cinap_lenrek
  2016-05-25 12:45                               ` cinap_lenrek
@ 2016-05-25 12:49                               ` kokamoto
  2016-05-25 13:32                                 ` cinap_lenrek
  1 sibling, 1 reply; 30+ messages in thread
From: kokamoto @ 2016-05-25 12:49 UTC (permalink / raw)
  To: 9front

> basically, in the bios, you configure how much DRAM you
> want to give to the graphics card. and the bios will
> then take that amount of memory and subtract it from
> the top of physical DRAM sapce and not include that in
> the e820 memory map.

Strangely, my machine's BIOS dosen't have an option to configure
the memory of graphics.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-25 12:49                               ` kokamoto
@ 2016-05-25 13:32                                 ` cinap_lenrek
  2016-05-26  0:09                                   ` kokamoto
  0 siblings, 1 reply; 30+ messages in thread
From: cinap_lenrek @ 2016-05-25 13:32 UTC (permalink / raw)
  To: 9front

> Strangely, my machine's BIOS dosen't have an option to configure
> the memory of graphics.

that might be, but it doesnt change the fact that the bios sets up
these data structures before the OS even runs, otherwise you would
not see anything on the screen.

until now, the igfx driver never cared about this and just assumed
when the card reports 8MB are stolen, then we could have up to 8MB
of framebuffer in the graphics aperture window.

in theory, we could do all the memory management ourselfs and program
the GTT tables ourselfs, but that results in more code that needs to
be maintained...

one idea might be to calculate the storage location differently,
locating the cursor right after the framebuffer (which means the
cursor needs to be reprogrammed after resolution change), or we
at least check the GTT and try to determine the large continuous
physical memory range and then use the end as storage location
for the cursor.

we could also avoid the graphics memory completely and set the
cursor in popup mode, storing the cursor image in kernel memory.

but having full knowledge about graphics memory layout might be
the better option, we will need it later if we want todo 3d
acceleration.

--
cinap


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9front] I'm giving up... hwcursor
  2016-05-25 13:32                                 ` cinap_lenrek
@ 2016-05-26  0:09                                   ` kokamoto
  0 siblings, 0 replies; 30+ messages in thread
From: kokamoto @ 2016-05-26  0:09 UTC (permalink / raw)
  To: 9front

> but having full knowledge about graphics memory layout might be
> the better option, we will need it later if we want todo 3d
> acceleration.

That's the main reason I have interests on native vga drivers.
If we continue to use vesa mode only, we have no future, I believe.

Kenji



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2016-05-26  0:09 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-21  2:04 I'm giving up... hwcursor kokamoto
2016-05-22 11:56 ` [9front] " cinap_lenrek
2016-05-23  0:21   ` kokamoto
2016-05-23  0:39   ` kokamoto
2016-05-23  0:57     ` cinap_lenrek
2016-05-23  2:48       ` kokamoto
2016-05-23  3:01         ` kokamoto
2016-05-23  5:09       ` kokamoto
2016-05-23  5:23         ` kokamoto
2016-05-23 11:01           ` cinap_lenrek
2016-05-23 11:29             ` kokamoto
2016-05-23 11:42               ` kokamoto
2016-05-23 12:05                 ` cinap_lenrek
2016-05-24  1:42                   ` kokamoto
2016-05-24  8:57                     ` cinap_lenrek
2016-05-24 10:12                       ` kokamoto
2016-05-24 10:14                       ` kokamoto
2016-05-24 14:17                         ` cinap_lenrek
2016-05-24 19:57                         ` cinap_lenrek
2016-05-24 23:55                           ` kokamoto
2016-05-25  9:08                             ` cinap_lenrek
2016-05-25 12:20                             ` cinap_lenrek
2016-05-25 12:45                               ` cinap_lenrek
2016-05-25 12:49                               ` kokamoto
2016-05-25 13:32                                 ` cinap_lenrek
2016-05-26  0:09                                   ` kokamoto
2016-05-24  9:10                     ` cinap_lenrek
2016-05-23 15:07         ` cinap_lenrek
2016-05-23  8:46     ` cinap_lenrek
2016-05-23  8:52     ` cinap_lenrek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).