9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] Random SATA errors with SMP on a dual core machine.
@ 2009-06-02 20:22 Dan Cross
  2009-06-02 21:01 ` Steve Simon
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Dan Cross @ 2009-06-02 20:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Has anyone else seen this?  I am experiencing random SATA errors when
I turn on SMP on a dual core machine.

After a several-year hiatus, I just got some new hardware to build a
plan 9 network at home.  My file server is a 1U rackmount machine with
the following hardware:

1. SuperMicro PDSML-LN2+ motherboard
   - builtin ICH7R SATA controller
   - builtin Intel 82573L Gigabit Ethernet adapter)
2. 1.8GHz Dual-core Intel Core2 Duo processor
3. 2GB RAM
4. 2 x 750GB SATA drives
5. 1 x 2GB Compact Flash removal disk.

Note that this machine has neither a CD nor DVD drive.  This is
because I misread the vendor's quote: they could not fit a slimline CD
or DVD drive into the 1U chassis along with two hard drives but I
didn't realize that until I pulled the machine out of the box.  I got
around this by installing Plan 9 onto the compact flash card on
another machine that did have a CD drive, then bringing it up on this
machine.

The first problem I had was using the SATA drives; the SATA drivers in
the distributed kernel had problems, so I updated them to the latest
from Erik's directory on sources.  Specifically:

% 9fs sources
% cd /n/sources/contrib/quanstro/root/sys/src/9/pc
% cp sdata.c sdiahci.c ahci.h /sys/src/9/pc
% cd ../port
% cp devsd.c sd.h sdloop.c /sys/src/9/port
% cd ../../libfis
% mkdir /sys/src/libfis
% cp fis.h mkfile /sys/src/libfis
% cd /sys/src/libfs
% mk install

I then edited the appropriate mkfile to refer to /386/lib/libfis.a and
built the 'pcf' kernel, copied it to 9fat (on the CF card) and
rebooted.  I'm not sure that I didn't miss any steps, but I was able
to fdisk, prep and flfmt the SATA drives and load the operating system
by running the (slightly edited) installation scripts from
/sys/lib/dist/pc/inst, choosing a fossil+venti configuration.  To this
point, I'd only been using one core as '*nomp=1' was set in plan9.ini.
 At this point, everything is still running as a terminal.

Now the problem that I am seeing is that, if I boot the machine up
with both cores enabled, I get some relatively small amount of use out
of the SATA drives, then I get a (seemingly) random i/o error and then
all further access to the drives fails.  I am still booting from the
CF disk, but using the fossil on the SATA drives as the root.  I was
also having problems with rio, but upon further investigation, I see
that there are known issues with VESA and MP, but even if I don't load
the VGA registers and stay in CGA mode things still behave strangely
(for instance, my venti got corrupted and all of /sys/include
disappeared).  However, if I set '*nomp=1' in plan9.ini, everything
works fine.

Has anyone seen this before?
Is this a known issue?
Even better, is there a fix?

Btw: my long term intention is to use the fs driver to mirror fossil
and venti across both of the SATA drives, keep a small fossil on the
CF card for emergencies, and keep a partition there for secstore data.
 But I haven't gotten to that stage yet.

        - Dan C.



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-06-04 19:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-02 20:22 [9fans] Random SATA errors with SMP on a dual core machine Dan Cross
2009-06-02 21:01 ` Steve Simon
2009-06-03  5:15 ` erik quanstrom
2009-06-04 16:03   ` Dan Cross
2009-06-04 17:54     ` erik quanstrom
2009-06-04 19:33 ` Venkatesh Srinivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).