9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] syscall spikes
@ 2007-04-17 14:15 Devon H. O'Dell
  2007-04-17 14:36 ` Uriel
  2007-04-18 16:53 ` Russ Cox
  0 siblings, 2 replies; 11+ messages in thread
From: Devon H. O'Dell @ 2007-04-17 14:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hi,

I've been having syscall spikes. It's really weird. There will be
>40,000 in 1 second, (according to stats(8)) and it appears at a
fairly regular interval. I've tried running kprof, but I'm not sure
that it's running a global profile. Namely, when I come out of it, I
have tons of time spent in ``halt'', with very, very few (<100)
syscalls. Any ideas what I can do to find the culprit? It's gapping
sound playback, which is really annoying.

--dho


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 14:15 [9fans] syscall spikes Devon H. O'Dell
@ 2007-04-17 14:36 ` Uriel
  2007-04-17 15:22   ` Devon H. O'Dell
  2007-04-18 16:53 ` Russ Cox
  1 sibling, 1 reply; 11+ messages in thread
From: Uriel @ 2007-04-17 14:36 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

fossil

On 4/17/07, Devon H. O'Dell <devon.odell@gmail.com> wrote:
> Hi,
>
> I've been having syscall spikes. It's really weird. There will be
> >40,000 in 1 second, (according to stats(8)) and it appears at a
> fairly regular interval. I've tried running kprof, but I'm not sure
> that it's running a global profile. Namely, when I come out of it, I
> have tons of time spent in ``halt'', with very, very few (<100)
> syscalls. Any ideas what I can do to find the culprit? It's gapping
> sound playback, which is really annoying.
>
> --dho
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 14:36 ` Uriel
@ 2007-04-17 15:22   ` Devon H. O'Dell
  2007-04-17 15:38     ` Devon H. O'Dell
  0 siblings, 1 reply; 11+ messages in thread
From: Devon H. O'Dell @ 2007-04-17 15:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2007/4/17, Uriel <uriel99@gmail.com>:
> fossil

I attempted to tprof fossil and I don't get any useful output. I even did:

for (i in `{ps a | grep fossil | awk '{ print $2 }'})
    echo profile > /proc/$i/ctl

[wait a while for spike]

for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
    echo $i
    tprof $i
}

All I get is a bunch of lines saying ``total: 0''.

What'm I missing? :(

--dho

> On 4/17/07, Devon H. O'Dell <devon.odell@gmail.com> wrote:
> > Hi,
> >
> > I've been having syscall spikes. It's really weird. There will be
> > >40,000 in 1 second, (according to stats(8)) and it appears at a
> > fairly regular interval. I've tried running kprof, but I'm not sure
> > that it's running a global profile. Namely, when I come out of it, I
> > have tons of time spent in ``halt'', with very, very few (<100)
> > syscalls. Any ideas what I can do to find the culprit? It's gapping
> > sound playback, which is really annoying.
> >
> > --dho
> >
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 15:22   ` Devon H. O'Dell
@ 2007-04-17 15:38     ` Devon H. O'Dell
  2007-04-17 15:45       ` erik quanstrom
  2007-04-17 15:49       ` Charles Forsyth
  0 siblings, 2 replies; 11+ messages in thread
From: Devon H. O'Dell @ 2007-04-17 15:38 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2007/4/17, Devon H. O'Dell <devon.odell@gmail.com>:
> 2007/4/17, Uriel <uriel99@gmail.com>:
> > fossil
>
> I attempted to tprof fossil and I don't get any useful output. I even did:
>
> for (i in `{ps a | grep fossil | awk '{ print $2 }'})
>     echo profile > /proc/$i/ctl
>
> [wait a while for spike]
>
> for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
>     echo $i
>     tprof $i
> }
>
> All I get is a bunch of lines saying ``total: 0''.
>
> What'm I missing? :(

I found the [undocumented] startsyscall argument to the process ctl
file, but I cannot get it to work. If I echo startsyscall >
/proc/N/ctl, I get ``echo: write error: bad process or channel control
request''.

What gives?

--dho

> --dho
>
> > On 4/17/07, Devon H. O'Dell <devon.odell@gmail.com> wrote:
> > > Hi,
> > >
> > > I've been having syscall spikes. It's really weird. There will be
> > > >40,000 in 1 second, (according to stats(8)) and it appears at a
> > > fairly regular interval. I've tried running kprof, but I'm not sure
> > > that it's running a global profile. Namely, when I come out of it, I
> > > have tons of time spent in ``halt'', with very, very few (<100)
> > > syscalls. Any ideas what I can do to find the culprit? It's gapping
> > > sound playback, which is really annoying.
> > >
> > > --dho
> > >
> >
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 15:38     ` Devon H. O'Dell
@ 2007-04-17 15:45       ` erik quanstrom
  2007-04-17 15:56         ` Devon H. O'Dell
  2007-04-17 15:49       ` Charles Forsyth
  1 sibling, 1 reply; 11+ messages in thread
From: erik quanstrom @ 2007-04-17 15:45 UTC (permalink / raw)
  To: 9fans

are you running anything that faces the outside world that
might be doing auth?  if you don't have a /sys/log/auth, you
might want to create it and watch it for any external attacks.

i've found that to be a source of big syscall loads on some of
our systems.

- erik

On Tue Apr 17 11:38:33 EDT 2007, devon.odell@gmail.com wrote:
> 2007/4/17, Devon H. O'Dell <devon.odell@gmail.com>:
> > 2007/4/17, Uriel <uriel99@gmail.com>:
> > > fossil
> >
> > I attempted to tprof fossil and I don't get any useful output. I even did:
> >
> > for (i in `{ps a | grep fossil | awk '{ print $2 }'})
> >     echo profile > /proc/$i/ctl
> >
> > [wait a while for spike]
> >
> > for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
> >     echo $i
> >     tprof $i
> > }
> >
> > All I get is a bunch of lines saying ``total: 0''.
> >
> > What'm I missing? :(
> 
> I found the [undocumented] startsyscall argument to the process ctl
> file, but I cannot get it to work. If I echo startsyscall >
> /proc/N/ctl, I get ``echo: write error: bad process or channel control
> request''.
> 
> What gives?
> 
> --dho
> 
> > --dho
> >
> > > On 4/17/07, Devon H. O'Dell <devon.odell@gmail.com> wrote:
> > > > Hi,
> > > >
> > > > I've been having syscall spikes. It's really weird. There will be
> > > > >40,000 in 1 second, (according to stats(8)) and it appears at a


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 15:38     ` Devon H. O'Dell
  2007-04-17 15:45       ` erik quanstrom
@ 2007-04-17 15:49       ` Charles Forsyth
  1 sibling, 0 replies; 11+ messages in thread
From: Charles Forsyth @ 2007-04-17 15:49 UTC (permalink / raw)
  To: 9fans

it looks as though the process must be stopped first,
which you possibly don't want to do to your primary fossil.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 15:45       ` erik quanstrom
@ 2007-04-17 15:56         ` Devon H. O'Dell
  2007-04-17 16:29           ` Devon H. O'Dell
  0 siblings, 1 reply; 11+ messages in thread
From: Devon H. O'Dell @ 2007-04-17 15:56 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2007/4/17, erik quanstrom <quanstro@coraid.com>:
> are you running anything that faces the outside world that
> might be doing auth?  if you don't have a /sys/log/auth, you
> might want to create it and watch it for any external attacks.
>
> i've found that to be a source of big syscall loads on some of
> our systems.

Nope, it's just a laptop on a network, and the spikes are very short
(1s bursts), and happen possibly once per minute or so, but they're
fairly evenly spaced, though not always.

Charles: I dug around in the code and it appears that it actually
stops the process and does nothing else. I can't see where any syscall
trace information is actually generated from it. So it doesn't seem
like a big problem that I can't get it to work, since it's not going
to anyway (or what am I missing?).

I'm considering creating a trace buffer of my own and doing some
heuristics with the timer, where if there have been more than 10,000
syscalls since the last tick, it starts logging them all. Because
honestly, I can't find a better way to do it. Does anybody have any
suggestions before I try this?

> - erik

--dho

> On Tue Apr 17 11:38:33 EDT 2007, devon.odell@gmail.com wrote:
> > 2007/4/17, Devon H. O'Dell <devon.odell@gmail.com>:
> > > 2007/4/17, Uriel <uriel99@gmail.com>:
> > > > fossil
> > >
> > > I attempted to tprof fossil and I don't get any useful output. I even did:
> > >
> > > for (i in `{ps a | grep fossil | awk '{ print $2 }'})
> > >     echo profile > /proc/$i/ctl
> > >
> > > [wait a while for spike]
> > >
> > > for (i in `{ps a | grep fossil | awk '{ print $2 }'}) {
> > >     echo $i
> > >     tprof $i
> > > }
> > >
> > > All I get is a bunch of lines saying ``total: 0''.
> > >
> > > What'm I missing? :(
> >
> > I found the [undocumented] startsyscall argument to the process ctl
> > file, but I cannot get it to work. If I echo startsyscall >
> > /proc/N/ctl, I get ``echo: write error: bad process or channel control
> > request''.
> >
> > What gives?
> >
> > --dho
> >
> > > --dho
> > >
> > > > On 4/17/07, Devon H. O'Dell <devon.odell@gmail.com> wrote:
> > > > > Hi,
> > > > >
> > > > > I've been having syscall spikes. It's really weird. There will be
> > > > > >40,000 in 1 second, (according to stats(8)) and it appears at a
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 15:56         ` Devon H. O'Dell
@ 2007-04-17 16:29           ` Devon H. O'Dell
  2007-04-17 16:39             ` Devon H. O'Dell
  0 siblings, 1 reply; 11+ messages in thread
From: Devon H. O'Dell @ 2007-04-17 16:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Steve Simon suggested the problem was timesync. He seems to have been
very correct in this suggestion. Though I wonder why it needs to do >
60k syscalls :\.

Additionally, using dd to play my cd instead of cat seems to work
pretty well. dd -if a000 -of /dev/audio -bs 4192 seems to be enough.
Additionally, using madplay like:

madplay -o RAW:'02 - Birdhouse In Your Soul.mp3' | dd -of /dev/audio -bs 4192

seems to work well.

--dho


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 16:29           ` Devon H. O'Dell
@ 2007-04-17 16:39             ` Devon H. O'Dell
  0 siblings, 0 replies; 11+ messages in thread
From: Devon H. O'Dell @ 2007-04-17 16:39 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2007/4/17, Devon H. O'Dell <devon.odell@gmail.com>:
> Steve Simon suggested the problem was timesync. He seems to have been
> very correct in this suggestion. Though I wonder why it needs to do >
> 60k syscalls :\.

And my apologies for continuing additions to this thread and flooding
the inboxes of you fine subscribers :\. It seems that somehow I ended
up with 2 timesyncs running. This causes the spikes I was seeing -- a
single timesync doesn't spike terribly badly.

Should there be some check in timesync to prevent two from running?
There should be no reason for this to ever occur.

--dho


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-17 14:15 [9fans] syscall spikes Devon H. O'Dell
  2007-04-17 14:36 ` Uriel
@ 2007-04-18 16:53 ` Russ Cox
  2007-04-18 16:59   ` Devon H. O'Dell
  1 sibling, 1 reply; 11+ messages in thread
From: Russ Cox @ 2007-04-18 16:53 UTC (permalink / raw)
  To: 9fans

> I've been having syscall spikes. It's really weird. There will be
>>40,000 in 1 second, (according to stats(8)) and it appears at a
> fairly regular interval. I've tried running kprof, but I'm not sure
> that it's running a global profile. Namely, when I come out of it, I
> have tons of time spent in ``halt'', with very, very few (<100)
> syscalls. Any ideas what I can do to find the culprit? It's gapping
> sound playback, which is really annoying.

You figured out that this is caused by two timesyncs.

However, it might not be caused by the timesyncs actually
issuing that many system calls as much as fighting over
what time means.  So probably you're not having 40,000
system calls but in fact the clock is being slowed/stopped
in the kernel, which makes the devaudio sleep take too long.

Russ



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] syscall spikes
  2007-04-18 16:53 ` Russ Cox
@ 2007-04-18 16:59   ` Devon H. O'Dell
  0 siblings, 0 replies; 11+ messages in thread
From: Devon H. O'Dell @ 2007-04-18 16:59 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2007/4/18, Russ Cox <rsc@swtch.com>:
> > I've been having syscall spikes. It's really weird. There will be
> >>40,000 in 1 second, (according to stats(8)) and it appears at a
> > fairly regular interval. I've tried running kprof, but I'm not sure
> > that it's running a global profile. Namely, when I come out of it, I
> > have tons of time spent in ``halt'', with very, very few (<100)
> > syscalls. Any ideas what I can do to find the culprit? It's gapping
> > sound playback, which is really annoying.
>
> You figured out that this is caused by two timesyncs.
>
> However, it might not be caused by the timesyncs actually
> issuing that many system calls as much as fighting over
> what time means.  So probably you're not having 40,000
> system calls but in fact the clock is being slowed/stopped
> in the kernel, which makes the devaudio sleep take too long.

Why would stats(8) say that there's anywhere between 40000 and (the
highest I've seen) 80000 syscalls in that second? Is its notion of the
second screwed up that much due to the contention?

> Russ

--dho


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-04-18 16:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-17 14:15 [9fans] syscall spikes Devon H. O'Dell
2007-04-17 14:36 ` Uriel
2007-04-17 15:22   ` Devon H. O'Dell
2007-04-17 15:38     ` Devon H. O'Dell
2007-04-17 15:45       ` erik quanstrom
2007-04-17 15:56         ` Devon H. O'Dell
2007-04-17 16:29           ` Devon H. O'Dell
2007-04-17 16:39             ` Devon H. O'Dell
2007-04-17 15:49       ` Charles Forsyth
2007-04-18 16:53 ` Russ Cox
2007-04-18 16:59   ` Devon H. O'Dell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).