* [9fans] No regression tests
@ 2014-03-25 5:50 Adriano Verardo
2014-03-25 12:33 ` erik quanstrom
0 siblings, 1 reply; 9+ messages in thread
From: Adriano Verardo @ 2014-03-25 5:50 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
A few weeks ago i wrote about an unkillable manager of usb barcode readers.
That code worked perfectly for 5+ years, with absolutely no changes.
IMHO the problem seems to be a change in Bell kernel sources, as under 9Atom
all works as expected.
Unfortunately I can't say what is the last working release, because the
problem
has been noted for the first time some weeks ago, but the kernel is
rebuilt frequently
and the sources are upgraded, non regularly, 3/4 times in a year.
adriano
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [9fans] No regression tests
2014-03-25 5:50 [9fans] No regression tests Adriano Verardo
@ 2014-03-25 12:33 ` erik quanstrom
2014-03-25 23:11 ` Adriano Verardo
0 siblings, 1 reply; 9+ messages in thread
From: erik quanstrom @ 2014-03-25 12:33 UTC (permalink / raw)
To: 9fans
On Tue Mar 25 01:51:36 EDT 2014, adriano.verardo@mail.com wrote:
> A few weeks ago i wrote about an unkillable manager of usb barcode
> readers. That code worked perfectly for 5+ years, with absolutely no
> changes.
>
> IMHO the problem seems to be a change in Bell kernel sources, as under
> 9Atom all works as expected.
>
> Unfortunately I can't say what is the last working release, because
> the problem has been noted for the first time some weeks ago, but the
> kernel is rebuilt frequently and the sources are upgraded, non
> regularly, 3/4 times in a year.
that's interesting. what state are these processes in what are
the backtraces?
- erik
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [9fans] No regression tests
2014-03-25 12:33 ` erik quanstrom
@ 2014-03-25 23:11 ` Adriano Verardo
2014-03-26 14:46 ` erik quanstrom
0 siblings, 1 reply; 9+ messages in thread
From: Adriano Verardo @ 2014-03-25 23:11 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
erik quanstrom ha scritto:
> On Tue Mar 25 01:51:36 EDT 2014, adriano.verardo@mail.com wrote:
>> A few weeks ago i wrote about an unkillable manager of usb barcode
>> readers. That code worked perfectly for 5+ years, with absolutely no
>> changes.
>>
>> IMHO the problem seems to be a change in Bell kernel sources, as under
>> 9Atom all works as expected.
>>
>> Unfortunately I can't say what is the last working release, because
>> the problem has been noted for the first time some weeks ago, but the
>> kernel is rebuilt frequently and the sources are upgraded, non
>> regularly, 3/4 times in a year.
> that's interesting. what state are these processes in what are
> the backtraces?
The task is basically a customized keyboard manager which
open a channel in /srv. When running ps shows 4 instances, as it
is started by usbd and forks 3 times.
Unplugging the reader all four processes must (should) terminate.
On Bell, since a while ago, only three die. Then, when plugging in again
there is a spurious process which doesn't allow the other (new 4) to work.
Kill nor slay works, the only solution is a reboot.
Internal debug prints (#ifdef, no code changes) show exactly the same
under Bell and Atom. In both cases, when unplugging, the manager
notify the condition, notify it terminates but under Bell this doesn't
actually happen.
I regret not to have more detailed info. I suspect there is something
changed in the detach primitives or so. But its only a very personal
opinion.
adriano
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [9fans] No regression tests
2014-03-25 23:11 ` Adriano Verardo
@ 2014-03-26 14:46 ` erik quanstrom
2014-03-26 18:26 ` Adriano Verardo
0 siblings, 1 reply; 9+ messages in thread
From: erik quanstrom @ 2014-03-26 14:46 UTC (permalink / raw)
To: 9fans
> I regret not to have more detailed info. I suspect there is something
> changed in the detach primitives or so. But its only a very personal
> opinion.
hmm. would it be too much to ask to request a ps of the processes that
failed to exit? i really would just like to know what state they're in.
i think this may have been a latent bug that just came out.
- erik
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [9fans] No regression tests
2014-03-26 14:46 ` erik quanstrom
@ 2014-03-26 18:26 ` Adriano Verardo
2014-03-26 19:13 ` erik quanstrom
0 siblings, 1 reply; 9+ messages in thread
From: Adriano Verardo @ 2014-03-26 18:26 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
erik quanstrom ha scritto:
>> I regret not to have more detailed info. I suspect there is something
>> changed in the detach primitives or so. But its only a very personal
>> opinion.
> hmm. would it be too much to ask to request a ps of the processes that
> failed to exit? i really would just like to know what state they're in.
> i think this may have been a latent bug that just came out.
>
> - erik
Working on a Bell I've at home, downloaded a few weeks ago.
The kernel is built using the same config used on the field,
where the wrong behaviour has been noted.
The modified usbd and thebcscan process are embedded.
After booting with the reader plugged in (normal condition):
bootes 12 0:00 0:00 336K Pread bcscan
bootes 13 0:00 0:00 336K Rendez bcscan
bootes 14 0:00 0:00 336K Rendez bcscan
bootes 19 0:00 0:00 336K Pread bcscan
Here mount /srv/bcscan /n/bc gives a readable /n/bc/bcU0/data.
Then the reader is unplugged
bootes 12 0:00 0:00 336K Pread bcscan
bootes 13 0:00 0:00 336K Rendez bcscan
bootes 14 0:00 0:00 336K Rendez bcscan
Plaese note that here we see a different case. There are three
spurious processes. On the plant (same test) there is only one.
Then the reader is plugged in again
bootes 13 0:00 0:00 336K Rendez bcscan
bootes 14 0:00 0:00 336K Rendez bcscan
bootes 432 0:00 0:00 336K Rendez bcscan
bootes 434 0:00 0:00 336K Pread bcscan
bootes 435 0:00 0:00 336K Rendez bcscan
bootes 436 0:00 0:00 336K Rendez bcscan
bootes 437 0:00 0:00 336K Open bcscan
Here mount /srv/bcscan /n/bc gives an empty /n/bc but doesn't
complain.
adriano
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [9fans] No regression tests
2014-03-26 18:26 ` Adriano Verardo
@ 2014-03-26 19:13 ` erik quanstrom
2014-03-26 19:45 ` Adriano Verardo
0 siblings, 1 reply; 9+ messages in thread
From: erik quanstrom @ 2014-03-26 19:13 UTC (permalink / raw)
To: 9fans
> Here mount /srv/bcscan /n/bc gives a readable /n/bc/bcU0/data.
>
> Then the reader is unplugged
>
> bootes 12 0:00 0:00 336K Pread bcscan
> bootes 13 0:00 0:00 336K Rendez bcscan
> bootes 14 0:00 0:00 336K Rendez bcscan
>
> Plaese note that here we see a different case. There are three
> spurious processes. On the plant (same test) there is only one.
>
> Then the reader is plugged in again
>
> bootes 13 0:00 0:00 336K Rendez bcscan
> bootes 14 0:00 0:00 336K Rendez bcscan
> bootes 432 0:00 0:00 336K Rendez bcscan
> bootes 434 0:00 0:00 336K Pread bcscan
> bootes 435 0:00 0:00 336K Rendez bcscan
> bootes 436 0:00 0:00 336K Rendez bcscan
> bootes 437 0:00 0:00 336K Open bcscan
i should learn chess so i don't ask questions in serial.
with acid, you can get a backtrace of process 12 and get the fd
it is reading. /proc/12/fd should have the file descriptor bcscan
thinks is open. if it
also, since process 13 and 14 did not wake from rendezvous,
there is a second issue. maybe you can see how 12 could exit
and leave 13 and 14 hanging.
- erik
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [9fans] No regression tests
2014-03-26 19:13 ` erik quanstrom
@ 2014-03-26 19:45 ` Adriano Verardo
2014-03-26 19:48 ` erik quanstrom
0 siblings, 1 reply; 9+ messages in thread
From: Adriano Verardo @ 2014-03-26 19:45 UTC (permalink / raw)
To: Fans of the OS Plan 9 from Bell Labs
> i should learn chess so i don't ask questions in serial.
Sorry, I don't understand the meaning of this sentence.
The word by word translation, in italian, has no logical meaning.
>
> with acid, you can get a backtrace of process 12 and get the fd
> it is reading. /proc/12/fd should have the file descriptor bcscan
> thinks is open. if it
>
> also, since process 13 and 14 did not wake from rendezvous,
> there is a second issue. maybe you can see how 12 could exit
> and leave 13 and 14 hanging.
>
>
I'll try, even if I don't know acid very well.
What is the backtrace of a process. lstk() ?
adriano
^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <214392e810a42ad8a4958929ca150ed5@proxima.alt.za>]
* Re: [9fans] No regression tests
[not found] <214392e810a42ad8a4958929ca150ed5@proxima.alt.za>
@ 2014-03-25 21:38 ` Adriano Verardo
0 siblings, 0 replies; 9+ messages in thread
From: Adriano Verardo @ 2014-03-25 21:38 UTC (permalink / raw)
To: 9fan >> Fans of the OS Plan 9 from Bell Labs
lucio@proxima.alt.za ha scritto:
>> but the kernel is
>> rebuilt frequently
>> and the sources are upgraded, non regularly, 3/4 times in a year.
>
> You could bisect the kernel from the history and try to locate the
> change that way. There have been recent changes to USB, so that's
> where you should look first.
Yes, but perhaps i would do a diff among Bell and Atom usb sources first.
Unless they weren't organized so differently to be not comparable at
all, of course.
Anyway, I observe that the last Atom release works and the last Bell one
do not.
I have to maintain industrial systems in service. From my personal point
of view
this usb problem is a neglectabe flaw, as devices must stay always
firmly plugged.
But the customer thinks different and I must solve asap. I'll install
Atom instead of
Bell
adriano
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-03-26 19:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-25 5:50 [9fans] No regression tests Adriano Verardo
2014-03-25 12:33 ` erik quanstrom
2014-03-25 23:11 ` Adriano Verardo
2014-03-26 14:46 ` erik quanstrom
2014-03-26 18:26 ` Adriano Verardo
2014-03-26 19:13 ` erik quanstrom
2014-03-26 19:45 ` Adriano Verardo
2014-03-26 19:48 ` erik quanstrom
[not found] <214392e810a42ad8a4958929ca150ed5@proxima.alt.za>
2014-03-25 21:38 ` Adriano Verardo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).