Re: [9front] time

9front - general discussion about 9front
 help / color / mirror / Atom feed

* Re: [9front] time
@ 2018-08-13  7:45 cinap_lenrek
  0 siblings, 0 replies; 5+ messages in thread
From: cinap_lenrek @ 2018-08-13  7:45 UTC (permalink / raw)
  To: 9front

>> internally, user/kernel time accounting works by hzclock()
>> just incrementing a per process counter of ticks. which is
>> not very precise and processes can evade time being accounted
>> by yielding so they'd miss the timer interrupt.
>
> Are there any other reasons for the real time count to become erratic
> or inconsistent?

the realtime count is about the easist to get right. thats done by
just recording start time on fork and then take the difference on
exit.

> How does this work on a multi-core machine?

the same as on a single core.

--
cinap


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] time
  2018-08-12 21:07 cinap_lenrek
@ 2018-08-12 22:28 ` BurnZeZ
  0 siblings, 0 replies; 5+ messages in thread
From: BurnZeZ @ 2018-08-12 22:28 UTC (permalink / raw)
  To: 9front

On Sun Aug 12 21:08:00 GMT 2018, cinap_lenrek@felloff.net wrote:
> of course it does. theres nothing special about exec. the accounting
> starts when the *process* is created, not after exec(). also, it is
> hard to know what you consider time spend execing. the exec itself
> just reads the programs a.out header and creates the segments.
> the pages get faulted in during the runtime of the program *after*
> exec did its job.

I’ve been tricked.


> internally, user/kernel time accounting works by hzclock()
> just incrementing a per process counter of ticks. which is
> not very precise and processes can evade time being accounted
> by yielding so they'd miss the timer interrupt.

Are there any other reasons for the real time count to become erratic
or inconsistent?
How does this work on a multi-core machine?


> so there is an opportunity here that when going to 64 bit,
> we could improve the time accounting and change the unit
> to nanoseconds as well...

Might be better than staying with milli, since nsec() sort of
promotes nano anyway.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] time
@ 2018-08-12 21:07 cinap_lenrek
  2018-08-12 22:28 ` BurnZeZ
  0 siblings, 1 reply; 5+ messages in thread
From: cinap_lenrek @ 2018-08-12 21:07 UTC (permalink / raw)
  To: 9front

>On Sun Aug 12 19:11:59 GMT 2018, cinap_lenrek@felloff.net wrote:
>> this sucks. burnzez talked to me explaining the lack of
>> resolution of kernel process time accounting but no context
>> has been given in this patch.
>
> This patch is unrelated to the accounting resolution issues.

no. it is very much related, because the times returned by
wait() are supposed to provide the same information you want,
just lacking resolution at the moment.

> Process-level accounting will never consider the time it takes to
> exec(), which ends up being important when the latency itself can
> unexpectedly (even sporadically) be an order of magnitude higher than
> the accounted execution time of the program.

of course it does. theres nothing special about exec. the accounting
starts when the *process* is created, not after exec(). also, it is
hard to know what you consider time spend execing. the exec itself
just reads the programs a.out header and creates the segments.
the pages get faulted in during the runtime of the program *after*
exec did its job.

>> as for the patch... no. i dont like it. its a kludgy work
>> arround, not addressing the problem. and i dont see why you
>> need segattach() at all.
>
> Yeah, segattach isn’t necessary, but seemed the most straightforward
> thing.  The child process also makes use of that ‘output’ array.
>	sprint(output, "/bin/%s", argv[1]);
>
> If RFMEM is used when rfork is called, then would it make sense to:
>	∙ define another array
>	∙ use smprint() instead of sprint
>	∙ set output[0] to NUL and proceed
>
> The last one makes the most sense to me, but that doesn’t mean it’s
> not retarded for some reason.

all of this doesnt matter. you'r adding your own nsec() calls because
the resolution of the wait times sucks. i think we should rather consider
improving the kernels process time accounting so that this code doesnt
need to exist.

--
cinap

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] time
  2018-08-12 19:11 cinap_lenrek
@ 2018-08-12 20:49 ` BurnZeZ
  0 siblings, 0 replies; 5+ messages in thread
From: BurnZeZ @ 2018-08-12 20:49 UTC (permalink / raw)
  To: 9front

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

On Sun Aug 12 19:11:59 GMT 2018, cinap_lenrek@felloff.net wrote:
> this sucks. burnzez talked to me explaining the lack of
> resolution of kernel process time accounting but no context
> has been given in this patch.

This patch is unrelated to the accounting resolution issues.
Process-level accounting will never consider the time it takes to
exec(), which ends up being important when the latency itself can
unexpectedly (even sporadically) be an order of magnitude higher than
the accounted execution time of the program.


> as for the patch... no. i dont like it. its a kludgy work
> arround, not addressing the problem. and i dont see why you
> need segattach() at all.

Yeah, segattach isn’t necessary, but seemed the most straightforward
thing.  The child process also makes use of that ‘output’ array.
	sprint(output, "/bin/%s", argv[1]);

If RFMEM is used when rfork is called, then would it make sense to:
	∙ define another array
	∙ use smprint() instead of sprint
	∙ set output[0] to NUL and proceed

The last one makes the most sense to me, but that doesn’t mean it’s
not retarded for some reason.

Attached diff shows how I would do it without segattach(), but I’m
not confident it’s correct.
(also removed the 2nd t0 assignment, as I realized that a latency
spike during the first exec() would be overlooked)

[-- Attachment #2: Type: text/plain, Size: 1092 bytes --]

diff -r 5c5acc9ab7a5 sys/src/cmd/time.c
--- a/sys/src/cmd/time.c	Sat Aug 11 16:19:32 2018 +0200
+++ b/sys/src/cmd/time.c	Sun Aug 12 20:40:19 2018 +0000
@@ -2,6 +2,7 @@
 #include <libc.h>
 
 char	output[4096];
+vlong	t0, t1;
 void	add(char*, ...);
 void	error(char*);
 void	notifyf(void*, char*);
@@ -14,16 +15,18 @@
 	long l;
 	char *p;
 	char err[ERRMAX];
+	enum { SEC = 1000000000ULL };
 
 	if(argc <= 1){
 		fprint(2, "usage: time command\n");
 		exits("usage");
 	}
 
-	switch(fork()){
+	switch(rfork(RFFDG|RFREND|RFPROC|RFMEM)){
 	case -1:
 		error("fork");
 	case 0:
+		t0 = nsec();
 		exec(argv[1], &argv[1]);
 		if(argv[1][0] != '/' && strncmp(argv[1], "./", 2) &&
 		   strncmp(argv[1], "../", 3)){
@@ -43,12 +46,15 @@
 			goto loop;
 		error("wait");
 	}
+	t1 = nsec();
+	output[0] = '\0';
 	l = w->time[0];
 	add("%ld.%.2ldu", l/1000, (l%1000)/10);
 	l = w->time[1];
 	add("%ld.%.2lds", l/1000, (l%1000)/10);
 	l = w->time[2];
 	add("%ld.%.2ldr", l/1000, (l%1000)/10);
+	add("%lld.%.9lldt", (t1-t0)/SEC, (t1-t0)%SEC);
 	add("\t");
 	for(i=1; i<argc; i++){
 		add("%s", argv[i], 0);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] time
@ 2018-08-12 19:11 cinap_lenrek
  2018-08-12 20:49 ` BurnZeZ
  0 siblings, 1 reply; 5+ messages in thread
From: cinap_lenrek @ 2018-08-12 19:11 UTC (permalink / raw)
  To: 9front

this sucks. burnzez talked to me explaining the lack of
resolution of kernel process time accounting but no context
has been given in this patch.

the issue is the kernel tracks time in user / kernel and
realtime for each process. this is exposed in /proc/n/status
and the wait() syscall as 32 bit unsigned integers with
unit in milliseconds. which gives a maximum resolution of
arround 50 days.

internally, user/kernel time accounting works by hzclock()
just incrementing a per process counter of ticks. which is
not very precise and processes can evade time being accounted
by yielding so they'd miss the timer interrupt.

so there is an opportunity here that when going to 64 bit,
we could improve the time accounting and change the unit
to nanoseconds as well...

there are a bunch of things that can be done, but carefull
preparation is needed to not break backwards compatibility.

keeping the unit in milliseconds but increasing the bits
to 64 bit is trivial for the wait() syscall. as its already
parsed as text... and not many programs use wait() and look
at the process times. so there should be minimal breakage.

for /proc/n/status we have to see... the kernel makes an
effort to align all the fields in a fixed format. but ps(1)
doesnt really mind and tokenizes the whole thing. so we
need to check which programs still rely on the fixed format.

as for the patch... no. i dont like it. its a kludgy work
arround, not addressing the problem. and i dont see why you
need segattach() at all.

--
cinap

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-13  7:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-13  7:45 [9front] time cinap_lenrek
  -- strict thread matches above, loose matches on Subject: below --
2018-08-12 21:07 cinap_lenrek
2018-08-12 22:28 ` BurnZeZ
2018-08-12 19:11 cinap_lenrek
2018-08-12 20:49 ` BurnZeZ

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).