9front - general discussion about 9front
 help / color / mirror / Atom feed
* Re: [9front] fix ref822
@ 2019-11-22 21:24 ori
  0 siblings, 0 replies; 5+ messages in thread
From: ori @ 2019-11-22 21:24 UTC (permalink / raw)
  To: cinap_lenrek, 9front


> you can see that when there are no duplicates, that is, when
> you remove the uniq() functions, they'r both identical
> expressions.

Yeah. There's potential for being some elements short
in the old code because the last uniq may remove something.
So, this change is better.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] fix ref822
@ 2019-11-22 21:14 cinap_lenrek
  0 siblings, 0 replies; 5+ messages in thread
From: cinap_lenrek @ 2019-11-22 21:14 UTC (permalink / raw)
  To: 9front

what do you mean with "different results"? what do i miss? i was
thinking that as we deduplicate the end result anyway, the only
difference would be that we retain more unique values in the case
of duplicates. but i might be missing something. the assumption was
that Nref was an arbitrary choosen value and the differences only
kick in when we reach that number of refernces in the array.

old code:

a = uniq(trunc-left(cat(a, uniq(trunc-right(tokenize(f))))))

new code:

a = trunc-left(uniq(cat(a, trunc-right(tokenize(f)))))

you can see that when there are no duplicates, that is, when
you remove the uniq() functions, they'r both identical
expressions.

--
cinap


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] fix ref822
@ 2019-11-22 18:39 ori
  0 siblings, 0 replies; 5+ messages in thread
From: ori @ 2019-11-22 18:39 UTC (permalink / raw)
  To: cinap_lenrek, 9front

> i think this is also wrong:
> 
> +	if(i + n > Nref){
> +		for(i = 0; i < n; i++)
> +			free(a[i]);
> +		for(i = n; i < Nref; i++)
> +			a[i - n] = a[i];
>  	}
> 
> i is number of entries in a[], n is number of entries in f[]. if we got
> like i = 1 and n = Nref, we'd want to remove 1 item from a because:
> 
> i + n - Nref == 1 + Nref - Nref == 1
> 
> also, i think the uniqarray() approach is wrong. we probably want to
> deduplicate while we insert into a. we should never populate m->references
> with duplicate items in the first place.
> 
> delete uniquearray() and instead just do it like this:
> 
> a = m->references;
> n = tokenize(f, ....);
> 
> for(i=0; i<n; i++){
> 	for(j=0; j<Nref; j++){
> 		if(a[j] == nil || strcmp(a[j], f[i]) == 0)
> 			break;
> 	}
> 	if(j == Nref){
> 		// handle full case, shift a by one
> 	}
> 
> 	// now a[j] can be: nil = slot free, 
> 	a[j] = f[i];	// put into array
> }
> 
> --
> cinap

Actually, ignore that last message.  Yeah, this is better, though it
has produces marginally different results.

Updated patch:

diff -r 8b2040ba4785 sys/src/cmd/upas/fs/mbox.c
--- a/sys/src/cmd/upas/fs/mbox.c	Fri Nov 22 17:29:35 2019 +1030
+++ b/sys/src/cmd/upas/fs/mbox.c	Fri Nov 22 10:38:48 2019 -0800
@@ -857,25 +857,6 @@
 	return rtrim(strdup(skipwhite(p + h->len)));
 }
 
-/*
- * firefox, e.g. doesn't keep references unique
- */
-static int
-uniqarray(char **a, int n, int allocd)
-{
-	int i, j;
-
-	for(i = 0; i < n; i++)
-		for(j = i + 1; j < n; j++)
-			if(strcmp(a[i], a[j]) == 0){
-				if(allocd)
-					free(a[j]);
-				memmove(a + j, a + j + 1, sizeof *a*(n - (j + 1)));
-				a[--n] = 0;
-			}
-	return n;
-}
-
 static char*
 ref822(Message *m, Header *h, char*, char *p)
 {
@@ -886,26 +867,26 @@
 	n = getfields(s, f, nelem(f), 1, "<> \n\t\r,");
 	if(n > Nref)
 		n = Nref;
-	n = uniqarray(f, n, 0);
 	a = m->references;
 	for(i = 0; i < Nref; i++)
 		if(a[i] == nil)
 			break;
 	/*
 	 * if there are too many references, drop from the beginning
-	 * of the list.
+	 * of the list. If someone else has a duplicate, we keep the
+	 * old duplicate.
 	 */
-	if(i + n > Nref){
-		for(i = 0; i < n; i++)
-			free(a[i]);
-		for(i = n; i < Nref; i++)
-			a[i - n] = a[i];
+	for(i = 0; i < n; i++){
+		for(j = 0; j < Nref; j++)
+			if(a[j] == nil || strcmp(a[j], f[i]) == 0)
+				break;
+		if(j == Nref){
+			free(a[0]);
+			memmove(&a[0], &a[1], (Nref - 1) * sizeof(a[0]));
+			j--;
+		}
+		a[j] = strdup(f[i]);
 	}
-	for(j = 0; j < n;)
-		a[i++] = strdup(f[j++]);
-	n = uniqarray(a, i, 1);
-	if(n < Nref)
-		a[n] = nil;
 	free(s);
 	return (char*)~0;
 }



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] fix ref822
@ 2019-11-22 17:59 ori
  0 siblings, 0 replies; 5+ messages in thread
From: ori @ 2019-11-22 17:59 UTC (permalink / raw)
  To: cinap_lenrek, 9front

> also, i think the uniqarray() approach is wrong. we probably want to
> deduplicate while we insert into a. we should never populate m->references
> with duplicate items in the first place.

The comment implied that other mail clients will mistakenly
give us duplicate refrences -- I don't know if we're guaranteed
that they'll all be together, so this may not be quite right
either.
 
> delete uniquearray() and instead just do it like this:
> 
> a = m->references;
> n = tokenize(f, ....);
> 
> for(i=0; i<n; i++){
> 	for(j=0; j<Nref; j++){
> 		if(a[j] == nil || strcmp(a[j], f[i]) == 0)
> 			break;
> 	}
> 	if(j == Nref){
> 		// handle full case, shift a by one
> 	}
> 
> 	// now a[j] can be: nil = slot free, 
> 	a[j] = f[i];	// put into array
> }
> 
> --
> cinap



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9front] fix ref822
@ 2019-11-22  7:21 cinap_lenrek
  0 siblings, 0 replies; 5+ messages in thread
From: cinap_lenrek @ 2019-11-22  7:21 UTC (permalink / raw)
  To: 9front

i think this is also wrong:

+	if(i + n > Nref){
+		for(i = 0; i < n; i++)
+			free(a[i]);
+		for(i = n; i < Nref; i++)
+			a[i - n] = a[i];
 	}

i is number of entries in a[], n is number of entries in f[]. if we got
like i = 1 and n = Nref, we'd want to remove 1 item from a because:

i + n - Nref == 1 + Nref - Nref == 1

also, i think the uniqarray() approach is wrong. we probably want to
deduplicate while we insert into a. we should never populate m->references
with duplicate items in the first place.

delete uniquearray() and instead just do it like this:

a = m->references;
n = tokenize(f, ....);

for(i=0; i<n; i++){
	for(j=0; j<Nref; j++){
		if(a[j] == nil || strcmp(a[j], f[i]) == 0)
			break;
	}
	if(j == Nref){
		// handle full case, shift a by one
	}

	// now a[j] can be: nil = slot free, 
	a[j] = f[i];	// put into array
}

--
cinap


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-11-22 21:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-22 21:24 [9front] fix ref822 ori
  -- strict thread matches above, loose matches on Subject: below --
2019-11-22 21:14 cinap_lenrek
2019-11-22 18:39 ori
2019-11-22 17:59 ori
2019-11-22  7:21 cinap_lenrek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).