From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/81127 Path: news.gmane.org!not-for-mail From: "Johann 'Myrkraverk' Oskarsson" Newsgroups: gmane.emacs.gnus.general Subject: Repost: Time to revisit the message id generation algorithm? Date: Wed, 01 Feb 2012 18:18:20 +0000 Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1328120401 23673 80.91.229.3 (1 Feb 2012 18:20:01 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 1 Feb 2012 18:20:01 +0000 (UTC) To: Gnus Ding Original-X-From: ding-owner+M29408@lists.math.uh.edu Wed Feb 01 19:20:01 2012 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Rsemm-0006Cj-8d for ding-account@gmane.org; Wed, 01 Feb 2012 19:20:00 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1Rseme-0005yA-Cl; Wed, 01 Feb 2012 12:19:52 -0600 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1Rsemd-0005xx-8O for ding@lists.math.uh.edu; Wed, 01 Feb 2012 12:19:51 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1Rsemc-0006Au-44 for ding@lists.math.uh.edu; Wed, 01 Feb 2012 12:19:51 -0600 Original-Received: from mailout-eu.gmx.com ([213.165.64.42]) by quimby.gnus.org with smtp (Exim 4.72) (envelope-from ) id 1Rsema-0005P1-4Z for ding@gnus.org; Wed, 01 Feb 2012 19:19:48 +0100 Original-Received: (qmail invoked by alias); 01 Feb 2012 18:19:42 -0000 Original-Received: from 85-220-60-172.dsl.dynamic.simnet.is (EHLO localhost) [85.220.60.172] by mail.gmx.com (mp-eu003) with SMTP; 01 Feb 2012 19:19:42 +0100 X-Authenticated: #132896649 X-Provags-ID: V01U2FsdGVkX19I3/1EtcVIsyyqlyozPEHpp1bo/m8GqsxxhnhOM3 7yH9LNiFy4mgSr User-Agent: Gnus/5.130001 (Ma Gnus v0.1) XEmacs/21.4.22 (usg-unix-v) X-Y-GMX-Trusted: 0 X-Spam-Score: -1.9 (-) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:81127 Archived-At: Hi all, Reposting this here from gnu.emacs.gnus: There are two issues I have with the message id generation algorithm; the (message-unique-id) function. 1) It bleeds information. This is an issue for those who use TOR or other anynomizers. 2) It may not be unique (anymore). Let's start with issue 1). ========================== The first two hash characters are from the unix user id - or simply the user name for those using MS-DOS, VMS or OS/2 - though I was unable to find anything more recent than Emacs 20.6 for that. There are Gnu Emacs > 21 out there for MS-DOS and VMS. The last four characters are .fsf. This is uniquely gnus. Let's consider a person creating an anonymous email account, say with TOR though any other such service will do. For the sake of argument, I do not consider Yahoo, Gmail, etc. anynomous since they include the originating IP address when using SMTP. She even has the forethought to set gnus-user-agent to nil and of course mail-host-address to the domain of the email service. Now, said person writes an email to the CEO about unethical practices in the corporation where she's working. It just so happens the CEO is in on the practices and she's known to be the only one using Gnus for email and the message id exposes her. Ergo, she's is trouble. Even if she is not the only one using Gnus the first two hash characters expose her Unix user id. Even though most people are using their own workstation with the default user id for the first account, that number tends to be different between distros/unix versions[*]. This may be enough to track the email to a specific person. Depending on the seriousness of said unethical practices, that person may have just lost her life to (message-unique-id). And now to issue 2). ==================== As said before, the first two hash characters are the unix user id. As many people are using their own workstations with the default user id, this is not very unique anymore. The rest of the hash is calculated from a counter, message-unique-id-char and the current unix timestamp in seconds. It is very probably that at any given point in time two people have the same value of the counter[**]. Now it is just a matter of those two persons pressing C-c C-c at the same time. This is not so far fetched as the workstations' clocks may not be in sync. This is unlikely to be a problem but for people setting the mail-host-address to their email provider. Say Google or Yahoo. Final words. ============ I am not proposing any specific change at the moment. As more and more people are using anonymizers like TOR bleeding information is not a good idea anymore. As many people using public news servers and email providers and probably setting their mail-host-address accordingly the chance of id clashes is growing. [*] For example, mine is 101 while regular Linux users will often have it 500 or 1000, or maybe 501 or 1001. [**] The calculations seems to be a bit more involved that just a 1+ counter, but that is what has been the case in my experiments.