From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Thomas.Fischbacher@Physik.Uni-Muenchen.DE>
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id 50974BC8B
	for <caml-list@yquem.inria.fr>; Sat, 12 Feb 2005 22:58:47 +0100 (CET)
Received: from mail.physik.uni-muenchen.de (mail.physik.uni-muenchen.de [192.54.42.129])
	by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j1CLwkdg006220
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL)
	for <caml-list@yquem.inria.fr>; Sat, 12 Feb 2005 22:58:47 +0100
Received: from localhost (unknown [127.0.0.1])
	by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 6AE2F2004A;
	Sat, 12 Feb 2005 22:58:46 +0100 (CET)
Received: from mail.physik.uni-muenchen.de ([127.0.0.1])
 by localhost (mail.physik.uni-muenchen.de [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id 19309-03-8; Sat, 12 Feb 2005 22:58:44 +0100 (CET)
Received: from mailhost.cip.physik.uni-muenchen.de (kaiser.cip.physik.uni-muenchen.de [141.84.136.1])
	by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 934E420014;
	Sat, 12 Feb 2005 22:58:44 +0100 (CET)
Received: from eiger.cip.physik.uni-muenchen.de (eiger.cip.physik.uni-muenchen.de [141.84.136.54])
	by mailhost.cip.physik.uni-muenchen.de (Postfix) with ESMTP
	id 3120626E87; Sat, 12 Feb 2005 22:58:44 +0100 (CET)
Received: by eiger.cip.physik.uni-muenchen.de (Postfix, from userid 3092)
	id E33A43CBC; Sat, 12 Feb 2005 22:58:36 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
	by eiger.cip.physik.uni-muenchen.de (Postfix) with ESMTP
	id E0EA92D713; Sat, 12 Feb 2005 22:58:36 +0100 (CET)
Date: Sat, 12 Feb 2005 22:58:36 +0100 (CET)
From: Thomas Fischbacher <Thomas.Fischbacher@Physik.Uni-Muenchen.DE>
To: Brian Hurt <bhurt@spnz.org>
Cc: Michael Walter <michael.walter@gmail.com>,
	skaller <skaller@users.sourceforge.net>, Jon <jdh30@cam.ac.uk>,
	caml-list@yquem.inria.fr
Subject: Re: [Caml-list] The boon of static type checking
In-Reply-To: <Pine.LNX.4.44.0502121233300.5563-100000@localhost.localdomain>
Message-ID: <Pine.LNX.4.58.0502122211110.22145@eiger.cip.physik.uni-muenchen.de>
References: <Pine.LNX.4.44.0502121233300.5563-100000@localhost.localdomain>
X-BOFH: Daemons did it
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Virus-Scanned: amavisd-new at physik.uni-muenchen.de
X-Miltered: at concorde with ID 420E7C16.003 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; caml-list:01 wrote:01 non-trivial:01 lib:01 dpkg:01 gnuplot:01 1,000:98 centuries:98 magnetic:98 vanish:98 cip:98 cip:98 lambda:01 lambda:01 interactions:01 
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.0.2
X-Spam-Level: 


On Sat, 12 Feb 2005, Brian Hurt wrote:

> Note that with N=1.5, a 100,000 program takes 1,000 man-days, or 50 
> man-months (4 man-years), and a 1,000,000 line program takes man 
> centuries.
> 
> The important point is the scaling is exponential, not linear.

Sorry, that's power-law scaling.

Exponential would mean that it's of the form a*exp(x/b), with some 
dimensionful constant a.

power-law scaling is the only behaviour that can be expected from 
scale-invariant systems. That is, take some complex system which may well 
be determined by staggeringly complicated short-distance effects (say, 
intra-orbital quantum interaction between magnetic atoms - can it get 
any worse than that?).

Then, look at this system at larger and larger scales. Try to find a way 
to describe it which allows you to perform a scale transformation, that 
is, look at interactions/interrelations between clusters, in such a way 
that the scale-transformed system can be described in a formally 
equivalent way to the original system, only with other interaction 
parameters.

Then, study the flow of your system parameters under many many 
re-scalings. There are different types of behaviour, the most simple cases
one can expect are:

(i) Some parameters blow up.

(ii) Some parameters vanish.

(iii) Some parameters reach non-trivial fixed points.

If you have (i), this means - more or less - your system doesn't scale. We 
will not consider this any further.

(ii) means that at very large scales, a lot of the special characteristics 
drop out, only those that can exhibit proper scaling behaviour survive. 
Namely those of (iii). Interestingly, one finds that frequently, one can 
easily enumerate and fully classify those few effects that survive 
scaling. This leads to the interesting observation that quite many systems 
that may be determined by wildly different micro-effects show precisely 
the same macroscopic behaviour.

All in all, power-law behaviour is what should be normally expected for 
questions that do not have an obvious characteristic scale. What's the 
typical size of a software project? Here, I'd expect power-law behaviour, 
as there is no natural answer.

Indeed:

grep Installed-Size /var/lib/dpkg/available|perl -pe 'm/: (\d+)/;$h[$1/100]++}{$n=0;$_=join "\n",map{$n++." ".(0+$_)}@h[1..100]'>/tmp/a;echo -e 'set logscale x\nset logscale y\nplot "/tmp/a" with impulses, 2600*x**-1.25,1000*exp(-x/20)\npause 20'|gnuplot /proc/self/fd/0

See what I mean? Exponential behaviour *does* have a characteristic scale: 
the size increase over which the number of projects is reduced by 1/2.

-- 
regards,               tf@cip.physik.uni-muenchen.de              (o_
 Thomas Fischbacher -  http://www.cip.physik.uni-muenchen.de/~tf  //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y)           V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1))                  (Debian GNU)