From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id MAA06206; Wed, 11 Aug 2004 12:22:37 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id MAA06815 for ; Wed, 11 Aug 2004 12:22:36 +0200 (MET DST) Received: from fichte.ai.univie.ac.at (fichte.ai.univie.ac.at [131.130.174.156]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i7BAMYRM024361 for ; Wed, 11 Aug 2004 12:22:35 +0200 Received: from fichte.ai.univie.ac.at (markus@localhost [127.0.0.1]) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian-6.6) with ESMTP id i7BAMVDu025134; Wed, 11 Aug 2004 12:22:31 +0200 Received: (from markus@localhost) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian-6.6) id i7BAMU2Q025132; Wed, 11 Aug 2004 12:22:30 +0200 Date: Wed, 11 Aug 2004 12:22:30 +0200 From: Markus Mottl To: Viktor Tron Cc: caml-list@inria.fr Subject: Re: [Caml-list] entropy etc. in OCaml Message-ID: <20040811102230.GA23569@fichte.ai.univie.ac.at> Mail-Followup-To: Viktor Tron , caml-list@inria.fr References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6i X-Miltered: at concorde with ID 4119F36A.003 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 entropy:01 entropy:01 gsl:01 aifad:01 aifad:01 datatypes:01 calc:01 0.0:01 0.0:01 observations:01 implements:01 ocaml:01 ocaml:01 rec:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Wed, 11 Aug 2004, Viktor Tron wrote: > Does anyone know of an OCaml library implementing or binding > Information Theoretical concepts like data entropy? > (e.g. gsl does not provide these). I don't have a separate library for that, but you might want to take a look at AIFAD: http://www.oefai.at/~markus/aifad It implements several functions for computing the entropy of discrete data including structured values. Its purpose is decision tree learning on structured data (represented by algebraic datatypes). One function for computing entropy from histograms is the following (taken from src/entropy_utils.ml in the distribution): --------------------------------------------------------------------------- let calc_entropy histo n = if n = 0 then 0.0 else let rec loop sum ix = if ix < 0 then sum else let freq = histo.(ix) in if freq = 0 then loop sum (ix - 1) else let ffreq = float freq in loop (sum +. ffreq *. log ffreq) (ix - 1) in let sum = loop 0.0 (Array.length histo - 1) in let f_n = float n in log2 f_n -. sum /. f_n /. log_2 --------------------------------------------------------------------------- If you pass it an array of integers (histogram) that counts the frequency of class values in variable "histo" and the number of observations in "n" (must be the sum of frequencies in the histogram), then this function will return you the entropy in bits. Regards, Markus -- Markus Mottl http://www.oefai.at/~markus markus@oefai.at ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners