caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Markus Mottl <markus@oefai.at>
To: Viktor Tron <v.tron@ed.ac.uk>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] entropy etc. in OCaml
Date: Wed, 11 Aug 2004 12:22:30 +0200	[thread overview]
Message-ID: <20040811102230.GA23569@fichte.ai.univie.ac.at> (raw)
In-Reply-To: <opsckajhcp65a1wk@postbox.inf.ed.ac.uk>

On Wed, 11 Aug 2004, Viktor Tron wrote:
> Does anyone know of an OCaml library implementing or binding
> Information Theoretical concepts like data entropy?
> (e.g. gsl does not provide these).

I don't have a separate library for that, but you might want to take a
look at AIFAD:

  http://www.oefai.at/~markus/aifad

It implements several functions for computing the entropy of discrete
data including structured values.  Its purpose is decision tree learning
on structured data (represented by algebraic datatypes).

One function for computing entropy from histograms is the following
(taken from src/entropy_utils.ml in the distribution):

---------------------------------------------------------------------------
let calc_entropy histo n =
  if n = 0 then 0.0
  else
    let rec loop sum ix =
      if ix < 0 then sum
      else
        let freq = histo.(ix) in
        if freq = 0 then loop sum (ix - 1)
        else
          let ffreq = float freq in
          loop (sum +. ffreq *. log ffreq) (ix - 1) in
    let sum = loop 0.0 (Array.length histo - 1) in
    let f_n = float n in
    log2 f_n -. sum /. f_n /. log_2
---------------------------------------------------------------------------

If you pass it an array of integers (histogram) that counts the frequency
of class values in variable "histo" and the number of observations in "n"
(must be the sum of frequencies in the histogram), then this function
will return you the entropy in bits.

Regards,
Markus

-- 
Markus Mottl          http://www.oefai.at/~markus          markus@oefai.at

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


      reply	other threads:[~2004-08-11 10:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-11  9:41 Viktor Tron
2004-08-11 10:22 ` Markus Mottl [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040811102230.GA23569@fichte.ai.univie.ac.at \
    --to=markus@oefai.at \
    --cc=caml-list@inria.fr \
    --cc=v.tron@ed.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).