caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* porter's stemmer implementation
@ 2010-01-17  7:01 Grégoire Seux
  2010-01-17 19:12 ` [Caml-list] " Richard Jones
  2010-01-17 19:33 ` Yoann Padioleau
  0 siblings, 2 replies; 5+ messages in thread
From: Grégoire Seux @ 2010-01-17  7:01 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 384 bytes --]

Hello,

I am looking for an implementation of Porter's stemmer in ocaml. Erik
Arneson published a few years ago a link to his implementation, but the file
is no longer available, does someone has a copy of the file or another
implementation ?
By the way, I am looking for every library that could be used in index
construction in Ocaml, of course !

Thanks by advance,

Gregoire Seux

[-- Attachment #2: Type: text/html, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] porter's stemmer implementation
  2010-01-17  7:01 porter's stemmer implementation Grégoire Seux
@ 2010-01-17 19:12 ` Richard Jones
  2010-01-17 19:33 ` Yoann Padioleau
  1 sibling, 0 replies; 5+ messages in thread
From: Richard Jones @ 2010-01-17 19:12 UTC (permalink / raw)
  To: Grégoire Seux; +Cc: caml-list

On Sun, Jan 17, 2010 at 12:31:49PM +0530, Grégoire Seux wrote:
> Hello,
> 
> I am looking for an implementation of Porter's stemmer in ocaml. Erik
> Arneson published a few years ago a link to his implementation, but the file
> is no longer available, does someone has a copy of the file or another
> implementation ?

The Internet Archive has it:

http://web.archive.org/web/*/http://www.aarg.net/~erik/ocaml/*

or:

http://www.annexia.org/tmp/stemmer-0.1.0.tar.gz

Sadly this code has no license information.  You would have to contact
the original author about that.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] porter's stemmer implementation
  2010-01-17  7:01 porter's stemmer implementation Grégoire Seux
  2010-01-17 19:12 ` [Caml-list] " Richard Jones
@ 2010-01-17 19:33 ` Yoann Padioleau
  2010-01-18  0:49   ` Guillaume Yziquel
  1 sibling, 1 reply; 5+ messages in thread
From: Yoann Padioleau @ 2010-01-17 19:33 UTC (permalink / raw)
  To: Grégoire Seux; +Cc: caml-list


On Jan 16, 2010, at 11:01 PM, Grégoire Seux wrote:

> Hello,
> 
> I am looking for an implementation of Porter's stemmer in ocaml.

There is one in nltk, a very complete python library for NLP and there is
ocamlpython to link ocaml and python code. As the stemmer interface is
very simpler (string -> string), it's very easy to use ocamlpython to do that.

Basically you can do in a python file nltk_ocaml,py:

import nltk, re

stemmer = nltk.PorterStemmer()

def stem(s):
    return stemmer.stem(s)

def test_nltk_ocaml():
    print "test_nltk_ocaml"
    nltk.draw.tree.demo()




and in a ml file nltk.ml:
open Pycaml

module Py = Python 
let modul = Py.cpr (Pycaml.pyimport_importmodule "nltk_ocaml")
let dict = Py.cpr (Pycaml.pymodule_getdict modul)

let stem s = 
  let py_str = Pycaml.pystring_fromstring s in
  let f = Py.cpr (Pycaml.pydict_getitemstring(dict, "stem")) in
  let args = Py.cpr (Pycaml.pytuple_fromsingle py_str) in
  let res = Py.cpr (Pycaml.pyeval_callobject (f,args)) in
  Pycaml.guarded_pystring_asstring res







> Erik Arneson published a few years ago a link to his implementation, but the file is no longer available, does someone has a copy of the file or another implementation ? 
> By the way, I am looking for every library that could be used in index construction in Ocaml, of course !
> 
> Thanks by advance, 
> 
> Gregoire Seux
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] porter's stemmer implementation
  2010-01-17 19:33 ` Yoann Padioleau
@ 2010-01-18  0:49   ` Guillaume Yziquel
  2010-01-18  2:33     ` Yoann Padioleau
  0 siblings, 1 reply; 5+ messages in thread
From: Guillaume Yziquel @ 2010-01-18  0:49 UTC (permalink / raw)
  To: Yoann Padioleau; +Cc: Grégoire Seux, caml-list

Yoann Padioleau a écrit :
> On Jan 16, 2010, at 11:01 PM, Grégoire Seux wrote:
> 
> There is one in nltk, a very complete python library for NLP and there is
> ocamlpython to link ocaml and python code. As the stemmer interface is
> very simpler (string -> string), it's very easy to use ocamlpython to do that.
> 
> open Pycaml
> 
> module Py = Python 
> let modul = Py.cpr (Pycaml.pyimport_importmodule "nltk_ocaml")
> let dict = Py.cpr (Pycaml.pymodule_getdict modul)
> 
> let stem s = 
>   let py_str = Pycaml.pystring_fromstring s in
>   let f = Py.cpr (Pycaml.pydict_getitemstring(dict, "stem")) in
>   let args = Py.cpr (Pycaml.pytuple_fromsingle py_str) in
>   let res = Py.cpr (Pycaml.pyeval_callobject (f,args)) in
>   Pycaml.guarded_pystring_asstring res

I'm afraid I do not understand where you get your Python module from and 
you Py.cpr value from. I get from the Debian pycaml package:

# #require "pycaml";;
/usr/lib/ocaml/unix.cma: loaded
/usr/lib/ocaml/pycaml: added to search path
/usr/lib/ocaml/pycaml/pycaml.cma: loaded
# module X = Python;;
Error: Unbound module Python
# module X = Pycaml.Python;;
Error: Unbound module Pycaml.Python
# module X = Pycaml;;
module X :
   sig
     type funcptr = Pycaml.funcptr
     type pyobject = Pycaml.pyobject
     type funcent = funcptr * int * int
     type pymodule_func =
       Pycaml.pymodule_func = {
       pyml_name : string;
       pyml_func : pyobject -> pyobject;
       pyml_flags : int;
       pyml_doc : string;
     }
     type pyobject_type =
       Pycaml.pyobject_type =
         TupleType
       | StringType
       | IntType
       | FloatType
       | ListType
       | NoneType
       | CallableType

Where did you get your pycaml from?

All the best,

-- 
      Guillaume Yziquel
http://yziquel.homelinux.org/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] porter's stemmer implementation
  2010-01-18  0:49   ` Guillaume Yziquel
@ 2010-01-18  2:33     ` Yoann Padioleau
  0 siblings, 0 replies; 5+ messages in thread
From: Yoann Padioleau @ 2010-01-18  2:33 UTC (permalink / raw)
  To: guillaume.yziquel; +Cc: Grégoire Seux, caml-list


On Jan 17, 2010, at 4:49 PM, Guillaume Yziquel wrote:

> 
> Yoann Padioleau a écrit :
>> On Jan 16, 2010, at 11:01 PM, Grégoire Seux wrote:
>> There is one in nltk, a very complete python library for NLP and there is
>> ocamlpython to link ocaml and python code. As the stemmer interface is
>> very simpler (string -> string), it's very easy to use ocamlpython to do that.
>> open Pycaml
>> module Py = Python let modul = Py.cpr (Pycaml.pyimport_importmodule "nltk_ocaml")
>> let dict = Py.cpr (Pycaml.pymodule_getdict modul)
>> let stem s =   let py_str = Pycaml.pystring_fromstring s in
>>  let f = Py.cpr (Pycaml.pydict_getitemstring(dict, "stem")) in
>>  let args = Py.cpr (Pycaml.pytuple_fromsingle py_str) in
>>  let res = Py.cpr (Pycaml.pyeval_callobject (f,args)) in
>>  Pycaml.guarded_pystring_asstring res
> 
> I'm afraid I do not understand where you get your Python module from and you Py.cpr value from. I get from the Debian pycaml package:

Oops. I have an extra module called python.ml containing small
helper functions:

exception PythonError

(* henrik have written similar things *)
let check_python_return v =
  if v = Pycaml.pynull () 
  then begin
    Pycaml.pyerr_print ();
    raise PythonError
  end
  else v
      
(* alias *)
let cpr = check_python_return


> 
> # #require "pycaml";;
> /usr/lib/ocaml/unix.cma: loaded
> /usr/lib/ocaml/pycaml: added to search path
> /usr/lib/ocaml/pycaml/pycaml.cma: loaded
> # module X = Python;;
> Error: Unbound module Python
> # module X = Pycaml.Python;;
> Error: Unbound module Pycaml.Python
> # module X = Pycaml;;
> module X :
>  sig
>    type funcptr = Pycaml.funcptr
>    type pyobject = Pycaml.pyobject
>    type funcent = funcptr * int * int
>    type pymodule_func =
>      Pycaml.pymodule_func = {
>      pyml_name : string;
>      pyml_func : pyobject -> pyobject;
>      pyml_flags : int;
>      pyml_doc : string;
>    }
>    type pyobject_type =
>      Pycaml.pyobject_type =
>        TupleType
>      | StringType
>      | IntType
>      | FloatType
>      | ListType
>      | NoneType
>      | CallableType
> 
> Where did you get your pycaml from?

A mix of 
 * - Arty Yekes original pycaml 0.82,
 * - Henrik stuart port to python 2.5
 * - Thomas Fischbacher heavy extension.


> 
> All the best,
> 
> -- 
>     Guillaume Yziquel
> http://yziquel.homelinux.org/
> 




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-01-18  2:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-17  7:01 porter's stemmer implementation Grégoire Seux
2010-01-17 19:12 ` [Caml-list] " Richard Jones
2010-01-17 19:33 ` Yoann Padioleau
2010-01-18  0:49   ` Guillaume Yziquel
2010-01-18  2:33     ` Yoann Padioleau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).