caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: camlex/camlyacc + threads problem
@ 1997-09-16 14:25 Thierry Bravier
  0 siblings, 0 replies; 4+ messages in thread
From: Thierry Bravier @ 1997-09-16 14:25 UTC (permalink / raw)
  To: caml-list

ubject: Re: camlex/camlyacc + threads problem

Olivier Bouyssou (mailto:bouyssou@didac-mip.fr) wrote:
> 
> J'ai un probleme lorsque j'utilise une fonction de parsing dans 
> plusieurs threads
> 

#==============================================================================
French summary:

Chers ocamliens,

A la suite du message d'O. Bouyssou j'ai essaye la version modifiee
de ocamlyacc que j'avais proposee et qui est deja disponible dans
les archives de cette 'mailing list' a
http://pauillac.inria.fr/caml/caml-list/0947.html
(Re: re-entrance of ocamlyacc parsers)

Sur son exemple, les limitations de la version actuelle de ocamlyacc
ne se font plus sentir avec la version modifiee.

Je pense que le probleme vient de l'utilisation combinee des threads
d'ocaml et d'une variable globale dans l'actuel stdlib/parsing.

Sur son exemple, les limitations de la version actuelle de ocamlyacc
ne se font plus sentir avec la version modifiee.

Je pense que le probleme vient de l'utilisation combinee des threads
d'ocaml et d'une variable globale dans l'actuel stdlib/parsing.

En esperant que ces modifications vous sembleront utiles
et/ou que certains voudront reagir a ce message.

La version anglaise est plus fournie ...

Thierry Bravier         Dassault Aviation - DGT / DTN / ELO / EAV
78, Quai Marcel Dassault       F-92214 Saint-Cloud Cedex - France
Telephone : (33) 01 47 11 53 07   Telecopie : (33) 01 47 11 52 83
E-Mail :              mailto:thierry.bravier@dassault-aviation.fr

#==============================================================================
English version:

Dear ocamlers,

I have tried to compile and execute O. Bouyssou's example
(ocamlyacc + threads). Unexpected 'Parse_error' exception are raised
as he said.

In my opinion, the current implementation of stdlib/parsing.ml[i]
suffers from a limitation: in the current version, a global variable
is shared among all ocamlyacc-generated parsers.

In a recent mail about this topic, I stated that this could be a
limitation to parsers re-entrance but did not provide a convincing
example. See http://pauillac.inria.fr/caml/caml-list/0947.html
(Re: re-entrance of ocamlyacc parsers)

In the case of threads, what I understood, is that the ocaml
runtime library schedules threads from one allocation point to
another (in time).

This, I think, is (likely to be) incompatible with a global variable
subject to side-effects.

Anyway, I tried to compile the example with the modified versions of
stdlib/parsing.ml[i] and of ocamlyacc I recently sent to this list.
The executable works correctly (without mutex of course).

The modified ocamlyacc is just a perl-patch on top of the standard
ocamlyacc. I aggree that it would be much better to incorporate
its modifications into the regular ocamlyacc, I just needed to check
this was not a useless work to do.

The modified stdlib/parsing module only provides functions instead of
shared global variables, this solves the problem I think.

The code of the modified version of ocamlyacc is not included here
(it is already available on this list archive).

The example below is compiled by the script mail.build with the
standard or modified version of ocamlyacc.

During execution, a mutex is used iff $MUTEX is set.

Please note that this example does not demonstrate the additional
features of the modified version of ocamlyacc: recursive parser calls
and most of all: how to use a parsing environment (see %env directive).
This has already been presented in my last mail.

In the hope that some of you will find this useful
or will feel like answering this message.

Thierry Bravier         Dassault Aviation - DGT / DTN / ELO / EAV
78, Quai Marcel Dassault       F-92214 Saint-Cloud Cedex - France
Telephone : (33) 01 47 11 53 07   Telecopie : (33) 01 47 11 52 83
E-Mail :              mailto:thierry.bravier@dassault-aviation.fr

# Listing of files:
#==============================================================================
$ ls ../parsing # modified version of stdlib/parsing.ml{,i}
parsing.cmi     parsing.cmo     parsing.ml      parsing.mli

$ ls -ls mail_parser.mly mail_lexer.mll mail_main.ml mail.build
   4 -rwxr-xr-x  1 tb           1290 Sep 16 13:24 mail.build
   4 -rw-r--r--  1 tb            669 Sep 16 12:43 mail_lexer.mll
   4 -rw-r--r--  1 tb            502 Sep 16 12:41 mail_main.ml
   4 -rw-r--r--  1 tb           1220 Sep 16 12:52 mail_parser.mly
 
$ head -9999 mail_parser.mly mail_lexer.mll mail_main.ml mail.build
==> mail_parser.mly <==
/*
============================================================================
 *        File: mail_parser.mly
 *    Language: camlyacc
 *      Author: Thierry Bravier
 *  Time-stamp: <97/09/16 12:53:53 tb>
 *     Created:  97/09/16 12:44:35 tb
 *
=========================================================================
*/
%{
let below max n = if (n > max) then failwith "below"; n
%}

/*
=========================================================================
*/
%token <int>            INT     /* Int Literal           */
%token COLON            /* ":"    */
%token HYPHEN           /* "-"    */
%token EOF              /* eof    */

/*
=========================================================================
*/
%start mail_yacc
%type <(int * int) list> mail_yacc

%%
/*
=========================================================================
*/
mail_yacc:
| intervals EOF { $1 }
;

intervals:
| interval intervals    { $1 :: $2 }
| /* empty */           { [] }
;

interval:
| hour HYPHEN hour      { $1, $3 }
;

hour:
| INT           { (below 23 $1) * 60 }
| INT COLON INT { (below 23 $1) * 60 + (below 59 $3) }
;

/*
=========================================================================
*/
%%

(*
=========================================================================
*)

==> mail_lexer.mll <==
(*
============================================================================
 *        File: mail_lexer.mll
 *    Language: camllex
 *      Author: Thierry Bravier
 *  Time-stamp: <97/09/16 12:44:27 tb>
 *     Created:  97/09/16 12:44:08 tb
 *
=========================================================================
*)
{
open Mail_parser
}

(*
=========================================================================
*)
rule mail_lex = parse
|['0'-'9'] ['0'-'9'] ? { INT (int_of_string (Lexing.lexeme lexbuf)) }
| ":"           { COLON       }
| "-"           { HYPHEN      }
| eof           { EOF         }

(*
=========================================================================
*)

==> mail_main.ml <==
let with_mutex =
  try Sys.getenv "MUTEX"; true
  with Not_found -> false

let mutex = Mutex.create ()

let task () =
  while true do
    let lexbuf = Lexing.from_string "0-23:59" in
    if with_mutex then Mutex.lock mutex;
    Mail_parser.mail_yacc Mail_lexer.mail_lex lexbuf;
    if with_mutex then Mutex.unlock mutex;
    if (Random.int 100) = 10 then ThreadUnix.sleep 1;
  done

let main () =
  for i = 0 to 100 do Thread.create task () done;
  while true do ThreadUnix.sleep 1 done
;;

main ()
;;

==> mail.build <==
#!/bin/sh -x
#==============================================================================
#         File: mail.build
#     Language: sh
#       Author: Thierry Bravier
#   Time-stamp: <97/09/16 13:25:34 tb>
#      Created:  97/09/16 13:24:17 tb
#==============================================================================
STDLIB=/logiciel/ML/ocaml/stdlib
MODIFIED_YACC=/home/asf/tb/ESTEREL/ATLEAST/Root/OCAMLYACC/Ocamlyacc

#==============================================================================
if [ "$1" != 'standard' ]; then
   WHICH='modified';
   OCAMLYACC=${MODIFIED_YACC};
   PARSING_I='-I ../parsing';
   PARSING_F=parsing.cmo;
else
   WHICH='standard';
   OCAMLYACC=ocamlyacc;
   PARSING_I='';
   PARSING_F='';
fi

OCAMLC="ocamlc ${PARSING_I} -c"

#==============================================================================
${OCAMLYACC} mail_parser.mly
${OCAMLC} mail_parser.mli
${OCAMLC} mail_parser.ml

ocamllex mail_lexer.mll
${OCAMLC} mail_lexer.ml

${OCAMLC} -I ${STDLIB}/threads mail_main.ml

ocamlc -o main.${WHICH} \
   -thread -custom unix.cma threads.cma \
   ${PARSING_I} ${PARSING_F} \
   mail_parser.cmo mail_lexer.cmo mail_main.cmo \
   -cclib -lunix -cclib -lthreads

#==============================================================================
$ 

# Demo with the standard ocamlyacc:
#==============================================================================

# Compilation
#------------------------------------------------------------------------------
$ ./mail.build standard
STDLIB=/logiciel/ML/ocaml/stdlib
MODIFIED_YACC=/home/asf/tb/ESTEREL/ATLEAST/Root/OCAMLYACC/Ocamlyacc
+ [ standard != standard ] 
WHICH=standard
OCAMLYACC=ocamlyacc
PARSING_I=
PARSING_F=
OCAMLC=ocamlc  -c
+ ocamlyacc mail_parser.mly 
+ ocamlc -c mail_parser.mli 
+ ocamlc -c mail_parser.ml 
+ ocamllex mail_lexer.mll 
6 states, 267 transitions, table size 1104 bytes
+ ocamlc -c mail_lexer.ml 
+ ocamlc -c -I /logiciel/ML/ocaml/stdlib/threads mail_main.ml 
+ ocamlc -o main.standard -thread -custom unix.cma threads.cma
mail_parser.cmo mail_lexer.cmo mail_main.cmo -cclib -lunix -cclib
-lthreads 

# Execution : without mutex: Parse_error as said in O. Bouyssou's
message
#------------------------------------------------------------------------------
$ ./main.standard
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Failure("below")
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Failure("below")
Uncaught exception: Parse_error
Uncaught exception: Failure("below")
Uncaught exception: Parse_error
Uncaught exception: Parse_error
Uncaught exception: Parse_error

Ctrl-C

# Execution : with a mutex: no Parse_error as said in O. Bouyssou's
message
#------------------------------------------------------------------------------
$ MUTEX=1 ./main

Ctrl-C

# Demo with the modified ocamlyacc:
#==============================================================================

# Compilation
#------------------------------------------------------------------------------
$ ./mail.build modified
STDLIB=/logiciel/ML/ocaml/stdlib
MODIFIED_YACC=/home/asf/tb/ESTEREL/ATLEAST/Root/OCAMLYACC/Ocamlyacc
+ [ modified != standard ] 
WHICH=modified
OCAMLYACC=/home/asf/tb/ESTEREL/ATLEAST/Root/OCAMLYACC/Ocamlyacc
PARSING_I=-I ../parsing
PARSING_F=parsing.cmo
OCAMLC=ocamlc -I ../parsing -c
+ /home/asf/tb/ESTEREL/ATLEAST/Root/OCAMLYACC/Ocamlyacc mail_parser.mly 
+ ocamlc -I ../parsing -c mail_parser.mli 
+ ocamlc -I ../parsing -c mail_parser.ml 
+ ocamllex mail_lexer.mll 
6 states, 267 transitions, table size 1104 bytes
+ ocamlc -I ../parsing -c mail_lexer.ml 
+ ocamlc -I ../parsing -c -I /logiciel/ML/ocaml/stdlib/threads
mail_main.ml 
+ ocamlc -o main.modified -thread -custom unix.cma threads.cma -I
../parsing parsing.cmo mail_parser.cmo mail_lexer.cmo mail_main.cmo
-cclib -lunix -cclib -lthreads 

# Execution : without mutex: no Parse_error any more
#------------------------------------------------------------------------------
$ ./main.modified
Ctrl-C

$





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: camlex/camlyacc + threads problem
       [not found] <199709231254.OAA20307@yeti2.didac-mip.fr>
@ 1997-09-25 12:19 ` Xavier Leroy
  0 siblings, 0 replies; 4+ messages in thread
From: Xavier Leroy @ 1997-09-25 12:19 UTC (permalink / raw)
  To: Jean-Claude Laffitte; +Cc: caml-list

> I make an intensive use of threads and I have an intensive need of safety. So
> the question is, what  means  *thread-safe* ?

To quote Dave Butenhof, "Programming with Posix threads":

        ``Thread-safe'' means that the code can be called from
        multiple threads without destructive results.  It does not
        require that the code run efficiently in multiple threads,
        only that it can operate safely in multiple threads.

> Is it just a problem whith global values  ( critical section ), or do I only 
> use the pervasive library in my threads ?
> For example, is this code safe ? :
> 
> let crazy name = 
>     let counter = 0 in 
>     while ( true ) do
>       let str = String.create 30                       (* safe ? *)
>       and arr = Array.create 10 a in                   (* safe ? *)
>         Printf.sprintf str "%s : %d" name counter      (* safe ? *)
>    done
>   
> let main  =
>   begin
>     let t1 =  Thread.create crazy "First"
>     and t2 =  Thread.create crazy "second" 
>     and t3 =  Thread.create crazy "Third" in
>     Thread.join (t3); 
>   end

Yes, it is. (Though it does not typecheck.)

> Do I need mutexes for all the Arrays, Lists, Strings, Stacks ... I use  ??

There are essentially four kinds of data structures / library functions:

1- Purely functional data structures and functions (no side effects):
   these are always thread-safe.  The module List is a good example.
   No need for mutexes.

2- Basic functions over mutable data structures, e.g. reading and writing
   references, or elements of arrays or strings, or Array.create,
   or String.create.  These are atomic operations, meaning that if two
   threads A and B assign the same array element, then either
   A will assign it, then B, or B, then A,
   but no weird behavior will occur that might cause the array element
   to hold a value other than that stored by A or that stored by B.
   I/O functions from Pervasive also fall in this class.
   You don't have to protect these structures with a mutex, though
   there are many cases where you will want to, e.g. for guaranteeing
   that a sequence of assignments over a shared array
   are performed atomically.

3- More complicated functions over mutable data structures, e.g.
   Array.copy, or the functions in Hashtable, Stack, Lexing:
   modifications on these data structures are not atomic, so if two threads
   modify the same structure concurrently, internal invariants may be broken
   and unexpected results ensue.  You should always associate these data
   structures with a mutex, or make sure they are used in only one thread.

4- Functions with internal global state.  The only example in the whole
   standard library is the Parsing module.  Here, it's unsafe to call
   one of these functions from two threads simultaneously, even on
   different arguments (e.g. different Lexing.lexbuf arguments for
   a parser).  Use a global mutex or make sure only one thread does
   parsing.

Of all these functions, class 4 is the most troublesome, and I expect
to get entirely rid of it for the next release of O'Caml (e.g. by putting
the parsing state inside the Lexing.lexbuf argument).

Class 3 could be made thread-safe by adding mutexes inside the library
modules, but this is problematic.  For instance, it is impractical to
associate a mutex with each array or string.  It also imposes
significant overhead on the standard library, especially in the
non-threaded case.  Finally, it often makes more sense to use a mutex in
the user code to protect a group of related standard library data
structures (e.g. a hashtable and an array), rather than rely on
fine-grained locking in the library itself.

Hope this clarifies the issues.  Nobody said multithreaded programming
was easy...

- Xavier Leroy





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: camlex/camlyacc + threads problem
  1997-09-10  8:17 Olivier Bouyssou
@ 1997-09-18 14:18 ` Xavier Leroy
  0 siblings, 0 replies; 4+ messages in thread
From: Xavier Leroy @ 1997-09-18 14:18 UTC (permalink / raw)
  To: bouyssou; +Cc: caml-list

> I've a problem when I use a parser in multiple threads. If the calls to the 
> parser are not enclosed by a mutex, there are parse errors. ( See example in 
> french version )

Of course.  Most of the O'Caml standard library is *not* thread-safe
yet.  So far, everything in Pervasives is thread-safe, in particular
I/O, but other stateful libraries are not.  This includes in
particular Hashtbl, Queue, and Lexing.

There are no easy ways to make a library thread-safe and still have it
working in the non-threaded case.  So, I'm not promising these
problems will be fixed soon.

In the meantime, use mutexes liberally, or make sure that only one
thread is doing e.g. parsing.

- Xavier Leroy





^ permalink raw reply	[flat|nested] 4+ messages in thread

* camlex/camlyacc + threads problem
@ 1997-09-10  8:17 Olivier Bouyssou
  1997-09-18 14:18 ` Xavier Leroy
  0 siblings, 1 reply; 4+ messages in thread
From: Olivier Bouyssou @ 1997-09-10  8:17 UTC (permalink / raw)
  To: caml-list

English translation below.

J'ai un probleme lorsque j'utilise une fonction de parsing dans plusieurs 
threads

Voici l'exemple qui parse une liste de crenaux horaires :

Le lexer :

--
{
open Horloge_parser
} 
rule token = parse
   [' ' '\t' '\n']          { token lexbuf }
|  ['0'-'9']['0'-'9']       { INT(int_of_string(Lexing.lexeme lexbuf) }
|  ['0'-'9']                { INT(int_of_string(Lexing.lexeme lexbuf) }
|  ':'                      { PP }
|  '-'                      { TIRET }
|  eof                      { EOF }
--

Le parser :

--
%{
exception Incorrect_minute
exception Incorrect_hour
let test_minute m = if m < 0 || m > 59 then raise Incorrect_minute
let test_heure  h = if h < 0 || h > 23 then raise Incorrect_hour
%}
%token <int> INT
%token PP
%token TIRET
%token EOF
%start main
%type <(int * int) list> main
%%
main:
  EOF { [] }
| liste_intervalle EOF { $1 };

liste_intervalle:
  intervalle { [$1] }
| liste_intervalle intervalle { $2::$1 };

intervalle:
  heure TIRET heure { ($1,$3) };

heure:
  INT { test_heure $1; $1*60 }
|  INT PP INT { test_heure $1; test_minute $3; $1*60+$3};
--  


Et le programme qui utilise le parser : 

--
let mutex = Mutex.create () 
  
let task () =
  while true 
  do
    let lexbuf = Lexing.from_string "0-23:59" in
    Mutex.lock mutex ;
    Horloge_parser.main Horloge_lexer.token lexbuf ;
    Mutex.unlock mutex ;
    if (Random.int 100) = 10 then ThreadUnix.sleep 1 ;
  done 
    
let main () =
  for i = 0 to 100 do Thread.create task () done ; 
  while true do ThreadUnix.sleep 1 done ;;

main () ;;
--

Le probleme est que sans le mutex j'obtient des Parse_errors. 

-----------------------------------------------------------------------------
English summary :

I've a problem when I use a parser in multiple threads. If the calls to the 
parser are not enclosed by a mutex, there are parse errors. ( See example in 
french version )


-----------------------------------------------------------------------------
-- 
Olivier Bouyssou (F1NXH), Didactheque Regionale          bouyssou@didac-mip.fr
Universite Paul Sabatier 118, route de Narbonne           31062 TOULOUSE Cedex
Tel : +33.5.61.55.65.74                                Fax : +33.5.61.55.65.71








^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~1997-09-25 17:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-09-16 14:25 camlex/camlyacc + threads problem Thierry Bravier
     [not found] <199709231254.OAA20307@yeti2.didac-mip.fr>
1997-09-25 12:19 ` Xavier Leroy
  -- strict thread matches above, loose matches on Subject: below --
1997-09-10  8:17 Olivier Bouyssou
1997-09-18 14:18 ` Xavier Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).