caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Re: OcamlDBI
       [not found] ` <20040229210015.GA31888@redhat.com>
@ 2004-03-01  2:10   ` Christophe TROESTLER
       [not found]     ` <20040301092842.GC7705@redhat.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Christophe TROESTLER @ 2004-03-01  2:10 UTC (permalink / raw)
  To: Richard Jones; +Cc: caml-list

Dear Rich,

I am sorry, I realized after the fact that the following is a bit long.

On Sun, 29 Feb 2004, Richard Jones <<rich@annexia.org> wrote:
> 
> What I would _really_ like to do is to use some Obj.magic to build
> tuples of the correct type, but that risks a segmentation fault at
> run time if the programmer makes a mistake. [...] I was thinking
> something along the lines of
>
> let data = sth#map (fun (i, s) ->
>                      do something with int i and string s) ...
>
> The tuple of type int * string could be generated magically?

That would be nice but is not quite possible.  Indeed what would be
the type of #map ?  So this is only feasible with Camlp4 and some
annotations to indicate to Camlp4 the type of the tuple.  The problem
is that it would be nice to add the types to the prepare[_cached]
statement but then they must be transported to the appropriate #map,
#iter,...  places.  As Camlp4 is just about syntax, I do not see how
to do that -- since, at worst, the #map, #iter,... may be in a
different file!  But since Camlp4 can only do what Caml can do (only
with a more terse and convenient syntax), this still begs the question
of what pure Caml should do.

So my point is first to try to understand what Caml should do for
DYNAMIC queries (where nothing is [supposed to be] known at compile
time: neither the types, nor the length, nor even the DB,...) and only
later see whether one can add syntactic sugar.  I'd like, as much as
you do, to find an efficient and convenient manner to address that
problem.

The code you send me is good food for thought.  Thanks for sharing it
(as I said I am only a casual user of DB).  I got why you want lists
on the input.  But this code also shows shortcomings that one should
deal with.  For example

  clickstream_is_null = row.(3) = ""

suggest you identify a NULL string to an empty string!!!

Because of the dynamic nature of the queries, the result cannot be
[fully] typed.  Hence strings.  But, I am sorry to insist, strings
tend to make the code UNPORTABLE because encodings are different from
one DB engine to another.  This goes against the purpose if DBI --
database independence!  The solution, then, I beg, is to create a new
type [sql_t] (see db.ml attached to the previous mail and the P.S.)
that "mimics" SQL types and can then be used to make automatic
conversions between the two.  In fact, we do not need to care so much
about SQL types but rather that we have enough types on the Caml side
to make the translations.  (1) I think it is not convenient for the
user to have

    [`Tinyint of int | `Int of int | ...]

as all we want to know on the Caml side is that we got an int.  Maybe
another type should be defined to report the column types more finely,
e.g. [`TinyInt | `SmallInt | `Int | ...] ?  (2) Also, another decision
that must be made is how to handle NULL values.  I think it is better
to have one global `Null instead of duplicating every type into [`X of
x] and [`Xopt of x option].  This is closer to the way SQL works.
Moreover, if the programmer knows that a column is not null, he can
match the type with [`X x] instead of having to worry whether the
column has been declared NOT NULL (in which case the matching should
be [`X x]) or not (in which case the matching should be [`Xopt (Some
x)]).  Finally, polymorphic variants make it convenient (see below).

Let me change some of your code to illustrate how translating to
[sqt_t array] works (see at the bottom of this mail for the types used).

>          (* Convert out, ready for grouping by days. *)
>          let hits = sth#map (fun row ->
>                                let t = row.(0) in
>                                let day = printable_day t in
>                                let time = printable_time t in
>                                let addr = row.(1) in
>                                let url = row.(2) in
>                                let clickstream = row.(3) in
>                                let clickstream_is_null = row.(3) = "" in
>                                let sessionid = row.(4) in
>                                let sessionid_is_null = row.(4) = "" in
>                                (day, (time, addr, url,
>                                       clickstream, clickstream_is_null,
>                                       sessionid, sessionid_is_null))) in

<CODE>
(* I am not sure what you do with "hits" -- whether you process them
   after of print them -- so I decided to keep the structures
   (believing formatting the data for output should happen in a single
   place -- or since you may want different back-ends: HTML, TeX,...). *)
let hits =
  sth#map (function
           | [|`Timestamp t; `String addr; `String url;
               (`Null | `String _) as clickstream;
               (`Null | `Int _) as sessionid |] ->
               (t.day, (t.time, addr, url, clickstream, sessionid))
           | _ -> assert(false))
    (* REMARK: The assert() line is supposed not to happen.  Assert
       was chosen because it will report line no if triggered. *)
</CODE>

Note that clickstream and sessionid do not have the same type as
before.  They now contain their nullity:

  clickstream : [> `Null | `String of string ]
  sessionid   : [> `Int of int | `Null ]

This is good as trying to access the value without caring about its
possible nullity will result into a type error.  This should be as
convenient to use as the couples (clickstream, clickstream_is_null) :
string * bool.  At least so I think.  If possible, I will be glad to
see the code using clickstream and sessionid to indeed be able to
judge.

Note that empty strings are now different from NULL values.

>   let sessions = sth#map (fun row ->
>                             let sessionid = int_of_string row.(0) in
>                             let url = row.(1) in
>                             let t = int_of_string row.(2) in
>                             sessionid, (url, t)) in
> 

let sessions =
  sth#map (function
           | [| `Int sessionid; `String url; `Int t |] -> (sessionid, (url, t))
           | _ -> assert(false))

> > * I feel one could use polymorphic variants in a more terse way to
> >   bind variables. [...]
> I'm not sure I understand this point.  Can you explain some more?

I just did not understand why you use `IntRef instead of simply `Int
to bind variables.  I was trying to explain that I think this is safe.

> > * method finish : unit
>
> Hmm ... #finish is not very well-defined. [...] Not clear if it's
> better or worse to do this using a finalizer instead.

Well, I thought that's what finalizers are for.  But maybe somebody on
the ML will raise points I have not thought about.

Cheers,
ChriS


P.S.  Feature request: it would be nice to have a standard way to work
with time and dates.  A module in the standard library would be nice.

---
P.S.2.  Here is the part of db.ml I sent Rich that allows to understand
the above.

type date = {
  year : int;
  month : int;
  day : int;
}

type time = {
  hour : int;
  min : int;
  sec : int;
}

type datetime = {
  date : date;
  time : time;
  microsecs : int;
  utc_offset : int;
}

module type Decimal =
sig
  type t
  val to_string : t -> string
  val to_float : t -> float
  val of_string : t -> t
    (* Operations on Decimal values ??? *)
end

module type Blob =
sig
  type t
  val length : t -> int64
  val get_chunk : pos:int64 -> len:int -> string
  val to_file : t -> string -> unit
  val of_file : string -> t
  val of_string : string -> t
end

type sql_t = [ `Null
             | `Int of int
             | `Float of float
             | `String of string
             | `Byte of char
             | `Bool of bool
             | `Bigint of Big_int.big_int
             | `Decimal of Decimal.t
             | `Date of date
             | `Time of time
             | `Timestamp of datetime
             | `Interval of datetime (* utc_offset irrelevant *)
             | `Blob of Blob.t
             ]

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Caml-list] Re: OcamlDBI
       [not found]       ` <20040301.195138.25223321.Christophe.Troestler@umh.ac.be>
@ 2004-03-01 19:20         ` Richard Jones
  2004-03-01 19:32           ` Jeremy Chatfield
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Jones @ 2004-03-01 19:20 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: caml-list, Jeremy Chatfield

On Mon, Mar 01, 2004 at 07:51:38PM +0100, Christophe TROESTLER wrote:
> Well, you know much more about databases than I do but it seems to me
> that the following functions do the job: for PostgreSQL,
[...]

OK, this is interesting and I think we should do it.  It requires
coordination because we have a metric ton of code here which uses the
current access method, and that would have to all be changed.  I'll
see if I can schedule some time to do this with my manager.  I also
want to be more sure about exactly how the code is going to work, so
I'll probably want to build a few prototypes and share them with you -
I'm still not 100% sure I understand how the `Null stuff works.

> Also it would be nice to be able to declare in the code new conversion
> functions (e.g. to use `Point(x,y)) -- but the decoding part is tricky
> and probably impossible without Camlp4.  Have you had a need for such
> thing in your applications?  In any case, one should maybe add a
> [`Unknown of string] to deal with new types...???

We use lots of odd PostgreSQL types, eg. IP addresses ... but
generally insert and select these as strings and do any conversion
necessary ourselves.  Getting the basics working is most important.

Jeremy:

It looks like we have a chance to get ocamldbi accepted as the
standard DBI interface for OCaml, if we schedule some time to make a
few changes.  I guess we're looking at 1-2 days work tops.  What do
you think?

Rich.

-- 
Richard Jones. http://www.annexia.org/ http://www.j-london.com/
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
http://www.YouUnlimited.co.uk/ - management courses

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Caml-list] Re: OcamlDBI
  2004-03-01 19:20         ` Richard Jones
@ 2004-03-01 19:32           ` Jeremy Chatfield
  0 siblings, 0 replies; 3+ messages in thread
From: Jeremy Chatfield @ 2004-03-01 19:32 UTC (permalink / raw)
  To: Richard William Jones; +Cc: Christophe TROESTLER, caml-list

On Mon, 2004-03-01 at 19:20, Richard Jones wrote:
...
> It looks like we have a chance to get ocamldbi accepted as the
> standard DBI interface for OCaml, if we schedule some time to make a
> few changes.  I guess we're looking at 1-2 days work tops.  What do
> you think?

Sounds as though it's worth doing. Should make a few more interesting
things possible. Yes, please go ahead! 

Cheers, JeremyC.
-- 
Jeremy Chatfield, Merjis Ltd - improving web site return on investment
office: 01234 822551                mobile: 07780 607613

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-03-01 19:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20040229.202452.53575435.Christophe.Troestler@umh.ac.be>
     [not found] ` <20040229210015.GA31888@redhat.com>
2004-03-01  2:10   ` [Caml-list] Re: OcamlDBI Christophe TROESTLER
     [not found]     ` <20040301092842.GC7705@redhat.com>
     [not found]       ` <20040301.195138.25223321.Christophe.Troestler@umh.ac.be>
2004-03-01 19:20         ` Richard Jones
2004-03-01 19:32           ` Jeremy Chatfield

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).