From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 522277FA5C for ; Tue, 21 Jun 2016 21:12:03 +0200 (CEST) IronPort-PHdr: 9a23:nDF+wByyAxmh2r3XCy+O+j09IxM/srCxBDY+r6Qd0OoRIJqq85mqBkHD//Il1AaPBtSDraocwLqI+4nbGkU+or+5+EgYd5JNUxJXwe43pCcHRPC/NEvgMfTxZDY7FskRHHVs/nW8LFQHUJ2mPw6anHS+4HYoFwnlMkItf6KuS9aU15z8h7760qaQSjsLrQL1Wal1IhSyoFeZnegtqqwmFJwMzADUqGBDYeVcyDAgD1uSmxHh+pX4p8Y7oGx48sgs/M9YUKj8Y79wDfkBVGxnYFYO+dbzuBLfYQyK73oaGiVKw1sbSzTCuT3zQJb9+g7XkNZH9WHOMcj7S6wuVBym7qF2WFrmki4BPDN/93vY3J9elqVe9T2orQZ+zoqcW4qVOeBzZOuJctoQX2tMWoBKXCxMGI6mR4QKBusFe+1fqt+u9BM1sRKiCFz0V6vUwThSiyqzgPQ3 Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=gabriel.scherer@gmail.com; spf=Pass smtp.mailfrom=gabriel.scherer@gmail.com; spf=None smtp.helo=postmaster@mail-io0-f182.google.com Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of gabriel.scherer@gmail.com) identity=pra; client-ip=209.85.223.182; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of gabriel.scherer@gmail.com designates 209.85.223.182 as permitted sender) identity=mailfrom; client-ip=209.85.223.182; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-io0-f182.google.com) identity=helo; client-ip=209.85.223.182; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="postmaster@mail-io0-f182.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0DhAACPkGlXhrbfVdFdg1w4fQaoAIFThXWLJ4F6IoV1AoExBzgUAQEBAQEBAQERAQEBCAsLCSEvgjGCGwEBAwESEQQZARsSBQYBAwELBgULGhYHAgIiAREBBQEKEgYTCAoJB4dzAQMPCA6kQIExPjGLO4FqglkFhzUKGScDClKDIwEBAQEBBQEBAQEBAQEBFwIGEIYXhE2EDGGCVIJaBYZLDJIigViEMIJ4hSyCN4xsjjkSHoEPDw+CMQ0RCxeBUSAyikgBAQE X-IPAS-Result: A0DhAACPkGlXhrbfVdFdg1w4fQaoAIFThXWLJ4F6IoV1AoExBzgUAQEBAQEBAQERAQEBCAsLCSEvgjGCGwEBAwESEQQZARsSBQYBAwELBgULGhYHAgIiAREBBQEKEgYTCAoJB4dzAQMPCA6kQIExPjGLO4FqglkFhzUKGScDClKDIwEBAQEBBQEBAQEBAQEBFwIGEIYXhE2EDGGCVIJaBYZLDJIigViEMIJ4hSyCN4xsjjkSHoEPDw+CMQ0RCxeBUSAyikgBAQE X-IronPort-AV: E=Sophos;i="5.26,505,1459807200"; d="scan'208,217";a="182051338" Received: from mail-io0-f182.google.com ([209.85.223.182]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 21 Jun 2016 21:12:01 +0200 Received: by mail-io0-f182.google.com with SMTP id g13so18540020ioj.1 for ; Tue, 21 Jun 2016 12:12:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=jLT3HPVflW+ZGpgJHjH7bYmYaeAQwZEZKOVSZQjrZic=; b=vJ9A95ginufEb0aoBivaECieeG3D51LEE79hiHpLYY1vdRVOikj48dLNjhBXdJ+dyR n8lLotPIvJY2aCb+g/pD3FQZOKAL2mpimupZ2YXo4UpJ0IXq2X5rMyODIuFFOGWxRYur Kd4vxtR/NfGAoTzGex12oGKqN8S1aZpR0QU8NeKF1GFvrcRtONPWOusYfjj0kTy240Rn IPeABPcZ/M+9Qm3lWxhtpmawB+UaPBwflPVvZm4cNb4Rq0znYURuZOMJLTQ91u5VylHC V1zHs7STfehulKV31BJrG94n8JUgxYNafrQ8udRj2J4i0ZwZklXkQAN5fiEcKxmvay+Q nBFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=jLT3HPVflW+ZGpgJHjH7bYmYaeAQwZEZKOVSZQjrZic=; b=MZRTqegN+cIGfy8Ncf9SJgx+QL0wfKBvi70mIJqsGOMkNMzW8gpNwHHSL0bnMM9bTb nspgTZAVNIgfFifts3FghTt9g8ME+2QLRSz/SjfBqdJUcQS1JayzWVOK/awySeYR6uxI 6qlApqC6SCerrPfHSaMVnzavfiP/cn9Cbrbm3XuzO02bWyOSflAVLOAGASXiL+Vt7Ydb 6cFE3Ha5krqOofZhE65M7WqU72MwxkUzmKisTcUPjdrHnRZ9Wo8uRPecuq/0xgKcqi/2 ZEzNLWS/0xId5uKNvX6W/Gp5wEMpbxOXrQpoiK5LNL6OiSDBXap8fToqy7xwkk+AYBAt pf4Q== X-Gm-Message-State: ALyK8tJNdwfUHeCizzLK6PA+dWHnwl3sIykDmsE8jkRyX8C3H8azRBPi0x/l0yDI8CSGSg4q9u/4s8C5aDoJnQ== X-Received: by 10.107.169.67 with SMTP id s64mr33092821ioe.19.1466536319982; Tue, 21 Jun 2016 12:11:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.79.136.70 with HTTP; Tue, 21 Jun 2016 12:11:20 -0700 (PDT) In-Reply-To: <25b304ab-c9ea-ff68-2059-68c93683b1a2@linux-france.org> References: <5E818FB5-6908-4E29-838E-C6A2836F60CE@inria.fr> <25b304ab-c9ea-ff68-2059-68c93683b1a2@linux-france.org> From: Gabriel Scherer Date: Tue, 21 Jun 2016 15:11:20 -0400 Message-ID: To: David MENTRE Cc: caml users Content-Type: multipart/alternative; boundary=001a11421a44673ac40535ce98f6 Subject: Re: [Caml-list] About "precise (formal) things that can be said about properties of certain interfaces" --001a11421a44673ac40535ce98f6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable TL;DR: there is no *generic* technique to improve any design, it's a methodology that consists in finding invariants that your functions respect, and tweaking their behavior so that they respect nicer, simpler invariants. The trick, in my experience, is that this practice naturally arises when you are careful about testing and have a random-testing library easily available. Long version: Quickcheck-style property testing ("generate randoms x, y, z and check that they verify this property") encourage users to formulate the invariants/properties of the function(s) they are testing as first-order formulas (usually forall-only). In my experience, this is an excellent mindset in which to put code authors at the moment they are designing and implementing the function (so these tests should come simultaneously, not after the implementation effort), because it makes you think about the properties the function should have, and this is a very effective way to make the right choices on corner cases: most choices will *not* respect nice properties, and those that do are the right ones. In Batteries we use the qtest library ( https://github.com/vincent-hugot/iTeML ) to write inline random tests, but they are less common than unit tests (they take more effort to write). I just looked (git grep -A2 "\$Q") and they seem to fit in three big categories: 1. Round-trip tests ((decode (encode s) =3D s) (eq li (li |> enum |> of_enum)). 2. Equivalence of the function implementation with a naive/simpler implementation, (eq (filter p v) =3D (to_list v |> List.filter p |> of_list)) (popcount x =3D popcount_sparse x) (to_list (List.fold_left insert empty l) =3D List.sort Pervasives.compa= re l) 3. A bunch of more diverse and less easy/generic to formulate tests. The most benefits come from (3) of course, but (2) can also be a way to solve corner cases -- although usually you don't have to think this way to "know" what the right behavior is. Some examples of (3) are the following val edit_distance : string -> string -> int (edit_distance s1 s2 =3D edit_distance s2 s1) val nsplit : ('a -> bool) -> 'a list -> 'a list list (xs =3D join sep (nsplit ((=3D) sep) xs)) (nsplit ((=3D) sep) la @ nsplit ((=3D) sep) lb =3D nsplit ((=3D) eq) (l= a @ [sep] @ lb) In the case discussed in GPR#10, val split : ~sep:string -> string -> string list I found it rather hard to get a good design for the corner cases, for example what the value of (split ~sep sep) should be, or (split ~sep ""), or (split ~sep:"" s), or (split ~sep:"" ""). My solution to get a meaningful behavior on all those cases was to ask: what is a function (concat_splits) such that split ~sep (sa ^ sb) =3D concat_splits (split ~sep sa) (split ~sep sb) ? In more formal terms that corresponds to asking for split to transport the monoid structure of strings. Formally one would also need to specify (split ~sep "") separately, but in fact finding any reasonable concat_splits function also answers this question. Daniel B=C3=BCnzli independently designed this function on his end, thinking about different invariants, and we got the exact same behavior on both ends. That's anecdotal evidence that this approach can lead to more objective design choices. (In fact I would say that property-testing is *more* useful for API guidance than actual testing -- at implementation-time or against regression; in my experience you always also write small unit tests to specifically exercise the corner cases you think about, and those tend to suffice to catch the implementation bugs or regressions that are easily found by testing.) On Tue, Jun 21, 2016 at 11:54 AM, David MENTRE wrote: > Hello, > > Le 21/06/2016 =C3=A0 17:48, Gabriel Scherer a =C3=A9crit : > > (1) Discussing function names or seemingly-minor API details is not > > necessarily an exercise in subjectivity. There are precise (formal) > > things that can be said about properties of certain interfaces compared > > to others, as we discussed with Daniel B=C3=BCnzli in a memorable past > > discussion in GPR#10. Taking time to make decisions can result in > > measurably better designs, and the importance of unit testsuits *and* > > property testing to help and structure API design cannot be > under-estimated. > > Could you elaborate a bit that point? Could you give example of such > "precise (formal) things that can be said about properties of certain > interfaces"? > > I am especially interested in such formal properties that could much > improve the design. > > Sincerely yours, > D. Mentr=C3=A9 > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --001a11421a44673ac40535ce98f6 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
TL;DR: there is no *generic* technique to improve any= design, it's a methodology that consists in finding invariants that yo= ur functions respect, and tweaking their behavior so that they respect nice= r, simpler invariants. The trick, in my experience, is that this practice n= aturally arises when you are careful about testing and have a random-testin= g library easily available.

Long version:
<= /div>

Quickcheck-style property testing ("generate randoms x, = y, z and check that they verify this property") encourage users to for= mulate the invariants/properties of the function(s) they are testing as fir= st-order formulas (usually forall-only). In my experience, this is an excel= lent mindset in which to put code authors at the moment they are designing = and implementing the function (so these tests should come simultaneously, n= ot after the implementation effort), because it makes you think about the p= roperties the function should have, and this is a very effective way to mak= e the right choices on corner cases: most choices will *not* respect nice p= roperties, and those that do are the right ones.

In Batte= ries we use the qtest library ( https://github.com/vincent-hugot/iTeML ) to write inline random= tests, but they are less common than unit tests (they take more effort to = write). I just looked (git grep -A2 "\$Q") and they seem to fit i= n three big categories:
1. Round-trip tests
=C2=A0=C2=A0= =C2=A0 ((decode (encode s) =3D s)
=C2=A0=C2=A0=C2=A0 (eq li (li |> en= um |> of_enum)).
2. Equivalence of the function implementa= tion with a naive/simpler implementation,
=C2=A0=C2=A0=C2=A0 = (eq (filter p v) =3D (to_list v |> List.filter p |> of_list))
=C2=A0=C2=A0=C2=A0 (popcount x =3D popcount_sparse x)
= =C2=A0=C2=A0=C2=A0 (to_list (List.fold_left insert empty l) =3D List.sort P= ervasives.compare l)
3. A bunch of more diverse and less easy= /generic to formulate tests.

The most benefits= come from (3) of course, but (2) can also be a way to solve corner cases -= - although usually you don't have to think this way to "know"= what the right behavior is.

Some examples of (3) are the= following
=C2=A0 val edit_distance : string -> string -&g= t; int
=C2=A0=C2=A0=C2=A0 (edit_distance s1 s2 =3D edit_dista= nce s2 s1)

=C2=A0 val nsplit : ('a -> b= ool) -> 'a list -> 'a list list
=C2=A0=C2=A0=C2= =A0 (xs =3D join sep (nsplit ((=3D) sep) xs))
=C2=A0=C2=A0=C2=A0 (nsplit= ((=3D) sep) la @ nsplit ((=3D) sep) lb =3D nsplit ((=3D) eq) (la @ [sep] @= lb)

In the case discussed in GPR#10,
=C2= =A0 val split : ~sep:string -> string -> string list

I found it rather hard to get a good design for the corner cases, for exa= mple what the value of (split ~sep sep) should be, or (split ~sep "&qu= ot;), or (split ~sep:"" s), or (split ~sep:"" "&qu= ot;). My solution to get a meaningful behavior on all those cases was to as= k: what is a function (concat_splits) such that

=C2=A0 sp= lit ~sep (sa ^ sb) =3D concat_splits (split ~sep sa) (split ~sep sb)
?

In more formal terms that corresponds to asking for sp= lit to transport the monoid structure of strings. Formally one would also n= eed to specify (split ~sep "") separately, but in fact finding an= y reasonable concat_splits function also answers this question.

Daniel B=C3=BCnzli independently designed this function on his end, = thinking about different invariants, and we got the exact same behavior on = both ends. That's anecdotal evidence that this approach can lead to mor= e objective design choices.

(In fact I would say th= at property-testing is *more* useful for API guidance than actual testing -= - at implementation-time or against regression; in my experience you always= also write small unit tests to specifically exercise the corner cases you = think about, and those tend to suffice to catch the implementation bugs or= regressions that are easily found by testing.)



On Tue, Jun 21, 2016 at 11:= 54 AM, David MENTRE <dmentre@linux-france.org> wrote:=
Hello,

Le 21/06/2016 =C3=A0 17:48, Gabriel Scherer a =C3=A9crit :
> (1) Discussing function names or seemingly-minor API details is not
> necessarily an exercise in subjectivity. There are precise (formal)
> things that can be said about properties of certain interfaces compare= d
> to others, as we discussed with Daniel B=C3=BCnzli in a memorable past=
> discussion in GPR#10. Taking time to make decisions can result in
> measurably better designs, and the importance of unit testsuits *and*<= br> > property testing to help and structure API design cannot be under-esti= mated.

Could you elaborate a bit that point? Could you give example of such
"precise (formal) things that can be said about properties of certain<= br> interfaces"?

I am especially interested in such formal properties that could much
improve the design.

Sincerely yours,
D. Mentr=C3=A9


--
Caml-list mailing list.=C2=A0 Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocam= l_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

--001a11421a44673ac40535ce98f6--