caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] Justifying a breaking 4.03 change, strong dependency on modules with "external" declarations (was: )
@ 2016-05-16 11:52 Hongbo Zhang (BLOOMBERG/ 731 LEX)
  2016-05-16 12:38 ` David Allsopp
  0 siblings, 1 reply; 3+ messages in thread
From: Hongbo Zhang (BLOOMBERG/ 731 LEX) @ 2016-05-16 11:52 UTC (permalink / raw)
  To: gabriel.scherer; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 6394 bytes --]

My 2 cents: I did not like this change, for modules which has side effect, it should always
be explicitly listed in the link line, instead of relying on such fragile implicit behavior.

From: gabriel.scherer@gmail.com At: May 15 2016 21:10:38
To: roberto@dicosmo.org
Cc: caml-list@inria.fr, leo@lpw25.net
Subject: Re:[Caml-list] Justifying a breaking 4.03 change, strong dependency on modules with "external" declarations (was: )

As Roberto Di Cosmo points out in the thread "parmap package broken in
opam switch 4.03.0", using an "external" declaration provided by a
module A now counts as a dependency on A -- in particular, A has to be
linked in the final executable.

This breaks some user code (such as Parmap, but I heard reports from
other breakages) in the case where A was used solely to provide such
"external" declarations, for C functions implemented in a C stub that
was linked in the application. In that case it was neither necessary
nor useful to link A, and projects did not do it. They need to do it
explicitly now, and break at linking-time otherwise.

I realize that users may not understand why this change, which looks
like a regression, happened in 4.03. It is actually a bugfix, see
PR#4166 ( http://caml.inria.fr/mantis/view.php?id=4166 ): if A does
not only provide those external declarations, but also initializes
some state that the C functions rely on (such as exception
declarations), then forgetting to link A may result in a crash at
runtime.

In other words, the pre-4.03 situation could result in some rare cases
of user error in a very subtle, very difficult to understand bug.
Under 4.03, such difficult errors cannot happen anymore, but we have
broken some (edgy but) less uncommon existing patterns in the
transition. On the long term, this is a win (obvious errors are
linking time are better than subtle errors at runtime), but the
transition period is of course painful -- and maybe it could have been
managed better, eg. with a warning instead of an error for the next
version, but this is a lot of work.

## Recovering the justification

In case anyone wonders, here is how I obtained this explanation: the
Changes file ( https://github.com/ocaml/ocaml/blob/trunk/Changes )
lists the breaking change as

* PR#4166, PR#6956: force linking when calling external C primitives
(Jacques Garrigue, reports by Markus Mottl and Christophe Troestler)

(Note the bullet point "*" instead of "-", that indicates that this
change may break user programs. If you maintain large or old OCaml
codebases, it is a good idea to review each starred item of the
changelog after each release. If a program breaks after a new release,
looking at the starred items may give indications of what is
happening.)

The first of the two issue reports listed in the Changes is an example
of the subtle bugs that could happen with the pre-4.03 semantics,
reported by Markus Mottle. The second is another instance reported by
Troestler, and contains the discussion around the change and its
implementation, mainly by Jacques Garrigue that did the implementation
work.


On Wed, Apr 27, 2016 at 5:49 AM, Roberto Di Cosmo <roberto@dicosmo.org> wrote:
> Indeed, after some more investigation (thanks to Francois Berenger), it
> seems that in 4.03 we can no longer just use a bare .mli file with the
> interface to some external code, as it was possible before.
>
> Now, we need to provide also an .ml file, in any case.
>
> The fix in parmap is underway, and it was a simple matter of moving
> setcore.mli to setcore.ml, without touching anything else.
>
> For the curious, the content of setcore.ml (ex setcore.mli) is the
> following:
>
> (* uses the native affinity interface to
>   declare that the current process should be
>   attached to core number n *)
>
> external numcores: unit -> int = "numcores"
> external setcore: int -> unit = "setcore"
>
> If you have similar patterns in your projects, take due notice :-)
>
> --
> Roberto
>
>
> 2016-04-26 19:16 GMT+02:00 Leo White <leo@lpw25.net>:
>>
>> > It seems that in 4.03 one needs to add the -opaque flag when compiling
>> > such stubs, otherwise things go astray, and it seems ocamlbuild does not
>> > detect automatically such situations, so one needs to explicitly pass
>> > the -opaque option when compiling setcore.mli (and only it).
>>
>> I would not have thought that adding `-opaque` would be sufficient. It
>> should get you
>> past the compilation of modules which depend on `setcore.mli`, but I would
>> expect
>> linking to fail still. If that is not the case I guess it should be
>> considered a bug because
>> in 4.03 referencing an `external` is supposed to force linking of the
>> containing module.
>> The change was made because the existing behaviour was said to confuse
>> people --
>> using a normal value from a module caused it to get linked whilst using an
>> external value
>> didn't.
>>
>> Regards,
>>
>> Leo
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>
>
> --
> Roberto Di Cosmo
>
> ------------------------------------------------------------------
> Professeur (on leave at/detache a INRIA Roquencourt)
> IRIF                           email : roberto@dicosmo.org
> Universite Paris Diderot         web : http://www.dicosmo.org
> Case 7014                    Twitter : http://twitter.com/rdicosmo
> 5, Rue Thomas Mann
> F-75205 Paris Cedex 13 FRANCE
> ------------------------------------------------------------------
> Office location:
>
> Paris Diderot                       INRIA
>
> Bureau 3020 (3rd floor)             Bureau C123
> Batiment Sophie Germain             Batiment C
> 8 place Aurélie Nemours             2, Rue Simone Iff
> Tel: +33 1 57 27 92 20              Tel: +33 1 80 49 44 42
>
> Metro
>  Bibliotheque F. Mitterrand        Ligne 6: Dugommier
>  ligne 14/RER C                    Ligne 14/RER A: Gare de Lyon
> ------------------------------------------------------------------
> GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


[-- Attachment #2: Type: text/html, Size: 10048 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [Caml-list] Justifying a breaking 4.03 change, strong dependency on modules with "external" declarations (was: )
  2016-05-16 11:52 [Caml-list] Justifying a breaking 4.03 change, strong dependency on modules with "external" declarations (was: ) Hongbo Zhang (BLOOMBERG/ 731 LEX)
@ 2016-05-16 12:38 ` David Allsopp
  0 siblings, 0 replies; 3+ messages in thread
From: David Allsopp @ 2016-05-16 12:38 UTC (permalink / raw)
  To: Hongbo Zhang, gabriel.scherer; +Cc: caml-list

Hongbo Zhang wrote:

> My 2 cents: I did not like this change, for modules which has side effect, it should always
> be explicitly listed in the link line, instead of relying on such fragile implicit behavior.

I think Gabriel's explanation has missed an important subtlety of the original problem. One of the reasons you might "omit" the .cmx file when linking is because you thought you'd given it, because the .cmx file was in a .cmxa file which you had linked (or more likely, one which ocamlfind was adding thanks to -package and -linkpkg instructions...).

Even worse, even if you did specify Bar.cmx, the linker did nothing to ensure that you'd placed it *before* the first call to an external value - meaning that the primitive could be called before the required initialisation code.

The only workarounds were to use val instead of external (giving both a slight performance penalty and forcing a .mli file to be written - although many, me included, think .mli files should be mandatory anyway!) or to use -linkall, which is not necessarily feasible.

I remember wasting a lot of time the first time I got stung by that... even though it is in the manual. In fact, that means the first paragraph of the manual needs updating!


David


PS I noticed that the old behaviour still happens with normal primitives (e.g. %identity), but I guess that's intentional? 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Caml-list] Justifying a breaking 4.03 change, strong dependency on modules with "external" declarations (was: )
@ 2016-05-16  1:09 Gabriel Scherer
  0 siblings, 0 replies; 3+ messages in thread
From: Gabriel Scherer @ 2016-05-16  1:09 UTC (permalink / raw)
  To: Roberto Di Cosmo; +Cc: Leo White, caml users

As Roberto Di Cosmo points out in the thread "parmap package broken in
opam switch 4.03.0", using an "external" declaration provided by a
module A now counts as a dependency on A -- in particular, A has to be
linked in the final executable.

This breaks some user code (such as Parmap, but I heard reports from
other breakages) in the case where A was used solely to provide such
"external" declarations, for C functions implemented in a C stub that
was linked in the application. In that case it was neither necessary
nor useful to link A, and projects did not do it. They need to do it
explicitly now, and break at linking-time otherwise.

I realize that users may not understand why this change, which looks
like a regression, happened in 4.03. It is actually a bugfix, see
PR#4166 ( http://caml.inria.fr/mantis/view.php?id=4166 ): if A does
not only provide those external declarations, but also initializes
some state that the C functions rely on (such as exception
declarations), then forgetting to link A may result in a crash at
runtime.

In other words, the pre-4.03 situation could result in some rare cases
of user error in a very subtle, very difficult to understand bug.
Under 4.03, such difficult errors cannot happen anymore, but we have
broken some (edgy but) less uncommon existing patterns in the
transition. On the long term, this is a win (obvious errors are
linking time are better than subtle errors at runtime), but the
transition period is of course painful -- and maybe it could have been
managed better, eg. with a warning instead of an error for the next
version, but this is a lot of work.

## Recovering the justification

In case anyone wonders, here is how I obtained this explanation: the
Changes file ( https://github.com/ocaml/ocaml/blob/trunk/Changes )
lists the breaking change as

* PR#4166, PR#6956: force linking when calling external C primitives
(Jacques Garrigue, reports by Markus Mottl and Christophe Troestler)

(Note the bullet point "*" instead of "-", that indicates that this
change may break user programs. If you maintain large or old OCaml
codebases, it is a good idea to review each starred item of the
changelog after each release. If a program breaks after a new release,
looking at the starred items may give indications of what is
happening.)

The first of the two issue reports listed in the Changes is an example
of the subtle bugs that could happen with the pre-4.03 semantics,
reported by Markus Mottle. The second is another instance reported by
Troestler, and contains the discussion around the change and its
implementation, mainly by Jacques Garrigue that did the implementation
work.


On Wed, Apr 27, 2016 at 5:49 AM, Roberto Di Cosmo <roberto@dicosmo.org> wrote:
> Indeed, after some more investigation (thanks to Francois Berenger), it
> seems that in 4.03 we can no longer just use a bare .mli file with the
> interface to some external code, as it was possible before.
>
> Now, we need to provide also an .ml file, in any case.
>
> The fix in parmap is underway, and it was a simple matter of moving
> setcore.mli to setcore.ml, without touching anything else.
>
> For the curious, the content of setcore.ml (ex setcore.mli) is the
> following:
>
> (* uses the native affinity interface to
>   declare that the current process should be
>   attached to core number n *)
>
> external numcores: unit -> int = "numcores"
> external setcore: int -> unit = "setcore"
>
> If you have similar patterns in your projects, take due notice :-)
>
> --
> Roberto
>
>
> 2016-04-26 19:16 GMT+02:00 Leo White <leo@lpw25.net>:
>>
>> > It seems that in 4.03 one needs to add the -opaque flag when compiling
>> > such stubs, otherwise things go astray, and it seems ocamlbuild does not
>> > detect automatically such situations, so one needs to explicitly pass
>> > the -opaque option when compiling setcore.mli (and only it).
>>
>> I would not have thought that adding `-opaque` would be sufficient. It
>> should get you
>> past the compilation of modules which depend on `setcore.mli`, but I would
>> expect
>> linking to fail still. If that is not the case I guess it should be
>> considered a bug because
>> in 4.03 referencing an `external` is supposed to force linking of the
>> containing module.
>> The change was made because the existing behaviour was said to confuse
>> people --
>> using a normal value from a module caused it to get linked whilst using an
>> external value
>> didn't.
>>
>> Regards,
>>
>> Leo
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>
>
> --
> Roberto Di Cosmo
>
> ------------------------------------------------------------------
> Professeur (on leave at/detache a INRIA Roquencourt)
> IRIF                           email : roberto@dicosmo.org
> Universite Paris Diderot         web : http://www.dicosmo.org
> Case 7014                    Twitter : http://twitter.com/rdicosmo
> 5, Rue Thomas Mann
> F-75205 Paris Cedex 13 FRANCE
> ------------------------------------------------------------------
> Office location:
>
> Paris Diderot                       INRIA
>
> Bureau 3020 (3rd floor)             Bureau C123
> Batiment Sophie Germain             Batiment C
> 8 place Aurélie Nemours             2, Rue Simone Iff
> Tel: +33 1 57 27 92 20              Tel: +33 1 80 49 44 42
>
> Metro
>  Bibliotheque F. Mitterrand        Ligne 6: Dugommier
>  ligne 14/RER C                    Ligne 14/RER A: Gare de Lyon
> ------------------------------------------------------------------
> GPG fingerprint 2931 20CE 3A5A 5390 98EC 8BFC FCCA C3BE 39CB 12D3

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-05-16 12:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-16 11:52 [Caml-list] Justifying a breaking 4.03 change, strong dependency on modules with "external" declarations (was: ) Hongbo Zhang (BLOOMBERG/ 731 LEX)
2016-05-16 12:38 ` David Allsopp
  -- strict thread matches above, loose matches on Subject: below --
2016-05-16  1:09 Gabriel Scherer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).