mailing list of musl libc
 help / color / mirror / code / Atom feed
* freeaddrinfo() comments and questions
@ 2019-11-23 15:46 gilles
  2019-11-23 16:05 ` Florian Weimer
  2019-11-23 16:31 ` gilles
  0 siblings, 2 replies; 6+ messages in thread
From: gilles @ 2019-11-23 15:46 UTC (permalink / raw)
  To: musl

Hello,

I'm not subscribed to the list, please do keep me cc-ed.

This is just a mail to express my opinion with regard to a somewhat recent commit,
feel free to disregard :-)


According to:

https://pubs.opengroup.org/onlinepubs/009695399/functions/freeaddrinfo.html

The freeaddrinfo() function shall free one or more addrinfo structures returned by getaddrinfo(), along with any additional storage associated with those structures. If the ai_next field of the structure is not null, the entire list of structures shall be freed. The freeaddrinfo() function shall support the freeing of arbitrary sublists of an addrinfo list originally returned by getaddrinfo().


In this commit:

https://git.musl-libc.org/cgit/musl/commit/src/network/freeaddrinfo.c?id=d1395c43c019aec6b855cf3c656bf47c8a719e7f

Rich Felker states:

the specification for freeaddrinfo allows it to be used to free
"arbitrary sublists" of the list returned by getaddrinfo. it's not
clearly stated how such sublists come into existence, but the
interpretation seems to be that the application can edit the ai_next
pointers to cut off a portion of the list and then free it.


I think that the reason why it's not clearly stated how such sublists come into existence...
is because of the mere definition of a sublist:

    sublist. Noun. (plural sublists) A list that makes up part of a larger list.


Given that struct addrinfo is a linked list this doesn't give much room for interpretation,
a sublist of the struct addrinfo is just any point after the first node,
because that's the only way you can have a list that makes up part of a larger list.


This leads to my second point, the musl interpretation of the standard leads to an implementation which is not only at odds with what is found in pretty much every other libc, ranging from BSD to glibc, but also including Android, Solaris and others, but which also prevents the writing of portable code which crafts a struct addrinfo. I understand that you may not have a goal to do exactly what all the others do, but if pretty much every one has the same understanding, it should at least raise some concern if you're diverging in my opinion.

In these other implementations, it is possible to write a custom struct addrinfo allocator and use freeaddrinfo() on it, just like it is possible to use getaddrinfo() and use a custom release function on it. This is not a very common use-case, granted, but it is one nonetheless, and one that works and has worked in a portable way for a long time across a wide variety of systems.

With musl, the way struct addrinfo is handled not only prevents this but also makes the struct addrinfo kind of a weird structure because it is public, as it should, but relies on an underlying structure, struct aibuf, which is not something a user of struct addrinfo would see. Furthermore, the fact that freeaddrinfo() relies on pointer arithmetic makes it really clear that struct addrinfo isn't really a linked list but that it's an array of linked elements, and while this may make it much simpler for musl to handle thread-safety and memory releasing, this is really something that breaks user expectations big time if they assume struct addrinfo to be a list and to work as a list.

I personally have a work-around but I really think that this commit was not right and wished to contribute with my 2 cents if it can provide a different point of view. I don't think any implementation relying on contiguous memory and a single release can possibly be right as it means the struct addrinfo is not a list, can't have sublists, can't be self-crafted or self-released.

Cheers,
Gilles


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: freeaddrinfo() comments and questions
  2019-11-23 15:46 freeaddrinfo() comments and questions gilles
@ 2019-11-23 16:05 ` Florian Weimer
  2019-11-23 17:41   ` Rich Felker
  2019-11-23 16:31 ` gilles
  1 sibling, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2019-11-23 16:05 UTC (permalink / raw)
  To: gilles; +Cc: musl

* gilles:

> In these other implementations, it is possible to write a custom
> struct addrinfo allocator and use freeaddrinfo() on it, just like it
> is possible to use getaddrinfo() and use a custom release function on
> it. This is not a very common use-case, granted, but it is one
> nonetheless, and one that works and has worked in a portable way for a
> long time across a wide variety of systems.

I think this is clearly undefined.  There is no way to know how
storage for ai_addr and ai_canonname is managed.  These pointers could
point to separate allocations, made with malloc.  They could be
interior pointers to the same top-level allocation at which start the
struct addrinfo object is allocated.  Nothing even needs to use
malloc, including the outer struct addrinfo object.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: freeaddrinfo() comments and questions
  2019-11-23 15:46 freeaddrinfo() comments and questions gilles
  2019-11-23 16:05 ` Florian Weimer
@ 2019-11-23 16:31 ` gilles
  2019-11-23 16:50   ` Florian Weimer
  1 sibling, 1 reply; 6+ messages in thread
From: gilles @ 2019-11-23 16:31 UTC (permalink / raw)
  To: Florian Weimer; +Cc: musl

November 23, 2019 5:05 PM, "Florian Weimer" <fw@deneb.enyo.de> wrote:

> * gilles:
> 
>> In these other implementations, it is possible to write a custom
>> struct addrinfo allocator and use freeaddrinfo() on it, just like it
>> is possible to use getaddrinfo() and use a custom release function on
>> it. This is not a very common use-case, granted, but it is one
>> nonetheless, and one that works and has worked in a portable way for a
>> long time across a wide variety of systems.
> 
> I think this is clearly undefined. There is no way to know how
> storage for ai_addr and ai_canonname is managed. These pointers could
> point to separate allocations, made with malloc. They could be
> interior pointers to the same top-level allocation at which start the
> struct addrinfo object is allocated. Nothing even needs to use
> malloc, including the outer struct addrinfo object.

Fair enough for this use-case, I think you are right and it works by accident.

What is your opinion on the other comments ?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: freeaddrinfo() comments and questions
  2019-11-23 16:31 ` gilles
@ 2019-11-23 16:50   ` Florian Weimer
  2019-11-23 17:48     ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2019-11-23 16:50 UTC (permalink / raw)
  To: gilles; +Cc: musl

* gilles:

> November 23, 2019 5:05 PM, "Florian Weimer" <fw@deneb.enyo.de> wrote:
>
>> * gilles:
>> 
>>> In these other implementations, it is possible to write a custom
>>> struct addrinfo allocator and use freeaddrinfo() on it, just like it
>>> is possible to use getaddrinfo() and use a custom release function on
>>> it. This is not a very common use-case, granted, but it is one
>>> nonetheless, and one that works and has worked in a portable way for a
>>> long time across a wide variety of systems.
>> 
>> I think this is clearly undefined. There is no way to know how
>> storage for ai_addr and ai_canonname is managed. These pointers could
>> point to separate allocations, made with malloc. They could be
>> interior pointers to the same top-level allocation at which start the
>> struct addrinfo object is allocated. Nothing even needs to use
>> malloc, including the outer struct addrinfo object.
>
> Fair enough for this use-case, I think you are right and it works by accident.
>
> What is your opinion on the other comments ?

The most obvious interpretation is that callers can tweak the ai_next
member before calling freeaddrinfo, and that freeaddrinfo performs the
usual iteration over this single-linked list, freeing each list
element individually.

In general, relying on this does not seem particularly useful to me.
Applications should probably call freeaddrinfo only on the pointer
provided by getaddrinfo, and refrain from writing to any struct
members.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: freeaddrinfo() comments and questions
  2019-11-23 16:05 ` Florian Weimer
@ 2019-11-23 17:41   ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2019-11-23 17:41 UTC (permalink / raw)
  To: Florian Weimer; +Cc: gilles, musl

On Sat, Nov 23, 2019 at 05:05:17PM +0100, Florian Weimer wrote:
> * gilles:
> 
> > In these other implementations, it is possible to write a custom
> > struct addrinfo allocator and use freeaddrinfo() on it, just like it
> > is possible to use getaddrinfo() and use a custom release function on
> > it. This is not a very common use-case, granted, but it is one
> > nonetheless, and one that works and has worked in a portable way for a
> > long time across a wide variety of systems.
> 
> I think this is clearly undefined.  There is no way to know how
> storage for ai_addr and ai_canonname is managed.  These pointers could
> point to separate allocations, made with malloc.  They could be
> interior pointers to the same top-level allocation at which start the
> struct addrinfo object is allocated.  Nothing even needs to use
> malloc, including the outer struct addrinfo object.

Indeed, this is absolutely undefined. Even the way the storage for the
addrinfo structures themselves is allocated is unspecified. It need
not be by malloc. For example if libc allowed malloc interposition,
but only used the interposed functions itself for functions specified
to return something allocated as if by malloc, and otherwise always
used its internal malloc, then you would end up with a mix of
structures allocated by different allocators, and passing one to the
other's free would blow up. And of course they could be allocated by
direct mmap, or from a static pool, or any of an unlimited number of
other possible ways.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: freeaddrinfo() comments and questions
  2019-11-23 16:50   ` Florian Weimer
@ 2019-11-23 17:48     ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2019-11-23 17:48 UTC (permalink / raw)
  To: Florian Weimer; +Cc: gilles, musl

On Sat, Nov 23, 2019 at 05:50:08PM +0100, Florian Weimer wrote:
> * gilles:
> 
> > November 23, 2019 5:05 PM, "Florian Weimer" <fw@deneb.enyo.de> wrote:
> >
> >> * gilles:
> >> 
> >>> In these other implementations, it is possible to write a custom
> >>> struct addrinfo allocator and use freeaddrinfo() on it, just like it
> >>> is possible to use getaddrinfo() and use a custom release function on
> >>> it. This is not a very common use-case, granted, but it is one
> >>> nonetheless, and one that works and has worked in a portable way for a
> >>> long time across a wide variety of systems.
> >> 
> >> I think this is clearly undefined. There is no way to know how
> >> storage for ai_addr and ai_canonname is managed. These pointers could
> >> point to separate allocations, made with malloc. They could be
> >> interior pointers to the same top-level allocation at which start the
> >> struct addrinfo object is allocated. Nothing even needs to use
> >> malloc, including the outer struct addrinfo object.
> >
> > Fair enough for this use-case, I think you are right and it works by accident.
> >
> > What is your opinion on the other comments ?
> 
> The most obvious interpretation is that callers can tweak the ai_next
> member before calling freeaddrinfo, and that freeaddrinfo performs the
> usual iteration over this single-linked list, freeing each list
> element individually.
> 
> In general, relying on this does not seem particularly useful to me.
> Applications should probably call freeaddrinfo only on the pointer
> provided by getaddrinfo, and refrain from writing to any struct
> members.

The strictest interpretation is that you can't modify the list at all,
and call freeaddrinfo only on the original (entire) list or some tail
of it. However I think the spec is ambiguous and it's worth supporting
the case where the list has been split by nulling an ai_next pointer.
Perhaps removing segments by updating ai_next to point to some later
tail of the list is also intended to be ok. Anything beyond that seems
pretty dubious to me.

To me, the intended use of this functionality seems to be that you
might want to remove unwanted entries from the returned list, without
having to make your own container structure to store the ones you want
to keep. Freeing parts of it is not useful to save memory; rather, the
idea of freeing parts is that, if the implementation itself can't
reach the list members you remove, it would have no way to free them
later when you free the list. Thus it's expected that you notify the
implementation you're removing them by calling freeaddrinfo on
sublists.

Unfortunately all of this is underspecified.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-23 17:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-23 15:46 freeaddrinfo() comments and questions gilles
2019-11-23 16:05 ` Florian Weimer
2019-11-23 17:41   ` Rich Felker
2019-11-23 16:31 ` gilles
2019-11-23 16:50   ` Florian Weimer
2019-11-23 17:48     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).