mailing list of musl libc
 help / color / mirror / code / Atom feed
* Maybe not a bug but a possible omission?
@ 2018-03-28 10:31 Jon Scobie
  2018-03-28 12:52 ` CodingMarkus
  0 siblings, 1 reply; 7+ messages in thread
From: Jon Scobie @ 2018-03-28 10:31 UTC (permalink / raw)
  To: musl; +Cc: Tom Cosgrove, Jason Page, Sam Carroll

[-- Attachment #1: Type: text/plain, Size: 2210 bytes --]

Hi. We are currently using musl in the context of Alpine 3.7 in a docker
container.
The application in question I am writing is in golang (1.9.4) but I have to
interface to some in house cryptographic libraries we write in C.
To help us achieve this with the minimal of fuss and also to generate
bindings to other languages, I am using SWIG 3.0.12.

I have come across a couple of issues which do not appear when using glibc
and they both centre on stdint.h

If I include <stdint.h> as part of the swig interface and it tries to wrap
it, it fails on the wrapper compilation of #defines for anything using
INT64_MIN, INT64_MAX etc.

Initially, I thought this was to do with the swig definitions for 64 bit
but on looking at the code and comparing what glibc defines for this, it
boiled down to the rvalue definitions for these macros.
Basically, glibc wraps these with another macro so where we have the
definition in musl as :-

#define INT64_MIN  (-1-0x7fffffffffffffff)

the equivalent glibc definition is the equivalent of

#define INT64_MIN  (-1-0x7fffffffffffffffL)

The issue is that swig has no idea what type INT64_MAX is if you don't
specifically state what it is so it treats it as a goint - which is not a
long (or long long).

Is there any value in changing the musl definitions of these so they are
precise in their definition or have I missed something?


A slightly different issue (and one which might be more a swig issue than
musl, although it works on glibc) is the definitions for WCHAR_MAX and
WCHAR_MIN.
On glibc, these are defined as whole values and not as 0xffffffffu+L'\0',
for example.

The wrappers end up messing these up as they escape all the back ticks and
this is not correct. As I said, a possible issue in swig based on a side
effect which doesn't exist when using glibc with these definitions.

Interested in hearing any opinion on this.

Regards,

Jon Scobie

-- 

----
The information contained in this communication is private and confidential 
and may contain privileged material. It is intended solely for use by the 
recipient(s). Copying, distributing, disclosing or using any of the 
information in it or any attachments is strictly prohibited and may be 
unlawful.

[-- Attachment #2: Type: text/html, Size: 2825 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Maybe not a bug but a possible omission?
  2018-03-28 10:31 Maybe not a bug but a possible omission? Jon Scobie
@ 2018-03-28 12:52 ` CodingMarkus
  2018-03-28 13:33   ` Jon Scobie
  0 siblings, 1 reply; 7+ messages in thread
From: CodingMarkus @ 2018-03-28 12:52 UTC (permalink / raw)
  To: musl



> On 2018-03-28, at 12:31, Jon Scobie <jon.scobie@callsign.com> wrote:
> 
> #define INT64_MIN  (-1-0x7fffffffffffffff)
> 
> the equivalent glibc definition is the equivalent of
> 
> #define INT64_MIN  (-1-0x7fffffffffffffffL)

According to ISO-C 2011 standard (page 63):

"The type of an integer constant is the first of the corresponding list in which its value can be represented.”

And for constants that have no type suffix (no “L”, “LL”, “U”, “UL”, “ULL”), the “list” mentioned above contains the following types decimal values:

int, long int, long long int

and the following types of octal/hex values:
 
int, unsigned int, long int, unsigned long int, long long int, unsigned long long int

Thus the type is the first one of the three above that is big enough to represent a constant value. 

By using the suffix “L”, all that happens is that the type list is further limited down to:

long int, long long int

or for octal/hex values:

long int, unsigned long int, long long int, unsigned long long int

On most systems, only long int or long long int can represent "(-1-0x7fffffffffffffff)”, so either one will be the deferred type. Appending the suffix “L” would thus not make any difference, as it will not result in a different type. It only eliminates the type “int” from the list but I don’t know any system where the type int could represent such a big number, so int is never chosen to begin with.

Personally I wonder about that definition in both libraries, I had expected it to be:

#define INT64_MIN  INT64_C(...)

because that assures the type is pinned to whatever type int64_t maps on the system. After all the C standard also says about INT64_MIN the other defines:

"Each instance of any defined macro shall be replaced by a constant expression suitable for use in #if preprocessing directives, and this expression shall have the same type as would an expression that is an object of the corresponding type converted according to the integer promotions.”

So INT64_MIN should be the same type as int64_t is (that’s how I interpret the last part of that sentence) and that is not always guaranteed when defined as above, as when defined as above, a long int is enough to represent that type on a 64 bit system (on 64 bit systems, long is typically 64 bit already), whereas int64_t may as well be a long long int on such a system. At least PRI64d typically uses “lld” AFAIK and not “ld”, though I have not checked what musl or glibc uses for PRI64d on 64 bit systems.

> The issue is that swig has no idea what type INT64_MAX is if you don't specifically state what it is so it treats it as a goint - which is not a long (or long long).

If swig fails to correctly process valid C code, then this sounds like a bug that the swig developers should really fix. The type of a numeric constant is always exactly defined by the C standard, it’s never required to give an explicit type unless you want a “wider” type than would have been chosen automatically otherwise or you want to force a signed value to be an unsigned type. Every tool parsing, compiling or processing C code should know the type rules and should apply them the same way a C compiler would do it, shouldn’t it?

Regards,
Markus



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Maybe not a bug but a possible omission?
  2018-03-28 12:52 ` CodingMarkus
@ 2018-03-28 13:33   ` Jon Scobie
  2018-03-28 17:19     ` Szabolcs Nagy
  0 siblings, 1 reply; 7+ messages in thread
From: Jon Scobie @ 2018-03-28 13:33 UTC (permalink / raw)
  To: musl


[-- Attachment #1.1: Type: text/plain, Size: 4447 bytes --]

Well, I definitely agree that instead of definitions like

#define INT64_MIN  (-1-0x7fffffffffffffff)

we should have

#define INT64_MIN  (-1 - INT64_C(0x7fffffffffffffff))


As for the WCHAR_MAX|MIN definitions, yes, swig should handle those. I'll
check the source and see about a patch for submission.

If the definitions above are what should be used, how do I go about getting
those updated?

Regards,
Jon.


*JON SCOBIE*
SENIOR SECURITY ENGINEER

+44 7894253988 <+44%7894253988>


www.callsign.com


On 28 March 2018 at 13:52, CodingMarkus <CodingMarkus@hanauska.name> wrote:

>
>
> > On 2018-03-28, at 12:31, Jon Scobie <jon.scobie@callsign.com> wrote:
> >
> > #define INT64_MIN  (-1-0x7fffffffffffffff)
> >
> > the equivalent glibc definition is the equivalent of
> >
> > #define INT64_MIN  (-1-0x7fffffffffffffffL)
>
> According to ISO-C 2011 standard (page 63):
>
> "The type of an integer constant is the first of the corresponding list in
> which its value can be represented.”
>
> And for constants that have no type suffix (no “L”, “LL”, “U”, “UL”,
> “ULL”), the “list” mentioned above contains the following types decimal
> values:
>
> int, long int, long long int
>
> and the following types of octal/hex values:
>
> int, unsigned int, long int, unsigned long int, long long int, unsigned
> long long int
>
> Thus the type is the first one of the three above that is big enough to
> represent a constant value.
>
> By using the suffix “L”, all that happens is that the type list is further
> limited down to:
>
> long int, long long int
>
> or for octal/hex values:
>
> long int, unsigned long int, long long int, unsigned long long int
>
> On most systems, only long int or long long int can represent
> "(-1-0x7fffffffffffffff)”, so either one will be the deferred type.
> Appending the suffix “L” would thus not make any difference, as it will not
> result in a different type. It only eliminates the type “int” from the list
> but I don’t know any system where the type int could represent such a big
> number, so int is never chosen to begin with.
>
> Personally I wonder about that definition in both libraries, I had
> expected it to be:
>
> #define INT64_MIN  INT64_C(...)
>
> because that assures the type is pinned to whatever type int64_t maps on
> the system. After all the C standard also says about INT64_MIN the other
> defines:
>
> "Each instance of any defined macro shall be replaced by a constant
> expression suitable for use in #if preprocessing directives, and this
> expression shall have the same type as would an expression that is an
> object of the corresponding type converted according to the integer
> promotions.”
>
> So INT64_MIN should be the same type as int64_t is (that’s how I interpret
> the last part of that sentence) and that is not always guaranteed when
> defined as above, as when defined as above, a long int is enough to
> represent that type on a 64 bit system (on 64 bit systems, long is
> typically 64 bit already), whereas int64_t may as well be a long long int
> on such a system. At least PRI64d typically uses “lld” AFAIK and not “ld”,
> though I have not checked what musl or glibc uses for PRI64d on 64 bit
> systems.
>
> > The issue is that swig has no idea what type INT64_MAX is if you don't
> specifically state what it is so it treats it as a goint - which is not a
> long (or long long).
>
> If swig fails to correctly process valid C code, then this sounds like a
> bug that the swig developers should really fix. The type of a numeric
> constant is always exactly defined by the C standard, it’s never required
> to give an explicit type unless you want a “wider” type than would have
> been chosen automatically otherwise or you want to force a signed value to
> be an unsigned type. Every tool parsing, compiling or processing C code
> should know the type rules and should apply them the same way a C compiler
> would do it, shouldn’t it?
>
> Regards,
> Markus
>
>

-- 

----
The information contained in this communication is private and confidential 
and may contain privileged material. It is intended solely for use by the 
recipient(s). Copying, distributing, disclosing or using any of the 
information in it or any attachments is strictly prohibited and may be 
unlawful.

[-- Attachment #1.2: Type: text/html, Size: 7315 bytes --]

[-- Attachment #2: logo-green.png --]
[-- Type: image/png, Size: 1566 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Maybe not a bug but a possible omission?
  2018-03-28 13:33   ` Jon Scobie
@ 2018-03-28 17:19     ` Szabolcs Nagy
  2018-03-28 17:54       ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Szabolcs Nagy @ 2018-03-28 17:19 UTC (permalink / raw)
  To: musl

* Jon Scobie <jon.scobie@callsign.com> [2018-03-28 14:33:23 +0100]:
> Well, I definitely agree that instead of definitions like
> 
> #define INT64_MIN  (-1-0x7fffffffffffffff)
> 
> we should have
> 
> #define INT64_MIN  (-1 - INT64_C(0x7fffffffffffffff))
> 

why?

"The macro INTN_C(value) shall expand to an integer constant expression corresponding to the type int_leastN_t"

i dont think it is necessary or appropriate: the c rules
already handles this portably: the const has the lowest
rank 64bit signed int type, any additional complication
can just get the type wrong.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Maybe not a bug but a possible omission?
  2018-03-28 17:19     ` Szabolcs Nagy
@ 2018-03-28 17:54       ` Rich Felker
  2018-03-29  0:30         ` Jon Scobie
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2018-03-28 17:54 UTC (permalink / raw)
  To: musl

On Wed, Mar 28, 2018 at 07:19:49PM +0200, Szabolcs Nagy wrote:
> * Jon Scobie <jon.scobie@callsign.com> [2018-03-28 14:33:23 +0100]:
> > Well, I definitely agree that instead of definitions like
> > 
> > #define INT64_MIN  (-1-0x7fffffffffffffff)
> > 
> > we should have
> > 
> > #define INT64_MIN  (-1 - INT64_C(0x7fffffffffffffff))
> > 
> 
> why?
> 
> "The macro INTN_C(value) shall expand to an integer constant expression corresponding to the type int_leastN_t"
> 
> i dont think it is necessary or appropriate: the c rules
> already handles this portably: the const has the lowest
> rank 64bit signed int type, any additional complication
> can just get the type wrong.

Yes. If a tool is misinterpreting the expressions here, the tool
should be fixed. They all have the intended types already when
evaluated as C expressions. Making random edits to headers to make
buggy tools happy is not a direction I want to take.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Maybe not a bug but a possible omission?
  2018-03-28 17:54       ` Rich Felker
@ 2018-03-29  0:30         ` Jon Scobie
  2018-03-29  4:53           ` Markus Wichmann
  0 siblings, 1 reply; 7+ messages in thread
From: Jon Scobie @ 2018-03-29  0:30 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1497 bytes --]

Intended types? It's a macro. Where is the type definition? It is inferred.
Why not just make it implicit when the language allows for it?

On Wed, 28 Mar 2018, 18:54 Rich Felker, <dalias@libc.org> wrote:

> On Wed, Mar 28, 2018 at 07:19:49PM +0200, Szabolcs Nagy wrote:
> > * Jon Scobie <jon.scobie@callsign.com> [2018-03-28 14:33:23 +0100]:
> > > Well, I definitely agree that instead of definitions like
> > >
> > > #define INT64_MIN  (-1-0x7fffffffffffffff)
> > >
> > > we should have
> > >
> > > #define INT64_MIN  (-1 - INT64_C(0x7fffffffffffffff))
> > >
> >
> > why?
> >
> > "The macro INTN_C(value) shall expand to an integer constant expression
> corresponding to the type int_leastN_t"
> >
> > i dont think it is necessary or appropriate: the c rules
> > already handles this portably: the const has the lowest
> > rank 64bit signed int type, any additional complication
> > can just get the type wrong.
>
> Yes. If a tool is misinterpreting the expressions here, the tool
> should be fixed. They all have the intended types already when
> evaluated as C expressions. Making random edits to headers to make
> buggy tools happy is not a direction I want to take.
>
> Rich
>

-- 

----
The information contained in this communication is private and confidential 
and may contain privileged material. It is intended solely for use by the 
recipient(s). Copying, distributing, disclosing or using any of the 
information in it or any attachments is strictly prohibited and may be 
unlawful.

[-- Attachment #2: Type: text/html, Size: 2210 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Maybe not a bug but a possible omission?
  2018-03-29  0:30         ` Jon Scobie
@ 2018-03-29  4:53           ` Markus Wichmann
  0 siblings, 0 replies; 7+ messages in thread
From: Markus Wichmann @ 2018-03-29  4:53 UTC (permalink / raw)
  To: musl

On Thu, Mar 29, 2018 at 12:30:52AM +0000, Jon Scobie wrote:
> Intended types? It's a macro. Where is the type definition? It is inferred.

Inferred by a standardised algorithm. So where's the problem?

> Why not just make it implicit when the language allows for it?
> 

But... the type _is_ implicit here. Also, Rich explained why. Do not
attempt to fix swig by changing musl.

The type inferrence rules haven't changed since 1989, except to add long
long int to the list.

Also, please add your answers at the bottom on this list.

> ----
> The information contained in this communication is private and confidential 
> and may contain privileged material. It is intended solely for use by the 
> recipient(s).

Well then you probably should not have sent it to a mailing list with an
online archive.

> Copying, distributing, disclosing or using any of the 
> information in it or any attachments is strictly prohibited and may be 
> unlawful.

In that generality, that is only true in the US. However, this list is
international.

Ciao,
Markus


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-03-29  4:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-28 10:31 Maybe not a bug but a possible omission? Jon Scobie
2018-03-28 12:52 ` CodingMarkus
2018-03-28 13:33   ` Jon Scobie
2018-03-28 17:19     ` Szabolcs Nagy
2018-03-28 17:54       ` Rich Felker
2018-03-29  0:30         ` Jon Scobie
2018-03-29  4:53           ` Markus Wichmann

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).