zsh-workers
 help / color / mirror / code / Atom feed
* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
@ 2016-10-16  1:14 ` Bart Schaefer
  2016-10-16 17:03   ` Daniel Shahaf
  2016-10-18 11:36   ` Peter Stephenson
  0 siblings, 2 replies; 12+ messages in thread
From: Bart Schaefer @ 2016-10-16  1:14 UTC (permalink / raw)
  To: zsh-workers

[I originally sent this 12 hours ago but it still hasn't shown up in the
mailing list archives.  Seems to be problems with mail from the gmail web
interface again.  Sorry if this eventually appears multiple times.]


On Sat, Oct 15, 2016 at 1:30 AM, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> Does anyone have a way to get type checking that doesn't involve
> adding .v everywhere?

Wouldn't it suffice to use
  typedef char *unmeta_t;
  typedef unsigned char *meta_t;

(or the reverse if we believe we're mostly going to be working on
meta_t).  If we do that and also enable the GCC warnings -Wtype-limits
-Wconversion -Wformat-signedness would we not get adequate verbosity
without changing pointer arithmetic and dereferencing?

I hadn't even considered all the places where using a struct would
change memory allocation strategy.  Oof.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-16  1:14 ` type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile()) Bart Schaefer
@ 2016-10-16 17:03   ` Daniel Shahaf
  2016-10-18 11:36   ` Peter Stephenson
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel Shahaf @ 2016-10-16 17:03 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer wrote on Sat, Oct 15, 2016 at 18:14:44 -0700:
> On Sat, Oct 15, 2016 at 1:30 AM, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > Does anyone have a way to get type checking that doesn't involve
> > adding .v everywhere?
> 
> Wouldn't it suffice to use
>   typedef char *unmeta_t;
>   typedef unsigned char *meta_t;
> 

Another option: use incomplete struct types:
.
    typedef char *unmeta_t;
    typedef struct incomplete_meta_t *meta_t;

> (or the reverse if we believe we're mostly going to be working on
> meta_t).  If we do that and also enable the GCC warnings -Wtype-limits
> -Wconversion -Wformat-signedness would we not get adequate verbosity
> without changing pointer arithmetic and dereferencing?
> 
> I hadn't even considered all the places where using a struct would
> change memory allocation strategy.  Oof.
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-16  1:14 ` type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile()) Bart Schaefer
  2016-10-16 17:03   ` Daniel Shahaf
@ 2016-10-18 11:36   ` Peter Stephenson
  1 sibling, 0 replies; 12+ messages in thread
From: Peter Stephenson @ 2016-10-18 11:36 UTC (permalink / raw)
  To: zsh-workers

On Sat, 15 Oct 2016 18:14:44 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> Wouldn't it suffice to use
>   typedef char *unmeta_t;
>   typedef unsigned char *meta_t;
> 
> (or the reverse if we believe we're mostly going to be working on
> meta_t).  If we do that and also enable the GCC warnings -Wtype-limits
> -Wconversion -Wformat-signedness would we not get adequate verbosity
> without changing pointer arithmetic and dereferencing?

-Wformat-signedness didn't exist when I tried (old version, 4.5.1), but
adding -Wtype-limits and -Wconversion produced 2348 type conversion
errors and 63 implicit conversion of negative integer errors.

But we don't need new warnings --- compilers, including gcc, are much
more finnicky about converting pointer types than the base types, and
assigning individual characters from strings around isn't part of our
problems (except where we alrady have the STOUC() hack).

The only possible subtlety is on some systems char is already unsigned.
However, it seems gcc considers all three of "unsigned char *", "signed
char *", and "char *" as distinct pointer types and making meta_t either
of the first two throws up errors like:

ptr.c:11:9: warning: comparison of distinct pointer types lacks a cast

So this looks workable.  Obviously this means continuing to keep
unmetafied lengths separate.

pws


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-14 23:04                   ` Oliver Kiddle
@ 2016-10-15  8:30                     ` Daniel Shahaf
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Shahaf @ 2016-10-15  8:30 UTC (permalink / raw)
  To: zsh-workers

Oliver Kiddle wrote on Sat, Oct 15, 2016 at 01:04:58 +0200:
> Daniel Shahaf wrote:
> > > > This probably is a good idea, although the intermediate state where this
> > > > has been only partly done is likely to be a bit of a mess.
> 
> Starting out with a simple typedef would at least add documentation and
> allow an incremental approach. Would still need to add .v everywhere en
> masse when going to a struct, though.

Does anyone have a way to get type checking that doesn't involve
adding .v everywhere?  The solution doesn't _have_ to be portable; we
can use compiler-specific magic of gcc or clang, for example, so long as
some developers use that compiler routinely.

If we do the typedefs first, then adding .v later should be easy with
something like coccinelle's patching tool.  (It allows doing
search/replace in terms of the C syntax tree.)

> I did some work on a code base that had a convention of uint8_t*
> for UTF-8. Seemed to work well as far as I could tell.
> 

What's the advantage over chars or unsigned chars?

> For the history file we should perhaps consider not using locale
> dependent metafied strings. UTF-8 with an overlong encoding for a null
> perhaps? grep etc might still not like the overlong null but they should
> be rarer.

+1 to making the history file locale-agnostic.

> > allocated relative to each other, e.g., if the code assumes that
> > «ary[1] - ary[0] == 1+strlen(ary[0])» or otherwise uses the values of
> 
> Even with char**, that condition is only likely if the array was
> created with a literal assignment. And it is easy to test for
> problems by forcing the size of the struct and running the test
> cases.

I was thinking of code that allocates multiple elements in a single
malloc() chunk:

    char **array, *storage;
    for (i = 0; i < LENGTH_OF_LIST; i++)
        sum_of_lengths += strlen(list[i]) + 1;
    storage = malloc(sum_of_lengths);
    for (i = 0; i < LENGTH_OF_LIST; i++) {
        array[i] = strcpy(storage, list[i]);
        storage += strlen(array[i]) + 1;
    }

> Oliver
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-14  6:36                 ` Daniel Shahaf
  2016-10-14 16:53                   ` Peter Stephenson
@ 2016-10-14 23:04                   ` Oliver Kiddle
  2016-10-15  8:30                     ` Daniel Shahaf
  1 sibling, 1 reply; 12+ messages in thread
From: Oliver Kiddle @ 2016-10-14 23:04 UTC (permalink / raw)
  To: Zsh hackers list

Daniel Shahaf wrote:
> > > This probably is a good idea, although the intermediate state where this
> > > has been only partly done is likely to be a bit of a mess.

Starting out with a simple typedef would at least add documentation and
allow an incremental approach. Would still need to add .v everywhere en
masse when going to a struct, though.

I did some work on a code base that had a convention of uint8_t*
for UTF-8. Seemed to work well as far as I could tell.

For the history file we should perhaps consider not using locale
dependent metafied strings. UTF-8 with an overlong encoding for a null
perhaps? grep etc might still not like the overlong null but they should
be rarer.

> allocated relative to each other, e.g., if the code assumes that
> «ary[1] - ary[0] == 1+strlen(ary[0])» or otherwise uses the values of

Even with char**, that condition is only likely if the array was
created with a literal assignment. And it is easy to test for
problems by forcing the size of the struct and running the test
cases.

Oliver


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-14 16:53                   ` Peter Stephenson
  2016-10-14 17:28                     ` Bart Schaefer
@ 2016-10-14 20:08                     ` Daniel Shahaf
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel Shahaf @ 2016-10-14 20:08 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

Peter Stephenson wrote on Fri, Oct 14, 2016 at 17:53:17 +0100:
> I don't think arrays of unmetafied strings are that common, and in at
> least one case (the print builtin) having the length as part of the
> unmetafied_t structure, avoiding a parallel length structure, is
> beneficial.

Ack.  I saw you say this in your previous mail; I didn't respond since
I don't have an opinion one way or the other as to how often the length
would be needed.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-14 16:53                   ` Peter Stephenson
@ 2016-10-14 17:28                     ` Bart Schaefer
  2016-10-14 20:08                     ` Daniel Shahaf
  1 sibling, 0 replies; 12+ messages in thread
From: Bart Schaefer @ 2016-10-14 17:28 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Fri, Oct 14, 2016 at 9:53 AM, Peter Stephenson
<p.stephenson@samsung.com> wrote:
> On Fri, 14 Oct 2016 06:36:25 +0000
> Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>> A 'char **' would become a 'struct metafied_t *', i.e., an array of
>> structs.
>
> I don't think arrays of unmetafied strings are that common, and in at
> least one case (the print builtin) having the length as part of the
> unmetafied_t structure, avoiding a parallel length structure, is
> beneficial.

I am more concerned about the large number of places where s[0][1]
would have to become s[0].v[1] than about whether the strings are
metafied or not.  Similarly *s++ and so on.

Including the even larger number of places where we'd have to convert
e.g. "if (!s)"  to  "if (!s.v)" because it's not possible to pass a
null struct by value.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-14  6:36                 ` Daniel Shahaf
@ 2016-10-14 16:53                   ` Peter Stephenson
  2016-10-14 17:28                     ` Bart Schaefer
  2016-10-14 20:08                     ` Daniel Shahaf
  2016-10-14 23:04                   ` Oliver Kiddle
  1 sibling, 2 replies; 12+ messages in thread
From: Peter Stephenson @ 2016-10-14 16:53 UTC (permalink / raw)
  To: Zsh hackers list

On Fri, 14 Oct 2016 06:36:25 +0000
Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> A 'char **' would become a 'struct metafied_t *', i.e., an array of
> structs.  This is not equivalent: the array stride of the latter may be
> larger than the array stride of the former.¹  However, this would only
> be a problem if the code makes assumptions about where elements are
> allocated relative to each other, e.g., if the code assumes that
> «ary[1] - ary[0] == 1+strlen(ary[0])» or otherwise uses the values of
> ary[N] and ary[M] in a single expression.
> 
> Does the code make such assumptions?

No, char ** is used fairly generically.

I don't think arrays of unmetafied strings are that common, and in at
least one case (the print builtin) having the length as part of the
unmetafied_t structure, avoiding a parallel length structure, is
beneficial.

pws


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-13 13:46               ` Bart Schaefer
@ 2016-10-14  6:36                 ` Daniel Shahaf
  2016-10-14 16:53                   ` Peter Stephenson
  2016-10-14 23:04                   ` Oliver Kiddle
  0 siblings, 2 replies; 12+ messages in thread
From: Daniel Shahaf @ 2016-10-14  6:36 UTC (permalink / raw)
  To: Zsh hackers list

Bart Schaefer wrote on Thu, Oct 13, 2016 at 06:46:38 -0700:
> On Thu, Oct 13, 2016 at 3:22 AM, Peter Stephenson
> <p.stephenson@samsung.com> wrote:
> > On Tue, 11 Oct 2016 06:51:05 +0000
> > Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >> Pardon me if this has been discussed before, but why don't we introduce
> >> the type definitions
> >> and start incrementally replacing uses of "char *" with uses of these
> >> types?
> >
> > This probably is a good idea, although the intermediate state where this
> > has been only partly done is likely to be a bit of a mess.
> 
> I'd expect to encounter a lot of problems with (char **) arrays where
> every element of the array is (or not) metafied.  (Hopefully there
> aren't any that are mixed).  As was noted in the discussion of passing
> around lengths of arrays when manipulating array parameters, the older
> code makes a lot of assumptions about pointer arithmetic and s[1][1]
> subscripting and so on.

A 'char **' would become a 'struct metafied_t *', i.e., an array of
structs.  This is not equivalent: the array stride of the latter may be
larger than the array stride of the former.¹  However, this would only
be a problem if the code makes assumptions about where elements are
allocated relative to each other, e.g., if the code assumes that
«ary[1] - ary[0] == 1+strlen(ary[0])» or otherwise uses the values of
ary[N] and ary[M] in a single expression.

Does the code make such assumptions?

Cheers,

Daniel

¹ The array stride is sizeof() of the array element's type.  The
sizeof() of the struct type may be larger than sizeof() of its member
because structs are allowed to contain padding after the last element.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile())
  2016-10-13 10:22             ` Peter Stephenson
@ 2016-10-13 13:46               ` Bart Schaefer
  2016-10-14  6:36                 ` Daniel Shahaf
  0 siblings, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2016-10-13 13:46 UTC (permalink / raw)
  To: Zsh hackers list

On Thu, Oct 13, 2016 at 3:22 AM, Peter Stephenson
<p.stephenson@samsung.com> wrote:
> On Tue, 11 Oct 2016 06:51:05 +0000
> Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>> Pardon me if this has been discussed before, but why don't we introduce
>> the type definitions
>> and start incrementally replacing uses of "char *" with uses of these
>> types?
>
> This probably is a good idea, although the intermediate state where this
> has been only partly done is likely to be a bit of a mess.

I'd expect to encounter a lot of problems with (char **) arrays where
every element of the array is (or not) metafied.  (Hopefully there
aren't any that are mixed).  As was noted in the discussion of passing
around lengths of arrays when manipulating array parameters, the older
code makes a lot of assumptions about pointer arithmetic and s[1][1]
subscripting and so on.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: type-checking for metafiedness?  (was: Re: Cores almost on demand in patcompile())
  2016-10-11  6:51           ` type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile()) Daniel Shahaf
@ 2016-10-13 10:22             ` Peter Stephenson
  2016-10-13 13:46               ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Stephenson @ 2016-10-13 10:22 UTC (permalink / raw)
  To: Zsh hackers list

On Tue, 11 Oct 2016 06:51:05 +0000
Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> Pardon me if this has been discussed before, but why don't we introduce
> the type definitions
> .
>     struct metafied_t { char *v; };
>     struct unmetafied_t { char *v; };
> .
> and start incrementally replacing uses of "char *" with uses of these
> types?  This will gain type checking for metafied v. unmetafied strings,
> as these two types do not implicitly convert into each other (attempts
> to do so generate a compiler warning).

This probably is a good idea, although the intermediate state where this
has been only partly done is likely to be a bit of a mess.  Modern
compilers should be able to keep this working efficiently.

Some thought could be given as to whether the unmetafied structure should
contain the length, too.  The only uses of unmetafied strings I can
think of that wouldn't need it are at the library / system interface,
which requires unmetafied null-terminated strings.

pws


^ permalink raw reply	[flat|nested] 12+ messages in thread

* type-checking for metafiedness?  (was: Re: Cores almost on demand in patcompile())
  2016-10-11  2:46         ` Bart Schaefer
@ 2016-10-11  6:51           ` Daniel Shahaf
  2016-10-13 10:22             ` Peter Stephenson
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Shahaf @ 2016-10-11  6:51 UTC (permalink / raw)
  To: Zsh hackers list; +Cc: Sebastian Gniazdowski

Bart Schaefer wrote on Mon, Oct 10, 2016 at 19:46:18 -0700:
> The comments say that this argument is expected to be metafied
> (pattern.c 522), but as best I can tell it's passed down from
> paramsubst() tokenized but not metafied.

Pardon me if this has been discussed before, but why don't we introduce
the type definitions
.
    struct metafied_t { char *v; };
    struct unmetafied_t { char *v; };
.
and start incrementally replacing uses of "char *" with uses of these
types?  This will gain type checking for metafied v. unmetafied strings,
as these two types do not implicitly convert into each other (attempts
to do so generate a compiler warning).

Cheers,

Daniel


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-10-18 11:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20161016014741eucas1p189be6c2fa19aaf9d31733cae5f716178@eucas1p1.samsung.com>
2016-10-16  1:14 ` type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile()) Bart Schaefer
2016-10-16 17:03   ` Daniel Shahaf
2016-10-18 11:36   ` Peter Stephenson
2016-10-09  9:00 Cores almost on demand in patcompile() Sebastian Gniazdowski
2016-10-09  9:16 ` Sebastian Gniazdowski
2016-10-09 10:15   ` Sebastian Gniazdowski
2016-10-09 16:09     ` Sebastian Gniazdowski
2016-10-10 15:31       ` Sebastian Gniazdowski
2016-10-11  2:46         ` Bart Schaefer
2016-10-11  6:51           ` type-checking for metafiedness? (was: Re: Cores almost on demand in patcompile()) Daniel Shahaf
2016-10-13 10:22             ` Peter Stephenson
2016-10-13 13:46               ` Bart Schaefer
2016-10-14  6:36                 ` Daniel Shahaf
2016-10-14 16:53                   ` Peter Stephenson
2016-10-14 17:28                     ` Bart Schaefer
2016-10-14 20:08                     ` Daniel Shahaf
2016-10-14 23:04                   ` Oliver Kiddle
2016-10-15  8:30                     ` Daniel Shahaf

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).