* Removing forced casts between signed/unsigned char pointers @ 2005-11-15 9:27 Wayne Davison 2005-11-15 10:00 ` DervishD 0 siblings, 1 reply; 4+ messages in thread From: Wayne Davison @ 2005-11-15 9:27 UTC (permalink / raw) To: zsh-workers There were a few compiler warnings about pointers and differ in signedness in the non-multibyte code that I wanted to fix. I first looked at the easy fix of adding a few new forced casts, but I didn't like how many casts to and from "unsigned char *" and "char *" we were getting in the code. Instead, I decided to change some of the functions and global pointers that were taking/returning unsigned char pointers to make them take/return normal char pointers. This allowed me to remove a bunch of forced casts. I was careful in my changes to preserve any unsigned features of the old code (for instance, ztrcmp() still compares the character values in an unsigned manner even though it takes normal "char *" strings). In two spots in the code there were unsigned char values being assigned and restored to/from a "char save" value, so this potentially fixed a problem for those broken systems that have trouble converting signed<->unsigned char values without extra casts. If anyone would like to see all the changes I just checked in, you'll find a unified diff of them here: http://opencoder.net/zsh-less-unsigned-char.patch ..wayne.. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Removing forced casts between signed/unsigned char pointers 2005-11-15 9:27 Removing forced casts between signed/unsigned char pointers Wayne Davison @ 2005-11-15 10:00 ` DervishD 2005-11-15 11:03 ` Wayne Davison 0 siblings, 1 reply; 4+ messages in thread From: DervishD @ 2005-11-15 10:00 UTC (permalink / raw) To: Wayne Davison; +Cc: zsh-workers Hi Wayne :) * Wayne Davison <wayned@users.sourceforge.net> dixit: > I first looked at the easy fix of adding a few new forced casts, > but I didn't like how many casts to and from "unsigned char *" and > "char *" we were getting in the code. Instead, I decided to change > some of the functions and global pointers that were > taking/returning unsigned char pointers to make them take/return > normal char pointers. This allowed me to remove a bunch of forced > casts. But that's not a good idea. If you go for portability, you shouldn't assume that a plain char is equivalent to "unsigned char". If you don't mind me suggesting, you should use "unsigned char" wherever possible, and cast to plain "char" when needed. In fact, plain char has been maintained in the C99 standard because making it dissappear or making it an alias to "unsigned char" will break some existing code. > I was careful in my changes to preserve any unsigned features of > the old code (for instance, ztrcmp() still compares the character > values in an unsigned manner even though it takes normal "char *" > strings). If you really want to make sure you're preserving unsigned features, avoid the use of "char" and "char *". I know you're using autoconf and that anyway you can check if "char" is equivalent to "unsigned char" or "signed char" using SCHAR_MAX, CHAR_MAX and UCHAR_MAX macros, but I think that using "unsigned char" where possible is safer. I haven't seen any potential problem in your patch, but I don't think it's a good idea to change some prototypes from using "unsigned char *" to "char *". I have had a similar problem with one of my projects and I think that the extra casts are a better solution, especially if you can afford to modify prototypes. You can run into trouble by changing from "unsigned char" to "char" (this applies to pointers to these types, too), because you can potentially restrict the range, but unless your code relies on "char" being signed at some point, you can safely go from "char" to "unsigned char". Currently I don't know of any system nor C implementation that won't copy eight bits from any kind of char type to any other kind of char type (even through pointers) if CHAR_BIT is 8, and the real danger is probably in comparisons, but anyway, and to be on the safe side, I would go for "unsigned char" and will do any needed casts. Given that you can change some prototypes, I think that the number of casts will be pretty low. In my project, going to "unsigned char" and "unsigned char *" will need casts only for a couple uses of <string.h> functions (which should be avoided whenever possible) and a couple of Xlib functions. The other way around (using "char" & "char *" instead) will need lots of casts and a careful auditing to see if we are breaking unsigned features, since the code must be 8bit clean. And thanks a lot for fixing the warnings :) I know that sometimes GCC warnings are nonsenses or misleading, but this time they're right: the C99 standard makes pretty clear that, no matter if "char" is equivalent to "unsigned char" or "signed char", the three types are distinct, so their pointers are different and non compatible, so you need explicit casts. You can see a discussion in the GCC bugzilla, and if you want I can point you to the relevant C99 standard paragraphs. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net http://www.pleyades.net & http://www.gotesdelluna.net It's my PC and I'll cry if I want to... ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Removing forced casts between signed/unsigned char pointers 2005-11-15 10:00 ` DervishD @ 2005-11-15 11:03 ` Wayne Davison 2005-11-15 12:15 ` DervishD 0 siblings, 1 reply; 4+ messages in thread From: Wayne Davison @ 2005-11-15 11:03 UTC (permalink / raw) To: zsh-workers On Tue, Nov 15, 2005 at 11:00:09AM +0100, DervishD wrote: > If you go for portability, you shouldn't assume that a plain char is > equivalent to "unsigned char". I'm not -- I'm assuming that "char" is probably signed while making sure that the code works irrespective of it being signed or not. What I want to avoid is a bunch of switching back and forth between the two pointer types. I like to avoid pointer casts where possible because they can hide bugs. For instance, it's easy to think that we're casting one pointer type to another when we're really forcibly casting a character value or an int into a pointer -- the casts are coded the same way, and C will not complain about it if it's an explicit cast. (Yes, we could use inline functions under a compiler like gcc and code up type-safe conversions, but I think it's better to pick a single char type and stick with it than to do a bunch of converting back and forth.) > If you don't mind me suggesting, you should use "unsigned char" > wherever possible, and cast to plain "char" when needed. I certainly wouldn't mind if all of zsh were switched over to using unsigned chars for its strings, but that's a MUCH larger change than what I did (my changes mainly affected the ZLE code). Switching all of zsh to unsigned chars would also involve some kind of a wrapper scheme for standard C functions that we use (e.g. strcpy(), strcmp(), strchr(), etc.), which might be a little annoying, but probably wouldn't be too bad. The main thing zsh needs is consistency of type and good coding practices to avoid any problems with signed characters. Yes, such coding practices are slightly easier to get right when using unsigned chars, but zsh already has several idioms in place that are geared towards making sure that signed characters are handled correctly. That's my opinion, at least. ..wayne.. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Removing forced casts between signed/unsigned char pointers 2005-11-15 11:03 ` Wayne Davison @ 2005-11-15 12:15 ` DervishD 0 siblings, 0 replies; 4+ messages in thread From: DervishD @ 2005-11-15 12:15 UTC (permalink / raw) To: Wayne Davison; +Cc: zsh-workers Hi Wayne :) * Wayne Davison <wayned@users.sourceforge.net> dixit: > On Tue, Nov 15, 2005 at 11:00:09AM +0100, DervishD wrote: > > If you go for portability, you shouldn't assume that a plain char is > > equivalent to "unsigned char". > > I'm not -- I'm assuming that "char" is probably signed while making > sure that the code works irrespective of it being signed or not. That's cool :) (and of course I wasn't saying that you was assuming that, it was just a suggestion). > What I want to avoid is a bunch of switching back and forth between > the two pointer types. I like to avoid pointer casts where > possible because they can hide bugs. You're completely right, arbitrary casts can hide very nasty bugs. In that sense you're right when sticking to a particular type of char. Anyway and IMHO (since I'm not as familiar as you are with zsh source code) going to "unsigned char" will remove a whole bunch of casts (especially if you can change some prototypes) and it is safer. But obviously you know LOTS more than me about zsh source code. For example, I didn't know that zsh code can deal with signed chars. > > If you don't mind me suggesting, you should use "unsigned char" > > wherever possible, and cast to plain "char" when needed. > > I certainly wouldn't mind if all of zsh were switched over to using > unsigned chars for its strings, but that's a MUCH larger change > than what I did (my changes mainly affected the ZLE code). OK, I understand. I didn't think about the extents of such modification. > Switching all of zsh to unsigned chars would also involve some kind > of a wrapper scheme for standard C functions that we use (e.g. > strcpy(), strcmp(), strchr(), etc.), which might be a little > annoying, but probably wouldn't be too bad. But zsh is already using such wrappers, isn't it? This way you only have to change the wrapper prototypes. As I've already told, I'm not familiar with the code to know such details. It was just a suggestion. > The main thing zsh needs is consistency of type and good coding > practices to avoid any problems with signed characters. This is obvious in any project. Good coding practices can avoid more bugs that just trying to write "proper" code (for example, using "unsigned char" instead of plain "char"). Here you're completely right, and as far as I know, zsh code fits your description. > Yes, such coding practices are slightly easier to get right when > using unsigned chars, but zsh already has several idioms in place > that are geared towards making sure that signed characters are > handled correctly. I didn't know that, thanks for pointing :) > That's my opinion, at least. And you're right. If switching to "unsigned char" can lead to significant problems and the change is very large (in number of lines of code affected) I don't think it is worth the effort. Anyway, if the change can be done progressively (for example, changing the wrappers first, after that some static variables, etc.) then it maybe worth a try. Anyway this is just a humble suggestion since I'm not able to do that change myself and I don't want (nor like) to say others how they should do their work. Thanks for considering and discussing the idea :) And good luck with the changes, whatever they are ;) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net http://www.pleyades.net & http://www.gotesdelluna.net It's my PC and I'll cry if I want to... ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-11-15 12:13 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-11-15 9:27 Removing forced casts between signed/unsigned char pointers Wayne Davison 2005-11-15 10:00 ` DervishD 2005-11-15 11:03 ` Wayne Davison 2005-11-15 12:15 ` DervishD
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).