From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8163 Path: news.gmane.org!not-for-mail From: ibid.ag@gmail.com Newsgroups: gmane.linux.lib.musl.general Subject: Re: Left-shift of negative number Date: Sat, 18 Jul 2015 13:01:43 -0700 Message-ID: <20150718200142.GB1999@Caracal> References: <1437159779.30461.1.camel@inria.fr> <20150717213521.GD1173@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1437249724 437 80.91.229.3 (18 Jul 2015 20:02:04 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 18 Jul 2015 20:02:04 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-8176-gllmg-musl=m.gmane.org@lists.openwall.com Sat Jul 18 22:02:04 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ZGYJ8-0004F7-K5 for gllmg-musl@m.gmane.org; Sat, 18 Jul 2015 22:02:02 +0200 Original-Received: (qmail 22521 invoked by uid 550); 18 Jul 2015 20:02:00 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 22495 invoked from network); 18 Jul 2015 20:01:59 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=beljQS2VJk5dac+wv2HSWgFkUmMDt8thlqMjcnHHbSw=; b=WAy4NxldgXuHy/iSZ1oX7a4DB/uhm/ztBWXHjRhJIWhQlnSccO4X07Jc3Qs+fh/SUS sRf7nhiKiKs9EFkVZ4T/GdSNifrsEv8Rt9sHd8d9G4Um9UWeEdKtP9FLzwBwkY9RAyqe 1mt8iFtK4lt5eIlx7AMyBY9m4+8Bs9u3ZJJ2MrC4du/0aCdpmcGhdUc83NysV/9SFevP mFdzWXcaOiwwEzgPuYVGAs9AuWpnTuNTJ2w/1+1D+rA/QIYrmQCBrHMhdtxrVnW8JucP FG7nSW2/km3FpKsUn7agui/ysr69O20tqFSVRyJNbvDQtaxWGeV4zEmxhSJb417wmAMb Zi9g== X-Received: by 10.68.186.2 with SMTP id fg2mr42331583pbc.90.1437249707510; Sat, 18 Jul 2015 13:01:47 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20150717213521.GD1173@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:8163 Archived-At: On Fri, Jul 17, 2015 at 05:35:22PM -0400, Rich Felker wrote: > > What worries me more than the shift of a negative value, is that this > > code is erroneous if `int` is only 16 bit wide. Whereas we can > > reasonably assume that a shift of a negative value in two's complement > > is the same as an unsigned shift, compilers tend to produce just crap > > if the shift exceeds the width. > > > > So I would feel much more comfortable if we'd use UINT32_C(0x40) > > inside the R macro. > > The entire internal API here uses the type unsigned for character > codes and state, so like the rest of musl there is an assumption > (guaranteed by POSIX) that int is at least 32-bit. Since the > UTF-8/multibyte code is written to be largely self-contained and > independent of musl, we could look into enhancing the code to be > portable to systems with 16-bit int, but I suspect this would be > rather useless in practice. If we did that, we would need to use > something ugly like uint_least32_t rather than uint32_t to gain any > portability since the latter need not even exist. As far as I know, 16-bit int is applicable to the following platforms: -Some ports of certain RTOSes to 8 or 16 bit microcontrollers (ie, FreeRTOS and perhaps eCos) -DOS, when *not* using GCC (DJGPP uses 32-bit int); this boils down to OpenWatcom, the old C89 compilers, and even older K&R-ish compilers. -FUZIX -ELKS -Minix (8086 version) -Xenix and other old commercial 16-bit *nixes FUZIX uses sdcc, which is an incomplete C89-ish compiler. ELKS uses a K&R compiler that can be used with a preprocessor to compile some C89 code. Old *nixes use K&R C. 8086 Minix uses ACK, which is C89; there's an experimental port of PCC to the 8086, but that's a long way from being useable right now. (Alan Cox is working on it part time so he can port FUZIX to the PC.) In short, the possible compilers are OpenWatcom, or various bits that are C89 at best (can't rely on uint* being available at all, short of a custom "limits.h"). I'm not sure if OpenWatcom uses 16-bit int when building in 32-bit mode; the compatability with HXRT would suggest that it doesn't. So to make it meaningful, you would have to make it work with segmented memory and probably C89. Odd as it may sound, there are people using UTF on DOS (the Blocek text editor comes to mind); but I'm not aware of interest in UTF on 16-bit DOS. Thanks, Isaac Dunham