The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: random832@fastmail.com (Random832)
Subject: [TUHS] Character sets
Date: Mon, 28 Mar 2016 01:12:58 -0400	[thread overview]
Message-ID: <1459141978.2176234.561172346.36763B9D@webmail.messagingengine.com> (raw)
In-Reply-To: <20160328015814.GC18027@mercury.ccil.org>

On Sun, Mar 27, 2016, at 21:58, John Cowan wrote:
> Random832 scripsit:
> 
> > Sure it does, but replace that != " " with !isblank(*c), and it doesn't
> > work anymore since it ignores multibyte characters. 
> 
> In which locales does isblank() actually return true on characters other
> than space and tab?  (This is a straight question.)

See, no, that's a trick question. None of the other blank class
characters are single-byte, so of course isblank doesn't. The following
characters return true on is*w*blank for me: U+00a0 U+1680 U+2000 U+2001
U+2002 U+2003 U+2004 U+2005 U+2006 U+2007 U+2008 U+2009 U+200a U+200b
U+202f U+205f U+3000 (Oddly enough, isblank(0xA0) is true even in the
UTF-8 locale, though of course U+00a0 is actually a multibyte character
"\xc2\xa0".) So, if what you _want_ is to find the next blank character,
doing this loop with isblank won't work. If what you want is to find
space or tab, sure. But that's why grep for patterns containing \s are
so slow.


      reply	other threads:[~2016-03-28  5:12 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.169.1459059516.15972.tuhs@minnie.tuhs.org>
2016-03-27 10:09 ` [TUHS] Character sets (was: Command-line options) Johnny Billquist
2016-03-27 11:29   ` John Cowan
2016-03-27 11:47     ` [TUHS] Character sets Johnny Billquist
2016-03-27 21:49       ` Greg 'groggy' Lehey
2016-03-27 21:53         ` Johnny Billquist
2016-03-27 21:59           ` Greg 'groggy' Lehey
2016-03-27 22:19             ` Johnny Billquist
2016-03-27 22:21             ` Charles Anthony
2016-03-27 23:23               ` Dave Horsfall
2016-03-28  0:20                 ` John Cowan
2016-03-28  1:02                   ` Dave Horsfall
2016-03-28  0:18               ` Johnny Billquist
2016-03-27 23:30           ` John Cowan
2016-03-27 23:56             ` Johnny Billquist
2016-03-28  1:54               ` John Cowan
2016-03-28  3:27               ` Steve Nickolas
2016-03-28  1:20             ` Random832
2016-03-28  1:58               ` John Cowan
2016-03-28  5:12                 ` Random832 [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459141978.2176234.561172346.36763B9D@webmail.messagingengine.com \
    --to=random832@fastmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).