From mboxrd@z Thu Jan 1 00:00:00 1970 From: michael@kjorling.se (Michael =?utf-8?B?S2rDtnJsaW5n?=) Date: Sun, 7 May 2017 13:54:07 +0000 Subject: [TUHS] Discuss of style and design of computer programs from a user stand point In-Reply-To: References: <201705060202.v4622L1J013430@coolidge.cs.Dartmouth.EDU> <5a2d6cc957c2efcd968f35aa5557c7a0e309dd27@webmail.yaccman.com> <20170506091857.GE12539@yeono.kjorling.se> <20170506144011.GF28787@mcvoy.com> <20170506195049.8FBEA124AEA4@mail.bitblocks.com> Message-ID: <20170507135407.GS12539@yeono.kjorling.se> On 7 May 2017 11:42 +1000, from noel.hunt at gmail.com (Noel Hunt): > I don't imagine it would be hard to re-write [uniq] to > handle utf-8. It does look like at least GNU coreutils 8.13 uniq is broken in that regard, which frankly surprised me. That version isn't _that_ old. $ uniq --version uniq (GNU coreutils) 8.13 Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Richard M. Stallman and David MacKenzie. $ ( echo $'\u1234' ; echo $'\u2345' ; echo $'\u1234' ) ሴ ⍅ ሴ $ ( echo $'\u1234' ; echo $'\u2345' ; echo $'\u1234' ) | uniq ሴ $ -- Michael Kjörling • https://michael.kjorling.se • michael at kjorling.se “People who think they know everything really annoy those of us who know we don’t.” (Bjarne Stroustrup)