From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Re: [9fans] awk Content-Type: text/plain; charset=ISO-2022-JP; format=flowed Mime-Version: 1.0 (Apple Message framework v546) From: Kenji Arisawa To: 9fans@cse.psu.edu Content-Transfer-Encoding: 7bit In-Reply-To: Message-Id: <4789922C-FC97-11D6-8A66-000393A941BC@ar.aichi-u.ac.jp> Date: Wed, 20 Nov 2002 23:49:32 +0900 Topicbox-Message-UUID: 24962b3c-eacb-11e9-9e20-41e7f4b1d025 Hello, I said: > I tested some awk string functions to examine if > they can handle UFT-8 code well. > The bollow is my text code: > #!/bin/rc > # > # Can awk function handle UTF strings ? > # > echo 'ベル:研究所' | awk '{ > print $0 # ベル:研究所 > print length($0) # 6 > print index($0,":") # 3 > print match($0,":.*"),RSTART, RLENGTH # 7 7 4 > print substr($0,3) # :研究所 > a=$0; sub(":.+", "alice", a); print a # ベルalice > }' > > Output is commented after `#' in each line. > Function `match' returns byte position that is inconsitent > with others. I believe this is a bug. > It seems this bug is fixed in recent update. Thanks. Kenji Arisawa