From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <44f445f990c9d739081500c810f292d5@plan9.bell-labs.com> From: David Presotto To: 9fans@cse.psu.edu Subject: Re: [9fans] fmt and unicode text In-Reply-To: <6a245736ff4fa6f1dcf7d9b2303482bd@plan9.ucalgary.ca> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="upas-wdhjvmgwdtsoiqxfwqbcxlpstm" Date: Sun, 16 Nov 2003 16:28:45 -0500 Topicbox-Message-UUID: 8b520cdc-eacc-11e9-9e20-41e7f4b1d025 This is a multi-part message in MIME format. --upas-wdhjvmgwdtsoiqxfwqbcxlpstm Content-Disposition: inline Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Odd, of the two examples, the old fmt looks reasonable on my screen and the other has odd long and unbalanced line. What is the advantage here? --upas-wdhjvmgwdtsoiqxfwqbcxlpstm Content-Type: message/rfc822 Content-Disposition: inline Received: from plan9.cs.bell-labs.com ([135.104.9.2]) by plan9; Sun Nov 16 14:30:57 EST 2003 Received: from mail.cse.psu.edu ([130.203.4.6]) by plan9; Sun Nov 16 14:30:54 EST 2003 Received: by mail.cse.psu.edu (CSE Mail Server, from userid 60001) id 6620F19B8F; Sun, 16 Nov 2003 14:30:52 -0500 (EST) Received: from psuvax1.cse.psu.edu (psuvax1.cse.psu.edu [130.203.76.6]) by mail.cse.psu.edu (CSE Mail Server) with ESMTP id 17A0719AB9; Sun, 16 Nov 2003 14:30:38 -0500 (EST) X-Original-To: 9fans@cse.psu.edu Delivered-To: 9fans@cse.psu.edu Received: by mail.cse.psu.edu (CSE Mail Server, from userid 60001) id 4E04D19B5A; Sun, 16 Nov 2003 14:29:42 -0500 (EST) Received: from plan9.ucalgary.ca (unknown [136.159.220.110]) by mail.cse.psu.edu (CSE Mail Server) with ESMTP id 4CF64199A3 for <9fans@cse.psu.edu>; Sun, 16 Nov 2003 14:29:31 -0500 (EST) Message-ID: <6a245736ff4fa6f1dcf7d9b2303482bd@plan9.ucalgary.ca> To: 9fans@cse.psu.edu From: mirtchov@cpsc.ucalgary.ca MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Subject: [9fans] fmt and unicode text Sender: 9fans-admin@cse.psu.edu Errors-To: 9fans-admin@cse.psu.edu X-BeenThere: 9fans@cse.psu.edu X-Mailman-Version: 2.0.11 Precedence: bulk Reply-To: 9fans@cse.psu.edu List-Id: Fans of the OS Plan 9 from Bell Labs <9fans.cse.psu.edu> List-Archive: Date: Sun, 16 Nov 2003 12:29:24 -0700 X-Spam-Status: No, hits=0.8 required=5.0 tests=NO_REAL_NAME version=2.55 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) Content-Transfer-Encoding: quoted-printable This diff makes 'fmt' understand utf text without cutting it short: home% diff fmt.c /sys/src/cmd/fmt.c 197c197 < col +=3D utflen(w[i]->text); --- > col +=3D strlen(w[i]->text); 203c203 < if(col+nsp+utflen(w[i]->text) > extraindent+length) --- > if(col+nsp+strlen(w[i]->text) > extraindent+length) home%=20 here's an example: This is utf text formatted with the stock fmt: =D0=97=D0=B0=D1=80=D0=B5=D0=B3=D0=B8=D1=81=D1=82=D1=80=D0=B8=D1=80=D1=83= =D0=B9=D1=82=D0=B5=D1=81=D1=8C =D1=81=D0=B5=D0=B9=D1=87=D0=B0=D1=81 =D0=BD= =D0=B0 =D0=94=D0=B5=D1=81=D1=8F=D1=82=D1=83=D1=8E =D0=9C=D0=B5=D0=B6=D0=B4=D1=83=D0=BD=D0=B0=D1=80=D0=BE=D0=B4=D0=BD=D1=83= =D1=8E =D0=9A=D0=BE=D0=BD=D1=84=D0=B5=D1=80=D0=B5=D0=BD=D1=86=D0=B8=D1=8E= =D0=BF=D0=BE Unicode, =D0=BA=D0=BE=D1=82=D0=BE=D1=80=D0=B0=D1=8F =D1=81=D0=BE=D1=81=D1=82=D0=BE= =D0=B8=D1=82=D1=81=D1=8F 10-12 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 1997 =D0=B3= =D0=BE=D0=B4=D0=B0 =D0=B2 =D0=9C=D0=B0=D0=B9=D0=BD=D1=86=D0=B5 =D0=B2 =D0=93=D0=B5=D1=80=D0=BC=D0= =B0=D0=BD=D0=B8=D0=B8. =20 This is utf text formatted with the new fmt: =D0=97=D0=B0=D1=80=D0=B5=D0=B3=D0=B8=D1=81=D1=82=D1=80=D0=B8=D1=80=D1=83= =D0=B9=D1=82=D0=B5=D1=81=D1=8C =D1=81=D0=B5=D0=B9=D1=87=D0=B0=D1=81 =D0=BD= =D0=B0 =D0=94=D0=B5=D1=81=D1=8F=D1=82=D1=83=D1=8E =D0=9C=D0=B5=D0=B6=D0=B4= =D1=83=D0=BD=D0=B0=D1=80=D0=BE=D0=B4=D0=BD=D1=83=D1=8E =D0=9A=D0=BE=D0=BD= =D1=84=D0=B5=D1=80=D0=B5=D0=BD=D1=86=D0=B8=D1=8E =D0=BF=D0=BE Unicode, =D0=BA=D0=BE=D1=82=D0=BE=D1=80=D0=B0=D1=8F =D1=81=D0=BE=D1=81=D1= =82=D0=BE=D0=B8=D1=82=D1=81=D1=8F 10-12 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 19= 97 =D0=B3=D0=BE=D0=B4=D0=B0 =D0=B2 =D0=9C=D0=B0=D0=B9=D0=BD=D1=86=D0=B5 =D0= =B2 =D0=93=D0=B5=D1=80=D0=BC=D0=B0=D0=BD=D0=B8=D0=B8. =20 Note that the russian text above appears to be formatted longer than this english text. That is due to the default cyrillic fonts being used in acme rendered slightly larger, even the fixed-size ones. If you use something completely fixed-size such as /lib/font/bit/10646/7x13/7x13.font you'll find that the texts are of proper length :) cheers, andrey ps: the patch(1) system isn't operational still.. =20 pps: a similar change may be required for cb(1), but since most code is written in plain english I doubt it's that necessary. --upas-wdhjvmgwdtsoiqxfwqbcxlpstm--