From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/30095 Path: news.gmane.org!not-for-mail From: Aditya Mahajan Newsgroups: gmane.comp.tex.context Subject: Re: counting the words in a TeX document Date: Mon, 7 Aug 2006 20:49:27 -0400 (EDT) Message-ID: References: <6faad9f00608050945g5f829eaeka4afdee9858c7df8@mail.gmail.com> <44D4FA91.3080808@wxs.nl> <6faad9f00608051731t1dc00da2v73ad192dedd4835c@mail.gmail.com> <6faad9f00608070124h2162d8ddj163fd308ca30348a@mail.gmail.com> <44D7065E.2010808@wxs.nl> <6faad9f00608071154v561ff28aq32c5446415e98657@mail.gmail.com> <44D7A8A6.1050708@wxs.nl> <6faad9f00608071431v3a097f38u86534b806896a996@mail.gmail.com> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1154998191 24641 80.91.229.2 (8 Aug 2006 00:49:51 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 8 Aug 2006 00:49:51 +0000 (UTC) Original-X-From: ntg-context-bounces@ntg.nl Tue Aug 08 02:49:49 2006 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by ciao.gmane.org with esmtp (Exim 4.43) id 1GAFmy-0004wR-Cw for gctc-ntg-context-518@m.gmane.org; Tue, 08 Aug 2006 02:49:44 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id C684A1FD37; Tue, 8 Aug 2006 02:49:43 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 10389-06; Tue, 8 Aug 2006 02:49:37 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id CEEDA1FD26; Tue, 8 Aug 2006 02:49:36 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 5ED8A1FD26 for ; Tue, 8 Aug 2006 02:49:35 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 10669-01 for ; Tue, 8 Aug 2006 02:49:33 +0200 (CEST) Original-Received: from skycaptain.mr.itd.umich.edu (smtp.mail.umich.edu [141.211.93.160]) by ronja.ntg.nl (Postfix) with SMTP id D7D911FD19 for ; Tue, 8 Aug 2006 02:49:32 +0200 (CEST) Original-Received: FROM aditya.annarb01.mi.comcast.net (c-68-40-50-205.hsd1.mi.comcast.net [68.40.50.205]) BY skycaptain.mr.itd.umich.edu ID 44D7DF99.A3F9E.15161 ; 7 Aug 2006 20:49:31 -0400 Original-To: mailing list for ConTeXt users In-Reply-To: <6faad9f00608071431v3a097f38u86534b806896a996@mail.gmail.com> X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.7 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:30095 Archived-At: On Mon, 7 Aug 2006, Mojca Miklavec wrote: > On 8/7/06, Hans Hagen wrote: >> >>> (I'll spare you the fun with sections for some other time,) but since >>> you reminded me that I might have some questions left, here you have >>> another one: how do I replace hyphens, en-dashes and em-dashes with >>> "spaces/line breaks"? >>> \catcode`~=13\let~=\space >>> does what I want, but none of the following works: >>> \def\-{\space} >>> \def-{\space} >>> \let\-=\space >>> >> \catcode`-=\active \def-{ } > > I tried that one already, but it didn't work. Now I figured out that > it was because of nesting the definitions (perhaps even some > interference with negative numbers?), not because of wrong definition > on itself. > > I'm sorry. > > Mojca > > (But my fear is that the whole problem is too complex anyway (tables, > ...) to be solved elegantly.) You should not be writing tables in abstracts! Here is my attempt. Seems to work correctly for simple text, references, simple markup etc. Try anything too fancy and you are in trouble. I changed the name to start stop stats, as I was mistyping startstatistics :-). \starttext \bgroup \catcode`~=\active \catcode`-=\active \gdef\ignorestats% {% treat non-breakable space as a normal one \catcode`~=\active \let~=\space % treat endash, emdash and - as normal space \catcode`-=\active \def-{ } %\setupframed[align=normal]%Frames do not work correctly } \gdef\startdostats% {\bgroup \setbox0\vbox\bgroup % \tracingall -) \forgetall \nohyphens \hsize1mm} \gdef\stopdostats% {\egroup \newcounter\NOfLines \dontcomplain %Why do I still get overfull \hbox warnings \beginshapebox \unvcopy0 \endshapebox \reshapebox{\doglobal\increment\NOfLines} \getnoflines{\ht0} \unvbox0 %Uncomment for debug \par lines: \the\noflines\space words: \NOfLines\par\egroup} \long\gdef\startstats#1\stopstats% {\bgroup\ignorestats \startdostats\scantokens{#1}\stopdostats\egroup} \egroup \def\ShowStats#1{\hairline#1\par\startstats#1\stopstats} \ShowStats{abc~def ghi-jkl -- mno --- prs} \ShowStats{abc-def -- ghi --- jkl} \ShowStats{a, b} \section[a]{one} \ShowStats{We do some great things in \in{section}[a]} % I do not know the internals, but section 1 seems unbreakable \ShowStats{$a=b$} %What did you expect? It may be possible to treat %each math token as mathord and allow it to break %but that will not give any better results. \startbuffer This is a test \stopbuffer \ShowStats{\getbuffer} \ShowStats{\startformula a = b + c \stopformula} \ShowStats{\framed{This is a test}} \ShowStats{\starthiding Another test \stophiding Does this work?} % Buffers do not work and fail silently. \ShowStats{This is {\bf Bold} and {\it Italic}} \ShowStats{\input tufte} \stoptext Aditya