From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/28601 Path: news.gmane.org!not-for-mail From: Bob Kerstetter Newsgroups: gmane.comp.tex.context Subject: Re: Ugly hack for multiple MSWord docs. Date: Thu, 15 Jun 2006 11:45:35 -0500 Message-ID: <0F40B9B4-3B9A-40CD-BC28-2C1D0085EAF8@mac.com> References: <200606131829.58862.john@wexfordpress.com> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (Apple Message framework v750) Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1150389854 22021 80.91.229.2 (15 Jun 2006 16:44:14 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 15 Jun 2006 16:44:14 +0000 (UTC) Original-X-From: ntg-context-bounces@ntg.nl Thu Jun 15 18:44:11 2006 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by ciao.gmane.org with esmtp (Exim 4.43) id 1Fquwp-0008QX-Ud for gctc-ntg-context-518@m.gmane.org; Thu, 15 Jun 2006 18:44:00 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id ED73A127FB; Thu, 15 Jun 2006 18:43:59 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 31601-02-2; Thu, 15 Jun 2006 18:43:54 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 351BE127C3; Thu, 15 Jun 2006 18:43:54 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 70E9F127A7 for ; Thu, 15 Jun 2006 18:43:52 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 31601-02 for ; Thu, 15 Jun 2006 18:43:47 +0200 (CEST) Original-Received: from smtpout.mac.com (smtpout.mac.com [17.250.248.172]) by ronja.ntg.nl (Postfix) with SMTP id F1A16127C3 for ; Thu, 15 Jun 2006 18:43:46 +0200 (CEST) Original-Received: from mac.com (smtpin08-en2 [10.13.10.153]) by smtpout.mac.com (Xserve/8.12.11/smtpout02/MantshX 4.0) with ESMTP id k5FGhie5015408 for ; Thu, 15 Jun 2006 09:43:44 -0700 (PDT) Original-Received: from [192.168.1.117] (ppp-70-247-191-179.dsl.rcsntx.swbell.net [70.247.191.179]) (authenticated bits=0) by mac.com (Xserve/smtpin08/MantshX 4.0) with ESMTP id k5FGhh0Q011343 for ; Thu, 15 Jun 2006 09:43:44 -0700 (PDT) In-Reply-To: <200606131829.58862.john@wexfordpress.com> Original-To: mailing list for ConTeXt users X-Mailer: Apple Mail (2.750) X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.7 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:28601 Archived-At: On Jun 13, 2006, at 5:29 PM, John R. Culleton wrote: > Frequently I find myself in the position of needing to combine > several MSWord and/or rtf documents into a single file for either > pdftex or Context. I have settled on this strategy. > > > > Someday there will be an elegant solution to the MSWord to > Context problem. For now there is my ugly hack as described here. MEMORY DISCLAIMER: In these examples none of the function names are really what they are in Word or VB for Word. The functions are available in VB for Word, but it's been some time since I've done this, i don't have the macros these days and don't really know the real names anymore. So they are just representative of the functions available. STYLE COMMENT: These methods should work even if styles are not being used. For example the primary heading may be Arial, 18pt, bold and not the Heading 1 style. That's okay because you can search for font attributes in Word. If the document is not consistent, well, convert to text and markup manually. :) MORE OR LESS CURRENT EXAMPLE It's not particularly elegant, but I used to convert from MSWord to whatever by writing VB find/replace macros based on styles and formatting. In newer versions of Word (at least on OS X), Replace has a function that includes what you found, plus you can add other text. Example: Find: %find stuff formatted with heading 1 style Replace: \subject{WhatItFound} %replaces what it found and wraps \subject{} around it. Because Word stores its formatting in the line feed/carriage return, for paragraph styles you end up with something like this: \subject{Some TeX } So my last VB find/replace removes the carriage returns globally: Find: ^p} Replace: } When done with all find/replace functions, save as text. That's it. Not being much of a script writer, I record the first find/replace, then edit the macro and duplicate the find/replace as needed. The VB find/replace function has options for starting at the top of the file, replacing globally, continuing if nothing is found and that sort of thing. The macro looks something like this: Find: %find stuff formatted with heading 1 style Replace: \subject{WhatItFound} %replaces what it found and wraps \subject{} around it. Find: %find stuff formatted with heading 2 style Replace: \subsubject{WhatItFound} %replaces what it found and wraps \subsubject{} around it. Find: %find stuff formatted with heading 3 style Replace: \subsubject{WhatItFound} %replaces what it found and wraps \subsubsubject{} around it. The above method uses global replacement and it's pretty zippy, for Word. ANOTHER OLDER METHOD Another method I used before Find/Replace had the function was to put the found string into a variable, then use that variable for the replacement text, plus any TeX control sequences wrapped around it. In summary: 1. Put your finds and replaces in an array: ArrayFind(0) Heading 1; ArrayReplace(0) \subject{ ArrayFind(1) Heading 2; ArrayReplace(1) \subsubject{ ArrayFind(2) Heading 3; ArrayReplace(2) \subsubsubject{ Note the closing } is missing. It is hardcoded in the replacement code. 2. Find the first array item starting from the top of the document. This highlights the text in Word: Find = $ArrayFind(n) 3. Put the highlighted text into a variable. Maybe you can even strip the CR's from formatted pagagraphs: stripCarriageReturns($FoundThisStuff) = CurrentSelection 4. Put the variable and the first replace item in the Word Replace function. Note the hard coded closing bracket. And the CR assuming you stripped the CR in step 3: Replace = $ArrayReplace(n)+$FoundThisStuff+"}"+CR 5. Repeatedly use Replace and Find Next until nothing else is found. Replace and Find Next . . . 6. Repeatedly find the next array item to the end of the array. n = n + 1 Find = $ArrayFind(n) . . . 7. Save the file as text. FilesSaveAs using the text option Hum. After thinking about this and typing it in, maybe I should still use the OLD method. It appears to be a little easier to manage. Maybe a lot easier. Oh well, not a real programmer.