public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Output HTML to DOCx file Tables Not Getting Formatted
@ 2022-07-24 16:32 Michael Becker
       [not found] ` <b92cf565-a834-4ed0-9a2a-edd4d75710b9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Becker @ 2022-07-24 16:32 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 969 bytes --]

I'm really struggling with outputting formatted tables using pandoc in MS 
Word.  I'm exporting HTML from a tool called Tinderbox. I then convert the 
clean HTML to a docx file with pandoc. The tables always come through with 
NO styling, even after I've modified the default table style in the the 
reference.docx file. 

I read tons of post on this, but for the life of me I can't figure it out. 
There are some that suggest going into the word XML but I'm not sure how to 
automate this. I'm on maxOS Silicon Anyone of any idea on how to make this 
work?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b92cf565-a834-4ed0-9a2a-edd4d75710b9n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1281 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Output HTML to DOCx file Tables Not Getting Formatted
       [not found] ` <b92cf565-a834-4ed0-9a2a-edd4d75710b9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-07-24 18:46   ` Michael Becker
       [not found]     ` <56f6388c-72f5-41c0-9d51-2a803644b6dcn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Becker @ 2022-07-24 18:46 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1764 bytes --]

Hi there, I have found a way to manually do this with BBEDIT. After 
exporting the file I Open the doxc file with BBEDIT (no zip or unzip 
required).  I can then do a global replace and replace the table style I 
want word to use, e.g. replace "Table" with "MyTable". 

Here is my question. Does anyone know of a list to terminal commands that I 
could write that would automatically go into the docx file and make the 
changes I want. Tinderbox can trigger runCommands. It is not clear to me 
what the steps would be to get into the document.xml file, globally replace 
the style val and save it. 

Manually, this works every time. Now I'd like to automate the process. 

[image: 2022-07-24_11-42-52.png]
On Sunday, July 24, 2022 at 9:32:07 AM UTC-7 Michael Becker wrote:

> I'm really struggling with outputting formatted tables using pandoc in MS 
> Word.  I'm exporting HTML from a tool called Tinderbox. I then convert the 
> clean HTML to a docx file with pandoc. The tables always come through with 
> NO styling, even after I've modified the default table style in the the 
> reference.docx file. 
>
> I read tons of post on this, but for the life of me I can't figure it out. 
> There are some that suggest going into the word XML but I'm not sure how to 
> automate this. I'm on maxOS Silicon Anyone of any idea on how to make this 
> work?
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/56f6388c-72f5-41c0-9d51-2a803644b6dcn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2472 bytes --]

[-- Attachment #2: 2022-07-24_11-42-52.png --]
[-- Type: image/png, Size: 289228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Output HTML to DOCx file Tables Not Getting Formatted
       [not found]     ` <56f6388c-72f5-41c0-9d51-2a803644b6dcn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-07-24 18:59       ` Leonard Rosenthol
       [not found]         ` <CALu=v3K2Cxk0GvfWovpNnMUh5nOzVNZyfns2d=r3ycU0rSNoPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2022-07-24 22:20       ` Anton Sharonov
  1 sibling, 1 reply; 6+ messages in thread
From: Leonard Rosenthol @ 2022-07-24 18:59 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


[-- Attachment #1.1: Type: text/plain, Size: 2599 bytes --]

BBEdit has a command line interface so that you could potentially automate
the manual commands you did in it

On Sun, Jul 24, 2022 at 2:46 PM Michael Becker <michael-QF1XyMwE1Uwv0rdu9s6TydBPR1lH4CV8@public.gmane.org>
wrote:

> Hi there, I have found a way to manually do this with BBEDIT. After
> exporting the file I Open the doxc file with BBEDIT (no zip or unzip
> required).  I can then do a global replace and replace the table style I
> want word to use, e.g. replace "Table" with "MyTable".
>
> Here is my question. Does anyone know of a list to terminal commands that
> I could write that would automatically go into the docx file and make the
> changes I want. Tinderbox can trigger runCommands. It is not clear to me
> what the steps would be to get into the document.xml file, globally replace
> the style val and save it.
>
> Manually, this works every time. Now I'd like to automate the process.
>
> [image: 2022-07-24_11-42-52.png]
> On Sunday, July 24, 2022 at 9:32:07 AM UTC-7 Michael Becker wrote:
>
>> I'm really struggling with outputting formatted tables using pandoc in MS
>> Word.  I'm exporting HTML from a tool called Tinderbox. I then convert the
>> clean HTML to a docx file with pandoc. The tables always come through with
>> NO styling, even after I've modified the default table style in the the
>> reference.docx file.
>>
>> I read tons of post on this, but for the life of me I can't figure it
>> out. There are some that suggest going into the word XML but I'm not sure
>> how to automate this. I'm on maxOS Silicon Anyone of any idea on how to
>> make this work?
>>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/56f6388c-72f5-41c0-9d51-2a803644b6dcn%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/56f6388c-72f5-41c0-9d51-2a803644b6dcn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CALu%3Dv3K2Cxk0GvfWovpNnMUh5nOzVNZyfns2d%3Dr3ycU0rSNoPw%40mail.gmail.com.

[-- Attachment #1.2: Type: text/html, Size: 3738 bytes --]

[-- Attachment #2: 2022-07-24_11-42-52.png --]
[-- Type: image/png, Size: 289228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Output HTML to DOCx file Tables Not Getting Formatted
       [not found]         ` <CALu=v3K2Cxk0GvfWovpNnMUh5nOzVNZyfns2d=r3ycU0rSNoPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2022-07-24 19:11           ` Michael Becker
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Becker @ 2022-07-24 19:11 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3431 bytes --]

Thanks.  I’m working my way through this. I’m not a coder, not clear to me yet.


Michael J. Becker
CEO
Identity Praxis, Inc.
michael-QF1XyMwE1Uwv0rdu9s6TydBPR1lH4CV8@public.gmane.org
M: +1-408-242-5733
https://www.linkedin.com/in/privacyshaman/

Check my availability & schedule a meeting with me: 30 MIN or 60 MIN
On Jul 24, 2022, 12:00 -0700, Leonard Rosenthol <leonardr-bM6h3K5UM15l57MIdRCFDg@public.gmane.org>, wrote:
> BBEdit has a command line interface so that you could potentially automate the manual commands you did in it
>
> > On Sun, Jul 24, 2022 at 2:46 PM Michael Becker <michael@identitypraxis.com> wrote:
> > > Hi there, I have found a way to manually do this with BBEDIT. After exporting the file I Open the doxc file with BBEDIT (no zip or unzip required).  I can then do a global replace and replace the table style I want word to use, e.g. replace "Table" with "MyTable".
> > >
> > > Here is my question. Does anyone know of a list to terminal commands that I could write that would automatically go into the docx file and make the changes I want. Tinderbox can trigger runCommands. It is not clear to me what the steps would be to get into the document.xml file, globally replace the style val and save it.
> > >
> > > Manually, this works every time. Now I'd like to automate the process.
> > >
> > > <2022-07-24_11-42-52.png>
> > > > On Sunday, July 24, 2022 at 9:32:07 AM UTC-7 Michael Becker wrote:
> > > > > I'm really struggling with outputting formatted tables using pandoc in MS Word.  I'm exporting HTML from a tool called Tinderbox. I then convert the clean HTML to a docx file with pandoc. The tables always come through with NO styling, even after I've modified the default table style in the the reference.docx file.
> > > > >
> > > > > I read tons of post on this, but for the life of me I can't figure it out. There are some that suggest going into the word XML but I'm not sure how to automate this. I'm on maxOS Silicon Anyone of any idea on how to make this work?
> > > --
> > > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/56f6388c-72f5-41c0-9d51-2a803644b6dcn%40googlegroups.com.
> --
> You received this message because you are subscribed to a topic in the Google Groups "pandoc-discuss" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/pandoc-discuss/KoGYWQGhRm8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CALu%3Dv3K2Cxk0GvfWovpNnMUh5nOzVNZyfns2d%3Dr3ycU0rSNoPw%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6cb0eb20-db92-4213-90e8-4a6aa3b88004%40Spark.

[-- Attachment #2: Type: text/html, Size: 6898 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Output HTML to DOCx file Tables Not Getting Formatted
       [not found]     ` <56f6388c-72f5-41c0-9d51-2a803644b6dcn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2022-07-24 18:59       ` Leonard Rosenthol
@ 2022-07-24 22:20       ` Anton Sharonov
       [not found]         ` <CAMoRF4k0gDFQTzKTYKPCh5Uiyp2zm2BWsKji2rE9DRfJdvAKSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: Anton Sharonov @ 2022-07-24 22:20 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


[-- Attachment #1.1: Type: text/plain, Size: 2774 bytes --]

Michael Becker <michael-QF1XyMwE1Uwv0rdu9s6TydBPR1lH4CV8@public.gmane.org> schrieb am So., 24. Juli 2022,
20:46:

> Hi there, I have found a way to manually do this with BBEDIT. After
> exporting the file I Open the doxc file with BBEDIT (no zip or unzip
> required).  I can then do a global replace and replace the table style I
> want word to use, e.g. replace "Table" with "MyTable".
>
> Here is my question. Does anyone know of a list to terminal commands that
> I could write that would automatically go into the docx file and make the
> changes I want.
>

Something like this should do the trick (untested):

unzip -x ~/storage/file-sample_100kB.docx word/document.xml

sed -i -e 's!<w:tblStyle w:val="Table"/>!<w:tblStyle w:val="MyTable"/>!g'
word/document.xml

zip -u ~/storage/file-sample_100kB.docx word/document.xml

Best regards, Anton

Tinderbox can trigger runCommands. It is not clear to me what the steps
> would be to get into the document.xml file, globally replace the style val
> and save it.
>
> Manually, this works every time. Now I'd like to automate the process.
>
> [image: 2022-07-24_11-42-52.png]
> On Sunday, July 24, 2022 at 9:32:07 AM UTC-7 Michael Becker wrote:
>
>> I'm really struggling with outputting formatted tables using pandoc in MS
>> Word.  I'm exporting HTML from a tool called Tinderbox. I then convert the
>> clean HTML to a docx file with pandoc. The tables always come through with
>> NO styling, even after I've modified the default table style in the the
>> reference.docx file.
>>
>> I read tons of post on this, but for the life of me I can't figure it
>> out. There are some that suggest going into the word XML but I'm not sure
>> how to automate this. I'm on maxOS Silicon Anyone of any idea on how to
>> make this work?
>>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/56f6388c-72f5-41c0-9d51-2a803644b6dcn%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/56f6388c-72f5-41c0-9d51-2a803644b6dcn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMoRF4k0gDFQTzKTYKPCh5Uiyp2zm2BWsKji2rE9DRfJdvAKSw%40mail.gmail.com.

[-- Attachment #1.2: Type: text/html, Size: 4413 bytes --]

[-- Attachment #2: 2022-07-24_11-42-52.png --]
[-- Type: image/png, Size: 289228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Output HTML to DOCx file Tables Not Getting Formatted
       [not found]         ` <CAMoRF4k0gDFQTzKTYKPCh5Uiyp2zm2BWsKji2rE9DRfJdvAKSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2022-07-24 23:05           ` Michael Becker
  0 siblings, 0 replies; 6+ messages in thread
From: Michael Becker @ 2022-07-24 23:05 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3730 bytes --]

Fantastic.  Thanks I finally go something working.   I will report back once I have the complete workflow up and running. Pretty cool!!!!!


Michael J. Becker
CEO
Identity Praxis, Inc.
michael-QF1XyMwE1Uwv0rdu9s6TydBPR1lH4CV8@public.gmane.org
M: +1-408-242-5733
https://www.linkedin.com/in/privacyshaman/

Check my availability & schedule a meeting with me: 30 MIN or 60 MIN
On Jul 24, 2022, 15:20 -0700, Anton Sharonov <anton.sharonov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, wrote:
>
>
> > Michael Becker <michael-QF1XyMwE1Uwv0rdu9s6TydBPR1lH4CV8@public.gmane.org> schrieb am So., 24. Juli 2022, 20:46:
> > > Hi there, I have found a way to manually do this with BBEDIT. After exporting the file I Open the doxc file with BBEDIT (no zip or unzip required).  I can then do a global replace and replace the table style I want word to use, e.g. replace "Table" with "MyTable".
> > >
> > > Here is my question. Does anyone know of a list to terminal commands that I could write that would automatically go into the docx file and make the changes I want.
>
> Something like this should do the trick (untested):
>
> unzip -x ~/storage/file-sample_100kB.docx word/document.xml
>
> sed -i -e 's!<w:tblStyle w:val="Table"/>!<w:tblStyle w:val="MyTable"/>!g' word/document.xml
>
> zip -u ~/storage/file-sample_100kB.docx word/document.xml
>
> Best regards, Anton
>
> > > Tinderbox can trigger runCommands. It is not clear to me what the steps would be to get into the document.xml file, globally replace the style val and save it.
> > >
> > > Manually, this works every time. Now I'd like to automate the process.
> > >
> > > <2022-07-24_11-42-52.png>
> > > > On Sunday, July 24, 2022 at 9:32:07 AM UTC-7 Michael Becker wrote:
> > > > > I'm really struggling with outputting formatted tables using pandoc in MS Word.  I'm exporting HTML from a tool called Tinderbox. I then convert the clean HTML to a docx file with pandoc. The tables always come through with NO styling, even after I've modified the default table style in the the reference.docx file.
> > > > >
> > > > > I read tons of post on this, but for the life of me I can't figure it out. There are some that suggest going into the word XML but I'm not sure how to automate this. I'm on maxOS Silicon Anyone of any idea on how to make this work?
> > > --
> > > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/56f6388c-72f5-41c0-9d51-2a803644b6dcn%40googlegroups.com.
> --
> You received this message because you are subscribed to a topic in the Google Groups "pandoc-discuss" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/pandoc-discuss/KoGYWQGhRm8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMoRF4k0gDFQTzKTYKPCh5Uiyp2zm2BWsKji2rE9DRfJdvAKSw%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9f3b5b5e-7116-4432-9fe6-c11bc1865a2d%40Spark.

[-- Attachment #2: Type: text/html, Size: 7745 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-07-24 23:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-24 16:32 Output HTML to DOCx file Tables Not Getting Formatted Michael Becker
     [not found] ` <b92cf565-a834-4ed0-9a2a-edd4d75710b9n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-07-24 18:46   ` Michael Becker
     [not found]     ` <56f6388c-72f5-41c0-9d51-2a803644b6dcn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-07-24 18:59       ` Leonard Rosenthol
     [not found]         ` <CALu=v3K2Cxk0GvfWovpNnMUh5nOzVNZyfns2d=r3ycU0rSNoPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-07-24 19:11           ` Michael Becker
2022-07-24 22:20       ` Anton Sharonov
     [not found]         ` <CAMoRF4k0gDFQTzKTYKPCh5Uiyp2zm2BWsKji2rE9DRfJdvAKSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-07-24 23:05           ` Michael Becker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).