public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Strip metadata in docx content
@ 2018-06-19 11:38 Jose Costa Teixeira
       [not found] ` <0185bd97-6167-4b60-b2fa-77101454476f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Jose Costa Teixeira @ 2018-06-19 11:38 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 938 bytes --]

Hi

I am starting to use pandoc, so please forgive me if this is a dumb 
question. 

When I convert an (x)html to docx, the docx writer puts the title in the 
file. How can I avoid that?
I can use the command line to give another value to the title (making it 
""), but this still creates an empty line.

Can we strip the metadata from the docx altogeher?

Thanks



-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/0185bd97-6167-4b60-b2fa-77101454476f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1507 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Strip metadata in docx content
       [not found] ` <0185bd97-6167-4b60-b2fa-77101454476f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-06-19 12:41   ` mb21
  2018-06-19 14:32   ` Francesco Occhipinti
  1 sibling, 0 replies; 7+ messages in thread
From: mb21 @ 2018-06-19 12:41 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1125 bytes --]

Setting the metadata variable to false seems to work:

    pandoc -M title=false


On Tuesday, June 19, 2018 at 1:38:25 PM UTC+2, Jose Costa Teixeira wrote:
>
> Hi
>
> I am starting to use pandoc, so please forgive me if this is a dumb 
> question. 
>
> When I convert an (x)html to docx, the docx writer puts the title in the 
> file. How can I avoid that?
> I can use the command line to give another value to the title (making it 
> ""), but this still creates an empty line.
>
> Can we strip the metadata from the docx altogeher?
>
> Thanks
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/018cd049-4cc2-4a42-b671-2cee59be88f4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1855 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Strip metadata in docx content
       [not found] ` <0185bd97-6167-4b60-b2fa-77101454476f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2018-06-19 12:41   ` mb21
@ 2018-06-19 14:32   ` Francesco Occhipinti
       [not found]     ` <b0ea598a-ec4c-4f4a-b083-59ac31e75b83-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 7+ messages in thread
From: Francesco Occhipinti @ 2018-06-19 14:32 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1180 bytes --]


You can also remove metadata via a filter. They might be not visible in 
some native representations of the document but they are there

On Tuesday, June 19, 2018 at 1:38:25 PM UTC+2, Jose Costa Teixeira wrote:
>
> Hi
>
> I am starting to use pandoc, so please forgive me if this is a dumb 
> question. 
>
> When I convert an (x)html to docx, the docx writer puts the title in the 
> file. How can I avoid that?
> I can use the command line to give another value to the title (making it 
> ""), but this still creates an empty line.
>
> Can we strip the metadata from the docx altogeher?
>
> Thanks
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b0ea598a-ec4c-4f4a-b083-59ac31e75b83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1894 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Strip metadata in docx content
       [not found]     ` <b0ea598a-ec4c-4f4a-b083-59ac31e75b83-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-06-19 15:01       ` BP Jonsson
       [not found]         ` <CAFC_yuRHzEfXBHCRppLuM2oYzyqMhMZGMK8uX1xkBpX9DjnMEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: BP Jonsson @ 2018-06-19 15:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3221 bytes --]

I'm AFK right now and don't want to post code which maybe doesn't work but
a Lua filter which intercepts the metadata and replaces it with an empty
table probably works.

tis 19 juni 2018 kl. 16:32 skrev Francesco Occhipinti <
f.occhipinti-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

>
> You can also remove metadata via a filter. They might be not visible in
> some native representations of the document but they are there
>
> On Tuesday, June 19, 2018 at 1:38:25 PM UTC+2, Jose Costa Teixeira wrote:
>>
>> Hi
>>
>> I am starting to use pandoc, so please forgive me if this is a dumb
>> question.
>>
>> When I convert an (x)html to docx, the docx writer puts the title in the
>> file. How can I avoid that?
>> I can use the command line to give another value to the title (making it
>> ""), but this still creates an empty line.
>>
>> Can we strip the metadata from the docx altogeher?
>>
>> Thanks
>>
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/b0ea598a-ec4c-4f4a-b083-59ac31e75b83%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/b0ea598a-ec4c-4f4a-b083-59ac31e75b83%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
-- 

------------------------------
SavedURI :Show URLShow URLSavedURI :
SavedURI :Hide URLHide URLSavedURI :
https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvwhttps://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
<https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
------------------------------

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAFC_yuRHzEfXBHCRppLuM2oYzyqMhMZGMK8uX1xkBpX9DjnMEw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 4761 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Strip metadata in docx content
       [not found]         ` <CAFC_yuRHzEfXBHCRppLuM2oYzyqMhMZGMK8uX1xkBpX9DjnMEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-06-20  5:06           ` Jose Costa Teixeira
       [not found]             ` <7123e8e6-0c97-4b3d-b6d4-5920bf7589d5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Jose Costa Teixeira @ 2018-06-20  5:06 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3509 bytes --]

Thanks, setting the metadata to false is sufficient. I am not familiar with 
filters so when this becomes more advanced I will catch up there.

On Tuesday, June 19, 2018 at 5:01:58 PM UTC+2, BP Jonsson wrote:
>
> I'm AFK right now and don't want to post code which maybe doesn't work but 
> a Lua filter which intercepts the metadata and replaces it with an empty 
> table probably works.
>
> tis 19 juni 2018 kl. 16:32 skrev Francesco Occhipinti <f.occh...-Re5JQEeQqe8@public.gmane.orgm 
> <javascript:>>:
>
>>
>> You can also remove metadata via a filter. They might be not visible in 
>> some native representations of the document but they are there
>>
>> On Tuesday, June 19, 2018 at 1:38:25 PM UTC+2, Jose Costa Teixeira wrote:
>>>
>>> Hi
>>>
>>> I am starting to use pandoc, so please forgive me if this is a dumb 
>>> question. 
>>>
>>> When I convert an (x)html to docx, the docx writer puts the title in the 
>>> file. How can I avoid that?
>>> I can use the command line to give another value to the title (making it 
>>> ""), but this still creates an empty line.
>>>
>>> Can we strip the metadata from the docx altogeher?
>>>
>>> Thanks
>>>
>>>
>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
>> To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
>> <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/b0ea598a-ec4c-4f4a-b083-59ac31e75b83%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/b0ea598a-ec4c-4f4a-b083-59ac31e75b83%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> -- 
>
> ------------------------------
> SavedURI :Show URLShow URLSavedURI :
> SavedURI :Hide URLHide URLSavedURI :
>
> https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvwhttps://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw 
> <https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw> 
> <https://mail.google.com/_/scs/mail-static/_/js/k=gmail.main.sv.G3GZFwvcniQ.O/m=m_i,t,it/am=fUAcTAoZawdGHAZ2YD-g9N_f7LL4CX7WlSgHQKgABHaCv9kToPiBD8qOMw/rt=h/d=1/rs=AItRSTO5CF1YB_frDRXLXTeUsQ1zItcBvw>
> ------------------------------
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/7123e8e6-0c97-4b3d-b6d4-5920bf7589d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 7110 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Transfer metadata in docx content
       [not found]             ` <7123e8e6-0c97-4b3d-b6d4-5920bf7589d5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-06-23 11:35               ` Giacomo Lanza
       [not found]                 ` <289a35ff-954e-4eb9-9a7d-f6df77f3770e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Giacomo Lanza @ 2018-06-23 11:35 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 1055 bytes --]

Hallo,

I have the opposite problem. I _need_ to transfer the metadata in the conversion HTML --> DOCX. According to the manual, this function is not implemented. (It just works for the title, because in HTML it is coded as a <title> element and not as an attribute of a <meta> element.) Reading your answers has given me back some hope. Does anybody know a way to write the metadata in HTML so that they may be transferred (separated from the text)?

Thanks,

Giacomo

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/289a35ff-954e-4eb9-9a7d-f6df77f3770e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Transfer metadata in docx content
       [not found]                 ` <289a35ff-954e-4eb9-9a7d-f6df77f3770e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-06-29  9:01                   ` John MacFarlane
  0 siblings, 0 replies; 7+ messages in thread
From: John MacFarlane @ 2018-06-29  9:01 UTC (permalink / raw)
  To: Giacomo Lanza, pandoc-discuss


There's an existing issue for this somewhere on the
bug tracker, asking to include more metadata in docx
properties.  You can search for it and comment there.

Giacomo Lanza <nuovog6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Hallo,
>
> I have the opposite problem. I _need_ to transfer the metadata in the conversion HTML --> DOCX. According to the manual, this function is not implemented. (It just works for the title, because in HTML it is coded as a <title> element and not as an attribute of a <meta> element.) Reading your answers has given me back some hope. Does anybody know a way to write the metadata in HTML so that they may be transferred (separated from the text)?
>
> Thanks,
>
> Giacomo
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/289a35ff-954e-4eb9-9a7d-f6df77f3770e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87o9fu0xw4.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-06-29  9:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-19 11:38 Strip metadata in docx content Jose Costa Teixeira
     [not found] ` <0185bd97-6167-4b60-b2fa-77101454476f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-06-19 12:41   ` mb21
2018-06-19 14:32   ` Francesco Occhipinti
     [not found]     ` <b0ea598a-ec4c-4f4a-b083-59ac31e75b83-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-06-19 15:01       ` BP Jonsson
     [not found]         ` <CAFC_yuRHzEfXBHCRppLuM2oYzyqMhMZGMK8uX1xkBpX9DjnMEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-06-20  5:06           ` Jose Costa Teixeira
     [not found]             ` <7123e8e6-0c97-4b3d-b6d4-5920bf7589d5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-06-23 11:35               ` Transfer " Giacomo Lanza
     [not found]                 ` <289a35ff-954e-4eb9-9a7d-f6df77f3770e-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-06-29  9:01                   ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).