public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Reading font information from docx
@ 2021-01-10 20:31 Zev Spitz
       [not found] ` <f7c9c263-8792-44d0-94cb-ddd7c93a2f63n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Zev Spitz @ 2021-01-10 20:31 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 637 bytes --]

Is it possible to read the font information from a docx source document in 
a filter?

The source document uses a specific font to identify code blocks.

Alternatively, how does Pandoc identify code blocks within a docx file?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f7c9c263-8792-44d0-94cb-ddd7c93a2f63n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 984 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading font information from docx
       [not found] ` <f7c9c263-8792-44d0-94cb-ddd7c93a2f63n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-01-11  2:55   ` John MacFarlane
       [not found]     ` <m2zh1g6v34.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: John MacFarlane @ 2021-01-11  2:55 UTC (permalink / raw)
  To: Zev Spitz, pandoc-discuss

Zev Spitz <spitzzev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Is it possible to read the font information from a docx source document in 
> a filter?
>
> The source document uses a specific font to identify code blocks.
>
> Alternatively, how does Pandoc identify code blocks within a docx file?

It looks for the Source Code style on a paragraph.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading font information from docx
       [not found]     ` <m2zh1g6v34.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
@ 2021-01-13 15:12       ` Zev Spitz
       [not found]         ` <a363d4e0-580f-4661-a82d-22ed5ba53c3en-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Zev Spitz @ 2021-01-13 15:12 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 993 bytes --]

Thanks, that does the trick.
Where is this information -- how a given Word style is intreperted as a 
Pandoc type -- documented?

On Monday, January 11, 2021 at 4:55:44 AM UTC+2 John MacFarlane wrote:

> Zev Spitz <spit...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > Is it possible to read the font information from a docx source document 
> in 
> > a filter?
> >
> > The source document uses a specific font to identify code blocks.
> >
> > Alternatively, how does Pandoc identify code blocks within a docx file?
>
> It looks for the Source Code style on a paragraph.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a363d4e0-580f-4661-a82d-22ed5ba53c3en%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1612 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading font information from docx
       [not found]         ` <a363d4e0-580f-4661-a82d-22ed5ba53c3en-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-01-13 18:23           ` John MacFarlane
       [not found]             ` <m2ft34wvam.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: John MacFarlane @ 2021-01-13 18:23 UTC (permalink / raw)
  To: Zev Spitz, pandoc-discuss

Zev Spitz <spitzzev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Thanks, that does the trick.
> Where is this information -- how a given Word style is intreperted as a 
> Pandoc type -- documented?

Nowhere, I guess! I looked in the source code.
In some cases the docx reader uses some pretty fancy heuristics.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading font information from docx
       [not found]             ` <m2ft34wvam.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
@ 2021-01-13 23:43               ` Zev Spitz
       [not found]                 ` <e49ac8ec-8a95-4af8-9cfe-92581015d9a8n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Zev Spitz @ 2021-01-13 23:43 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 895 bytes --]

Is there logic for identifying inline code as well? Or only code blocks?

On Wednesday, January 13, 2021 at 8:23:28 PM UTC+2 John MacFarlane wrote:

> Zev Spitz <spit...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > Thanks, that does the trick.
> > Where is this information -- how a given Word style is intreperted as a 
> > Pandoc type -- documented?
>
> Nowhere, I guess! I looked in the source code.
> In some cases the docx reader uses some pretty fancy heuristics.
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e49ac8ec-8a95-4af8-9cfe-92581015d9a8n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1486 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading font information from docx
       [not found]                 ` <e49ac8ec-8a95-4af8-9cfe-92581015d9a8n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-01-14 18:55                   ` John MacFarlane
       [not found]                     ` <m2bldrgxg1.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: John MacFarlane @ 2021-01-14 18:55 UTC (permalink / raw)
  To: Zev Spitz, pandoc-discuss


Styled with VerbatimChar

Zev Spitz <spitzzev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Is there logic for identifying inline code as well? Or only code blocks?
>
> On Wednesday, January 13, 2021 at 8:23:28 PM UTC+2 John MacFarlane wrote:
>
>> Zev Spitz <spit...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>> > Thanks, that does the trick.
>> > Where is this information -- how a given Word style is intreperted as a 
>> > Pandoc type -- documented?
>>
>> Nowhere, I guess! I looked in the source code.
>> In some cases the docx reader uses some pretty fancy heuristics.
>>
>>
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e49ac8ec-8a95-4af8-9cfe-92581015d9a8n%40googlegroups.com.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading font information from docx
       [not found]                     ` <m2bldrgxg1.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
@ 2021-01-14 20:43                       ` Zev Spitz
  0 siblings, 0 replies; 7+ messages in thread
From: Zev Spitz @ 2021-01-14 20:43 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1955 bytes --]

I know nothing of Haskell, but it would seem from the source 
<https://github.com/jgm/pandoc/blob/527346cc7e2bc874092be2f6793001860e10a719/src/Text/Pandoc/Readers/Docx.hs#L214> 
that the character style being searched for is "Verbatim Char". This 
clashes with the comment at line 47 
<https://github.com/jgm/pandoc/blob/527346cc7e2bc874092be2f6793001860e10a719/src/Text/Pandoc/Readers/Docx.hs#L47>
.

On Thursday, January 14, 2021 at 8:55:58 PM UTC+2 John MacFarlane wrote:

>
> Styled with VerbatimChar
>
> Zev Spitz <spit...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > Is there logic for identifying inline code as well? Or only code blocks?
> >
> > On Wednesday, January 13, 2021 at 8:23:28 PM UTC+2 John MacFarlane wrote:
> >
> >> Zev Spitz <spit...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >>
> >> > Thanks, that does the trick.
> >> > Where is this information -- how a given Word style is intreperted as 
> a 
> >> > Pandoc type -- documented?
> >>
> >> Nowhere, I guess! I looked in the source code.
> >> In some cases the docx reader uses some pretty fancy heuristics.
> >>
> >>
> >
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/e49ac8ec-8a95-4af8-9cfe-92581015d9a8n%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bfef120a-2ac0-4add-b26b-914e19136470n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3102 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-14 20:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-10 20:31 Reading font information from docx Zev Spitz
     [not found] ` <f7c9c263-8792-44d0-94cb-ddd7c93a2f63n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-01-11  2:55   ` John MacFarlane
     [not found]     ` <m2zh1g6v34.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-01-13 15:12       ` Zev Spitz
     [not found]         ` <a363d4e0-580f-4661-a82d-22ed5ba53c3en-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-01-13 18:23           ` John MacFarlane
     [not found]             ` <m2ft34wvam.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-01-13 23:43               ` Zev Spitz
     [not found]                 ` <e49ac8ec-8a95-4af8-9cfe-92581015d9a8n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-01-14 18:55                   ` John MacFarlane
     [not found]                     ` <m2bldrgxg1.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-01-14 20:43                       ` Zev Spitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).