public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Docx reader ; style picking algorithm
@ 2014-12-30 21:48 Kjetil Flovild-Midtlie
       [not found] ` <4b2f3216-fc6c-4552-8bf0-3cd263ebc143-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Kjetil Flovild-Midtlie @ 2014-12-30 21:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 535 bytes --]

Sometimes authors change style names into locale variants or just non standard version. 

Pandoc skiped these when sectioning outout when I tested.. 

Could the docx reader check the underlying style/class name if an elem has a non standard style name? 

Does Word even keep this 'parentclass'   info in the docx elements ? 

(I had a quick look in the docx Reader src file. Have also been reading up on Haskell again like mad this xmas, and sadly I realize I need more time "there" and also reading some raw docx files... )

Kjetil  

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Docx reader ; style picking algorithm
       [not found] ` <4b2f3216-fc6c-4552-8bf0-3cd263ebc143-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2014-12-31 11:20   ` Kjetil Flovild-Midtlie
       [not found]     ` <1bb6000f-1dc7-4a1f-9050-188f03b8d4d9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2015-01-01 14:50   ` Ghlen Livid
  1 sibling, 1 reply; 5+ messages in thread
From: Kjetil Flovild-Midtlie @ 2014-12-31 11:20 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


[-- Attachment #1.1: Type: text/plain, Size: 1433 bytes --]

If docx XMLs dont keep relations between custom-styles and defaults .. is a 
"styleMap" the thing? 
And does pandoc support this in anyway?

(kind of like this project does https://github.com/mwilliamson/mammoth.js)



On Tuesday, 30 December 2014 22:48:17 UTC+1, Kjetil Flovild-Midtlie wrote:
>
> Sometimes authors change style names into locale variants or just non 
> standard version. 
>
> Pandoc skiped these when sectioning outout when I tested.. 
>
> Could the docx reader check the underlying style/class name if an elem has 
> a non standard style name? 
>
> Does Word even keep this 'parentclass'   info in the docx elements ? 
>
> (I had a quick look in the docx Reader src file. Have also been reading up 
> on Haskell again like mad this xmas, and sadly I realize I need more time 
> "there" and also reading some raw docx files... ) 
>
> Kjetil  

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1bb6000f-1dc7-4a1f-9050-188f03b8d4d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2075 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Docx reader ; style picking algorithm
       [not found]     ` <1bb6000f-1dc7-4a1f-9050-188f03b8d4d9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2014-12-31 15:55       ` Mark Szepieniec
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Szepieniec @ 2014-12-31 15:55 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2559 bytes --]

No, the pandoc document model does not support style information. You can
always add styling instructions to the template of whatever output format
you are using, for example CSS for html and ebook, LaTeX commands for
LaTeX, etc.

On Wed, Dec 31, 2014 at 12:20 PM, Kjetil Flovild-Midtlie <
kjetil.midtlie-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> If docx XMLs dont keep relations between custom-styles and defaults .. is
> a "styleMap" the thing?
> And does pandoc support this in anyway?
>
> (kind of like this project does https://github.com/mwilliamson/mammoth.js)
>
>
>
> On Tuesday, 30 December 2014 22:48:17 UTC+1, Kjetil Flovild-Midtlie wrote:
>>
>> Sometimes authors change style names into locale variants or just non
>> standard version.
>>
>> Pandoc skiped these when sectioning outout when I tested..
>>
>> Could the docx reader check the underlying style/class name if an elem
>> has a non standard style name?
>>
>> Does Word even keep this 'parentclass'   info in the docx elements ?
>>
>> (I had a quick look in the docx Reader src file. Have also been reading
>> up on Haskell again like mad this xmas, and sadly I realize I need more
>> time "there" and also reading some raw docx files... )
>>
>> Kjetil
>
>  --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/1bb6000f-1dc7-4a1f-9050-188f03b8d4d9%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/1bb6000f-1dc7-4a1f-9050-188f03b8d4d9%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAE4-1rWTOEANvr3R6-vcO3FvBepd8AgT9vooNVGdGNuzHPX4hQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 3972 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Docx reader ; style picking algorithm
       [not found] ` <4b2f3216-fc6c-4552-8bf0-3cd263ebc143-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2014-12-31 11:20   ` Kjetil Flovild-Midtlie
@ 2015-01-01 14:50   ` Ghlen Livid
       [not found]     ` <dc85a2ba-1117-4ff3-9b53-edb5b36a9f73-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 5+ messages in thread
From: Ghlen Livid @ 2015-01-01 14:50 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 199 bytes --]

I think what you are referring to is related to https://github.com/jgm/pandoc/issues/1607 and https://github.com/jgm/pandoc/issues/1692, please see revelant discussions. 
Basically, docx is a mess. 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Docx reader ; style picking algorithm
       [not found]     ` <dc85a2ba-1117-4ff3-9b53-edb5b36a9f73-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-01-01 16:05       ` Jesse Rosenthal
  0 siblings, 0 replies; 5+ messages in thread
From: Jesse Rosenthal @ 2015-01-01 16:05 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


[-- Attachment #1.1: Type: text/plain, Size: 1713 bytes --]

Yep -- a few major styles (headers, block quotes) have been addressed in 
the reader to make sure they work in different languages.

As far as the question of user-defined paragraph styles go -- the main 
to-do now is parsing the style file for paragraph styles similarly to how 
we already do it for character styles. There are additional details and 
complications which make it a bit harder, but it's doable, and just waiting 
on developer time. Eventually, user-defined styles that inherit from some 
set of base semantic styles will work. (This is all in the reader.)

What will probably never work is being able to guess that your user defined 
style for headers which is just big and bold is really for headers. In 
other words, paragraph styles will have to inherit from something semantic, 
and not just be be a collection of visual character styles.

On Thursday, January 1, 2015 9:50:49 AM UTC-5, Ghlen Livid wrote:
>
> I think what you are referring to is related to 
> https://github.com/jgm/pandoc/issues/1607 and 
> https://github.com/jgm/pandoc/issues/1692, please see revelant 
> discussions. 
> Basically, docx is a mess. 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/224dcd62-a503-40a9-bd4b-a7009abd559a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3194 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-01-01 16:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-30 21:48 Docx reader ; style picking algorithm Kjetil Flovild-Midtlie
     [not found] ` <4b2f3216-fc6c-4552-8bf0-3cd263ebc143-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2014-12-31 11:20   ` Kjetil Flovild-Midtlie
     [not found]     ` <1bb6000f-1dc7-4a1f-9050-188f03b8d4d9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2014-12-31 15:55       ` Mark Szepieniec
2015-01-01 14:50   ` Ghlen Livid
     [not found]     ` <dc85a2ba-1117-4ff3-9b53-edb5b36a9f73-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-01-01 16:05       ` Jesse Rosenthal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).