* map p+class and span+class to para and char style names in html to docx, odt, icml and vice versa
@ 2015-09-22 12:40 massifrg
[not found] ` <6f4a2ed7-3eb3-4f09-8fc2-07c823e62ff2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: massifrg @ 2015-09-22 12:40 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2032 bytes --]
Hello,
I'm working on documents marked up with XHTML.
I wrote some utilities to convert them in docx, pdf (through PrinceXML or
ConTeXt) and ICML.
Those utilities are far from complete and I'd like to use Pandoc instead.
It would be great to convert <p class=...> and <span class=...> elements to
some corresponding paragraph and character styles in docx, odt and ICML.
The concepts of paragraph styles and character styles are common in Word,
OpenOffice/Libreoffice Writer and InDesign (and not only them).
They map well to HTML's p+class and span+class.
In Pandoc, paragraphs lack attributes(see Pandoc.Text.Definition
<http://hackage.haskell.org/package/pandoc-types-1.12.4.4/docs/Text-Pandoc-Definition.html>),
even if there's a workaround (see here
<https://groups.google.com/forum/#!searchin/pandoc-discuss/paragraph$20attributes/pandoc-discuss/hmcT7edsHd8/SH-l8AWYiqoJ>
).
It would be really useful if Pandoc mapped p+class and span+class elements
to para and char styles with the same name in docx, odt, icml.
What do you think?
I think it should be an option that you could toggle (i.e. "--map-styles").
Something like (or working with) --reference-odt and --reference-docx (and
maybe --reference-icml or --reference-idml in the future), but not limited
to a fixed set of styles.
I don't know how they should be marked up in markdown, but since it would
be specific to those formats, markdown writer could simply ignore that
feature.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6f4a2ed7-3eb3-4f09-8fc2-07c823e62ff2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 2829 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: map p+class and span+class to para and char style names in html to docx, odt, icml and vice versa
[not found] ` <6f4a2ed7-3eb3-4f09-8fc2-07c823e62ff2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-12-29 15:07 ` massifrg
[not found] ` <337e7324-35e6-4774-ad4b-574e33cede54-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: massifrg @ 2015-12-29 15:07 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2477 bytes --]
I try to reformulate and simplify the question.
Example:
A <span class="myStyle">word</span> with a custom style.
Convert it from markdown to HTML (pandoc -f markdown -t html) and you get:
<p>A <span class="myStyle">word</span> with a custom style.</p>
Convert it from markdown to ICML (pandoc -f markdown -t icml) and you get:
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Paragraph">
<CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
<Content>A </Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
<Content>word</Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
<Content> with a custom style.</Content>
</CharacterStyleRange><Br />
</ParagraphStyleRange>
The styled word is put in a CharacterStyleRange of its own, but there's no
trace of the class attribute.
Is there a way to get this:
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Paragraph">
<CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
<Content>A </Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="$ID/myStyle">
<Content>word</Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
<Content> with a custom style.</Content>
</CharacterStyleRange><Br />
</ParagraphStyleRange>
This way, when you import the ICML in InDesign, in a document with myStyle
previously defined as a character style, you get the right formatting.
The same could be thought for DOCX and ODT, with reference documents that
contain the styles you need.
I have used the class attribute to map the style, but another attribute
could be used: it's only conventional.
I think this "style mapping" should be disabled by default, but enabled by
a command line option.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/337e7324-35e6-4774-ad4b-574e33cede54%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 12194 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: map p+class and span+class to para and char style names in html to docx, odt, icml and vice versa
[not found] ` <337e7324-35e6-4774-ad4b-574e33cede54-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2015-12-31 17:23 ` mb21
[not found] ` <984a8c71-60cd-4766-83d3-219d178ab923-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: mb21 @ 2015-12-31 17:23 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 3374 bytes --]
So what you're proposing is to extend the functionality described in
https://github.com/jgm/pandoc/issues/2542 to: - not only cover DOCX, but
also ODT and ICML
- and not only Inlines but also Blocks (i.e. not only "character styles"
but also "paragraph styles")
You're welcome to add your comments to that issue!
You suggest using a Span for Inlines, so using a Div for Blocks would be
consequent. Also, as you mentioned Para currently unfortunately doesn't
support attributes in Pandoc's AST anyway.
Btw, you can also always write your own filter (see
http://pandoc.org/scripting.html) to modify Pandoc's AST and insert for
example Raw ICML, like: [RawBlock (Format "icml") "<ParagraphStyleRange ...
</ParagraphStyleRange>"]
On Tuesday, December 29, 2015 at 4:07:15 PM UTC+1, massifrg wrote:
>
> I try to reformulate and simplify the question.
> Example:
>
> A <span class="myStyle">word</span> with a custom style.
>
> Convert it from markdown to HTML (pandoc -f markdown -t html) and you get:
>
> <p>A <span class="myStyle">word</span> with a custom style.</p>
>
> Convert it from markdown to ICML (pandoc -f markdown -t icml) and you get:
>
> <ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Paragraph">
> <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
> <Content>A </Content>
> </CharacterStyleRange>
> <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
> <Content>word</Content>
> </CharacterStyleRange>
> <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
> <Content> with a custom style.</Content>
> </CharacterStyleRange><Br />
> </ParagraphStyleRange>
>
> The styled word is put in a CharacterStyleRange of its own, but there's no
> trace of the class attribute.
> Is there a way to get this:
>
> <ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Paragraph">
> <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
> <Content>A </Content>
> </CharacterStyleRange>
> <CharacterStyleRange AppliedCharacterStyle="$ID/myStyle">
> <Content>word</Content>
> </CharacterStyleRange>
> <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
> <Content> with a custom style.</Content>
> </CharacterStyleRange><Br />
> </ParagraphStyleRange>
>
> This way, when you import the ICML in InDesign, in a document with myStyle
> previously defined as a character style, you get the right formatting.
> The same could be thought for DOCX and ODT, with reference documents that
> contain the styles you need.
> I have used the class attribute to map the style, but another attribute
> could be used: it's only conventional.
> I think this "style mapping" should be disabled by default, but enabled by
> a command line option.
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/984a8c71-60cd-4766-83d3-219d178ab923%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 9402 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: map p+class and span+class to para and char style names in html to docx, odt, icml and vice versa
[not found] ` <984a8c71-60cd-4766-83d3-219d178ab923-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2016-01-02 14:28 ` massifrg
0 siblings, 0 replies; 4+ messages in thread
From: massifrg @ 2016-01-02 14:28 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 895 bytes --]
Thank you for the answer and the links, mb21.
When I have something to contribute, I'll add it to issue 2542.
I'll follow these guidelines:
- use the current AST (even if p+attrs would map better than div+attrs to
paragraph styles)
- follow jgm's comments on issue 2542 (map only to existing styles and
"style-" prefix)
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8c61babd-b797-44e8-9b37-bc0f57aace36%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 1350 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-01-02 14:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22 12:40 map p+class and span+class to para and char style names in html to docx, odt, icml and vice versa massifrg
[not found] ` <6f4a2ed7-3eb3-4f09-8fc2-07c823e62ff2-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-12-29 15:07 ` massifrg
[not found] ` <337e7324-35e6-4774-ad4b-574e33cede54-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2015-12-31 17:23 ` mb21
[not found] ` <984a8c71-60cd-4766-83d3-219d178ab923-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2016-01-02 14:28 ` massifrg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).