* Normalizing spaces in italics
@ 2022-07-01 16:36 r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
[not found] ` <bd84993b-b1cd-4128-aab2-ce1eff2c9768n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org @ 2022-07-01 16:36 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1073 bytes --]
I am a bit sloppy typing italics in my wordprocessor, and generally only
turn off the italics after I hit the space at the end of the word, so I end
up with markdown output that looks like this (when I convert from rtf to
md):
Strictly speaking the qualities that are imposed by the *logos *of a
certain thing are the *activities *of the *logos*
This looks ugly when I open it up in Emacs etc. I can fix these with regex
replace in Emacs; but I thought pandoc had normalization by default now,
which is supposed to fix these kinds of stylistic errors? I tried passing
the markdown again through pandoc, to generate markdown, but it made no
difference.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bd84993b-b1cd-4128-aab2-ce1eff2c9768n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 1402 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Normalizing spaces in italics
[not found] ` <bd84993b-b1cd-4128-aab2-ce1eff2c9768n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-07-02 8:49 ` BPJ
[not found] ` <CADAJKhCj=dCQ+1BkzkK7++bJn8ajpKkbxYHYVrHaC_NRjVQ15Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: BPJ @ 2022-07-02 8:49 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1: Type: text/plain, Size: 2500 bytes --]
I use this Lua filter to clean up when I convert from DOCX.
``````lua
local function handler (elem)
-- Get the length of the content
len = #elem.content
-- Check that the content isn't empty
if 0 < len then
-- Is the last child a space?
if 'Space' == elem.content[len].tag then
-- Remove the space (last child)
elem.content:remove()
-- Return a space *after* the element
return { elem, pandoc.Space() }
end
end
return nil
end
return {
{
Emph = handler,
Strong = handler,
Strikeout = handler,
SmallCaps = handler,
Underline = handler,
Span = handler,
Link = handler,
}
}
``````
Den fre 1 juli 2022 18:37r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <r.d.goulding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
> I am a bit sloppy typing italics in my wordprocessor, and generally only
> turn off the italics after I hit the space at the end of the word, so I end
> up with markdown output that looks like this (when I convert from rtf to
> md):
>
> Strictly speaking the qualities that are imposed by the *logos *of a
> certain thing are the *activities *of the *logos*
>
> This looks ugly when I open it up in Emacs etc. I can fix these with regex
> replace in Emacs; but I thought pandoc had normalization by default now,
> which is supposed to fix these kinds of stylistic errors? I tried passing
> the markdown again through pandoc, to generate markdown, but it made no
> difference.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/bd84993b-b1cd-4128-aab2-ce1eff2c9768n%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/bd84993b-b1cd-4128-aab2-ce1eff2c9768n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhCj%3DdCQ%2B1BkzkK7%2B%2BbJn8ajpKkbxYHYVrHaC_NRjVQ15Q%40mail.gmail.com.
[-- Attachment #2: Type: text/html, Size: 4136 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Normalizing spaces in italics
[not found] ` <CADAJKhCj=dCQ+1BkzkK7++bJn8ajpKkbxYHYVrHaC_NRjVQ15Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2022-07-02 21:13 ` r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2022-07-05 8:26 ` John MacFarlane
1 sibling, 0 replies; 4+ messages in thread
From: r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org @ 2022-07-02 21:13 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2697 bytes --]
It works perfectly! Thanks, saved me a lot of manual fixing of files
On Saturday, July 2, 2022 at 4:50:05 AM UTC-4 BP wrote:
> I use this Lua filter to clean up when I convert from DOCX.
>
> ``````lua
> local function handler (elem)
> -- Get the length of the content
> len = #elem.content
> -- Check that the content isn't empty
> if 0 < len then
> -- Is the last child a space?
> if 'Space' == elem.content[len].tag then
> -- Remove the space (last child)
> elem.content:remove()
> -- Return a space *after* the element
> return { elem, pandoc.Space() }
> end
> end
> return nil
> end
>
> return {
> {
> Emph = handler,
> Strong = handler,
> Strikeout = handler,
> SmallCaps = handler,
> Underline = handler,
> Span = handler,
> Link = handler,
> }
> }
> ``````
>
> Den fre 1 juli 2022 18:37r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
>
>> I am a bit sloppy typing italics in my wordprocessor, and generally only
>> turn off the italics after I hit the space at the end of the word, so I end
>> up with markdown output that looks like this (when I convert from rtf to
>> md):
>>
>> Strictly speaking the qualities that are imposed by the *logos *of a
>> certain thing are the *activities *of the *logos*
>>
>> This looks ugly when I open it up in Emacs etc. I can fix these with
>> regex replace in Emacs; but I thought pandoc had normalization by default
>> now, which is supposed to fix these kinds of stylistic errors? I tried
>> passing the markdown again through pandoc, to generate markdown, but it
>> made no difference.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/bd84993b-b1cd-4128-aab2-ce1eff2c9768n%40googlegroups.com
>> <https://groups.google.com/d/msgid/pandoc-discuss/bd84993b-b1cd-4128-aab2-ce1eff2c9768n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/eb95ef1e-8b32-4454-95ca-94794db16961n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 4755 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Normalizing spaces in italics
[not found] ` <CADAJKhCj=dCQ+1BkzkK7++bJn8ajpKkbxYHYVrHaC_NRjVQ15Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-07-02 21:13 ` r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
@ 2022-07-05 8:26 ` John MacFarlane
1 sibling, 0 replies; 4+ messages in thread
From: John MacFarlane @ 2022-07-05 8:26 UTC (permalink / raw)
To: BPJ, pandoc-discuss
Might be good to build this into the docx reader.
BPJ <melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> I use this Lua filter to clean up when I convert from DOCX.
>
> ``````lua
> local function handler (elem)
> -- Get the length of the content
> len = #elem.content
> -- Check that the content isn't empty
> if 0 < len then
> -- Is the last child a space?
> if 'Space' == elem.content[len].tag then
> -- Remove the space (last child)
> elem.content:remove()
> -- Return a space *after* the element
> return { elem, pandoc.Space() }
> end
> end
> return nil
> end
>
> return {
> {
> Emph = handler,
> Strong = handler,
> Strikeout = handler,
> SmallCaps = handler,
> Underline = handler,
> Span = handler,
> Link = handler,
> }
> }
> ``````
>
> Den fre 1 juli 2022 18:37r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <r.d.goulding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
>
>> I am a bit sloppy typing italics in my wordprocessor, and generally only
>> turn off the italics after I hit the space at the end of the word, so I end
>> up with markdown output that looks like this (when I convert from rtf to
>> md):
>>
>> Strictly speaking the qualities that are imposed by the *logos *of a
>> certain thing are the *activities *of the *logos*
>>
>> This looks ugly when I open it up in Emacs etc. I can fix these with regex
>> replace in Emacs; but I thought pandoc had normalization by default now,
>> which is supposed to fix these kinds of stylistic errors? I tried passing
>> the markdown again through pandoc, to generate markdown, but it made no
>> difference.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/bd84993b-b1cd-4128-aab2-ce1eff2c9768n%40googlegroups.com
>> <https://groups.google.com/d/msgid/pandoc-discuss/bd84993b-b1cd-4128-aab2-ce1eff2c9768n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhCj%3DdCQ%2B1BkzkK7%2B%2BbJn8ajpKkbxYHYVrHaC_NRjVQ15Q%40mail.gmail.com.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-07-05 8:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-01 16:36 Normalizing spaces in italics r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
[not found] ` <bd84993b-b1cd-4128-aab2-ce1eff2c9768n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-07-02 8:49 ` BPJ
[not found] ` <CADAJKhCj=dCQ+1BkzkK7++bJn8ajpKkbxYHYVrHaC_NRjVQ15Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-07-02 21:13 ` r.d.go...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2022-07-05 8:26 ` John MacFarlane
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).