* convert Word docx file containing Chinese and English to Markdown
@ 2018-01-10 6:17 Philip Lee
[not found] ` <f3990958-7004-4c1c-809e-7076c65aaaee-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Philip Lee @ 2018-01-10 6:17 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2018 bytes --]
I want to convert a docx file to Markdown
The *a.docx* file contains the following content
这条性质是由德国数学家戴德金(Richard Dedekind
> )提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,
> 可称其为直线连续性公理(line continuity axiom)。
After converted , got the following in source
这条性质是由德国数学家戴德金(Richard
>
> Dedekind)提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,可称其为直线连续性公理(line
> continuity axiom)。
However , I don't want line breaks between English words , that is , I
expect the following result ,
这条性质是由德国数学家戴德金(Richard
> Dedekind)提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,可称其为直线连续性公理(line continuity
> axiom)。
*Anyone can help resolve the issue ? *
I know there are extensions ignore_line_breaks and east_asian_line_breaks
might help , but I cannot figure out how to use them , I have tried
pandoc -s a.docx -t markdown+east_asian_line_breaks -o a.md
but didn't work.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to pandoc-discuss@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f3990958-7004-4c1c-809e-7076c65aaaee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 4360 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: convert Word docx file containing Chinese and English to Markdown
[not found] ` <f3990958-7004-4c1c-809e-7076c65aaaee-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-01-10 12:23 ` Jesse Rosenthal
[not found] ` <87inc9gawu.fsf-4GNroTWusrE@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Jesse Rosenthal @ 2018-01-10 12:23 UTC (permalink / raw)
To: Philip Lee, pandoc-discuss
Can you post the a.docx file you were using as input? I could try to
take a look at this.
Philip Lee <redstone-cold@163.com> writes:
> I want to convert a docx file to Markdown
> The *a.docx* file contains the following content
>
> 这条性质是由德国数学家戴德金(Richard Dedekind
>> )提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,
>> 可称其为直线连续性公理(line continuity axiom)。
>
>
>
> After converted , got the following in source
>
> 这条性质是由德国数学家戴德金(Richard
>>
>> Dedekind)提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,可称其为直线连续性公理(line
>> continuity axiom)。
>
>
> However , I don't want line breaks between English words , that is , I
> expect the following result ,
>
> 这条性质是由德国数学家戴德金(Richard
>> Dedekind)提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,可称其为直线连续性公理(line continuity
>> axiom)。
>
>
> *Anyone can help resolve the issue ? *
> I know there are extensions ignore_line_breaks and east_asian_line_breaks
> might help , but I cannot figure out how to use them , I have tried
> pandoc -s a.docx -t markdown+east_asian_line_breaks -o a.md
> but didn't work.
>
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe@googlegroups.com.
> To post to this group, send email to pandoc-discuss@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f3990958-7004-4c1c-809e-7076c65aaaee%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to pandoc-discuss@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87inc9gawu.fsf%40jhu.edu.
For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: convert Word docx file containing Chinese and English to Markdown
[not found] ` <87inc9gawu.fsf-4GNroTWusrE@public.gmane.org>
@ 2018-01-10 13:09 ` Philip Lee
[not found] ` <1ac9ebf8-5240-47c6-9b5e-591f73872089-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Philip Lee @ 2018-01-10 13:09 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 3142 bytes --]
Thanks, please download it
from https://drive.google.com/file/d/1L0zd1xlw9Vk8JL46feYqZ4x28sHhW8kF/view?usp=sharing
On Wednesday, January 10, 2018 at 8:23:08 PM UTC+8, Jesse Rosenthal wrote:
>
> Can you post the a.docx file you were using as input? I could try to
> take a look at this.
>
> Philip Lee <redsto...@163.com <javascript:>> writes:
>
> > I want to convert a docx file to Markdown
> > The *a.docx* file contains the following content
> >
> > 这条性质是由德国数学家戴德金(Richard Dedekind
> >> )提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,
> >> 可称其为直线连续性公理(line continuity axiom)。
> >
> >
> >
> > After converted , got the following in source
> >
> > 这条性质是由德国数学家戴德金(Richard
> >>
> >>
> Dedekind)提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,可称其为直线连续性公理(line
>
> >> continuity axiom)。
> >
> >
> > However , I don't want line breaks between English words , that is , I
> > expect the following result ,
> >
> > 这条性质是由德国数学家戴德金(Richard
> >>
> Dedekind)提出的,他认为这条性质是一个明显的事实,无需也无法被证明,它能够刻画直线的连续性,它是直线之所以连续的本质表现,应将其看作一条公理,可称其为直线连续性公理(line
> continuity
> >> axiom)。
> >
> >
> > *Anyone can help resolve the issue ? *
> > I know there are extensions ignore_line_breaks and
> east_asian_line_breaks
> > might help , but I cannot figure out how to use them , I have tried
> > pandoc -s a.docx -t markdown+east_asian_line_breaks -o a.md
> > but didn't work.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-discus...@googlegroups.com <javascript:>.
> > To post to this group, send email to pandoc-...@googlegroups.com
> <javascript:>.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/f3990958-7004-4c1c-809e-7076c65aaaee%40googlegroups.com.
>
> > For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to pandoc-discuss@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1ac9ebf8-5240-47c6-9b5e-591f73872089%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 5671 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: convert Word docx file containing Chinese and English to Markdown
[not found] ` <1ac9ebf8-5240-47c6-9b5e-591f73872089-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2018-01-10 13:29 ` Jesse Rosenthal
[not found] ` <87fu7dg7ve.fsf-4GNroTWusrE@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Jesse Rosenthal @ 2018-01-10 13:29 UTC (permalink / raw)
To: Philip Lee, pandoc-discuss
Try it without line-wrapping:
pandoc a.docx -t markdown --wrap=none
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: convert Word docx file containing Chinese and English to Markdown
[not found] ` <87fu7dg7ve.fsf-4GNroTWusrE@public.gmane.org>
@ 2018-01-10 17:08 ` Philip Lee
0 siblings, 0 replies; 5+ messages in thread
From: Philip Lee @ 2018-01-10 17:08 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 764 bytes --]
Awesome ! Thanks very very much !
On Wednesday, January 10, 2018 at 9:28:47 PM UTC+8, Jesse Rosenthal wrote:
>
> Try it without line-wrapping:
>
> pandoc a.docx -t markdown --wrap=none
>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1939254c-777a-4ed7-9305-34ac33bfd5d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[-- Attachment #1.2: Type: text/html, Size: 1319 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-01-10 17:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-10 6:17 convert Word docx file containing Chinese and English to Markdown Philip Lee
[not found] ` <f3990958-7004-4c1c-809e-7076c65aaaee-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-01-10 12:23 ` Jesse Rosenthal
[not found] ` <87inc9gawu.fsf-4GNroTWusrE@public.gmane.org>
2018-01-10 13:09 ` Philip Lee
[not found] ` <1ac9ebf8-5240-47c6-9b5e-591f73872089-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2018-01-10 13:29 ` Jesse Rosenthal
[not found] ` <87fu7dg7ve.fsf-4GNroTWusrE@public.gmane.org>
2018-01-10 17:08 ` Philip Lee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).