public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Question/Feature-request: preserve tabs in normal text
@ 2019-04-18 11:52 Jérémie Wenger
       [not found] ` <ba9f5cda-a3b3-49b4-9408-e982975a160d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Jérémie Wenger @ 2019-04-18 11:52 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1849 bytes --]

Dear all,

I have a slightly unusual request: I have been wondering whether it is 
possible to make Pandoc preserve tabs (meaning, in this case, not simply 
erasing them: it could be a conversion to spaces) for regular text e.g. in 
.odt format. I know this feature exists for code blocks, but in my case I 
have a rather large amount of experimental texts in .odt format using tabs 
to layout text in a specific way, and would like to be able to convert that 
to other formats (plain text, or markdown, would be a good start). So far I 
could not find any such feature.

What I could find is: the libreoffice cli preserves tabs when porting to 
txt, but gets rid of all other information (e.g. the markdown *italic*, 
**bold**), whereas the ideal situation for me would be if I could keep *both 
*this type of information and the tabs (which I could then batch-convert to 
something else, either unbreakable space, or some other thing). I have been 
working on a script that uses both libreoffice for the tabs and pandoc for 
the rest, and automating a merge between the two, but this has proved 
fairly tricky, and is still not working in all cases. 

Does anyone know if there is a way of perserving tabs and multiple 
consecutive spaces using Pandoc?

Many thanks,
Jeremie

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ba9f5cda-a3b3-49b4-9408-e982975a160d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2376 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question/Feature-request: preserve tabs in normal text
       [not found] ` <ba9f5cda-a3b3-49b4-9408-e982975a160d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-04-18 16:55   ` John MacFarlane
       [not found]     ` <yh480kv9zbdzir.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2019-04-18 16:55 UTC (permalink / raw)
  To: Jérémie Wenger, pandoc-discuss


It isn't possible to retain spaces in ODT -> anything
conversions; the ODT reader collapses them into a
Pandoc Space element.

Your best bet I think would be to do a batch
search-and-replace using LibreOffice, replacing
literal tabs with some unicode character that won't duplicate
anything else (like an arrow).  Then you'll get this
as a literal character in the pandoc AST, and you can
run a lua filter to convert arrows in Str elements
into tabs.



Jérémie Wenger <jeremie.wenger-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Dear all,
>
> I have a slightly unusual request: I have been wondering whether it is 
> possible to make Pandoc preserve tabs (meaning, in this case, not simply 
> erasing them: it could be a conversion to spaces) for regular text e.g. in 
> .odt format. I know this feature exists for code blocks, but in my case I 
> have a rather large amount of experimental texts in .odt format using tabs 
> to layout text in a specific way, and would like to be able to convert that 
> to other formats (plain text, or markdown, would be a good start). So far I 
> could not find any such feature.
>
> What I could find is: the libreoffice cli preserves tabs when porting to 
> txt, but gets rid of all other information (e.g. the markdown *italic*, 
> **bold**), whereas the ideal situation for me would be if I could keep *both 
> *this type of information and the tabs (which I could then batch-convert to 
> something else, either unbreakable space, or some other thing). I have been 
> working on a script that uses both libreoffice for the tabs and pandoc for 
> the rest, and automating a merge between the two, but this has proved 
> fairly tricky, and is still not working in all cases. 
>
> Does anyone know if there is a way of perserving tabs and multiple 
> consecutive spaces using Pandoc?
>
> Many thanks,
> Jeremie
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ba9f5cda-a3b3-49b4-9408-e982975a160d%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/yh480kv9zbdzir.fsf%40johnmacfarlane.net.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question/Feature-request: preserve tabs in normal text
       [not found]     ` <yh480kv9zbdzir.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2019-04-18 20:34       ` Jérémie Wenger
  0 siblings, 0 replies; 3+ messages in thread
From: Jérémie Wenger @ 2019-04-18 20:34 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3415 bytes --]

Thanks a lot for this, very helpful, that makes sense! I will try this 
then. Best, J

On Thursday, April 18, 2019 at 6:55:26 PM UTC+2, John MacFarlane wrote:
>
>
> It isn't possible to retain spaces in ODT -> anything 
> conversions; the ODT reader collapses them into a 
> Pandoc Space element. 
>
> Your best bet I think would be to do a batch 
> search-and-replace using LibreOffice, replacing 
> literal tabs with some unicode character that won't duplicate 
> anything else (like an arrow).  Then you'll get this 
> as a literal character in the pandoc AST, and you can 
> run a lua filter to convert arrows in Str elements 
> into tabs. 
>
>
>
> Jérémie Wenger <jeremi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
>
> > Dear all, 
> > 
> > I have a slightly unusual request: I have been wondering whether it is 
> > possible to make Pandoc preserve tabs (meaning, in this case, not simply 
> > erasing them: it could be a conversion to spaces) for regular text e.g. 
> in 
> > .odt format. I know this feature exists for code blocks, but in my case 
> I 
> > have a rather large amount of experimental texts in .odt format using 
> tabs 
> > to layout text in a specific way, and would like to be able to convert 
> that 
> > to other formats (plain text, or markdown, would be a good start). So 
> far I 
> > could not find any such feature. 
> > 
> > What I could find is: the libreoffice cli preserves tabs when porting to 
> > txt, but gets rid of all other information (e.g. the markdown *italic*, 
> > **bold**), whereas the ideal situation for me would be if I could keep 
> *both 
> > *this type of information and the tabs (which I could then batch-convert 
> to 
> > something else, either unbreakable space, or some other thing). I have 
> been 
> > working on a script that uses both libreoffice for the tabs and pandoc 
> for 
> > the rest, and automating a merge between the two, but this has proved 
> > fairly tricky, and is still not working in all cases. 
> > 
> > Does anyone know if there is a way of perserving tabs and multiple 
> > consecutive spaces using Pandoc? 
> > 
> > Many thanks, 
> > Jeremie 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> > To post to this group, send email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
> <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/ba9f5cda-a3b3-49b4-9408-e982975a160d%40googlegroups.com. 
>
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/57dde4db-6914-4c5c-9c33-98ec245528aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 5351 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-04-18 20:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-18 11:52 Question/Feature-request: preserve tabs in normal text Jérémie Wenger
     [not found] ` <ba9f5cda-a3b3-49b4-9408-e982975a160d-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-04-18 16:55   ` John MacFarlane
     [not found]     ` <yh480kv9zbdzir.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2019-04-18 20:34       ` Jérémie Wenger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).