9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] MS office XML to txt/troff
@ 2013-02-26 11:26 Steve Simon
  2013-03-02 20:53 ` erik quanstrom
  0 siblings, 1 reply; 2+ messages in thread
From: Steve Simon @ 2013-02-26 11:26 UTC (permalink / raw)
  To: 9fans

New toys in my contrib to convert modern
microsoft office XML files to text or troff/tbl source.

these live in a directory opc as the standard is known as Open
Packaging Conventions and there may be more tools to come.

docx2troff works pretty well, the formatting is imperfect but
looks OK, embedded drawings are ignored (sorry, too hard).

xlsx2txt works find for text output but custom number formats are
not handled which is disappointing - this means they work fine
for most documents but "clever" spreadsheets can cause problems.
This may get fixed one day - feel free if you want to try.

code in /n/sources/contrib/steve/opc.tgz and depends on
/n/sources/contrib/steve/libxml.tgz

fixes and extensions greatfully received. please don't reformat
the code without contacting me first.

-Steve



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [9fans] MS office XML to txt/troff
  2013-02-26 11:26 [9fans] MS office XML to txt/troff Steve Simon
@ 2013-03-02 20:53 ` erik quanstrom
  0 siblings, 0 replies; 2+ messages in thread
From: erik quanstrom @ 2013-03-02 20:53 UTC (permalink / raw)
  To: 9fans

On Tue Feb 26 06:26:55 EST 2013, steve@quintile.net wrote:
> New toys in my contrib to convert modern
> microsoft office XML files to text or troff/tbl source.
>
> these live in a directory opc as the standard is known as Open
> Packaging Conventions and there may be more tools to come.
>
> docx2troff works pretty well, the formatting is imperfect but
> looks OK, embedded drawings are ignored (sorry, too hard).
>
> xlsx2txt works find for text output but custom number formats are
> not handled which is disappointing - this means they work fine
> for most documents but "clever" spreadsheets can cause problems.
> This may get fixed one day - feel free if you want to try.
>
> code in /n/sources/contrib/steve/opc.tgz and depends on
> /n/sources/contrib/steve/libxml.tgz
>
> fixes and extensions greatfully received. please don't reformat
> the code without contacting me first.

this has been included in 9atom.

it works pretty well for my purposes, and has reduced the need
to switch to google docs for much of anything.

it would be nice to have a troff2docx as well.  i've recommended
parsing excel format strings as a gsoc project.

- erik



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-03-02 20:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-26 11:26 [9fans] MS office XML to txt/troff Steve Simon
2013-03-02 20:53 ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).