ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* wikimedia2context: any existing solutions?
@ 2011-03-30 14:47 Mojca Miklavec
  2011-03-30 15:16 ` Khaled Hosny
  0 siblings, 1 reply; 7+ messages in thread
From: Mojca Miklavec @ 2011-03-30 14:47 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hello,

Before I start reinvinting the wheel ... I have a feeling that some
people were already doing some basic wikimedia2context syntax
conversion.

I would like to create PDF out of some wiki pages with very limited
number of used commands. I have created a simple ruby script that
fetches all the contents that I want in the final PDF, all that is
left to be done is conversion from wiki to tex syntax:
- replace =...= with \section{...}, ==...== with \subsection{...},
===...=== with \subsubsection{...}, ...
- replace ''...'' with {\bf ...}, '''...''' with {\it ...},
'''''...''''' with {\bi ...}
- all lines starting with a space should be printed verbatim
- lines starting with * should be bulleted itemize
- lines starting with # should be numbered itemize
- some trivial replacements like >
- some links: [[abc def]] should become symlinks to begining of
sections with that title
- [[Image:chap1-f2.jpg|frame|Figure 1.2: Cylindrical scanner]] should
become \placefigure{Cylindrical
scanner}{\externalfigure[chap1-f2.jpg]}
- a few tables

Maybe there is more, but I think that this covers the majority of contents.

The solution doesn't have to be too robust and I don't care what
language it is written in (I just need a printed manual and I have no
problem manually tweaking the pitfals after the conversion if needed).
I can start writing regular expressions, but in case that somebody has
an almost-ready-to-use solution, that would be much better than doing
everything from scratch. (A Lua function that would simply read in a
plain wiki file would be nice, but I have never tried to gain deep
understanding of "parsing" in lua.)

Thanks a lot,
    Mojca
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wikimedia2context: any existing solutions?
  2011-03-30 14:47 wikimedia2context: any existing solutions? Mojca Miklavec
@ 2011-03-30 15:16 ` Khaled Hosny
  2011-03-30 17:32   ` Mojca Miklavec
  0 siblings, 1 reply; 7+ messages in thread
From: Khaled Hosny @ 2011-03-30 15:16 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, Mar 30, 2011 at 04:47:07PM +0200, Mojca Miklavec wrote:
> Hello,
> 
> Before I start reinvinting the wheel ... I have a feeling that some
> people were already doing some basic wikimedia2context syntax
> conversion.
> 
> I would like to create PDF out of some wiki pages with very limited
> number of used commands. I have created a simple ruby script that
> fetches all the contents that I want in the final PDF, all that is
> left to be done is conversion from wiki to tex syntax:
> - replace =...= with \section{...}, ==...== with \subsection{...},
> ===...=== with \subsubsection{...}, ...
> - replace ''...'' with {\bf ...}, '''...''' with {\it ...},
> '''''...''''' with {\bi ...}
> - all lines starting with a space should be printed verbatim
> - lines starting with * should be bulleted itemize
> - lines starting with # should be numbered itemize
> - some trivial replacements like >
> - some links: [[abc def]] should become symlinks to begining of
> sections with that title
> - [[Image:chap1-f2.jpg|frame|Figure 1.2: Cylindrical scanner]] should
> become \placefigure{Cylindrical
> scanner}{\externalfigure[chap1-f2.jpg]}
> - a few tables

If you are comfortable with writing PEG grammar (I'm not), writing a
mediawiki parser for luanamark[1] might be a good choice, it has a
ConTeXt writer already (and markdown parser).

I pet pandoc have mediawiki support as well, so you may try it.

[1] https://github.com/jgm/lunamark

Regards,
 Khaled

-- 
 Khaled Hosny
 Egyptian
 Arab
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wikimedia2context: any existing solutions?
  2011-03-30 15:16 ` Khaled Hosny
@ 2011-03-30 17:32   ` Mojca Miklavec
  2011-03-30 17:38     ` Aditya Mahajan
                       ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Mojca Miklavec @ 2011-03-30 17:32 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, Mar 30, 2011 at 17:16, Khaled Hosny wrote:
> On Wed, Mar 30, 2011 at 04:47:07PM +0200, Mojca Miklavec wrote:
>
> If you are comfortable with writing PEG grammar (I'm not), writing a
> mediawiki parser for luanamark[1] might be a good choice, it has a
> ConTeXt writer already (and markdown parser).

This seems like a very reasonable solution, however it will take too
long before I understand LPEG enough to write some useful code.

> I pet pandoc have mediawiki support as well, so you may try it.

I started installing it, but then realized that it only supports
output to mediawiki, no input.

It seems like writing my own parser (a few regular expressions in
language that is not lua) will probably be the fastest solution after
all.

Mojca
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wikimedia2context: any existing solutions?
  2011-03-30 17:32   ` Mojca Miklavec
@ 2011-03-30 17:38     ` Aditya Mahajan
  2011-03-30 18:44     ` Khaled Hosny
  2011-04-03 11:48     ` R. Ermers
  2 siblings, 0 replies; 7+ messages in thread
From: Aditya Mahajan @ 2011-03-30 17:38 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, 30 Mar 2011, Mojca Miklavec wrote:

> On Wed, Mar 30, 2011 at 17:16, Khaled Hosny wrote:
>> On Wed, Mar 30, 2011 at 04:47:07PM +0200, Mojca Miklavec wrote:
>>
>> If you are comfortable with writing PEG grammar (I'm not), writing a
>> mediawiki parser for luanamark[1] might be a good choice, it has a
>> ConTeXt writer already (and markdown parser).
>
> This seems like a very reasonable solution, however it will take too
> long before I understand LPEG enough to write some useful code.
>
>> I pet pandoc have mediawiki support as well, so you may try it.
>
> I started installing it, but then realized that it only supports
> output to mediawiki, no input.
>
> It seems like writing my own parser (a few regular expressions in
> language that is not lua) will probably be the fastest solution after
> all.

Why not work with html output instead? It is easier to convert html to 
context (either using built in xml parser or pandoc)

Aditya
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wikimedia2context: any existing solutions?
  2011-03-30 17:32   ` Mojca Miklavec
  2011-03-30 17:38     ` Aditya Mahajan
@ 2011-03-30 18:44     ` Khaled Hosny
  2011-03-30 20:41       ` Mojca Miklavec
  2011-04-03 11:48     ` R. Ermers
  2 siblings, 1 reply; 7+ messages in thread
From: Khaled Hosny @ 2011-03-30 18:44 UTC (permalink / raw)
  To: Mojca Miklavec; +Cc: mailing list for ConTeXt users

On Wed, Mar 30, 2011 at 07:32:35PM +0200, Mojca Miklavec wrote:
> On Wed, Mar 30, 2011 at 17:16, Khaled Hosny wrote:
> > On Wed, Mar 30, 2011 at 04:47:07PM +0200, Mojca Miklavec wrote:
> >
> > If you are comfortable with writing PEG grammar (I'm not), writing a
> > mediawiki parser for luanamark[1] might be a good choice, it has a
> > ConTeXt writer already (and markdown parser).
> 
> This seems like a very reasonable solution, however it will take too
> long before I understand LPEG enough to write some useful code.
> 
> > I pet pandoc have mediawiki support as well, so you may try it.
> 
> I started installing it, but then realized that it only supports
> output to mediawiki, no input.
> 
> It seems like writing my own parser (a few regular expressions in
> language that is not lua) will probably be the fastest solution after
> all.

There is also http://sourceforge.net/projects/wiki2tex/ but it generates
LaTeX, tweaking it to generate ConTeXt should not be hard (as long as
you can build it; written in C++ and requires cmake, Qt and what not,
luckily it built here just fine).

Regards,
 Khaled

-- 
 Khaled Hosny
 Egyptian
 Arab
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wikimedia2context: any existing solutions?
  2011-03-30 18:44     ` Khaled Hosny
@ 2011-03-30 20:41       ` Mojca Miklavec
  0 siblings, 0 replies; 7+ messages in thread
From: Mojca Miklavec @ 2011-03-30 20:41 UTC (permalink / raw)
  To: Khaled Hosny; +Cc: mailing list for ConTeXt users

On Wed, Mar 30, 2011 at 20:44, Khaled Hosny wrote:
>
> There is also http://sourceforge.net/projects/wiki2tex/ but it generates
> LaTeX, tweaking it to generate ConTeXt should not be hard (as long as
> you can build it; written in C++ and requires cmake, Qt and what not,
> luckily it built here just fine).

This one works out pretty nice and compiles of the box (TeX code is
not perfect, but all the examples compiled).

The parser needs some tweaking for special cases that were not handled
by the author and some latex needs to be converted to ConTeXt (minor
issues), but it indeed seems nice. (The major problem is that it lacks
any documentation, but that can be circumvented.)

Thanks a lot.

As for why I prefer wiki to html as the main source: wiki is somewhat
more basic and has a bit more structure. Even if I start from HTML, I
hardly have any less work.

Mojca

PS: now I only need to figure out how to compile the program (for
which I'm trying to prepare an acceptable/printable/readable version
of the manual) without crashing ... :)
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: wikimedia2context: any existing solutions?
  2011-03-30 17:32   ` Mojca Miklavec
  2011-03-30 17:38     ` Aditya Mahajan
  2011-03-30 18:44     ` Khaled Hosny
@ 2011-04-03 11:48     ` R. Ermers
  2 siblings, 0 replies; 7+ messages in thread
From: R. Ermers @ 2011-04-03 11:48 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Well, that seems like a great idea. But beware: as far as I know it is impossible with context to process cals tables in an html or xml document. It is possible though to process cals tables in a separate document and insert the resulting pdf.

Regards,

Robert





Op 30 mrt 2011, om 19:32 heeft Mojca Miklavec het volgende geschreven:

> On Wed, Mar 30, 2011 at 17:16, Khaled Hosny wrote:
>> On Wed, Mar 30, 2011 at 04:47:07PM +0200, Mojca Miklavec wrote:
>> 
>> If you are comfortable with writing PEG grammar (I'm not), writing a
>> mediawiki parser for luanamark[1] might be a good choice, it has a
>> ConTeXt writer already (and markdown parser).
> 
> This seems like a very reasonable solution, however it will take too
> long before I understand LPEG enough to write some useful code.
> 
>> I pet pandoc have mediawiki support as well, so you may try it.
> 
> I started installing it, but then realized that it only supports
> output to mediawiki, no input.
> 
> It seems like writing my own parser (a few regular expressions in
> language that is not lua) will probably be the fastest solution after
> all.
> 
> Mojca
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-04-03 11:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-30 14:47 wikimedia2context: any existing solutions? Mojca Miklavec
2011-03-30 15:16 ` Khaled Hosny
2011-03-30 17:32   ` Mojca Miklavec
2011-03-30 17:38     ` Aditya Mahajan
2011-03-30 18:44     ` Khaled Hosny
2011-03-30 20:41       ` Mojca Miklavec
2011-04-03 11:48     ` R. Ermers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).