public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX.
@ 2020-05-07 19:34 John McCorkle
       [not found] ` <52683ae4-6dc6-45cd-8e2f-66b1226d6b08-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: John McCorkle @ 2020-05-07 19:34 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3414 bytes --]

I need to convert a Wikipedia page I wrote, to HTML and to either LaTex or 
Docx.
I go to the page here 
(https://en.wikipedia.org/wiki/User:JohnM7190/John%27s_Noise_Figure_Page), 
click on the "edit source" tab, select and copy the source text, and then 
click on the "read" tab so I don't risk actually editing anything. I paste 
that text into Notepad++ and use several regular expression search/replace 
operations to eliminate the <NumBlk blah blah /NumBlk> styles (since Pandoc 
does not recognize them), but keeps the equation and the equation reference 
number they contain plus fixes the {{EquationNote|x}}  references to those 
equations. That gets saved, UTF-8 encoded, as my source.wiki file. Pandoc 
converts my source.wiki file to all three output formats pretty well except 
the citations don't come across.

Can someone please tell me how to modify the citations in my source.wiki 
file so the citations get converted properly (i.e. both first use of the 
citation, and additional references to the same citation), and end up 
listed at the end of the article the same way they do on the Wikipedia page?

For example, on first use, one of my citations is:
<ref name="Peebles457">{{Cite 
book|url=https://cds.cern.ch/record/105963|title=Communication system 
principles|last=Peebles|first=Peyton 
Z.|date=1976|publisher=Addison-Wesley|year=|isbn=|location=Reading, 
MA|pages=457}}</ref>

and then other references to it are:
<ref name="Peebles457" />

There are several types of references, like

<ref name=":2">{{Cite journal|last=Friis|first=H. T.|date=July 
1944|title=Noise Figures of Radio Receivers|url=|journal=Proceedings of the 
IRE|volume=32|issue=7|pages=419–422|doi=10.1109/JRPROC.1944.232049|issn=0096-8390|via=}}[https://ieeexplore.ieee.org/abstract/document/1695024]</ref>

<ref name="IEC_Spot_NF">{{Cite 
web|url=http://www.electropedia.org/iev/iev.nsf/display?openform&ievref=702-08-57|title=IEC 
60050 - International Electrotechnical Vocabulary - IEV number 702-08-57: 
"spot noise factor (of a linear two-port device); spot noise figure (of a 
linear two-port device)"|last=|first=|date=September 
2018|website=|url-status=live|archive-url=|archive-date=|accessdate=2019-12-29}}</ref>

<ref name="Fisk">{{Cite journal|last=Fisk|first=James R.|date=Oct 
1975|title=Receiver Noise Figure Sensitivity and Dynamic Range - What The 
Numbers 
Mean|url=http://www.electronicsandbooks.com/eab3/manual/Magazine/H/Ham%20Radio%20Magazine%20US/Ham%20Radio%20Magazine%201975/10%20October%201975.pdf|journal=Ham 
Radio|volume=|pages=8-25, pg. 12|via=}}</ref>

Then Wikimedia automatically numbers these and puts them all at the end of 
the article with the command:
{{Reflist}}

Is there some format I could convert these citations to, e.g. using regular 
expressions, so that Pandoc would convert them properly? And is there 
something I can use to replace the {{Reflist}} command?

Thanks in advance for any help! 


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/52683ae4-6dc6-45cd-8e2f-66b1226d6b08%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4119 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX.
       [not found] ` <52683ae4-6dc6-45cd-8e2f-66b1226d6b08-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-05-07 21:41   ` John MacFarlane
       [not found]     ` <m2eerv1kz2.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
  2020-05-29 14:53   ` Joseph Reagle
  1 sibling, 1 reply; 6+ messages in thread
From: John MacFarlane @ 2020-05-07 21:41 UTC (permalink / raw)
  To: John McCorkle, pandoc-discuss


You might have better luck converting the HTML version of the
wikipedia page.  See
https://groups.google.com/d/msg/pandoc-discuss/ptiLha5vJ2I/bPJvyLw0BAAJ


John McCorkle <jmco67-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> I need to convert a Wikipedia page I wrote, to HTML and to either LaTex or 
> Docx.
> I go to the page here 
> (https://en.wikipedia.org/wiki/User:JohnM7190/John%27s_Noise_Figure_Page), 
> click on the "edit source" tab, select and copy the source text, and then 
> click on the "read" tab so I don't risk actually editing anything. I paste 
> that text into Notepad++ and use several regular expression search/replace 
> operations to eliminate the <NumBlk blah blah /NumBlk> styles (since Pandoc 
> does not recognize them), but keeps the equation and the equation reference 
> number they contain plus fixes the {{EquationNote|x}}  references to those 
> equations. That gets saved, UTF-8 encoded, as my source.wiki file. Pandoc 
> converts my source.wiki file to all three output formats pretty well except 
> the citations don't come across.
>
> Can someone please tell me how to modify the citations in my source.wiki 
> file so the citations get converted properly (i.e. both first use of the 
> citation, and additional references to the same citation), and end up 
> listed at the end of the article the same way they do on the Wikipedia page?
>
> For example, on first use, one of my citations is:
> <ref name="Peebles457">{{Cite 
> book|url=https://cds.cern.ch/record/105963|title=Communication system 
> principles|last=Peebles|first=Peyton 
> Z.|date=1976|publisher=Addison-Wesley|year=|isbn=|location=Reading, 
> MA|pages=457}}</ref>
>
> and then other references to it are:
> <ref name="Peebles457" />
>
> There are several types of references, like
>
> <ref name=":2">{{Cite journal|last=Friis|first=H. T.|date=July 
> 1944|title=Noise Figures of Radio Receivers|url=|journal=Proceedings of the 
> IRE|volume=32|issue=7|pages=419–422|doi=10.1109/JRPROC.1944.232049|issn=0096-8390|via=}}[https://ieeexplore.ieee.org/abstract/document/1695024]</ref>
>
> <ref name="IEC_Spot_NF">{{Cite 
> web|url=http://www.electropedia.org/iev/iev.nsf/display?openform&ievref=702-08-57|title=IEC 
> 60050 - International Electrotechnical Vocabulary - IEV number 702-08-57: 
> "spot noise factor (of a linear two-port device); spot noise figure (of a 
> linear two-port device)"|last=|first=|date=September 
> 2018|website=|url-status=live|archive-url=|archive-date=|accessdate=2019-12-29}}</ref>
>
> <ref name="Fisk">{{Cite journal|last=Fisk|first=James R.|date=Oct 
> 1975|title=Receiver Noise Figure Sensitivity and Dynamic Range - What The 
> Numbers 
> Mean|url=http://www.electronicsandbooks.com/eab3/manual/Magazine/H/Ham%20Radio%20Magazine%20US/Ham%20Radio%20Magazine%201975/10%20October%201975.pdf|journal=Ham 
> Radio|volume=|pages=8-25, pg. 12|via=}}</ref>
>
> Then Wikimedia automatically numbers these and puts them all at the end of 
> the article with the command:
> {{Reflist}}
>
> Is there some format I could convert these citations to, e.g. using regular 
> expressions, so that Pandoc would convert them properly? And is there 
> something I can use to replace the {{Reflist}} command?
>
> Thanks in advance for any help! 
>
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/52683ae4-6dc6-45cd-8e2f-66b1226d6b08%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m2eerv1kz2.fsf%40johnmacfarlane.net.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX.
       [not found]     ` <m2eerv1kz2.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2020-05-29 13:47       ` John McCorkle
  0 siblings, 0 replies; 6+ messages in thread
From: John McCorkle @ 2020-05-29 13:47 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4923 bytes --]

I never thought about grabbing the HTML. Works great for my HTML need. Many 
thanks!!
Unfortunately, besides not working on the source Wikimedia file, Pandoc 
does not convert the HTML to TeX or Docx successfully either.

On Thursday, May 7, 2020 at 5:41:54 PM UTC-4, John MacFarlane wrote:
>
>
> You might have better luck converting the HTML version of the 
> wikipedia page.  See 
> https://groups.google.com/d/msg/pandoc-discuss/ptiLha5vJ2I/bPJvyLw0BAAJ 
>
>
> John McCorkle <jmc...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
>
> > I need to convert a Wikipedia page I wrote, to HTML and to either LaTex 
> or 
> > Docx. 
> > I go to the page here 
> > (https://en.wikipedia.org/wiki/User:JohnM7190/John%27s_Noise_Figure_Page), 
>
> > click on the "edit source" tab, select and copy the source text, and 
> then 
> > click on the "read" tab so I don't risk actually editing anything. I 
> paste 
> > that text into Notepad++ and use several regular expression 
> search/replace 
> > operations to eliminate the <NumBlk blah blah /NumBlk> styles (since 
> Pandoc 
> > does not recognize them), but keeps the equation and the equation 
> reference 
> > number they contain plus fixes the {{EquationNote|x}}  references to 
> those 
> > equations. That gets saved, UTF-8 encoded, as my source.wiki file. 
> Pandoc 
> > converts my source.wiki file to all three output formats pretty well 
> except 
> > the citations don't come across. 
> > 
> > Can someone please tell me how to modify the citations in my source.wiki 
> > file so the citations get converted properly (i.e. both first use of the 
> > citation, and additional references to the same citation), and end up 
> > listed at the end of the article the same way they do on the Wikipedia 
> page? 
> > 
> > For example, on first use, one of my citations is: 
> > <ref name="Peebles457">{{Cite 
> > book|url=https://cds.cern.ch/record/105963|title=Communication system 
> > principles|last=Peebles|first=Peyton 
> > Z.|date=1976|publisher=Addison-Wesley|year=|isbn=|location=Reading, 
> > MA|pages=457}}</ref> 
> > 
> > and then other references to it are: 
> > <ref name="Peebles457" /> 
> > 
> > There are several types of references, like 
> > 
> > <ref name=":2">{{Cite journal|last=Friis|first=H. T.|date=July 
> > 1944|title=Noise Figures of Radio Receivers|url=|journal=Proceedings of 
> the 
> > 
> IRE|volume=32|issue=7|pages=419–422|doi=10.1109/JRPROC.1944.232049|issn=0096-8390|via=}}[
> https://ieeexplore.ieee.org/abstract/document/1695024]</ref> 
> > 
> > <ref name="IEC_Spot_NF">{{Cite 
> > web|url=
> http://www.electropedia.org/iev/iev.nsf/display?openform&ievref=702-08-57|title=IEC 
> > 60050 - International Electrotechnical Vocabulary - IEV number 
> 702-08-57: 
> > "spot noise factor (of a linear two-port device); spot noise figure (of 
> a 
> > linear two-port device)"|last=|first=|date=September 
> > 
> 2018|website=|url-status=live|archive-url=|archive-date=|accessdate=2019-12-29}}</ref> 
>
> > 
> > <ref name="Fisk">{{Cite journal|last=Fisk|first=James R.|date=Oct 
> > 1975|title=Receiver Noise Figure Sensitivity and Dynamic Range - What 
> The 
> > Numbers 
> > Mean|url=
> http://www.electronicsandbooks.com/eab3/manual/Magazine/H/Ham%20Radio%20Magazine%20US/Ham%20Radio%20Magazine%201975/10%20October%201975.pdf|journal=Ham 
> <http://www.electronicsandbooks.com/eab3/manual/Magazine/H/Ham%20Radio%20Magazine%20US/Ham%20Radio%20Magazine%201975/10%20October%201975.pdf%7Cjournal=Ham> 
> > Radio|volume=|pages=8-25, pg. 12|via=}}</ref> 
> > 
> > Then Wikimedia automatically numbers these and puts them all at the end 
> of 
> > the article with the command: 
> > {{Reflist}} 
> > 
> > Is there some format I could convert these citations to, e.g. using 
> regular 
> > expressions, so that Pandoc would convert them properly? And is there 
> > something I can use to replace the {{Reflist}} command? 
> > 
> > Thanks in advance for any help! 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/52683ae4-6dc6-45cd-8e2f-66b1226d6b08%40googlegroups.com. 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f57971df-00de-48d2-b4ad-3c3e0dc6a629%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 10041 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX.
       [not found] ` <52683ae4-6dc6-45cd-8e2f-66b1226d6b08-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2020-05-07 21:41   ` John MacFarlane
@ 2020-05-29 14:53   ` Joseph Reagle
       [not found]     ` <6ac2c977-59b8-159c-93e2-c0a8bf9599fe-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: Joseph Reagle @ 2020-05-29 14:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Hello John, as someone who authors a lot of citation-heavy content in markdown and Wikitext, I know it'd be nice if there was an easy way to convert between the two. 

However, on Wikipedia, citations are templates (appearing between '{{' and '}}'). Any specific template is not actually part of Wikitex, it is instead a dynamic and arbitrarily customizable extension. Pandoc, obviously, doesn't support that. I suppose someone could write a filter to do some of the work, but they'd need to decide which template to support: {{cite}}, {{citation}}, {{sfn}}, ... . And then when it comes to the bibliography, there's <references/>, {{reflist}}, ... And then deal with all of the paramaters, converting their semantics, and bugs.

Wikitext, and especially templates, is a god-awful mess; it's often not even well-formed. I tried running a citation bot on your article and it found many errors, which would make conversion difficult. (Feel feel to revert that edit.)

  https://en.wikipedia.org/w/index.php?title=User:JohnM7190/John%27s_Noise_Figure_Page&action=history

If you do actually want to do a proper semantic conversion of your citations, I think the thing to do would be:

1. Convert your article into List-defined style, so that each citation is a short reference (<REF NAME=FOO/>) to a longer one (<REF NAME=FOO>{{citation ...}}</REF>) at the bottom of your page.

	https://en.wikipedia.org/wiki/Help:List-defined_references

This is how latex and pandoc-markdown structures things.

2. You'll then need to turn your references (in the prose) and citations (at the bottom) into the appropriate pandoc/YAML -- you could use bibtex for the latter. Some regexs might get you part of the way, but given the sloppiness in the citations, it would be a very manual process. For some of them, perhaps you could use a DOI or ISSN to get bibtex formatted citations from an API, which you could use with pandoc.

There are tools that can output Wikipedia citations given a well-formed and defined input (bibtex or YAML), but I'm not aware of anything that goes the other way.

Good luck!


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/6ac2c977-59b8-159c-93e2-c0a8bf9599fe%40reagle.org.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX.
       [not found]     ` <6ac2c977-59b8-159c-93e2-c0a8bf9599fe-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
@ 2020-06-03 13:05       ` John McCorkle
       [not found]         ` <ad675bd9-ffc9-42b3-abb8-b78713b1b2e5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: John McCorkle @ 2020-06-03 13:05 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4001 bytes --]

Joseph, Thank you very much. I was not aware of the List-defined style in 
Wikimedia.
I use Zotero and all the references in the article are in my Zotero 
library. I could easily put them all in a Zotero collection so that 
exporting the whole batch to Bibtex or Wikimedia, or other formats Zotero 
supports is easy. Bibtex or BibLaTex would be nice since these 
automatically create the short REF-Name. The whole list can also be 
exported as Wikimedia citations, but that does not create the <ref><\ref> 
container, so there is no automatically generated short REF-Name.
Is there a way to put the whole BibTex list at the end of my wikimedia 
source file and then reference those in the article text? Is that what you 
are suggesting?
It does seem like if the source was set up that way, converting it to a 
LaTex or docx format would go better.
I'm also thinking the HTML that I grab from my browser, looking at the 
wikipedia page, would also be cleaner and perhaps the HTML would convert to 
LaTex or docx better.
If you have never used Zotero, you might check it out. It is an absolutely 
fabulous tool. Great grabber and great database.
Thanks again.

On Friday, May 29, 2020 at 10:53:19 AM UTC-4, Joseph wrote:
>
>
> Hello John, as someone who authors a lot of citation-heavy content in 
> markdown and Wikitext, I know it'd be nice if there was an easy way to 
> convert between the two. 
>
> However, on Wikipedia, citations are templates (appearing between '{{' and 
> '}}'). Any specific template is not actually part of Wikitex, it is instead 
> a dynamic and arbitrarily customizable extension. Pandoc, obviously, 
> doesn't support that. I suppose someone could write a filter to do some of 
> the work, but they'd need to decide which template to support: {{cite}}, 
> {{citation}}, {{sfn}}, ... . And then when it comes to the bibliography, 
> there's <references/>, {{reflist}}, ... And then deal with all of the 
> paramaters, converting their semantics, and bugs. 
>
> Wikitext, and especially templates, is a god-awful mess; it's often not 
> even well-formed. I tried running a citation bot on your article and it 
> found many errors, which would make conversion difficult. (Feel feel to 
> revert that edit.) 
>
>   
> https://en.wikipedia.org/w/index.php?title=User:JohnM7190/John%27s_Noise_Figure_Page&action=history 
>
> If you do actually want to do a proper semantic conversion of your 
> citations, I think the thing to do would be: 
>
> 1. Convert your article into List-defined style, so that each citation is 
> a short reference (<REF NAME=FOO/>) to a longer one (<REF 
> NAME=FOO>{{citation ...}}</REF>) at the bottom of your page. 
>
>         https://en.wikipedia.org/wiki/Help:List-defined_references 
> <https://www.google.com/url?q=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FHelp%3AList-defined_references&sa=D&sntz=1&usg=AFQjCNEdu4ja4u-rRswI7z0p61m4TlQi9A> 
>
> This is how latex and pandoc-markdown structures things. 
>
> 2. You'll then need to turn your references (in the prose) and citations 
> (at the bottom) into the appropriate pandoc/YAML -- you could use bibtex 
> for the latter. Some regexs might get you part of the way, but given the 
> sloppiness in the citations, it would be a very manual process. For some of 
> them, perhaps you could use a DOI or ISSN to get bibtex formatted citations 
> from an API, which you could use with pandoc. 
>
> There are tools that can output Wikipedia citations given a well-formed 
> and defined input (bibtex or YAML), but I'm not aware of anything that goes 
> the other way. 
>
> Good luck! 
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ad675bd9-ffc9-42b3-abb8-b78713b1b2e5%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 5787 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX.
       [not found]         ` <ad675bd9-ffc9-42b3-abb8-b78713b1b2e5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-06-03 13:53           ` Joseph Reagle
  0 siblings, 0 replies; 6+ messages in thread
From: Joseph Reagle @ 2020-06-03 13:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


On 6/3/20 9:05 AM, John McCorkle wrote:
> Joseph, Thank you very much. I was not aware of the List-defined style in Wikimedia.

To see an example, I recently did some work on the Richard Stallman article. Given I usually edit Wikipedia in Sublime Text, I converted to List-defined style before editing. (I find it to impossible to substantively edit with all those verbose citations in the prose. Fortunately, I can also easily fold them.)

https://en.wikipedia.org/w/index.php?title=Richard_Stallman&action=edit

> Is there a way to put the whole BibTex list at the end of my wikimedia source file and then reference those in the article text? Is that what you are suggesting?

You can't put bibtex in Wikitext, it wouldn't be understood. There are the occasional proposals to support something like that, but nothing has happened.

https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Wikicite/purpose#Bibtex_format

You'd have to convert the bibtex to `<ref NAME=foo><{{cite ...}}</ref>`.  There might be tools for that.

https://en.wikipedia.org/wiki/Help:Citation_tools

> I'm also thinking the HTML that I grab from my browser, looking at the wikipedia page, would also be cleaner and perhaps the HTML would convert to LaTex or docx better.

That might suffice, but the citations will be as text only. You couldn't easy change all their formatting from one style to another, for example. But perhaps that isn't needed.

> If you have never used Zotero, you might check it out. It is an absolutely fabulous tool. Great grabber and great database.

Zotero is great! However, I started work on my own bib manager scripts before Zotero and continue to use them. Originally, I grabbed URLs and stuck them in Freemind mindmaps, and then could output for use with OpenOffice and bibtex. Today, I grab DOIs, ISBNs, and URLs -- some with site-specific heuristics -- and manage them in Freeplane mindmaps. I can output to biblatex, pandoc-citeproc YAML, and Wikipedia ref/cite.

https://github.com/reagle/thunderdell

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/7e5f5083-6330-d0ca-15d6-a5163420f6b7%40reagle.org.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-06-03 13:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-07 19:34 Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX John McCorkle
     [not found] ` <52683ae4-6dc6-45cd-8e2f-66b1226d6b08-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-05-07 21:41   ` John MacFarlane
     [not found]     ` <m2eerv1kz2.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-05-29 13:47       ` John McCorkle
2020-05-29 14:53   ` Joseph Reagle
     [not found]     ` <6ac2c977-59b8-159c-93e2-c0a8bf9599fe-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2020-06-03 13:05       ` John McCorkle
     [not found]         ` <ad675bd9-ffc9-42b3-abb8-b78713b1b2e5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-06-03 13:53           ` Joseph Reagle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).