public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: John McCorkle <jmco67-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Getting Citations in Wikipedia page to convert over to HTML, Docx, LaTeX.
Date: Fri, 29 May 2020 06:47:44 -0700 (PDT)	[thread overview]
Message-ID: <f57971df-00de-48d2-b4ad-3c3e0dc6a629@googlegroups.com> (raw)
In-Reply-To: <m2eerv1kz2.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 4923 bytes --]

I never thought about grabbing the HTML. Works great for my HTML need. Many 
thanks!!
Unfortunately, besides not working on the source Wikimedia file, Pandoc 
does not convert the HTML to TeX or Docx successfully either.

On Thursday, May 7, 2020 at 5:41:54 PM UTC-4, John MacFarlane wrote:
>
>
> You might have better luck converting the HTML version of the 
> wikipedia page.  See 
> https://groups.google.com/d/msg/pandoc-discuss/ptiLha5vJ2I/bPJvyLw0BAAJ 
>
>
> John McCorkle <jmc...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
>
> > I need to convert a Wikipedia page I wrote, to HTML and to either LaTex 
> or 
> > Docx. 
> > I go to the page here 
> > (https://en.wikipedia.org/wiki/User:JohnM7190/John%27s_Noise_Figure_Page), 
>
> > click on the "edit source" tab, select and copy the source text, and 
> then 
> > click on the "read" tab so I don't risk actually editing anything. I 
> paste 
> > that text into Notepad++ and use several regular expression 
> search/replace 
> > operations to eliminate the <NumBlk blah blah /NumBlk> styles (since 
> Pandoc 
> > does not recognize them), but keeps the equation and the equation 
> reference 
> > number they contain plus fixes the {{EquationNote|x}}  references to 
> those 
> > equations. That gets saved, UTF-8 encoded, as my source.wiki file. 
> Pandoc 
> > converts my source.wiki file to all three output formats pretty well 
> except 
> > the citations don't come across. 
> > 
> > Can someone please tell me how to modify the citations in my source.wiki 
> > file so the citations get converted properly (i.e. both first use of the 
> > citation, and additional references to the same citation), and end up 
> > listed at the end of the article the same way they do on the Wikipedia 
> page? 
> > 
> > For example, on first use, one of my citations is: 
> > <ref name="Peebles457">{{Cite 
> > book|url=https://cds.cern.ch/record/105963|title=Communication system 
> > principles|last=Peebles|first=Peyton 
> > Z.|date=1976|publisher=Addison-Wesley|year=|isbn=|location=Reading, 
> > MA|pages=457}}</ref> 
> > 
> > and then other references to it are: 
> > <ref name="Peebles457" /> 
> > 
> > There are several types of references, like 
> > 
> > <ref name=":2">{{Cite journal|last=Friis|first=H. T.|date=July 
> > 1944|title=Noise Figures of Radio Receivers|url=|journal=Proceedings of 
> the 
> > 
> IRE|volume=32|issue=7|pages=419–422|doi=10.1109/JRPROC.1944.232049|issn=0096-8390|via=}}[
> https://ieeexplore.ieee.org/abstract/document/1695024]</ref> 
> > 
> > <ref name="IEC_Spot_NF">{{Cite 
> > web|url=
> http://www.electropedia.org/iev/iev.nsf/display?openform&ievref=702-08-57|title=IEC 
> > 60050 - International Electrotechnical Vocabulary - IEV number 
> 702-08-57: 
> > "spot noise factor (of a linear two-port device); spot noise figure (of 
> a 
> > linear two-port device)"|last=|first=|date=September 
> > 
> 2018|website=|url-status=live|archive-url=|archive-date=|accessdate=2019-12-29}}</ref> 
>
> > 
> > <ref name="Fisk">{{Cite journal|last=Fisk|first=James R.|date=Oct 
> > 1975|title=Receiver Noise Figure Sensitivity and Dynamic Range - What 
> The 
> > Numbers 
> > Mean|url=
> http://www.electronicsandbooks.com/eab3/manual/Magazine/H/Ham%20Radio%20Magazine%20US/Ham%20Radio%20Magazine%201975/10%20October%201975.pdf|journal=Ham 
> <http://www.electronicsandbooks.com/eab3/manual/Magazine/H/Ham%20Radio%20Magazine%20US/Ham%20Radio%20Magazine%201975/10%20October%201975.pdf%7Cjournal=Ham> 
> > Radio|volume=|pages=8-25, pg. 12|via=}}</ref> 
> > 
> > Then Wikimedia automatically numbers these and puts them all at the end 
> of 
> > the article with the command: 
> > {{Reflist}} 
> > 
> > Is there some format I could convert these citations to, e.g. using 
> regular 
> > expressions, so that Pandoc would convert them properly? And is there 
> > something I can use to replace the {{Reflist}} command? 
> > 
> > Thanks in advance for any help! 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/52683ae4-6dc6-45cd-8e2f-66b1226d6b08%40googlegroups.com. 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/f57971df-00de-48d2-b4ad-3c3e0dc6a629%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 10041 bytes --]

  parent reply	other threads:[~2020-05-29 13:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-07 19:34 John McCorkle
     [not found] ` <52683ae4-6dc6-45cd-8e2f-66b1226d6b08-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-05-07 21:41   ` John MacFarlane
     [not found]     ` <m2eerv1kz2.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2020-05-29 13:47       ` John McCorkle [this message]
2020-05-29 14:53   ` Joseph Reagle
     [not found]     ` <6ac2c977-59b8-159c-93e2-c0a8bf9599fe-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2020-06-03 13:05       ` John McCorkle
     [not found]         ` <ad675bd9-ffc9-42b3-abb8-b78713b1b2e5-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-06-03 13:53           ` Joseph Reagle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f57971df-00de-48d2-b4ad-3c3e0dc6a629@googlegroups.com \
    --to=jmco67-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).