public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* I want to extract bibliographic data from Amazon pages
@ 2022-12-10  9:05 Trevor Jenkins
       [not found] ` <C57B5FA0-9810-4234-A8A8-C828D6CF27F6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Trevor Jenkins @ 2022-12-10  9:05 UTC (permalink / raw)
  To: pandoc-discuss

My current workflow for getting bibliographic data from Amazon’s book listings is failing. I use BibDesk as my primary citation manager but it does not extract data from Amazon listing so for that I use a lashed up scheme using Zotero. Zotero has a browser add-on which extracts the bibliographic information from these pages. Then in Zotero I have a third-party script that sends that data to BibDesk. This has worked well for a year or more.

However there are two problems with my method. First is that the third-party script for extraction from Zotero does not work with the current version of the program. I downgraded Zotero to an earlier version and that restore my workflow. Unfortunately it now appears that changes to the browser add-on are not compatible with that older version and my workflow is now dammed as it may or may not add the data to Zotero.

As panda can process both HTML and BibTex formats I wonder if and how I could harness that capability to finally drop Zotero altogether as it was only ever meant to be a stopgap anyway. A simplistic 

pandoc -f html -t bib text …

Using the specific URL for the book I want to add does not work; I did not expect it. Leaves me wonder whether a Lua script might be required to do the job. Not conversant with Lua at all so my idea is on hold. 

Is it possible to get pandoc to do the required extraction and if so what might a Lua script look like?

Regards, Trevor.

<>< Re: deemed!

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/C57B5FA0-9810-4234-A8A8-C828D6CF27F6%40gmail.com.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-12-14  9:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-10  9:05 I want to extract bibliographic data from Amazon pages Trevor Jenkins
     [not found] ` <C57B5FA0-9810-4234-A8A8-C828D6CF27F6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-12-10 11:44   ` AW: " denis.maier-NSENcxR/0n0
     [not found]     ` <03d11be1c7b64ed0b31a56f5eb209f88-NSENcxR/0n0@public.gmane.org>
2022-12-10 12:39       ` denis.maier-NSENcxR/0n0
     [not found]         ` <0394e3cb78574a3b986a66479e6253e8-NSENcxR/0n0@public.gmane.org>
2022-12-10 15:08           ` Trevor Jenkins
     [not found]             ` <6A66AECA-AAFF-4195-BA35-039A85E847EE-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-12-14  9:17               ` AW: " denis.maier-NSENcxR/0n0

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).