I want to extract bibliographic data from Amazon pages

public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed

From: Trevor Jenkins <bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: I want to extract bibliographic data from Amazon pages
Date: Sat, 10 Dec 2022 09:05:36 +0000	[thread overview]
Message-ID: <C57B5FA0-9810-4234-A8A8-C828D6CF27F6@gmail.com> (raw)

My current workflow for getting bibliographic data from Amazon’s book listings is failing. I use BibDesk as my primary citation manager but it does not extract data from Amazon listing so for that I use a lashed up scheme using Zotero. Zotero has a browser add-on which extracts the bibliographic information from these pages. Then in Zotero I have a third-party script that sends that data to BibDesk. This has worked well for a year or more.

However there are two problems with my method. First is that the third-party script for extraction from Zotero does not work with the current version of the program. I downgraded Zotero to an earlier version and that restore my workflow. Unfortunately it now appears that changes to the browser add-on are not compatible with that older version and my workflow is now dammed as it may or may not add the data to Zotero.

As panda can process both HTML and BibTex formats I wonder if and how I could harness that capability to finally drop Zotero altogether as it was only ever meant to be a stopgap anyway. A simplistic 

pandoc -f html -t bib text …

Using the specific URL for the book I want to add does not work; I did not expect it. Leaves me wonder whether a Lua script might be required to do the job. Not conversant with Lua at all so my idea is on hold. 

Is it possible to get pandoc to do the required extraction and if so what might a Lua script look like?

Regards, Trevor.

<>< Re: deemed!

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/C57B5FA0-9810-4234-A8A8-C828D6CF27F6%40gmail.com.

next             reply	other threads:[~2022-12-10  9:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-10  9:05 Trevor Jenkins [this message]
     [not found] ` <C57B5FA0-9810-4234-A8A8-C828D6CF27F6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-12-10 11:44   ` AW: " denis.maier-NSENcxR/0n0
     [not found]     ` <03d11be1c7b64ed0b31a56f5eb209f88-NSENcxR/0n0@public.gmane.org>
2022-12-10 12:39       ` denis.maier-NSENcxR/0n0
     [not found]         ` <0394e3cb78574a3b986a66479e6253e8-NSENcxR/0n0@public.gmane.org>
2022-12-10 15:08           ` Trevor Jenkins
     [not found]             ` <6A66AECA-AAFF-4195-BA35-039A85E847EE-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-12-14  9:17               ` AW: " denis.maier-NSENcxR/0n0

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C57B5FA0-9810-4234-A8A8-C828D6CF27F6@gmail.com \
    --to=bslwannabe-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).