From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/31879 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Trevor Jenkins Newsgroups: gmane.text.pandoc Subject: I want to extract bibliographic data from Amazon pages Date: Sat, 10 Dec 2022 09:05:36 +0000 Message-ID: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5251"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBC2JJUM6RYOBBY4X2GOAMGQEPAFHBVI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Dec 10 10:05:45 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-lj1-f185.google.com ([209.85.208.185]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1p3vnk-0001DK-UP for gtp-pandoc-discuss@m.gmane-mx.org; Sat, 10 Dec 2022 10:05:44 +0100 Original-Received: by mail-lj1-f185.google.com with SMTP id q9-20020a05651c054900b002791e4a143dsf1846559ljp.10 for ; Sat, 10 Dec 2022 01:05:44 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1670663144; cv=pass; d=google.com; s=arc-20160816; b=eJWRYAS6d3MO9Oz9rGGBIYj43vx7SicjIbVqjGTFL3xI/NXZNDRvTAMRNw/lY/PVq5 Fstnt/1rDYwtIHAapNXfahk5wTYOkDeG2mcagLeFBBycmLsvDIg4uC/UHGpIQZHHZ4ZG o3TKzLpqJWasI0VvORMvD8WjjfCHCCf4vahm03EGiXviaFJJOoVZBwNwLkhZOmsbRV4g RlrjoH3rjMy0PtwggJaV51eGsEBxCXbgovHT+rtt3E+mHQJJSBL5QaL5+QR+Bdh3Pr3P VYTJtmh/N7+F4KrNX5n0vR2Mo0DRw84JCFqOoV5Oby1eEBqH5Hrhr84FoWmlNpZdDojg itZw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:date:message-id:subject :mime-version:content-transfer-encoding:from:sender:dkim-signature :dkim-signature; bh=OvO2GwThg5eOGwj4svp7rR160n8Zm4TCexR5PpouoJE=; b=BhTP+7tjwbl6hIe6ERq364PPtvD+/SAHWs2/psEpdxpUbUBF+5VO6UceZLOSb66yAP 1AV9lCE4zUDDeVHshX9S0xa9JzlwYgzHe4ps20V4WszyqZwt+BUAdyN/KVyJzrif1qvh IDbv8t7rlA5/9jzw+V29wACRZ25R6eZJ4n7m24wmd8gxVSOcF98YruLA6hMbkA/hiwIP kKmBWMFL//YLyFfjCCCIt55N4s/zG9di58pXU0YpLS0OHEIW9HQnRPARJ6Wns4iLS8xl zPQ0J/eM2OUzj42hK7ke2FPb8Jz8YHsUImwd/Ku44W8VSk5lxYV+JvxyYHp4UyQMpJr8 c2Sg== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PBBKIgGf; spf=pass (google.com: domain of bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::431 as permitted sender) smtp.mailfrom=bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:date :message-id:subject:mime-version:content-transfer-encoding:from :sender:from:to:cc:subject:date:message-id:reply-to; bh=OvO2GwThg5eOGwj4svp7rR160n8Zm4TCexR5PpouoJE=; b=bKQjiNo0uFN82E72Z/cI509tchPpY5X8oKdYlHWVm19orIlz4NHSpBhhgjXk6eShrt t5asQTjiuywbhsb4N/uJkngk2NRdnqBKn+3UtRSvOkf1pasVulpuhVt2IsZIeiq6DJRW fNLUnhSAyy1G0Fii1Si35vOAE72N8PThI41GxR3cPtLKmJihJV5DM+7QQqkbYFMOSLJ2 63yXC5ea9WXzVWWMqA2T56Y7wP7qf1AurPKahW+VyoyzPeReJ7Dx+yuOqZwix5IqKVw+ +2l/BQIE806pr5jphobqRcoyb9TFrUCBEIbE97vDydFYWjmIc0LSlOeAbsRZSMxxNKNl KbBg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:date :message-id:subject:mime-version:content-transfer-encoding:from:from :to:cc:subject:date:message-id:reply-to; bh=OvO2GwThg5eOGwj4svp7rR160n8Zm4TCexR5PpouoJE=; b=ERA7QLWPjtvy/KlWHWKUPSRGfVodjddWYhnFeIL3WCKhzfMTgxqBIHkrSA8NWWWjsZ FNgjmnrvc43msO+xs/AjivQ5daaNFOc3MBuShS2JTmHjscJy5ZVbeeYKR4gh/7e0UNF/ ymXiHsOeRkIjqu5Atm7CVbOQEcPDFncGxdJAjbS/2Fa6/uT+RnKCp5Vm0MV9wK3Hgyxv tJtKXMRkg5ZDKQM+h8DTZHZosz44jmm6cL7u1w077wTdOb8ceLIrnIgrmxxdj9vb1xUn q290wG8RYSn11P6pGsCw9RIerejWL0uIQThzyvfs0kagkDhC37q1Hh/LDFHwM0Xp+jYb mlRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:to:date :message-id:subject:mime-version:content-transfer-encoding:from :x-gm-message-state:sender:from:to:cc:subject:date:message-id :reply-to; bh=OvO2GwThg5eOGwj4svp7rR160n8Zm4TCexR5PpouoJE=; b=1l/7eXCEGMCfApfcbCcmCdRXPbb7JuM6UMli2+W1C6eSehhGzXaWQPKZmrtw7KyGC3 pGDCnVN4gyywrnCfhCLrW0UyCbJJINpTqypn07COolibSxzXLQR5VfdovwHuHw+3cgJS cLjKttl5Mq0tfKZX+wo9PL056GK/WUcAkwEPDkNWUFc72POlvrUZbZ4nMzGg6Y2cSyWK 5/91ZhwadjQIwpiVGkuwo0Bn1pk2mjdTaF0gCGVMxqs8MRR8AUw6moLMKJ1Cccxpz+zw TaMhZjf6wd1Jjq8rpg6fhHmQTsD6QCnFRePiKQzde0v Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: ANoB5pnijLhHT/SIjp99ATwfbkALQ7iw/4US9DOBvUDmufKaHqnCRQON qiA6Epm8zNqAgj4XCh63t8w= X-Google-Smtp-Source: AA0mqf6q2XWKwCtDN5tTAo18NCqy3uV2nEVLOzqTZhZPrVk3gG/C0XqnAcbC3MQ7LFHOKBan7qeldg== X-Received: by 2002:a05:6512:3996:b0:4b4:e9f8:88ce with SMTP id j22-20020a056512399600b004b4e9f888cemr25098160lfu.323.1670663144124; Sat, 10 Dec 2022 01:05:44 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6512:3456:b0:4b5:3cdf:5a65 with SMTP id j22-20020a056512345600b004b53cdf5a65ls4078595lfr.2.-pod-prod-gmail; Sat, 10 Dec 2022 01:05:38 -0800 (PST) X-Received: by 2002:a05:6512:12d1:b0:4a4:68b8:c2e1 with SMTP id p17-20020a05651212d100b004a468b8c2e1mr3877332lfg.56.1670663138285; Sat, 10 Dec 2022 01:05:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670663138; cv=none; d=google.com; s=arc-20160816; b=KMEtXAfhOkq/XULvt1hRKaxYZ9RbjKaJEQdx1/ktfCEMD8uFTBkU2Bp1x048iG9M+w q0Aj7KwPn6exddzOTTEF62bN5z7yBOweWznlfsRfXevC+ZHqiKiQk7Zebho8LuIZ3GDh YfjMyODuQXnQ4IGXHWJTQuxbvIcBioMOKUx8FsYfQZ0IbH8dZOeDlHCDM+V7NbfYlq8d i4gfaov80OzzcIG2iWvHbb0pPXgxt2hHPxH1vfDQ+ixGD8/rlJCk0+lx2PAS805NySeq iTpDok7/VY2J0dPPxGgh1f/7mtb6PBaevdDCPNA9gZsnK9QYVHl/1ij17MkWMHzHg54C Jeiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:date:message-id:subject:mime-version:content-transfer-encoding :from:dkim-signature; bh=xuzWAcnDUvhJBy4EWxfc3vLh+yjDkEHao6z7Emnpv9Y=; b=o+uU96cS5d0pwxDTZ9mCg1/G38xn9LYinMOFbic8PouBTL+qv6XH2jbHT5kX7avM5W F0zlWFDCHRb4gJ4x8GRA3qCjQSU348vNAPmH99HCIeHRiplhOp2nIkLZbfqU0LuVLo/n /Tyz0hyMO6lB+c/HY8YtHmuxUSx0bXySSE+BwSI2arhFPVed/0cmtKYQ8C1dRLL5R19M nyGwHRXAPmmlflHG3gXP9o6hDPWzDqi3CrUTAHPck1zneOuOBwVth3YcGKiDpkrmlx9f K+MT/gxcrpErf7G9VVimrPAyly3+ULnQI5jxCv3kxJJK8FUJ1i0ptcmu41zZMYFgRxuD ZfbQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PBBKIgGf; spf=pass (google.com: domain of bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::431 as permitted sender) smtp.mailfrom=bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com. [2a00:1450:4864:20::431]) by gmr-mx.google.com with ESMTPS id s4-20020a056512202400b004b5337b9898si206977lfs.6.2022.12.10.01.05.38 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 10 Dec 2022 01:05:38 -0800 (PST) Received-SPF: pass (google.com: domain of bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::431 as permitted sender) client-ip=2a00:1450:4864:20::431; Original-Received: by mail-wr1-x431.google.com with SMTP id w15so7421192wrl.9 for ; Sat, 10 Dec 2022 01:05:38 -0800 (PST) X-Received: by 2002:a5d:5111:0:b0:232:be5b:44db with SMTP id s17-20020a5d5111000000b00232be5b44dbmr5856705wrt.67.1670663137493; Sat, 10 Dec 2022 01:05:37 -0800 (PST) Original-Received: from smtpclient.apple (host-212-159-187-5.static.as13285.net. [212.159.187.5]) by smtp.gmail.com with ESMTPSA id e6-20020adffc46000000b002425c6d30c6sm3830537wrs.117.2022.12.10.01.05.36 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Dec 2022 01:05:36 -0800 (PST) X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Original-Sender: BSLwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PBBKIgGf; spf=pass (google.com: domain of bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::431 as permitted sender) smtp.mailfrom=bslwannabe-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:31879 Archived-At: My current workflow for getting bibliographic data from Amazon=E2=80=99s bo= ok listings is failing. I use BibDesk as my primary citation manager but it= does not extract data from Amazon listing so for that I use a lashed up sc= heme using Zotero. Zotero has a browser add-on which extracts the bibliogra= phic information from these pages. Then in Zotero I have a third-party scri= pt that sends that data to BibDesk. This has worked well for a year or more= . However there are two problems with my method. First is that the third-part= y script for extraction from Zotero does not work with the current version = of the program. I downgraded Zotero to an earlier version and that restore = my workflow. Unfortunately it now appears that changes to the browser add-o= n are not compatible with that older version and my workflow is now dammed = as it may or may not add the data to Zotero. As panda can process both HTML and BibTex formats I wonder if and how I cou= ld harness that capability to finally drop Zotero altogether as it was only= ever meant to be a stopgap anyway. A simplistic=20 pandoc -f html -t bib text =E2=80=A6 Using the specific URL for the book I want to add does not work; I did not = expect it. Leaves me wonder whether a Lua script might be required to do th= e job. Not conversant with Lua at all so my idea is on hold.=20 Is it possible to get pandoc to do the required extraction and if so what m= ight a Lua script look like? Regards, Trevor. <>< Re: deemed! --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/C57B5FA0-9810-4234-A8A8-C828D6CF27F6%40gmail.com.