From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/26121 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Albert Krewinkel Newsgroups: gmane.text.pandoc Subject: Re: pandoc as a linkchecker? Date: Sat, 12 Sep 2020 22:19:22 +0200 Message-ID: <87a6xulpdh.fsf@zeitkraut.de> References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8681"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCZJF7XJTILRBT626T5AKGQEZ7IUSUA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Sep 12 22:19:30 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-wr1-f60.google.com ([209.85.221.60]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1kHBze-00029y-HX for gtp-pandoc-discuss@m.gmane-mx.org; Sat, 12 Sep 2020 22:19:30 +0200 Original-Received: by mail-wr1-f60.google.com with SMTP id r16sf4680438wrm.18 for ; Sat, 12 Sep 2020 13:19:30 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1599941970; cv=pass; d=google.com; s=arc-20160816; b=swEWDXcPbuCcnEIUaViuruGemp4u3jfQTnLWWYodf14bPOb5uobw0de2IwkHtqtClU cdcvZYFaJeFS38DpY4oG0W4Vi9OMtj++VIajV4xry8S8cdaIhmLr94J6PQOHRjENz4mD 2LYUBBmAz5kf8h/spJMocZ1oCZGhq3bPLs6Olt8GO+ncGh2DV0vzfet8u0315SEP5/It 0oY8srspLl5O14bRDmtplPTRRD+8YcnHIhOXE1WtbWtecZQtQn1z4vFEAvEsUFViQ6ww kEBVxLMNBb9go/fjgR+htUTmg/L+I0VNW6yMiGRB66DnLQqErNDR4tJ/rRZJTbFTu9Ng QZOg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:in-reply-to:subject:to:from:references:sender:dkim-signature; bh=4lsfIlFjli6mYjDjrjwaBQUv0huIDmTfTdTloI67hhQ=; b=h19EHjBd5rKwdYbl4P5A+m08Hh3d1cMLMM8/1XOyzwH6j1vJvrzo5pXTV6JGhzgjP9 9pJEiI+i3wwl/ZJ82NSuNWD/udGGNyCK8ePAgjpMDle4WmerSNlfGGF3xvyzqs3qsGa5 TB8TOWy5tkJvp3K8gIEC/XWAJpOerAYz7sgMyM/5QbBQbcwZqYRIaGGeAmwvuLP9Fje3 c6lGU7TNN/dKqG9eYZUe+pfm4iltTQX3Vbl/JhU2wJWjG8TIsmBHSp6II9BmFWL7GOVY RLUz6eTh8D37rj7ypORKjGRVxN+1bs084KhCGpTlsGN5xyHdfmQ2IKc5aPGZ8ammitsd Koiw== ARC-Authentication-Results: i=2; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050::465:101 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:references:from:to:subject:in-reply-to:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=4lsfIlFjli6mYjDjrjwaBQUv0huIDmTfTdTloI67hhQ=; b=T1dL6eY1T99W/u1GcMxoUavLYSugzq4EcYa/sX+RUSym+76zqaytrOThHtK3sLA/2E 1k1afFpkkljI6A1S55GavGZaMCTULHhJRZ6BThdqvTtFhgdjoqLVq42Cj1h9eVu1YL1x jkrbqlM4S8atbwf+v78j12W25ZPUJUEzrCTNEElPCgwyuDx6Vugmq3xE+x5zSjfYRfnA S6JOXRebaPjGK8tkE6ySmlPEGRPTiRgPaoEVdHFWL3L9ojTjcv+3jNHM6b7ZB3xJgkl1 imWZX6DHI8wcrp399lC2xjPNGN9Q+XUkcm+U7fw795bytfEBImx1ThVyd3vx0a+/NWav +eUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:references:from:to:subject:in-reply-to :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=4lsfIlFjli6mYjDjrjwaBQUv0huIDmTfTdTloI67hhQ=; b=Buruv5KcDaxeq7GwzueBy/spKS+ruDja+NZPST1teiJhgZirMM4EDQyrdb8rbZg90l trOehavEVnlVxidtnB4PuFwRml5WN6nrapIu+Is6nWBsfHWHKBLIm4F9YPhF+Q3yzOJj qiG4/OszjaXoqgMrV3te1ksymfnngUrzryTQMSURqga2CWqBFr1mgnD21p/mTTscLSQG CEkydxW45AsZjuhFI3v9m52D5ZTJ92t4QhwR4DxIw+4MtvVYhbuvGF6E5rzh8DHQkWka CQQsZ8oR77ERFSTocoUhDG4fmMurQ6ei3Nnif+7lAoTfXeAWV7Q3wuyatKg5avl+i51h kMhA== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM531cK/2fC/V7iH9YwLBCdnVhOoVaNDwUm+iaErOeyWUCAk8DifaH Fyuto1BQoS0yhrVRu0hauAg= X-Google-Smtp-Source: ABdhPJw47hrm4TgI5ORoO1Mpgz7j2xwJ9MF/DkC5M/Rc0FYXACz8t8yL5Aa6Jqe7oTv1WvX6Wjs3OQ== X-Received: by 2002:a1c:7714:: with SMTP id t20mr8489468wmi.55.1599941970250; Sat, 12 Sep 2020 13:19:30 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a5d:428e:: with SMTP id k14ls2840458wrq.0.gmail; Sat, 12 Sep 2020 13:19:26 -0700 (PDT) X-Received: by 2002:adf:91c2:: with SMTP id 60mr8617147wri.292.1599941966577; Sat, 12 Sep 2020 13:19:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599941966; cv=none; d=google.com; s=arc-20160816; b=jOcZt73F/n0b+1CfkKcWZZz/oKeOPHCbac+t+Z6xSQILcfF9StcY04B7vx+v2mYGwg vZRB6Nk01sNN6VgvP08SsC7iMbZP2B5lQPNN9rMLgz2Ab/HA2Pmbv1FXc+Rkq3V1Dj9x NM7Zl4etnvuNbc7PJKZ5LNWJA+TfendwHl0a100wW7BT7/d9Z5RD9F1aWRX1Xygh90tt BDDIZ99Ts+FTvi/vrsiySXoJT77S9FFbhRKhhKBXjjeqSkYMgOQbB9mW2YEdrnC4igyN uxdppaZZupO1jLYGDPTWQ4QEwGFwfsIl8YNIcQBvPfj7DWSBpBucycHlkBeBk0V+tmre hqxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:in-reply-to:subject:to:from:references; bh=VeVTxP+0uwJ254KbSIZoVs51FSPZoCqCQvNXiAK5/1Q=; b=JGRvphXstvvWPqZlmBHlUXKPHoStjtOWauWed1sSkh3xM4kdDmOmNbcb0dm1F+jE2d l+JAnmTQU1bWhsq5MX/lX1u5QEoeHFPsn4ERkFeSa2eFu3Uc5SsuMLQmlGxLVlBzpcb2 PYcOJsY4titSpfaIcqL5N5Bvw4JMiauRYR6Vi9Yy4s6SLooBKAWcnR1UvWYZ/YXTP2ru yeUTyESRF1xsOHoshdZF8WvhwCKyptxLSNJv45qzhYpJtD20kwiL2y+dXH3v3ou+6o4f r7ZTSSPHPQaAGl0BDWB3LdBelz6XHZ0b9082QfT0VTlWXQwZWC9tHp7eLpWhWywBbv4i Smvw== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050::465:101 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Original-Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org. [2001:67c:2050::465:101]) by gmr-mx.google.com with ESMTPS id n1si37904wmn.1.2020.09.12.13.19.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 12 Sep 2020 13:19:26 -0700 (PDT) Received-SPF: pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050::465:101 as permitted sender) client-ip=2001:67c:2050::465:101; Original-Received: from smtp1.mailbox.org (smtp1.mailbox.org [80.241.60.240]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4BpkVk1hQnzKmpn for ; Sat, 12 Sep 2020 22:19:26 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Original-Received: from smtp1.mailbox.org ([80.241.60.240]) by spamfilter04.heinlein-hosting.de (spamfilter04.heinlein-hosting.de [80.241.56.122]) (amavisd-new, port 10030) with ESMTP id jYUZa50ZoKKN for ; Sat, 12 Sep 2020 22:19:23 +0200 (CEST) In-reply-to: X-Rspamd-Score: -2.60 / 15.00 / 15.00 X-Rspamd-Queue-Id: 4245A8AA X-Rspamd-UID: db35ba X-Original-Sender: albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050::465:101 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:26121 Archived-At: Joseph Reagle writes: > It's time to check which links in my syllabi are broken, and I'm again > cursing under my breath that there's no multi-format linkchecker out > there that can report line numbers. Then I thought, what about my > favorite tool!? Well, here's an anchor checking Lua filter which will tell you when a link points to a nonexistent anchor. Should be not too hard to extend to check external links as well. You won't get line numbers, though. local identifiers = {} function collect_ids (x) if x.identifier and x.identifier ~= '' then identifiers[x.identifier] = true end end function check_link (link) -- check internal links if link.target:sub(1,1) == '#' then local target_exists = identifiers[link.target:sub(2)] if not target_exists then io.stderr:write( table.concat {'Invalid target: ', link.target, ' (link text is "', pandoc.utils.stringify(link), '")\n' } ) end end end return { {Block = collect_ids, Inline = collect_ids}, {Link = check_link} } -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124