public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: pandoc as a linkchecker?
Date: Sat, 12 Sep 2020 22:19:22 +0200	[thread overview]
Message-ID: <87a6xulpdh.fsf@zeitkraut.de> (raw)
In-Reply-To: <f87a3346-3243-0cd4-a101-107e5ffe4902-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>


Joseph Reagle writes:

> It's time to check which links in my syllabi are broken, and I'm again
> cursing under my breath that there's no multi-format linkchecker out
> there that can report line numbers. Then I thought, what about my
> favorite tool!?

Well, here's an anchor checking Lua filter which will tell you when a
link points to a nonexistent anchor. Should be not too hard to extend to
check external links as well. You won't get line numbers, though.

    local identifiers = {}
    function collect_ids (x)
      if x.identifier and x.identifier ~= '' then
        identifiers[x.identifier] = true
      end
    end

    function check_link (link)
      -- check internal links
      if link.target:sub(1,1) == '#' then
        local target_exists = identifiers[link.target:sub(2)]
        if not target_exists then
          io.stderr:write(
            table.concat {'Invalid target: ', link.target,
              ' (link text is "', pandoc.utils.stringify(link), '")\n'
            }
          )
        end
      end
    end

    return {
      {Block = collect_ids, Inline = collect_ids},
      {Link = check_link}
    }


--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


  parent reply	other threads:[~2020-09-12 20:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-12 19:12 Joseph Reagle
     [not found] ` <f87a3346-3243-0cd4-a101-107e5ffe4902-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2020-09-12 19:35   ` Gwern Branwen
2020-09-12 19:38   ` Daniel Staal
2020-09-12 20:19   ` Albert Krewinkel [this message]
2020-09-12 20:31   ` BPJ
     [not found]     ` <CADAJKhCpmA-g_LPufFmZxSY2dVJzYGw_S8vvsPrK2YQoHpRNNQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-09-14 13:19       ` Joseph Reagle
     [not found]         ` <c5259326-1317-e43a-6416-25922630b25e-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2020-09-14 13:23           ` Gwern Branwen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6xulpdh.fsf@zeitkraut.de \
    --to=albert+pandoc-9eawchwdxg8hfhg+jk9f0w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).