I'd like to extend the lua wordcount filter to tell me a bit more about
specific parts of my text, specifically how many words are in the footnotes
and how many words are in "original quotations," which I mark off with the
tag in my markdown (and which I then strip later via another
filter for certain versions). I got the footnote part to work but can't
figure out the RawInline html bit. Any guidance would be appreciated.
Here's my filter followed by a simple markdown doc and the results
```
-- counts words in a document
words = 0
notewords = 0
quotewords = 0
notenoquotewords = 0
noquotewords = 0
wordcount = {
Note = function(el)
pandoc.walk_inline(el, {
Str = function(el)
if el.text:match("%P") then
notewords = notewords + 1
end
end })
end,
RawInline = function(el)
if el.text == '' then
pandoc.walk_inline(el, {
Str = function(el)
if el.text:match("%P") then
quotewords = quotewords + 1
end
end })
end
end,
Str = function(el)
-- we don't count a word if it's entirely punctuation:
if el.text:match("%P") then
words = words + 1
end
end,
Code = function(el)
_,n = el.text:gsub("%S+","")
words = words + n
end,
CodeBlock = function(el)
_,n = el.text:gsub("%S+","")
words = words + n
end
}
function Pandoc(el)
-- skip metadata, just count body:
pandoc.walk_block(pandoc.Div(el.blocks), wordcount)
mainwords = words - notewords
notenoquotewords = notewords - quotewords
noquotewords = words - quotewords
print(words .. " total words")
print(mainwords .. " words in main text")
print(notewords .. " words in notes")
print(noquotewords .. " total words minus original quotes")
print(quotewords .. " words in original quotes")
print (notenoquotewords .. " words in notes minus original quotes")
os.exit(0)
end
```
test.md mwe markdown file
```
Suspendisse malesuada venenatis mauris. Curabitur ornare mollis velit. Sed
vitae metus.
"Morbi posuere mi id odio."[^1]
[^1]: Citation. ("Original quotation here.")
```
`pandoc --lua-filter wordcount.lua test.md`
> 20 total words
> 16 words in main text
> 4 words in notes
> 20 total words minus original quotes
> 0 words in original quotes
> 4 words in notes minus original quotes
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/49b04b07-285b-47f5-8b6b-b123db559b07o%40googlegroups.com.