* Skinny-list detector script: 'list-columns.hs'
@ 2021-05-22 18:18 Gwern Branwen
0 siblings, 0 replies; only message in thread
From: Gwern Branwen @ 2021-05-22 18:18 UTC (permalink / raw)
To: pandoc-discuss
On Gwern.net, I use multiple column layouts for some lists like
https://www.gwern.net/DNM-archives#overall-coverage which are very
'skinny': thin + tall. They would take up an inordinate amount of
vertical space if simply left as normal 1-column lists, but there is
also no semantically-sensible way to collapse them.
To detect opportunities for using multiple-column layouts, I have
written a Pandoc API script which looks for lists of a certain minimum
length whose sub-list entries are no more than certain lengths when
rendered as plain text (to avoid penalizing elements like links, which
may be long to write but look much smaller):
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
-- dependencies: libghc-pandoc-dev
-- usage: 'lost-columns.hs [file]'; reads a Pandoc Markdown file
and looks for 'skinny tall' lists which are better rendered
-- as multiple columns (supported on gwern.net by special CSS
triggered by '<div class="columns"></div>' wrappers)
-- A skinny tall list is defined as a list which is at least 8
items long (so you get at least 2×4 columns—a 2×2 square or 2×3
rectangle looks dumb),
-- and where the individual lines are all <75 characters wide
(>half the width of a gwern.net line at the utmost).
module Main where
import Text.Pandoc (def, nullMeta, queryWith, readerExtensions,
readMarkdown, runPure,
pandocExtensions, writePlain,
Block(BulletList, OrderedList), Pandoc(Pandoc))
import qualified Data.Text as T (length, unlines, Text)
import qualified Data.Text.IO as TIO (readFile, putStrLn)
import System.Environment (getArgs)
import Control.Monad (when, unless)
-- | Map over the filenames
main :: IO ()
main = do
fs <- getArgs
let printfilenamep = head fs == "--print-filenames"
let fs' = if printfilenamep then Prelude.drop 1 fs else fs
mapM_ (printLists printfilenamep) fs'
printLists :: Bool -> FilePath -> IO ()
printLists printfilenamep file = do
input <- TIO.readFile file
let long = getLongLists input
unless (null long) $ do
when printfilenamep $ putStrLn $ file ++ ":"
TIO.putStrLn $ T.unlines $ map simplified long
listLengthMax, sublistsLengthMin :: Int
listLengthMax = 75
sublistsLengthMin = 8
getLongLists :: T.Text -> [Block]
getLongLists txt = let parsedEither = runPure $ readMarkdown
def{readerExtensions = pandocExtensions } txt
-- if we don't explicitly enable
footnotes, Pandoc interprets the footnotes as broken links, which
throws many spurious warnings to stdout
in case parsedEither of
Left _ -> []
Right pnd -> let lists = extractLists pnd in
filter (\x ->
listLength x < listLengthMax) lists
extractLists :: Pandoc -> [Block]
extractLists = queryWith extractList
where
extractList :: Block -> [Block]
extractList l@(OrderedList _ _) = [l]
extractList l@(BulletList _) = [l]
extractList _ = []
-- > listLength $ BulletList [[Para [Str "test"]],[Para [Str
"test2"],Para [Str "Continuation"]],[Para [Link ("",[],[]) [Str "WP"]
("https://en.wikipedia.org/wiki/Foo","")]],[Para [Str
"Final",Space,Str "line"]]]
-- → 7
listLength :: Block -> Int
listLength (OrderedList _ list) = listLengthAvg list
listLength (BulletList list) = listLengthAvg list
listLength _ = maxBound
listLengthAvg :: [[Block]] -> Int
listLengthAvg list = if length list < sublistsLengthMin then maxBound else
let lengths = map listItemLength list in
maximum lengths
-- > listItemLength $ [Para [Str "Foo", Link nullAttr [Str "bar"]
("https://en.wikipedia.org/wiki/Bar", "Wikipedia link")], Para [Str
"Continued Line"]]
-- → 15
-- > listItemLength $ [Para [Str "Foo", Link nullAttr [Str "bar"]
("https://en.wikipedia.org/wiki/Bar", "Wikipedia link")]]
-- → 7
-- > listItemLength $ [Para [Str "Continued Line"]]
-- → 15
listItemLength :: [Block] -> Int
listItemLength is = let lengths = map listSubItemLength is in
maximum lengths
-- > listSubItemLength $ Para [Str "Foo"]
-- → 4
-- > listSubItemLength $ Para [Str "Foo", Link nullAttr [Str
"bar"] ("https://en.wikipedia.org/wiki/Bar", "Wikipedia link")]
-- → 7
listSubItemLength :: Block -> Int
listSubItemLength i = T.length $ simplified i
simplified :: Block -> T.Text
simplified i = let md = runPure $ writePlain def (Pandoc nullMeta [i]) in
case md of
Left _ -> error $ "Failed to render: " ++ show md
Right md' -> md'
Could perhaps be improved by doing some sort of max/length monoid over
the AST tree, but Pandoc makes it hard to update trees in place or
ascend systematically or get lengths, so I didn't bother. Any list
which needs such tricks is probably not a good candidate for
multiple-columns anyway.
--
gwern
https://www.gwern.net
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMwO0gyaLj3Bdrs-jo%2BLPaquhD2LKogE63T-N0O1nSPfVyLDWQ%40mail.gmail.com.
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2021-05-22 18:18 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-22 18:18 Skinny-list detector script: 'list-columns.hs' Gwern Branwen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).