ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* [NTG-context] Tracker for hyphens at the end of lines
@ 2023-08-01 14:54 Keith McKay
  2023-08-01 17:10 ` [NTG-context] " Hans Hagen via ntg-context
  0 siblings, 1 reply; 5+ messages in thread
From: Keith McKay @ 2023-08-01 14:54 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi colleagues,

Is there a tracker for highlighting hyphens at the end of lines similar 
the way underfull and overfull boxes can be displayed with a coloured 
bar at the end of the offending line?

I have looked at the wiki page "Reviewing hyphenation" and it has a 
solution for mkii from 2009 which, I would think, won't be suitable  for 
present day ConTeXt. I have tried searching for hyphens using Skim and 
Adobe Acrobate viewers but although they find hyphenation in line they 
don't recognise hyphens at the edge of lines.

Any help would be appreciated.

Keith McKay

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : https://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [NTG-context] Re: Tracker for hyphens at the end of lines
  2023-08-01 14:54 [NTG-context] Tracker for hyphens at the end of lines Keith McKay
@ 2023-08-01 17:10 ` Hans Hagen via ntg-context
  2023-08-01 18:22   ` Keith McKay
  0 siblings, 1 reply; 5+ messages in thread
From: Hans Hagen via ntg-context @ 2023-08-01 17:10 UTC (permalink / raw)
  To: ntg-context; +Cc: Hans Hagen

On 8/1/2023 4:54 PM, Keith McKay wrote:
> Hi colleagues,
> 
> Is there a tracker for highlighting hyphens at the end of lines similar 
> the way underfull and overfull boxes can be displayed with a coloured 
> bar at the end of the offending line?
> 
> I have looked at the wiki page "Reviewing hyphenation" and it has a 
> solution for mkii from 2009 which, I would think, won't be suitable  for 
> present day ConTeXt. I have tried searching for hyphens using Skim and 
> Adobe Acrobate viewers but although they find hyphenation in line they 
> don't recognise hyphens at the edge of lines.
> 
> Any help would be appreciated.
I suppose you would be disappointed it there was no tracker ...

\enabletrackers[hyphenation.applied.console]
\enabletrackers[hyphenation.applied.visualize]

you even get a file with the hyphenated words

You can see all of them with

\disabledirectives[backend.cleanup.flatten]
\bitwiseflip \normalizelinemode -\flattendiscretionariesnormalizecode
\showmakeup[discretionary]

weren't it that i had to provide the directive for this to work well 
(disbale flattening) so for that you have to wait till we update

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : https://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [NTG-context] Re: Tracker for hyphens at the end of lines
  2023-08-01 17:10 ` [NTG-context] " Hans Hagen via ntg-context
@ 2023-08-01 18:22   ` Keith McKay
  2023-08-09 10:10     ` denis.maier
  0 siblings, 1 reply; 5+ messages in thread
From: Keith McKay @ 2023-08-01 18:22 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thanks Hans!

I'm never disappointed, always amazed with ConTeXt!

This is just what I was looking for.

Best Wishes

Keith McKay

On 01/08/2023 18:10, Hans Hagen via ntg-context wrote:
> On 8/1/2023 4:54 PM, Keith McKay wrote:
>> Hi colleagues,
>>
>> Is there a tracker for highlighting hyphens at the end of lines 
>> similar the way underfull and overfull boxes can be displayed with a 
>> coloured bar at the end of the offending line?
>>
>> I have looked at the wiki page "Reviewing hyphenation" and it has a 
>> solution for mkii from 2009 which, I would think, won't be suitable  
>> for present day ConTeXt. I have tried searching for hyphens using 
>> Skim and Adobe Acrobate viewers but although they find hyphenation in 
>> line they don't recognise hyphens at the edge of lines.
>>
>> Any help would be appreciated.
> I suppose you would be disappointed it there was no tracker ...
>
> \enabletrackers[hyphenation.applied.console]
> \enabletrackers[hyphenation.applied.visualize]
>
> you even get a file with the hyphenated words
>
> You can see all of them with
>
> \disabledirectives[backend.cleanup.flatten]
> \bitwiseflip \normalizelinemode -\flattendiscretionariesnormalizecode
> \showmakeup[discretionary]
>
> weren't it that i had to provide the directive for this to work well 
> (disbale flattening) so for that you have to wait till we update
>
> Hans
>
>
> -----------------------------------------------------------------
>                                           Hans Hagen | PRAGMA ADE
>               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
>        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
> -----------------------------------------------------------------
>
> ___________________________________________________________________________________ 
>
> If your question is of interest to others as well, please add an entry 
> to the Wiki!
>
> maillist : ntg-context@ntg.nl / 
> https://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
> archive  : https://bitbucket.org/phg/context-mirror/commits/
> wiki     : https://contextgarden.net
> ___________________________________________________________________________________
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : https://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [NTG-context] Re: Tracker for hyphens at the end of lines
  2023-08-01 18:22   ` Keith McKay
@ 2023-08-09 10:10     ` denis.maier
  2023-08-09 12:17       ` Hans Hagen via ntg-context
  0 siblings, 1 reply; 5+ messages in thread
From: denis.maier @ 2023-08-09 10:10 UTC (permalink / raw)
  To: ntg-context; +Cc: mckaymeister

Keith, you can also check hyphenations using a script:

-- check-hyphens.lua
--[[ 
    analyze hyphenations based on a ConTeXt log file
    enable hyphenation tracking in the ConTeXt file with
    \enabletrackers[hyphenation.applied]
    then run this script with
    lua check-hyphens.lua input_file whitelist.ending
    for the input_file we assume .log, so no need to add this
    for the whitelist a file ending has to be supplied
    the whitelist is optional
]] 

-- local lines = string.splitlines(io.loaddata("oeps.tex")or "") or { }

-- local pprint = require('pprint')

function main (input_file, whitelist_file)
    local lines = lines_from(input_file .. ".log")
    local whitelist = {}
    if whitelist_file == nil then
        whitelist = {}
    else 
        whitelist = lines_from(whitelist_file)
    end
    --pprint (lines)
    --pprint (whitelist)
    local filteredWordlist = filterHyphenationsWordlist
                (cleanLines
                    (getHyphenationLines(lines)), 
                    whitelist)
    -- pprint(filteredWordlist)
    saveResultsToFile(filteredWordlist, 'check-hyphens.log')
end

-- see if the file exists

-- http://lua-users.org/wiki/FileInputOutput

-- see if the file exists
function file_exists(file)
    local f = io.open(file, "rb")
    if f then f:close() end
    return f ~= nil
end
  
-- get all lines from a file, returns an empty 
-- list/table if the file does not exist
function lines_from(file)
    if not file_exists(file) then return {} end
    local lines = {}
    for line in io.lines(file) do 
        lines[#lines + 1] = line
    end
    return lines
end

-- String testing
function starts_with(str, start)
    return str:sub(1, #start) == start
end

-- get relevant lines
function getHyphenationLines(lines)
    local lines_with_hyphenations = {}
    for k,v in pairs(lines) do
        if 
            (starts_with(v, "hyphenated") 
            and not string.find(v, "start hyphenated words") 
            and not string.find(v, "stop hyphenated words"))
        then table.insert(lines_with_hyphenations, v) end
    end
    return lines_with_hyphenations
end

-- String cleaning
-- wrapper functions

function cleanLines (xs)
    local cleanedLines = {}
    for k,v in pairs(xs) do
        table.insert(cleanedLines, cleanLine(v))
    end
    return cleanedLines
end

function cleanLine (x)
    return removeTrailingPunctuation(getWord(x))
end

-- 1. Start reading at colon 
function getWord(x)
    -- wir lesen aber Zeichen 26
    return string.sub(x,26)
end

-- 2. Remove trailing punctuation
function removeTrailingPunctuation (x)
    if string.find(x, ',') then
        return x:sub(1, -2)
    else
        return x
    end
end

-- test if word is in second list
function inList (x, list)
    for k,v in ipairs(list) do
        if v == x then
            return true
        end
    end
    return nil
end

-- Filter hyphenated words based on second list (whitelist)
function filterHyphenationsWordlist (xs, list)
    local result = {}
    for k,v in ipairs(xs) do
        if not inList(v, list) then table.insert (result, v) end
    end
    return result
end

function saveResultsToFile(results, output_file)
    -- Opens a file in write mode
    output_file = io.open("check_hyphens.log", "w")
    -- sets the default output file as output_file
    io.output(output_file)
    -- iterate oiver 
    for k,v in ipairs(results) do
        io.write(v..'\n')
    end
    -- closes the open file
    io.close(output_file)
end

-- Run
main(arg[1], arg[2])

> -----Ursprüngliche Nachricht-----
> Von: Keith McKay <mckaymeister@gmail.com>
> Gesendet: Dienstag, 1. August 2023 20:22
> An: mailing list for ConTeXt users <ntg-context@ntg.nl>
> Betreff: [NTG-context] Re: Tracker for hyphens at the end of lines
> 
> Thanks Hans!
> 
> I'm never disappointed, always amazed with ConTeXt!
> 
> This is just what I was looking for.
> 
> Best Wishes
> 
> Keith McKay
> 
> On 01/08/2023 18:10, Hans Hagen via ntg-context wrote:
> > On 8/1/2023 4:54 PM, Keith McKay wrote:
> >> Hi colleagues,
> >>
> >> Is there a tracker for highlighting hyphens at the end of lines
> >> similar the way underfull and overfull boxes can be displayed with a
> >> coloured bar at the end of the offending line?
> >>
> >> I have looked at the wiki page "Reviewing hyphenation" and it has a
> >> solution for mkii from 2009 which, I would think, won't be suitable
> >> for present day ConTeXt. I have tried searching for hyphens using
> >> Skim and Adobe Acrobate viewers but although they find hyphenation in
> >> line they don't recognise hyphens at the edge of lines.
> >>
> >> Any help would be appreciated.
> > I suppose you would be disappointed it there was no tracker ...
> >
> > \enabletrackers[hyphenation.applied.console]
> > \enabletrackers[hyphenation.applied.visualize]
> >
> > you even get a file with the hyphenated words
> >
> > You can see all of them with
> >
> > \disabledirectives[backend.cleanup.flatten]
> > \bitwiseflip \normalizelinemode -\flattendiscretionariesnormalizecode
> > \showmakeup[discretionary]
> >
> > weren't it that i had to provide the directive for this to work well
> > (disbale flattening) so for that you have to wait till we update
> >
> > Hans
> >
> >
> > -----------------------------------------------------------------
> >                                           Hans Hagen | PRAGMA ADE
> >               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
> >        tel: 038 477 53 69 |
> >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> p
> > ragma-
> ade.nl%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc4fb2d4
> b
> >
> 4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7C1%7C
> 0%7C638265
> >
> 109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luM
> >
> zIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5cqZK5
> j5xmDoNJ
> > RkvHLQvmifP5drpkP8LZXHZIN6AmQ%3D&reserved=0 |
> >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> p
> > ragma-
> pod.nl%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc4fb2d4
> b
> >
> 4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7C1%7C
> 0%7C638265
> >
> 109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luM
> >
> zIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lU0q8
> Um87m5GNP
> > AYdBnbZg5d3MmRIQtn4ffAOnMo7RY%3D&reserved=0
> > -----------------------------------------------------------------
> >
> >
> ___________________________________________________________________
> ___
> > _____________
> >
> > If your question is of interest to others as well, please add an entry
> > to the Wiki!
> >
> > maillist : ntg-context@ntg.nl /
> >
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .
> > ntg.nl%2Fmailman%2Flistinfo%2Fntg-
> context&data=05%7C01%7Cdenis.maier%4
> >
> 0unibe.ch%7Cc8839bc4fb2d4b4f585508db92bc568e%7Cd400387a212f43
> eaac7f77a
> >
> a12d7977e%7C1%7C0%7C638265109822676715%7CUnknown%7CTWFpb
> GZsb3d8eyJWIjo
> >
> iMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
> 000%7C%
> >
> 7C%7C&sdata=IrC0veC0OAastbM%2FToranotsvOMMtYd9c2172Rv7k54%3D
> &reserved=
> > 0 webpage  :
> >
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .
> > pragma-
> ade.nl%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc4fb2d4
> >
> b4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7C1%7
> C0%7C63826
> >
> 5109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2lu
> >
> MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gG6p
> jFfCmIfMW
> > lBBqqqKVMknsNrZBd80td7Egfzt1YI%3D&reserved=0 /
> >
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fconte
> >
> xt.aanhet.net%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc
> 4fb2d4
> >
> b4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7C1%7
> C0%7C63826
> >
> 5109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2lu
> >
> MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=m7%
> 2FquU5%2BI
> > P3s3ENvsR0RD%2BABLLrwBe5Vrq3CJA%2FHwTs%3D&reserved=0
> > archive  :
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbitb
> > ucket.org%2Fphg%2Fcontext-
> mirror%2Fcommits%2F&data=05%7C01%7Cdenis.mai
> >
> er%40unibe.ch%7Cc8839bc4fb2d4b4f585508db92bc568e%7Cd400387a21
> 2f43eaac7
> >
> f77aa12d7977e%7C1%7C0%7C638265109822676715%7CUnknown%7CT
> WFpbGZsb3d8eyJ
> >
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
> %7C3000
> >
> %7C%7C%7C&sdata=m4f4c%2BtTIHPoPZI7pdDN3N9%2FrL3XUC2q%2FWXD
> bU%2F7ILo%3D
> > &reserved=0 wiki     :
> >
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcont
> >
> extgarden.net%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc
> 4fb2d4
> >
> b4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7C1%7
> C0%7C63826
> >
> 5109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2lu
> >
> MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KUzZ
> UpM9ykzfi
> > wrJXS07NO5ePBd6dKKOSrZbqmT%2Bd5k%3D&reserved=0
> >
> ___________________________________________________________________
> ___
> > _____________
> ___________________________________________________________________
> ________________
> If your question is of interest to others as well, please add an entry to the
> Wiki!
> 
> maillist : ntg-context@ntg.nl /
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .ntg.nl%2Fmailman%2Flistinfo%2Fntg-
> context&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc4fb2d4b4f
> 585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7C1%7C0
> %7C638265109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%
> 7C%7C%7C&sdata=IrC0veC0OAastbM%2FToranotsvOMMtYd9c2172Rv7k54
> %3D&reserved=0
> webpage  :
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .pragma-
> ade.nl%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc4fb2d4
> b4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7C1%7
> C0%7C638265109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000
> %7C%7C%7C&sdata=gG6pjFfCmIfMWlBBqqqKVMknsNrZBd80td7Egfzt1YI%3
> D&reserved=0 /
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcontex
> t.aanhet.net%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc4
> fb2d4b4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7
> C1%7C0%7C638265109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWI
> joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 3000%7C%7C%7C&sdata=m7%2FquU5%2BIP3s3ENvsR0RD%2BABLLrwBe5V
> rq3CJA%2FHwTs%3D&reserved=0
> archive  :
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbitbu
> cket.org%2Fphg%2Fcontext-
> mirror%2Fcommits%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc88
> 39bc4fb2d4b4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d79
> 77e%7C1%7C0%7C638265109822676715%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> 3D%7C3000%7C%7C%7C&sdata=m4f4c%2BtTIHPoPZI7pdDN3N9%2FrL3XU
> C2q%2FWXDbU%2F7ILo%3D&reserved=0
> wiki     :
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconte
> xtgarden.net%2F&data=05%7C01%7Cdenis.maier%40unibe.ch%7Cc8839bc4
> fb2d4b4f585508db92bc568e%7Cd400387a212f43eaac7f77aa12d7977e%7
> C1%7C0%7C638265109822676715%7CUnknown%7CTWFpbGZsb3d8eyJWI
> joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 3000%7C%7C%7C&sdata=KUzZUpM9ykzfiwrJXS07NO5ePBd6dKKOSrZbqmT
> %2Bd5k%3D&reserved=0
> ___________________________________________________________________
> ________________
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : https://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [NTG-context] Re: Tracker for hyphens at the end of lines
  2023-08-09 10:10     ` denis.maier
@ 2023-08-09 12:17       ` Hans Hagen via ntg-context
  0 siblings, 0 replies; 5+ messages in thread
From: Hans Hagen via ntg-context @ 2023-08-09 12:17 UTC (permalink / raw)
  To: denis.maier, mailing list for ConTeXt users; +Cc: Hans Hagen

On 8/9/2023 12:10 PM, denis.maier@unibe.ch wrote:
> Keith, you can also check hyphenations using a script:
> 
> -- check-hyphens.lua
> --[[
>      analyze hyphenations based on a ConTeXt log file
>      enable hyphenation tracking in the ConTeXt file with
>      \enabletrackers[hyphenation.applied]
>      then run this script with
>      lua check-hyphens.lua input_file whitelist.ending
>      for the input_file we assume .log, so no need to add this
>      for the whitelist a file ending has to be supplied
>      the whitelist is optional
> ]]
> 
> -- local lines = string.splitlines(io.loaddata("oeps.tex")or "") or { }
> 
> -- local pprint = require('pprint')
> 
> function main (input_file, whitelist_file)
>      local lines = lines_from(input_file .. ".log")
>      local whitelist = {}
>      if whitelist_file == nil then
>          whitelist = {}
>      else
>          whitelist = lines_from(whitelist_file)
>      end
>      --pprint (lines)
>      --pprint (whitelist)
>      local filteredWordlist = filterHyphenationsWordlist
>                  (cleanLines
>                      (getHyphenationLines(lines)),
>                      whitelist)
>      -- pprint(filteredWordlist)
>      saveResultsToFile(filteredWordlist, 'check-hyphens.log')
> end
> 
> -- see if the file exists
> 
> -- http://lua-users.org/wiki/FileInputOutput
> 
> -- see if the file exists
> function file_exists(file)
>      local f = io.open(file, "rb")
>      if f then f:close() end
>      return f ~= nil
> end
>    
> -- get all lines from a file, returns an empty
> -- list/table if the file does not exist
> function lines_from(file)
>      if not file_exists(file) then return {} end
>      local lines = {}
>      for line in io.lines(file) do
>          lines[#lines + 1] = line
>      end
>      return lines
> end
> 
> -- String testing
> function starts_with(str, start)
>      return str:sub(1, #start) == start
> end
> 
> -- get relevant lines
> function getHyphenationLines(lines)
>      local lines_with_hyphenations = {}
>      for k,v in pairs(lines) do
>          if
>              (starts_with(v, "hyphenated")
>              and not string.find(v, "start hyphenated words")
>              and not string.find(v, "stop hyphenated words"))
>          then table.insert(lines_with_hyphenations, v) end
>      end
>      return lines_with_hyphenations
> end
> 
> -- String cleaning
> -- wrapper functions
> 
> function cleanLines (xs)
>      local cleanedLines = {}
>      for k,v in pairs(xs) do
>          table.insert(cleanedLines, cleanLine(v))
>      end
>      return cleanedLines
> end
> 
> function cleanLine (x)
>      return removeTrailingPunctuation(getWord(x))
> end
> 
> -- 1. Start reading at colon
> function getWord(x)
>      -- wir lesen aber Zeichen 26
>      return string.sub(x,26)
> end
> 
> -- 2. Remove trailing punctuation
> function removeTrailingPunctuation (x)
>      if string.find(x, ',') then
>          return x:sub(1, -2)
>      else
>          return x
>      end
> end
> 
> -- test if word is in second list
> function inList (x, list)
>      for k,v in ipairs(list) do
>          if v == x then
>              return true
>          end
>      end
>      return nil
> end
> 
> -- Filter hyphenated words based on second list (whitelist)
> function filterHyphenationsWordlist (xs, list)
>      local result = {}
>      for k,v in ipairs(xs) do
>          if not inList(v, list) then table.insert (result, v) end
>      end
>      return result
> end
> 
> function saveResultsToFile(results, output_file)
>      -- Opens a file in write mode
>      output_file = io.open("check_hyphens.log", "w")
>      -- sets the default output file as output_file
>      io.output(output_file)
>      -- iterate oiver
>      for k,v in ipairs(results) do
>          io.write(v..'\n')
>      end
>      -- closes the open file
>      io.close(output_file)
> end
> 
> -- Run
> main(arg[1], arg[2])
Ok, a little lua lesson, if you don't mind.

---- xxx.tex ----

\enabletrackers[hyphenation.applied]

\starttext
     \input tufte
\stoptext

---- xxx.tmp ----

re-fine

---- xxx.lua ----

local function check(logname,whitename)
     if not logname then
         return
     end
     local data = io.loaddata(logname) or ""
     if data == "" then
         return
     end
     local blob  = string.match(data,"start hyphenated words(.-)stop 
hyphenated words")
     if not blob then
         return
     end
     local white = table.tohash(string.splitlines(whitename and 
io.loaddata(whitename) or ""))
     for n, s in string.gmatch(blob,"(%d+) *: (%S+)") do
         if white[s] then
             -- were good
         else
             print(n,s)
         end
     end
end

check(environment.files[1],environment.files[2])

-- print("TEST 1")
-- check("xxx.log")
-- print("TEST 2")
-- check("xxx.log","xxx.tmp")

-------------------

 >mtxrun --script xxx xxx.log
1       dis-tinguish
1       harmo-nize
1       re-fine

 >mtxrun --script xxx xxx.log xxx.tmp
1       dis-tinguish
1       harmo-nize

That said, i wonder if we should add the filename, just in case one 
includes 20 files and a whitelist could be an option to the tracker.

Now the good news is that the tracker is actually already a bit more 
clever. After a run you will see

   xxx-hyphenation-new.lua

that has the hyphenated words (not the numbers)

and you can make a whitelist

   xxx-hyphenation-old.lua

in which case you only get the new ones.

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : https://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-08-09 12:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-01 14:54 [NTG-context] Tracker for hyphens at the end of lines Keith McKay
2023-08-01 17:10 ` [NTG-context] " Hans Hagen via ntg-context
2023-08-01 18:22   ` Keith McKay
2023-08-09 10:10     ` denis.maier
2023-08-09 12:17       ` Hans Hagen via ntg-context

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).