zsh-workers
 help / color / mirror / code / Atom feed
From: Sebastian Gniazdowski <psprint@fastmail.com>
To: zsh-workers@zsh.org
Subject: Re: Callgrind run
Date: Thu, 10 Nov 2016 06:07:04 -0800	[thread overview]
Message-ID: <1478786824.2412727.783512001.7853F82E@webmail.messagingengine.com> (raw)
In-Reply-To: <20161110123156.1d1699ec@pwslap01u.europe.root.pri>

[-- Attachment #1: Type: text/plain, Size: 1388 bytes --]

On Thu, Nov 10, 2016, at 04:31 AM, Peter Stephenson wrote:
> To do a good job optimising here, we really need state information
> outside the functions --- in an experiment with my start up files, only
> 16% of calls to untokenize() actually had any effect.  But recording the
> state generally is a very big change.
> 
> Some possible optimisations are along the following lines, although a
> bit of care it's needed as it's not necessarily the case on all
> architectures that the bit test used by itok() is necessarily faster
> than the range test the following replaces it with.  It did seem faster
> on this fairly standard Intel CPU.

Tested this and no big change, maybe 14 ms – running times are 2135 vs
2149, but that can be just instability. However callgrind reports
851,712,174 instructions instead of 1,177,994,701 for untokenize, while
other instruction counts are kept the same so the test seems valid.

My motivation is parsing of long Zsh code – would be a cool thing to
iterate long (z)-splitted input in say 400 ms instead of 2 seconds – a
dreamed result, maybe actually impossible, as disabling multibyte yields
1560 ms. State recording might seem bad but at least there is room for
improvement condensed in apparently few places, better than counting
cycles along whole Zsh code.

-- 
  Sebastian Gniazdowski
  psprint@fastmail.com

[-- Attachment #2: callgrind_annotate3.txt --]
[-- Type: text/plain, Size: 7316 bytes --]

--------------------------------------------------------------------------------
Profile data file 'callgrind.out.37869' (creator: callgrind-3.12.0)
--------------------------------------------------------------------------------
I1 cache: 
D1 cache: 
LL cache: 
Timerange: Basic block 0 - 3061023589
Trigger: Program termination
Profiled target:  zsh-ps-debug-opt -f -c source "./testparse.zsh" "./to-parse.zsh" "changes.out" "" (PID 37869, part 1)
Events recorded:  Ir
Events shown:     Ir
Event sort order: Ir
Thresholds:       99
Include dirs:     
User annotated:   
Auto-annotation:  off

--------------------------------------------------------------------------------
            Ir 
--------------------------------------------------------------------------------
16,408,775,049  PROGRAM TOTALS

--------------------------------------------------------------------------------
           Ir  file:function
--------------------------------------------------------------------------------
2,269,560,047  ???:mb_metacharlenconv_r [/usr/local/bin/zsh-ps-debug-opt]
1,697,840,717  ???:remnulargs [/usr/local/bin/zsh-ps-debug-opt]
1,677,804,272  ???:_UTF8_mbrtowc [/usr/lib/system/libsystem_c.dylib]
1,425,973,736  ???:mbrtowc [/usr/lib/system/libsystem_c.dylib]
1,048,181,974  ???:mb_metacharlenconv [/usr/local/bin/zsh-ps-debug-opt]
1,036,055,574  ???:getindex'2 [/usr/local/bin/zsh-ps-debug-opt]
  851,712,174  ???:untokenize [/usr/local/bin/zsh-ps-debug-opt]
  793,202,632  ???:haswilds [/usr/local/bin/zsh-ps-debug-opt]
  578,630,988  ???:mb_metastrlenend [/usr/local/bin/zsh-ps-debug-opt]
  482,828,373  ???:szone_free_definite_size [/usr/lib/system/libsystem_malloc.dylib]
  436,411,797  ???:ztrsub [/usr/local/bin/zsh-ps-debug-opt]
  363,212,196  ???:tiny_malloc_from_free_list [/usr/lib/system/libsystem_malloc.dylib]
  353,826,375  ???:pattrylen'2 [/usr/local/bin/zsh-ps-debug-opt]
  282,357,130  ???:tiny_free_list_add_ptr [/usr/lib/system/libsystem_malloc.dylib]
  258,502,798  ???:strlen [/usr/lib/dyld]
  234,273,918  ???:pattrylen [/usr/local/bin/zsh-ps-debug-opt]
  209,831,892  ???:szone_size [/usr/lib/system/libsystem_malloc.dylib]
  193,951,431  ???:tiny_free_list_remove_ptr [/usr/lib/system/libsystem_malloc.dylib]
  169,581,080  ???:szone_malloc_should_clear [/usr/lib/system/libsystem_malloc.dylib]
  143,108,999  ???:_platform_memmove$VARIANT$Nehalem [/usr/lib/system/libsystem_platform.dylib]
   97,432,800  ???:free [/usr/lib/system/libsystem_malloc.dylib]
   97,335,179  ???:itype_end [/usr/local/bin/zsh-ps-debug-opt]
   95,268,036  ???:get_tiny_free_size [/usr/lib/system/libsystem_malloc.dylib]
   83,934,500  ???:pthread_getspecific [/usr/lib/system/libsystem_pthread.dylib]
   81,015,036  ???:filesub [/usr/local/bin/zsh-ps-debug-opt]
   68,739,019  ???:__strcpy_chk [/usr/lib/system/libsystem_c.dylib]
   60,928,000  ???:malloc_zone_malloc [/usr/lib/system/libsystem_malloc.dylib]
   57,698,433  ???:zalloc [/usr/local/bin/zsh-ps-debug-opt]
   55,196,334  ???:bin_log [/usr/local/bin/zsh-ps-debug-opt]
   54,517,153  ???:stpcpy [/usr/lib/system/libsystem_c.dylib]
   51,545,105  ???:setarrvalue [/usr/local/bin/zsh-ps-debug-opt]
   49,058,372  ???:get_tiny_previous_free_msize [/usr/lib/system/libsystem_malloc.dylib]
   48,122,383  ???:ztrdup [/usr/local/bin/zsh-ps-debug-opt]
   45,371,076  ???:mathevalarg'2 [/usr/local/bin/zsh-ps-debug-opt]
   44,923,221  ???:arrlen [/usr/local/bin/zsh-ps-debug-opt]
   44,888,797  ???:__vsnprintf_chk [/usr/lib/system/libsystem_c.dylib]
   43,521,421  ???:malloc [/usr/lib/system/libsystem_malloc.dylib]
   33,548,396  ???:__chk_overlap [/usr/lib/system/libsystem_c.dylib]
   33,378,378  ???:execlist'2 [/usr/local/bin/zsh-ps-debug-opt]
   32,027,396  ???:_platform_memset$VARIANT$Merom [/usr/lib/system/libsystem_platform.dylib]
   29,584,698  ???:_platform_strchr$VARIANT$Generic [/usr/lib/system/libsystem_platform.dylib]
   28,788,128  ???:hasher [/usr/local/bin/zsh-ps-debug-opt]
   25,459,319  ???:zhalloc [/usr/local/bin/zsh-ps-debug-opt]
   25,436,057  ???:modify [/usr/local/bin/zsh-ps-debug-opt]
   23,233,085  ???:patcompile'2 [/usr/local/bin/zsh-ps-debug-opt]
   23,114,835  ???:zsfree [/usr/local/bin/zsh-ps-debug-opt]
   21,720,950  ???:_os_lock_spin_lock [/usr/lib/system/libsystem_platform.dylib]
   21,033,364  ???:execrestore'2 [/usr/local/bin/zsh-ps-debug-opt]
   21,029,575  ???:ingetc [/usr/local/bin/zsh-ps-debug-opt]
   20,619,575  ???:freearray [/usr/local/bin/zsh-ps-debug-opt]
   18,246,076  ???:fetchvalue [/usr/local/bin/zsh-ps-debug-opt]
   17,288,068  ???:isascii [/usr/lib/system/libsystem_c.dylib]
   16,274,888  ???:filesub'2 [/usr/local/bin/zsh-ps-debug-opt]
   15,162,279  ???:haswilds'2 [/usr/local/bin/zsh-ps-debug-opt]
   12,590,562  ???:parsestrnoerr [/usr/local/bin/zsh-ps-debug-opt]
   10,881,555  ???:szone_malloc [/usr/lib/system/libsystem_malloc.dylib]
   10,206,971  ???:zstrtol_underscore [/usr/local/bin/zsh-ps-debug-opt]
    9,997,820  ???:_pthread_mutex_unlock_slow [/usr/lib/system/libsystem_pthread.dylib]
    9,639,594  ???:_platform_strcmp [/usr/lib/system/libsystem_platform.dylib]
    9,404,226  ???:modify'2 [/usr/local/bin/zsh-ps-debug-opt]
    8,688,580  ???:os_lock_unlock [/usr/lib/system/libsystem_platform.dylib]
    8,688,580  ???:os_lock_lock [/usr/lib/system/libsystem_platform.dylib]
    8,688,380  ???:_os_lock_spin_unlock [/usr/lib/system/libsystem_platform.dylib]
    8,497,692  ???:op [/usr/local/bin/zsh-ps-debug-opt]
    8,390,890  ???:prefork [/usr/local/bin/zsh-ps-debug-opt]
    8,223,809  ???:patcompstart [/usr/local/bin/zsh-ps-debug-opt]
    7,963,097  ???:gethashnode2 [/usr/local/bin/zsh-ps-debug-opt]
    7,766,705  ???:scanmatchtable [/usr/local/bin/zsh-ps-debug-opt]
    7,013,693  ???:parsestrnoerr'2 [/usr/local/bin/zsh-ps-debug-opt]
    6,917,568  ???:op'2 [/usr/local/bin/zsh-ps-debug-opt]
    6,909,521  ???:getindex [/usr/local/bin/zsh-ps-debug-opt]
    6,827,386  ???:_pthread_mutex_lock_slow [/usr/lib/system/libsystem_pthread.dylib]
    6,691,178  ???:hasbraces [/usr/local/bin/zsh-ps-debug-opt]
    6,601,957  ???:mathevalarg [/usr/local/bin/zsh-ps-debug-opt]
    6,523,105  ???:get_node_from_uniquing_table [/usr/lib/system/libsystem_malloc.dylib]
    6,465,840  ???:ecgetstr [/usr/local/bin/zsh-ps-debug-opt]
    6,193,899  ???:getstrvalue [/usr/local/bin/zsh-ps-debug-opt]
    6,178,975  ???:matheval'2 [/usr/local/bin/zsh-ps-debug-opt]
    6,012,923  ???:patcompile [/usr/local/bin/zsh-ps-debug-opt]
    5,721,631  ???:ImageLoaderMachOCompressed::trieWalk(unsigned char const*, unsigned char const*, char const*) [/usr/lib/dyld]
    5,084,013  ???:add [/usr/local/bin/zsh-ps-debug-opt]
    4,864,913  ???:__vsnprintf_chk'2 [/usr/lib/system/libsystem_c.dylib]
    4,728,297  ???:dupstring [/usr/local/bin/zsh-ps-debug-opt]
    4,331,738  ???:fetchvalue'2 [/usr/local/bin/zsh-ps-debug-opt]
    4,212,315  ???:newparamtable [/usr/local/bin/zsh-ps-debug-opt]
    4,034,794  ???:__vfprintf [/usr/lib/system/libsystem_c.dylib]
    3,977,804  ???:pattryrefs [/usr/local/bin/zsh-ps-debug-opt]
    3,604,588  ???:assignstrvalue [/usr/local/bin/zsh-ps-debug-opt]
    3,520,423  ???:matheval [/usr/local/bin/zsh-ps-debug-opt]
    3,518,560  ???:mb_charinit [/usr/local/bin/zsh-ps-debug-opt]
    3,487,929  ???:freeheap [/usr/local/bin/zsh-ps-debug-opt]


[-- Attachment #3: callgrind_annotate.txt --]
[-- Type: text/plain, Size: 7036 bytes --]

--------------------------------------------------------------------------------
Profile data file 'callgrind.out.11879' (creator: callgrind-3.12.0)
--------------------------------------------------------------------------------
I1 cache: 
D1 cache: 
LL cache: 
Timerange: Basic block 0 - 2995164135
Trigger: Program termination
Profiled target:  zsh-debug-opt -f -c source "./testparse.zsh" "./to-parse.zsh" "changes.out" "" (PID 11879, part 1)
Events recorded:  Ir
Events shown:     Ir
Event sort order: Ir
Thresholds:       99
Include dirs:     
User annotated:   
Auto-annotation:  off

--------------------------------------------------------------------------------
            Ir 
--------------------------------------------------------------------------------
16,735,388,538  PROGRAM TOTALS

--------------------------------------------------------------------------------
           Ir  file:function
--------------------------------------------------------------------------------
2,269,560,047  ???:mb_metacharlenconv_r [/usr/local/bin/zsh-debug-opt]
1,698,947,505  ???:remnulargs [/usr/local/bin/zsh-debug-opt]
1,677,804,272  ???:_UTF8_mbrtowc [/usr/lib/system/libsystem_c.dylib]
1,425,973,736  ???:mbrtowc [/usr/lib/system/libsystem_c.dylib]
1,177,994,701  ???:untokenize [/usr/local/bin/zsh-debug-opt]
1,048,181,974  ???:mb_metacharlenconv [/usr/local/bin/zsh-debug-opt]
1,036,055,574  ???:getindex'2 [/usr/local/bin/zsh-debug-opt]
  793,202,632  ???:haswilds [/usr/local/bin/zsh-debug-opt]
  578,630,988  ???:mb_metastrlenend [/usr/local/bin/zsh-debug-opt]
  483,051,992  ???:szone_free_definite_size [/usr/lib/system/libsystem_malloc.dylib]
  436,411,797  ???:ztrsub [/usr/local/bin/zsh-debug-opt]
  364,444,476  ???:tiny_malloc_from_free_list [/usr/lib/system/libsystem_malloc.dylib]
  353,826,375  ???:pattrylen'2 [/usr/local/bin/zsh-debug-opt]
  280,090,072  ???:tiny_free_list_add_ptr [/usr/lib/system/libsystem_malloc.dylib]
  258,502,596  ???:strlen [/usr/lib/dyld]
  234,273,918  ???:pattrylen [/usr/local/bin/zsh-debug-opt]
  209,835,520  ???:szone_size [/usr/lib/system/libsystem_malloc.dylib]
  193,985,837  ???:tiny_free_list_remove_ptr [/usr/lib/system/libsystem_malloc.dylib]
  169,580,182  ???:szone_malloc_should_clear [/usr/lib/system/libsystem_malloc.dylib]
  143,109,122  ???:_platform_memmove$VARIANT$Nehalem [/usr/lib/system/libsystem_platform.dylib]
   97,432,800  ???:free [/usr/lib/dyld]
   97,335,179  ???:itype_end [/usr/local/bin/zsh-debug-opt]
   95,353,820  ???:get_tiny_free_size [/usr/lib/system/libsystem_malloc.dylib]
   83,934,500  ???:pthread_getspecific [/usr/lib/system/libsystem_pthread.dylib]
   81,015,036  ???:filesub [/usr/local/bin/zsh-debug-opt]
   68,738,845  ???:__strcpy_chk [/usr/lib/system/libsystem_c.dylib]
   60,927,832  ???:malloc_zone_malloc [/usr/lib/system/libsystem_malloc.dylib]
   57,698,352  ???:zalloc [/usr/local/bin/zsh-debug-opt]
   55,196,289  ???:bin_log [/usr/local/bin/zsh-debug-opt]
   54,517,015  ???:stpcpy [/usr/lib/system/libsystem_c.dylib]
   51,545,105  ???:setarrvalue [/usr/local/bin/zsh-debug-opt]
   49,052,650  ???:get_tiny_previous_free_msize [/usr/lib/system/libsystem_malloc.dylib]
   48,122,314  ???:ztrdup [/usr/local/bin/zsh-debug-opt]
   45,371,076  ???:mathevalarg'2 [/usr/local/bin/zsh-debug-opt]
   44,923,221  ???:arrlen [/usr/local/bin/zsh-debug-opt]
   44,888,769  ???:__vsnprintf_chk [/usr/lib/system/libsystem_c.dylib]
   43,521,301  ???:malloc [/usr/lib/dyld]
   33,548,312  ???:__chk_overlap [/usr/lib/system/libsystem_c.dylib]
   33,378,378  ???:execlist'2 [/usr/local/bin/zsh-debug-opt]
   32,027,315  ???:_platform_memset$VARIANT$Merom [/usr/lib/system/libsystem_platform.dylib]
   29,584,698  ???:_platform_strchr$VARIANT$Generic [/usr/lib/system/libsystem_platform.dylib]
   28,786,904  ???:hasher [/usr/local/bin/zsh-debug-opt]
   25,459,319  ???:zhalloc [/usr/local/bin/zsh-debug-opt]
   25,436,057  ???:modify [/usr/local/bin/zsh-debug-opt]
   23,233,085  ???:patcompile'2 [/usr/local/bin/zsh-debug-opt]
   23,114,835  ???:zsfree [/usr/local/bin/zsh-debug-opt]
   21,720,915  ???:_os_lock_spin_lock [/usr/lib/system/libsystem_platform.dylib]
   21,033,364  ???:execrestore'2 [/usr/local/bin/zsh-debug-opt]
   21,029,416  ???:ingetc [/usr/local/bin/zsh-debug-opt]
   20,619,575  ???:freearray [/usr/local/bin/zsh-debug-opt]
   18,246,076  ???:fetchvalue [/usr/local/bin/zsh-debug-opt]
   17,288,068  ???:isascii [/usr/lib/system/libsystem_c.dylib]
   16,274,888  ???:filesub'2 [/usr/local/bin/zsh-debug-opt]
   15,162,279  ???:haswilds'2 [/usr/local/bin/zsh-debug-opt]
   12,590,562  ???:parsestrnoerr [/usr/local/bin/zsh-debug-opt]
   10,881,530  ???:szone_malloc [/usr/lib/system/libsystem_malloc.dylib]
   10,206,971  ???:zstrtol_underscore [/usr/local/bin/zsh-debug-opt]
    9,997,508  ???:_pthread_mutex_unlock_slow [/usr/lib/system/libsystem_pthread.dylib]
    9,639,468  ???:_platform_strcmp [/usr/lib/system/libsystem_platform.dylib]
    9,404,226  ???:modify'2 [/usr/local/bin/zsh-debug-opt]
    8,688,566  ???:os_lock_unlock [/usr/lib/system/libsystem_platform.dylib]
    8,688,566  ???:os_lock_lock [/usr/lib/system/libsystem_platform.dylib]
    8,688,366  ???:_os_lock_spin_unlock [/usr/lib/system/libsystem_platform.dylib]
    8,497,692  ???:op [/usr/local/bin/zsh-debug-opt]
    8,390,890  ???:prefork [/usr/local/bin/zsh-debug-opt]
    8,223,809  ???:patcompstart [/usr/local/bin/zsh-debug-opt]
    7,962,974  ???:gethashnode2 [/usr/local/bin/zsh-debug-opt]
    7,766,705  ???:scanmatchtable [/usr/local/bin/zsh-debug-opt]
    7,013,693  ???:parsestrnoerr'2 [/usr/local/bin/zsh-debug-opt]
    6,917,568  ???:op'2 [/usr/local/bin/zsh-debug-opt]
    6,909,521  ???:getindex [/usr/local/bin/zsh-debug-opt]
    6,827,173  ???:_pthread_mutex_lock_slow [/usr/lib/system/libsystem_pthread.dylib]
    6,691,178  ???:hasbraces [/usr/local/bin/zsh-debug-opt]
    6,601,957  ???:mathevalarg [/usr/local/bin/zsh-debug-opt]
    6,523,091  ???:get_node_from_uniquing_table [/usr/lib/system/libsystem_malloc.dylib]
    6,465,840  ???:ecgetstr [/usr/local/bin/zsh-debug-opt]
    6,193,899  ???:getstrvalue [/usr/local/bin/zsh-debug-opt]
    6,178,975  ???:matheval'2 [/usr/local/bin/zsh-debug-opt]
    6,012,923  ???:patcompile [/usr/local/bin/zsh-debug-opt]
    5,721,631  ???:ImageLoaderMachOCompressed::trieWalk(unsigned char const*, unsigned char const*, char const*) [/usr/lib/dyld]
    5,083,953  ???:add [/usr/local/bin/zsh-debug-opt]
    4,864,901  ???:__vsnprintf_chk'2 [/usr/lib/system/libsystem_c.dylib]
    4,728,297  ???:dupstring [/usr/local/bin/zsh-debug-opt]
    4,331,738  ???:fetchvalue'2 [/usr/local/bin/zsh-debug-opt]
    4,212,315  ???:newparamtable [/usr/local/bin/zsh-debug-opt]
    4,034,794  ???:__vfprintf [/usr/lib/system/libsystem_c.dylib]
    3,977,804  ???:pattryrefs [/usr/local/bin/zsh-debug-opt]
    3,604,588  ???:assignstrvalue [/usr/local/bin/zsh-debug-opt]
    3,520,423  ???:matheval [/usr/local/bin/zsh-debug-opt]
    3,518,560  ???:mb_charinit [/usr/local/bin/zsh-debug-opt]


  reply	other threads:[~2016-11-10 14:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20161110103845epcas3p3e7cabeffae723219daafa8d3e6b32f12@epcas3p3.samsung.com>
2016-11-10 10:37 ` Sebastian Gniazdowski
2016-11-10 12:31   ` Peter Stephenson
2016-11-10 14:07     ` Sebastian Gniazdowski [this message]
2016-11-10 13:47   ` multibyte optimisations Peter Stephenson
2016-11-10 14:57     ` Sebastian Gniazdowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1478786824.2412727.783512001.7853F82E@webmail.messagingengine.com \
    --to=psprint@fastmail.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).