* Multithreading support in pluggable module, but with problems @ 2017-02-11 12:58 Sebastian Gniazdowski 2017-02-11 16:43 ` Sebastian Gniazdowski 2017-02-11 19:46 ` Bart Schaefer 0 siblings, 2 replies; 6+ messages in thread From: Sebastian Gniazdowski @ 2017-02-11 12:58 UTC (permalink / raw) To: zsh-workers Hello, being able to build modules outside Zsh I've created one that allows (this works): # zpin 'print -rl "a:b" "c:d"' | zpopulator -A my_hash # print -r -- ${(kv)my_hash[@]} a b c d This is done entirely in background. zpin does fork() and this way makes Zsh think it's finished – main process exits. zpopulator starts a thread and makes main thread exit – this is also sufficient to make Zsh think that there is no background job. The aim is e.g. to do: zpin "ls -R1 /home/user" | zpopulator -a my_files # echo $zpworkers_count 1 <later..> # echo $zpworkers_count 0 And work in shell as normal. Also, I plan to modify my syntax highlighting so that paths are checked in background. This is to avoid stalls when paths are on auto-mounted directory. However I cannot resolve all problems: 1. I duplicate stdin of "zpopulator" (reading thread): /* Duplicate standard input */ oconf->stream = fdopen( dup( fileno( stdin ) ), "r" ); /* Prepare standard input replacement */ oconf->r_devnull = fopen( "/dev/null", "r"); /* Replace standard input with /dev/null */ dup2( fileno( oconf->r_devnull ), STDIN_FILENO ); fclose( oconf->r_devnull ); I tried with only first line of code, without last (fclose) line, and result is always: 50% of runs has error: "zpopulator: Warning: could not close input stream: Bad file descriptor" This is from fclose(). There is a fcntl() check before: flags = fcntl( oconf->stream->_file, F_GETFD ); and it results in: "zpopulator: Indeed bad descriptor 11: Bad file descriptor" After fclose(), the ->_file, the descriptor, is even turned into -1. Any idea why duplicated descriptor, used only in worker thread, is invalid? What could Zsh do with stdin/out file descriptors of commands it runs to cause such problems? https://github.com/psprint/zpopulator -- Sebastian Gniazdowski psprint2@fastmail.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Multithreading support in pluggable module, but with problems 2017-02-11 12:58 Multithreading support in pluggable module, but with problems Sebastian Gniazdowski @ 2017-02-11 16:43 ` Sebastian Gniazdowski 2017-02-11 19:46 ` Bart Schaefer 1 sibling, 0 replies; 6+ messages in thread From: Sebastian Gniazdowski @ 2017-02-11 16:43 UTC (permalink / raw) To: zsh-workers An idea explaining what's happening: – the streams are pipes, – if writer (zpin) ends first, reader's stdin is "Bad descriptor", not to be fclose()-d, – this explains random behavior, ~50% of fclose() failures. That said, I think FILE has internal buffers to be freed, that's clearly visible from debugger ("p *stream"). So it's not obvious to "not call fclose()". Maybe I shouldn't use stdlib.h for pipe created by Zsh? Handle it by system read(), write(), fcntl(), close(). Or has my thinking no sense – pipes created by Zsh can be always fclosed() on both ends ? Below code tries to create a pipe that instantly closes, and the output is: 2, 0, Operation timed out Operation timed out so looks like stream is not trashed by "echo a" instantly exiting.. But popen() might be different from what's Zsh does. #include <stdlib.h> #include <stdio.h> #include <string.h> #include <unistd.h> #include <sys/errno.h> #include <fcntl.h> int main( int argc, char *argv[] ) { FILE *my_end; my_end = popen( "/bin/echo a", "r" ); char buf[32]; int count = fread( buf, 1, 32, my_end ); sleep( 1 ); int flags = fcntl( my_end->_file, F_GETFD ); printf( "%d, %d, %s\n", count, flags, strerror( errno ) ); fclose( my_end ); printf( "%s\n", strerror( errno ) ); fflush( stdout ); return 0; } That said, after guarding fclose() by successful fcntl(), I did 100 runs in a loop of 32 concurrent workers spawning, and no segfault. That said in debugger the execution might hang, with backtrace: (lldb) bt * thread #1: tid = 0x25367, 0x00007fff8be4683a libsystem_kernel.dylib`close + 10, stop reason = signal SIGSTOP * frame #0: 0x00007fff8be4683a libsystem_kernel.dylib`close + 10 frame #1: 0x00000001000b48ea zsh-5.3.1-dev-0`zclose(fd=12) + 266 at utils.c:2107 frame #2: 0x00000001000b4ae6 zsh-5.3.1-dev-0`redup(x=12, y=0) + 246 at utils.c:2026 frame #3: 0x000000010002e7fc zsh-5.3.1-dev-0`fixfds(save=0x00007fff5fbfe4f0) + 76 at exec.c:4137 frame #4: 0x000000010002da2b zsh-5.3.1-dev-0`execcmd_exec(state=0x00007fff5fbff600, eparams=0x00007fff5fbfe600, input=13, output=0, how=18, last1=2) + 20139 at exec.c:3976 frame #5: 0x0000000100028599 zsh-5.3.1-dev-0`execpline2(state=0x00007fff5fbff600, pcode=387, how=18, input=13, output=0, last1=0) + 441 at exec.c:1861 -- Sebastian Gniazdowski psprint2@fastmail.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Multithreading support in pluggable module, but with problems 2017-02-11 12:58 Multithreading support in pluggable module, but with problems Sebastian Gniazdowski 2017-02-11 16:43 ` Sebastian Gniazdowski @ 2017-02-11 19:46 ` Bart Schaefer 2017-02-11 20:14 ` Sebastian Gniazdowski 1 sibling, 1 reply; 6+ messages in thread From: Bart Schaefer @ 2017-02-11 19:46 UTC (permalink / raw) To: zsh-workers On Feb 11, 4:58am, Sebastian Gniazdowski wrote: } Subject: Multithreading support in pluggable module, but with problems } } # zpin 'print -rl "a:b" "c:d"' | zpopulator -A my_hash } } This is done entirely in background. zpin does fork() and this way makes } Zsh think it's finished - main process exits. zpopulator starts a thread } and makes main thread exit - this is also sufficient to make Zsh think } that there is no background job. So if I understand this correctly, you're attempting to use POSIX threads to share the my_hash variable between the main shell and a 2nd thread of which the main shell is not aware, so that the 2nd thread can stuff a value into my_hash while the main thread goes on about its business? I can almost promise you there is no way to make that work, at least not in a crash-proof manner. Even making my_hash PM_READONLY -- which is I presume what yesterday's question was about -- what is there to stop the thread-unaware main shell from accessing it at a time when the 2nd thread is changing it in some way that will send the main thread off into uncharted memory? Aside from that, you're running directly afoul of the presecribed behavior of file descriptor management for shells, which amounts to a prohibition on passing unexpected file descriptors down to any forked children. If you want to keep a descriptor around, you have to manipulate it with the routines in Src/utils.c that manage the fdtable array. In particular you *might* need to call either movefd() or redup() rather than call dup2() directly, and you should otherwise be using addmodulefd() to protect the fd from exec.c:closeallelse(). Examples in Src/Modules/tcp.c. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Multithreading support in pluggable module, but with problems 2017-02-11 19:46 ` Bart Schaefer @ 2017-02-11 20:14 ` Sebastian Gniazdowski 2017-02-11 20:39 ` Bart Schaefer 0 siblings, 1 reply; 6+ messages in thread From: Sebastian Gniazdowski @ 2017-02-11 20:14 UTC (permalink / raw) To: zsh-workers; +Cc: Bart Schaefer On Sat, Feb 11, 2017, at 11:46 AM, Bart Schaefer wrote: > So if I understand this correctly, you're attempting to use POSIX threads > to share the my_hash variable between the main shell and a 2nd thread of > which the main shell is not aware, so that the 2nd thread can stuff a > value into my_hash while the main thread goes on about its business? Yes. 2nd thread has it's forked pipe data provider – "zpin" command that does eval – and can eval freely because it's forked – Zsh forks first components of pipeline, last one isn't forked – that's the tight window for this solution. > I can almost promise you there is no way to make that work, at least not > in a crash-proof manner. Even making my_hash PM_READONLY -- which is I > presume what yesterday's question was about -- what is there to stop > the thread-unaware main shell from accessing it at a time when the 2nd > thread is changing it in some way that will send the main thread off > into uncharted memory? I know it might seem dirty, but what if it's rock-solid done. Also, for me it might be interesting, for someone outside it might be repelling – I think I can tell from my own reactions to creativity of others, plugins created by others – sadly I'm miles away from trying them. But maybe they're just not interesting, it's possible. I think I got code close to working state. Can consecutively spawn 32 threads one hundred times. Have functions without queue_signals() etc. copied from Zsh. What is to stop conflicts are helper variables. Got two of them casually added, readonly: – $workers_count – holds number of active threads. User is to wait for 0. – $worker_finished – array of 32 ones. If a thread is active, there is "0" for him. If in syntax highlighting I would spawn "check_path()" with worker number "X", output to variable "output_X", then I would check for $worker_finished[X] to become "1" at main loop one or few times, join the thread (TODO), use the variable. This might seem complex and maybe it is, I will verify. > In particular you *might* need to call either movefd() or redup() > rather than call dup2() directly, and you should otherwise be using > addmodulefd() to protect the fd from exec.c:closeallelse(). Examples > in Src/Modules/tcp.c. Good to know, thanks. I might reserve 32 file descriptors. Also I think I found some very real proof. In my message I wrote I guarded fclose() with fcntl() that is checking if FD is valid: if ( -1 != fcntl( fileno( oconf->stream ), F_GETFD ) ) { if ( 0 != fclose( oconf->stream ) ) { This summoned away almost all FD problems, literaly 99.5%. Once in a while fclose() would however fail. I thought: "It's a race condition, between fcntl() and fclose() something closes the FD". So I modified: if ( -1 != fcntl( fileno( oconf->stream ), F_GETFD ) ) { sleep( 1 ); if ( 0 != fclose( oconf->stream ) ) { And now I get many "Bad descriptor" errors each time. So something is really managing the FD behind the curtains, either is like I originally suspected that closing on one end invalidates on other (there are problems in this thinking..), or it's Zsh that does the management, which might be a good thing. It seems I can just forget about fclose(), it's not needed apparently. Will check for any memory leak tomorrow to be sure. -- Sebastian Gniazdowski ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Multithreading support in pluggable module, but with problems 2017-02-11 20:14 ` Sebastian Gniazdowski @ 2017-02-11 20:39 ` Bart Schaefer 2017-02-11 20:56 ` Sebastian Gniazdowski 0 siblings, 1 reply; 6+ messages in thread From: Bart Schaefer @ 2017-02-11 20:39 UTC (permalink / raw) To: zsh-workers On Feb 11, 12:14pm, Sebastian Gniazdowski wrote: } } I think I got code close to working state. Can consecutively spawn 32 } threads one hundred times. Have functions without queue_signals() etc. } copied from Zsh. What is to stop conflicts are helper variables. So, what you mean is that what stops conflicts is well-written shell scripts that remember to check the semaphores. } } > In particular you *might* need to call either movefd() or redup() } > rather than call dup2() directly, and you should otherwise be using } > addmodulefd() to protect the fd from exec.c:closeallelse(). Examples } > in Src/Modules/tcp.c. } } Good to know, thanks. I might reserve 32 file descriptors. There's no reason to reserve descriptors just add them as you open them, and close them with zclose(). Also, you shouldn't add any that aren't actually open. Please, take a look at the way tcp.c or even socket.c manages file descriptors, it'll probably save you a lot of thrashing. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Multithreading support in pluggable module, but with problems 2017-02-11 20:39 ` Bart Schaefer @ 2017-02-11 20:56 ` Sebastian Gniazdowski 0 siblings, 0 replies; 6+ messages in thread From: Sebastian Gniazdowski @ 2017-02-11 20:56 UTC (permalink / raw) To: zsh-workers; +Cc: Bart Schaefer On Sat, Feb 11, 2017, at 12:39 PM, Bart Schaefer wrote: > On Feb 11, 12:14pm, Sebastian Gniazdowski wrote: > } > } I think I got code close to working state. Can consecutively spawn 32 > } threads one hundred times. Have functions without queue_signals() etc. > } copied from Zsh. What is to stop conflicts are helper variables. > > So, what you mean is that what stops conflicts is well-written shell > scripts that remember to check the semaphores. Yes for shell script that are semaphores – this might be difficult, e.g. I've found obvious race condition hour ago that was popping up whole day, but scripts can require carefulness to give e.g. performance. In shell-use it might be simple "echo $workers_count", to see 0 there, as a signal to go on with reading output of: zpin "ls -R /home/user | zpopulator -a my_files" > There's no reason to reserve descriptors just add them as you open them, > and close them with zclose(). Also, you shouldn't add any that aren't > actually open. > > Please, take a look at the way tcp.c or even socket.c manages file > descriptors, it'll probably save you a lot of thrashing. Thanks. -- Sebastian Gniazdowski psprint2@fastmail.com ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-02-11 20:56 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-11 12:58 Multithreading support in pluggable module, but with problems Sebastian Gniazdowski 2017-02-11 16:43 ` Sebastian Gniazdowski 2017-02-11 19:46 ` Bart Schaefer 2017-02-11 20:14 ` Sebastian Gniazdowski 2017-02-11 20:39 ` Bart Schaefer 2017-02-11 20:56 ` Sebastian Gniazdowski
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).