* LMDB test failures under musl on mips @ 2014-02-13 20:50 Martin Lucina 2014-02-13 21:39 ` Rich Felker ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Martin Lucina @ 2014-02-13 20:50 UTC (permalink / raw) To: musl Hi, I'm currently using musl libc and LMDB [1] in a new project. When developing on a Debian x86_64 host everything works fine, but when building for a target device (OpenWRT mips or mipsel, I've tried both) with static linking my LMDB code starts failing with assertions and/or segfaults inside LMDB itself. Cross-compiling to statically linked musl on x86_64 does not have the problem. It's possible that the problem is LMDB itself; I can ask on the OpenLDAP lists but I'd like to check here first if someone else has encountered this problem? You can reproduce the problem fairly easily by building the mtest* programs that come with LMDB. Running mtest a few times (after creating ./testdb) reliably gives either a segfault or various assertion failures in LMDB. Note that I'm using the prebuilt toolchains from musl.codu.org (thanks!), and the 0.9,15 release. Any ideas? Thanks, Martin [1] http://symas.com/mdb/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LMDB test failures under musl on mips 2014-02-13 20:50 LMDB test failures under musl on mips Martin Lucina @ 2014-02-13 21:39 ` Rich Felker 2014-02-13 21:42 ` Szabolcs Nagy 2014-02-13 23:26 ` Szabolcs Nagy 2 siblings, 0 replies; 7+ messages in thread From: Rich Felker @ 2014-02-13 21:39 UTC (permalink / raw) To: musl Hi, Thank you for the report. Can you provide some more details on where it's failing such as the specific tests/assertions? I'll see if someone with a mips setup can look at it. Rich On Thu, Feb 13, 2014 at 09:50:40PM +0100, Martin Lucina wrote: > Hi, > > I'm currently using musl libc and LMDB [1] in a new project. When > developing on a Debian x86_64 host everything works fine, but when building > for a target device (OpenWRT mips or mipsel, I've tried both) with static > linking my LMDB code starts failing with assertions and/or segfaults inside > LMDB itself. > > Cross-compiling to statically linked musl on x86_64 does not have the > problem. > > It's possible that the problem is LMDB itself; I can ask on the OpenLDAP > lists but I'd like to check here first if someone else has encountered this > problem? > > You can reproduce the problem fairly easily by building the mtest* programs > that come with LMDB. Running mtest a few times (after creating ./testdb) > reliably gives either a segfault or various assertion failures in LMDB. > > Note that I'm using the prebuilt toolchains from musl.codu.org (thanks!), > and the 0.9,15 release. > > Any ideas? > > Thanks, > > Martin > > [1] http://symas.com/mdb/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LMDB test failures under musl on mips 2014-02-13 20:50 LMDB test failures under musl on mips Martin Lucina 2014-02-13 21:39 ` Rich Felker @ 2014-02-13 21:42 ` Szabolcs Nagy 2014-02-13 23:26 ` Szabolcs Nagy 2 siblings, 0 replies; 7+ messages in thread From: Szabolcs Nagy @ 2014-02-13 21:42 UTC (permalink / raw) To: musl * Martin Lucina <martin@lucina.net> [2014-02-13 21:50:40 +0100]: > I'm currently using musl libc and LMDB [1] in a new project. When > developing on a Debian x86_64 host everything works fine, but when building > for a target device (OpenWRT mips or mipsel, I've tried both) with static > linking my LMDB code starts failing with assertions and/or segfaults inside > LMDB itself. > > Cross-compiling to statically linked musl on x86_64 does not have the > problem. > > It's possible that the problem is LMDB itself; I can ask on the OpenLDAP > lists but I'd like to check here first if someone else has encountered this > problem? > mips was not nearly as extensively tested as x86 targets and it has a lot of arch specific syscall quirks so it may be a musl bug, if you have strace on the target then please send a strace log what is the pagesize used on the target? (iirc some mips can be non 4k) > You can reproduce the problem fairly easily by building the mtest* programs > that come with LMDB. Running mtest a few times (after creating ./testdb) > reliably gives either a segfault or various assertion failures in LMDB. > > Note that I'm using the prebuilt toolchains from musl.codu.org (thanks!), > and the 0.9,15 release. > > Any ideas? > > Thanks, > > Martin > > [1] http://symas.com/mdb/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LMDB test failures under musl on mips 2014-02-13 20:50 LMDB test failures under musl on mips Martin Lucina 2014-02-13 21:39 ` Rich Felker 2014-02-13 21:42 ` Szabolcs Nagy @ 2014-02-13 23:26 ` Szabolcs Nagy 2014-02-14 9:31 ` Martin Lucina 2 siblings, 1 reply; 7+ messages in thread From: Szabolcs Nagy @ 2014-02-13 23:26 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 488 bytes --] * Martin Lucina <martin@lucina.net> [2014-02-13 21:50:40 +0100]: > You can reproduce the problem fairly easily by building the mtest* programs > that come with LMDB. Running mtest a few times (after creating ./testdb) > reliably gives either a segfault or various assertion failures in LMDB. ok i could reproduce it i got the following assertion failure: mdb.c:2001: Assertion 'mp->mp_pgno != pgno' failed in mdb_page_touch() (it's on real hw without debugger, but i have strace now) [-- Attachment #2: mtest.txt --] [-- Type: text/plain, Size: 3608 bytes --] execve("./mtest", ["./mtest"], [/* 11 vars */]) = 0 clock_gettime(CLOCK_REALTIME, {1392333473, 890526084}) = 0 getpid() = 28121 mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2abe1000 mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ac62000 open("./testdb/lock.mdb", O_RDWR|O_CREAT|O_LARGEFILE|0x80000, 0664) = 3 rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 16) = 0 set_thread_area(0x2abe7d68) = 0 set_tid_address(0x2abe0cb8) = 28121 fcntl64(3, F_SETLK64, {type=F_WRLCK, whence=SEEK_SET, start=0, len=1}, 0x7f9ee820) = 0 _llseek(3, 0, [8192], SEEK_END) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2abce000 open("./testdb/data.mdb", O_RDWR|O_CREAT|O_LARGEFILE, 0664) = 4 pread(4, "\0\0\0\0\0\0\0\10\0\0\0\0\276\357\300\336\0\0\0\1*\326@\0\0\240\0\0\0\0\20\0"..., 92, 0) = 92 pread(4, "\0\0\0\1\0\0\0\10\0\0\0\0\276\357\300\336\0\0\0\1*\326@\0\0\240\0\0\0\0\20\0"..., 92, 4096) = 92 mmap(0x2ad64000, 10485760, PROT_READ, MAP_SHARED, 4, 0) = 0x2ad64000 open("./testdb/data.mdb", O_RDWR|O_SYNC|O_LARGEFILE) = 5 fcntl64(3, F_SETLK64, {type=F_RDLCK, whence=SEEK_SET, start=0, len=1}, 0x7f9ee878) = 0 brk(0) = 0x423000 brk(0x425000) = 0x425000 ioctl(1, TIOCNXCL, 0x7f9ee6e0) = -1 ENOTTY (Inappropriate ioctl for device) writev(1, [{"Adding 75", 9}, {" values\n", 8}], 2) = 17 brk(0x427000) = 0x427000 brk(0x429000) = 0x429000 pwrite(4, "\0\0\0\2\0\0\0\1\0\20\17\354\17\370\17\354\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 8192) = 4096 _llseek(4, 40960, [40960], SEEK_SET) = 0 writev(4, [{"\0\0\0\n\0\0\0\2\0\212\5,\17\324\17\250\17|\17P\17$\16\370\16\314\7\354\16\240\16t"..., 4096}, {"\0\0\0\v\0\0\0\2\0\20\17\310\17\344\17\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096}], 2) = 8192 fdatasync(4) = 0 pwrite(5, "\0\1\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\2\0\0\0\v\0\0\0\0\0\0\0\2\0\0"..., 58, 34) = 58 writev(1, [{"74 duplicates skipped\nkey: 0x2ad"..., 1021}, {"097 ", 4}], 2) = 1025 writev(1, [{", data: 0x2ad67da4 097 151 foo b"..., 1024}, {"\n", 1}], 2) = 1025 writev(1, [{"key: 0x2ad67b64 114 , data: 0x2a"..., 1020}, {"2ad67640", 8}], 2) = 1028 writev(1, [{" 1a3 419 foo bar\nkey: 0x2ad67820"..., 1023}, {"258 ", 4}], 2) = 1027 writev(1, [{", data: 0x2ad68f04 258 600 foo b"..., 1024}, {"\n", 1}], 2) = 1025 writev(1, [{"key: 0x2ad68c98 327 , data: 0x2a"..., 1020}, {"2ad68ab8", 8}], 2) = 1028 writev(1, [{" 377 887 foo bar\nkey: 0x2ad68a88"..., 1024}, {"3f4 ", 4}], 2) = 1028 _llseek(4, 16384, [16384], SEEK_SET) = 0 writev(4, [{"\0\0\0\4\0\0\0\1\0\20\17\354\17\370\17\354\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096}, {"\0\0\0\5\0\0\0\2\0\210\5X\17\324\17\250\17|\17P\17$\16\370\16\314\7\354\16\240\16t"..., 4096}, {"\0\0\0\6\0\0\0\2\0\20\17\310\17\344\17\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096}], 3) = 12288 fdatasync(4) = 0 pwrite(5, "\0\1\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\2\0\0\0\6\0\0\0\0\0\0\0\2\0\0"..., 58, 4130) = 58 writev(2, [{"mdb.c:2001: Assertion 'mp->mp_pg"..., 71}, {NULL, 0}], 2mdb.c:2001: Assertion 'mp->mp_pgno != pgno' failed in mdb_page_touch() ) = 71 rt_sigprocmask(SIG_BLOCK, ~[RT_0 RT_1 RT_2], [], 16) = 0 gettid() = 28121 getpid() = 28121 tgkill(28121, 28121, SIGIOT) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 16) = 0 --- SIGIOT (Aborted) @ 0 (0) --- +++ killed by SIGIOT +++ Aborted ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LMDB test failures under musl on mips 2014-02-13 23:26 ` Szabolcs Nagy @ 2014-02-14 9:31 ` Martin Lucina 2014-02-14 10:26 ` Szabolcs Nagy 0 siblings, 1 reply; 7+ messages in thread From: Martin Lucina @ 2014-02-14 9:31 UTC (permalink / raw) To: musl nsz@port70.net said: > * Martin Lucina <martin@lucina.net> [2014-02-13 21:50:40 +0100]: > > You can reproduce the problem fairly easily by building the mtest* programs > > that come with LMDB. Running mtest a few times (after creating ./testdb) > > reliably gives either a segfault or various assertion failures in LMDB. > > ok i could reproduce it > i got the following assertion failure: > > mdb.c:2001: Assertion 'mp->mp_pgno != pgno' failed in mdb_page_touch() > > (it's on real hw without debugger, but i have strace now) That's what I get, and also these: mdb.c:5176: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next() or mdb.c:1713: Assertion 'rc == 0' failed in mdb_page_dirty() etc. mtest is somewhat fickle, it uses random() to decide exactly what it's doing. I have a hunch that I can provoke this with a simpler test program, going to try that now. Do you still want those strace logs from me? Both of the targets (ASUS RT-N66u running Tomato, TP-Link TL-WDR4300 running OpenWRT trunk) I tried have 4k page size, so nothing out of the ordinary there. One thing I'd like to try is building against the normal OpenWRT/uClibc toolchain (or even a plain glibc one) to see if anything changes. Unfortunately the snapshot binaries they provide require at least glibc 2.14 which I don't have on my machines running Debian stable. I tried using a toolchain built from source using the OpenWRT buildroot but get random link errors :-/ Martin ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LMDB test failures under musl on mips 2014-02-14 9:31 ` Martin Lucina @ 2014-02-14 10:26 ` Szabolcs Nagy 2014-02-14 15:53 ` Martin Lucina 0 siblings, 1 reply; 7+ messages in thread From: Szabolcs Nagy @ 2014-02-14 10:26 UTC (permalink / raw) To: musl * Martin Lucina <martin@lucina.net> [2014-02-14 10:31:56 +0100]: > That's what I get, and also these: > > mdb.c:5176: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next() > > or > > mdb.c:1713: Assertion 'rc == 0' failed in mdb_page_dirty() > > etc. > > mtest is somewhat fickle, it uses random() to decide exactly what it's > doing. I have a hunch that I can provoke this with a simpler test program, > going to try that now. i removed the srandom(time(NULL)) and disabled ASLR and it's still fickle i haven't looked further > > Do you still want those strace logs from me? > no, i think strace does not help here (at least i didnt see anything obvious) i don't quite understand the nondeterministic behaviour it seems to do reads/writes and mmap through two different fds to the same underlying file, but it does fdatasync on one and O_SYNC on the other so i think the behaviour should be deterministic (i'd need to know more about mdb and see the mmap accesses as well to figure out what's going on..) > Both of the targets (ASUS RT-N66u running Tomato, TP-Link TL-WDR4300 > running OpenWRT trunk) I tried have 4k page size, so nothing out of the > ordinary there. > i tried it on a wrt160nl with old openwrt image (Atheros AR9130 cpu) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LMDB test failures under musl on mips 2014-02-14 10:26 ` Szabolcs Nagy @ 2014-02-14 15:53 ` Martin Lucina 0 siblings, 0 replies; 7+ messages in thread From: Martin Lucina @ 2014-02-14 15:53 UTC (permalink / raw) To: musl nsz@port70.net said: > no, i think strace does not help here > (at least i didnt see anything obvious) Same here. > i don't quite understand the nondeterministic behaviour Me neither :/ Something to do with magic memory accesses, most likely. I've gotten in touch with the author via the openldap lists and we are looking into it. I built a toolchain for mips-unknown-linux-gnu, using GCC 4.8.1 (crosstool-NG 1.19.0) and configured with eglibc 2.17. mtest still fails with the same symptoms - so it looks like we can rule out musl as a cause for the time being. Will follow up once I have more information. Thanks for your help, Martin ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-02-14 15:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-02-13 20:50 LMDB test failures under musl on mips Martin Lucina 2014-02-13 21:39 ` Rich Felker 2014-02-13 21:42 ` Szabolcs Nagy 2014-02-13 23:26 ` Szabolcs Nagy 2014-02-14 9:31 ` Martin Lucina 2014-02-14 10:26 ` Szabolcs Nagy 2014-02-14 15:53 ` Martin Lucina
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).