Hi all, I first proposed this change over two and a half years ago.[1] Unfortunately, it never seemed to gain any traction. I figured I'd give it another shot. Adding this open-source implementation of memrchr() allows for CGIT to be built on macOS. This can be very convenient, be it for CGIT development or to have a local CGIT instance that'll work even without network connection. The change seems quite small and well contained (less than 60 lines of code added; no other platform affected), and it enables an additional platform/OS for CGIT. Proof that it actually builds (x86 works too, of course): $ file cgit cgit: Mach-O 64-bit executable arm64 $ otool -L cgit cgit: /System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices (...) /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11) /usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.0.0) Regards, -Markus [1] https://www.mail-archive.com/cgit@lists.zx2c4.com/msg03014.html Markus Mayer (1): global: provide memrchr implementation for macOS cgit.h | 4 ++++ cgit.mk | 9 +++++++++ memrchr.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 57 insertions(+) create mode 100644 memrchr.c -- 2.39.0
macOS doesn't come with a memrchr() of its own, so let's provide an implementation it can use. memrchr.c was taken from Apple's own open source site. https://opensource.apple.com/source/sudo/sudo-87.80.2/sudo/lib/util/memrchr.c It was modified in the following ways: - Removed unnecessary includes (config.h, sudo_compat.h). - Removed unnecessary include guard (HAVE_MEMRCHR). - Removed comment referencing an unrelated project. - Added comment mentioning URL to the original location of the file. Signed-off-by: Markus Mayer <code@mmayer.net> --- cgit.h | 4 ++++ cgit.mk | 9 +++++++++ memrchr.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 57 insertions(+) create mode 100644 memrchr.c diff --git a/cgit.h b/cgit.h index ddd2ccba0b88..19f810f22b61 100644 --- a/cgit.h +++ b/cgit.h @@ -397,4 +397,8 @@ extern char *expand_macros(const char *txt); extern char *get_mimetype_for_filename(const char *filename); +#ifdef NEED_MEMRCHR +extern void *memrchr(const void *s, int c, size_t n); +#endif + #endif /* CGIT_H */ diff --git a/cgit.mk b/cgit.mk index 3fcc1ca31440..f85018c38b78 100644 --- a/cgit.mk +++ b/cgit.mk @@ -63,10 +63,19 @@ ifeq ($(uname_S),Linux) HAVE_LINUX_SENDFILE = YesPlease endif +ifeq ($(uname_S),Darwin) + IS_DARWIN = yes +endif + ifdef HAVE_LINUX_SENDFILE CGIT_CFLAGS += -DHAVE_LINUX_SENDFILE endif +ifdef IS_DARWIN + CGIT_CFLAGS += -DNEED_MEMRCHR + CGIT_OBJ_NAMES += memrchr.o +endif + CGIT_OBJ_NAMES += cgit.o CGIT_OBJ_NAMES += cache.o CGIT_OBJ_NAMES += cmd.o diff --git a/memrchr.c b/memrchr.c new file mode 100644 index 000000000000..75c6dcde7432 --- /dev/null +++ b/memrchr.c @@ -0,0 +1,44 @@ +/* + * SPDX-License-Identifier: ISC + * + * Copyright (c) 2007, 2010-2014 + * Todd C. Miller <Todd.Miller@sudo.ws> + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ + +/* + * Imported into CGIT from + * https://opensource.apple.com/source/sudo/sudo-87.80.2/sudo/lib/util/memrchr.c + */ + +#include <sys/types.h> + +/* + * Reverse memchr() + * Find the last occurrence of 'c' in the buffer 's' of size 'n'. + */ +void * +memrchr(const void *s, int c, size_t n) +{ + const unsigned char *cp; + + if (n != 0) { + cp = (unsigned char *)s + n; + do { + if (*(--cp) == (unsigned char)c) + return (void *)cp; + } while (--n != 0); + } + return (void *)0; +} -- 2.39.0
Hi Alex, > This code seems overly complex to me. A do-while seems unintuitive when > you're not performing at least 1 iteration, and having an if that is > basically the same operation as the while condition seems weird. Maybe > I'm missing something, but it looks like it can be written more readable as: I have no objections to changing the code. For my initial submission, I figured I'd use an implementation that comes directly from Apple's own open source server. But if the consensus is to tweak the code to make it smaller / faster / more readable, that's alright by me. I only want CGIT to build on macOS. :-) > void * > memrchr(const void *s, int c, size_t n) > { > unsigned char ch; > const unsigned char *cp; > > ch = c; > > cp = s; > for (cp += n; n != 0; n--) { > cp--; > if (*cp == ch) > return (void *) cp; > } > > return NULL; > } Let me play around with this. > Clang produces similar assembly code for both. GCC however produces > smaller code for the for loop. Thanks for the feedback. Regards, -Markus
On Fri, 27 Jan 2023 at 17:16, Glenn Strauss <gstrauss@gluelogic.com> wrote: > > Fun! A small exercise for comparison if you like. > Cheers, Glenn > > void * > memrchr(const void *s, int c, size_t n) > { > const unsigned char *cp = (const unsigned char *)s + n; > const unsigned char ch = (unsigned char)c; > while (s != cp) { > if (*(--cp) == ch) > return (void *)cp; > } > return NULL; > } Hi all, As promised, I played around a bit. I ran a few experiments with different memrchr() implementations. Everything I did can be found here: https://github.com/mmayer/cgit/tree/memrchr-compare The test-specific code is in the memrchr_test folder[1] within that repo. The four implementations I tried are: memrchr: the original implementation (from Apple's sudo command) that I submitted as v1 memrchr2: Alejandro's suggestion memrchr3: Glen's suggestion memrchr4: for added fun, musl-libc's implementation[2] I also checked the object and assembly files into the repo, so it's easier to look at them if anybody wants to. They live in the memrchr_test/output folder. Here are the results for ARM and x86, both in assembly/object size and runtime. ARM # Object size of memrchr and memrchr2 is the same -rw-r--r-- 1 mmayer staff 552 29 Jan 09:52 memrchr.o -rw-r--r-- 1 mmayer staff 552 29 Jan 09:52 memrchr2.o -rw-r--r-- 1 mmayer staff 544 29 Jan 09:52 memrchr3.o -rw-r--r-- 1 mmayer staff 544 29 Jan 09:52 memrchr4.o # Assembly source of memrchr2 is larger than memrchr -rw-r--r-- 1 mmayer staff 694 29 Jan 09:52 memrchr2.s -rw-r--r-- 1 mmayer staff 691 29 Jan 09:52 memrchr.s -rw-r--r-- 1 mmayer staff 655 29 Jan 09:52 memrchr3.s -rw-r--r-- 1 mmayer staff 655 29 Jan 09:52 memrchr4.s execution time: 18.61453 seconds execution time: 15.39163 seconds execution time: 13.56957 seconds execution time: 13.55493 seconds x86 -rw-r--r-- 1 mmayer staff 656 29 Jan 10:02 memrchr.o -rw-r--r-- 1 mmayer staff 656 29 Jan 10:02 memrchr2.o -rw-r--r-- 1 mmayer staff 656 29 Jan 10:02 memrchr3.o -rw-r--r-- 1 mmayer staff 648 29 Jan 10:02 memrchr4.o -rw-r--r-- 1 mmayer staff 835 29 Jan 10:02 memrchr.s -rw-r--r-- 1 mmayer staff 835 29 Jan 10:02 memrchr2.s -rw-r--r-- 1 mmayer staff 825 29 Jan 10:02 memrchr3.s -rw-r--r-- 1 mmayer staff 818 29 Jan 10:02 memrchr4.s execution time: 20.29937 seconds execution time: 23.67755 seconds execution time: 12.59514 seconds execution time: 11.38668 seconds As you can see, musl-libc provides the smallest implementation that is also the fastest. This is true for ARM and x86. So, I guess it makes the most sense to pick that (memrchr4.c in my experiments). The code is under a MIT license, which I assume is fine for CGIT. What does everybody think? Regards, -Markus [1] https://github.com/mmayer/cgit/tree/memrchr-compare/memrchr_test [2] https://git.musl-libc.org/cgit/musl/tree/src/string/memrchr.c