mailing list of musl libc
 help / color / mirror / code / Atom feed
* Reformatting dynamic linker error mesages? (or, Bikeshed April 2015)
@ 2015-04-18 22:54 Rich Felker
  2015-04-22  0:37 ` Isaac Dunham
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2015-04-18 22:54 UTC (permalink / raw)
  To: musl

One item on the roadmap that I want to work on is making the dynamic
linker error strings translatable. Since (unwritten, maybe) policy is
that we don't translate format strings in libc (it makes untrusted
locale files dangerous), this is going to require at least
restructuring of the code that generates the messages, and also calls
for restructuring of the actual text. I'm looking for ideas on how to
do this and make it the most legible and informative.

The errors we have to worry with are:

"Error relocating %s: %s: symbol not found"
"Error relocating %s: cannot allocate TLSDESC for %s",
"Error relocating %s: unsupported relocation type %d",
"Error loading shared library %s: %m (needed by %s)",
"Error relocating %s: RELRO protection failed: %m",
"cannot load %s: %m\n"
"%s: Not a valid dynamic program\n"
"Library %s is not already loaded"
"Error loading shared library %s: %m"
"Invalid library handle %p"
"Symbol not found: %s"
"Dynamic loading not supported"
"Unsupported request %d"

I think we should aim to separate it into a :-delimited form where
there's no grammatical relationship between fields, since making
grammar fit multiple natural languages is basically impossible.

Note that some of the above error messages are bad to begin with. We
don't necessarily need to preserve the content as-is; I'd much rather
make the error messages better at the same time.

For errors loading/mapping a library, I think we need to report the
cause, which could be:
- System-level errors opening/reading/etc.
- File-format errors
- Memory-allocation failure

It may also be useful, if the failing library is being loaded as a
dependency for another library, to report the library that needed it.

For errors during relocation, we need to report the library that could
not be relocated, and the reason, which could be:
- Missing symbol (need to show which symbol)
- Unknown/invalid relocation type
- Memory-allocation failure
- System-level failures (mprotect, etc.)

Everything else looks pretty straightforward.

It's also unclear to me whether we should aim to change the signature
of the error() function to take fixed fields, which error() is then
responsible for translating, or whether the caller should translate
fixed strings going in (in which case the caller has freedom to use
lots of different formats).

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Reformatting dynamic linker error mesages? (or, Bikeshed April 2015)
  2015-04-18 22:54 Reformatting dynamic linker error mesages? (or, Bikeshed April 2015) Rich Felker
@ 2015-04-22  0:37 ` Isaac Dunham
  2015-04-22  6:15   ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Isaac Dunham @ 2015-04-22  0:37 UTC (permalink / raw)
  To: musl

On Sat, Apr 18, 2015 at 06:54:36PM -0400, Rich Felker wrote:
> One item on the roadmap that I want to work on is making the dynamic
> linker error strings translatable. Since (unwritten, maybe) policy is
> that we don't translate format strings in libc (it makes untrusted
> locale files dangerous), this is going to require at least
> restructuring of the code that generates the messages, and also calls
> for restructuring of the actual text. I'm looking for ideas on how to
> do this and make it the most legible and informative.
> 
> The errors we have to worry with are:

[reordered to make redundancy more obvious]

> "Symbol not found: %s"
> "Error relocating %s: %s: symbol not found"
> "Error relocating %s: cannot allocate TLSDESC for %s",
> "Error relocating %s: unsupported relocation type %d",
> "Error relocating %s: RELRO protection failed: %m",
> "Error loading shared library %s: %m (needed by %s)",
> "Error loading shared library %s: %m"
> "cannot load %s: %m\n"
> "%s: Not a valid dynamic program\n"
> "Library %s is not already loaded"
> "Invalid library handle %p"
> "Dynamic loading not supported"
> "Unsupported request %d"


It looks like there's a bit of redundancy here, where 2-5 share
"Error relocating %s", 6 and 7 share "Error loading shared library",
and 1 and 2 share (save capitalization) "symbol not found".
It would be nice to see that redundancy turned into shared submessages.

I also notice that there's a mix of two equivalent formats:
error("...%m...") and
dprintf(fd, "...%s...", strerror(errno))

which I find odd.

Do we want the calls to strerror(errno) converted to something localized?
If so, %m would seem to be harmful.
On the other hand...is localization set up this early?
 
> I think we should aim to separate it into a :-delimited form where
> there's no grammatical relationship between fields, since making
> grammar fit multiple natural languages is basically impossible.

Agreed.

> Note that some of the above error messages are bad to begin with. We
> don't necessarily need to preserve the content as-is; I'd much rather
> make the error messages better at the same time.
> 
> For errors loading/mapping a library, I think we need to report the
> cause, which could be:
> - System-level errors opening/reading/etc.
> - File-format errors
> - Memory-allocation failure
> 
> It may also be useful, if the failing library is being loaded as a
> dependency for another library, to report the library that needed it.

> For errors during relocation, we need to report the library that could
> not be relocated, and the reason, which could be:
> - Missing symbol (need to show which symbol)
> - Unknown/invalid relocation type
> - Memory-allocation failure
> - System-level failures (mprotect, etc.)
> 
> Everything else looks pretty straightforward.
> 
> It's also unclear to me whether we should aim to change the signature
> of the error() function to take fixed fields, which error() is then
> responsible for translating, or whether the caller should translate
> fixed strings going in (in which case the caller has freedom to use
> lots of different formats).

Consider these strings:

> "Error loading shared library %s: %m (needed by %s)",
> "Error relocating %s: cannot allocate TLSDESC for %s",
> "Error relocating %s: %s: symbol not found"

Looking at the strings in question, it seems clear to me that we will
need to intermix translated and untranslateable strings to make the 
meaning clear. While the error message can be adjusted, it gets hard
to comprehend the error when data is not adjacent to its explanation.

Theoretically, you could stipulate that every other field is translated
and empty strings are skipped, but that sounds rather brittle.

Thanks,
Isaac Dunham



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Reformatting dynamic linker error mesages? (or, Bikeshed April 2015)
  2015-04-22  0:37 ` Isaac Dunham
@ 2015-04-22  6:15   ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2015-04-22  6:15 UTC (permalink / raw)
  To: musl

On Wed, Apr 22, 2015 at 12:37:41AM +0000, Isaac Dunham wrote:
> On Sat, Apr 18, 2015 at 06:54:36PM -0400, Rich Felker wrote:
> > One item on the roadmap that I want to work on is making the dynamic
> > linker error strings translatable. Since (unwritten, maybe) policy is
> > that we don't translate format strings in libc (it makes untrusted
> > locale files dangerous), this is going to require at least
> > restructuring of the code that generates the messages, and also calls
> > for restructuring of the actual text. I'm looking for ideas on how to
> > do this and make it the most legible and informative.
> > 
> > The errors we have to worry with are:
> 
> [reordered to make redundancy more obvious]

Thanks.

> > "Symbol not found: %s"
> > "Error relocating %s: %s: symbol not found"
> > "Error relocating %s: cannot allocate TLSDESC for %s",
> > "Error relocating %s: unsupported relocation type %d",
> > "Error relocating %s: RELRO protection failed: %m",
> > "Error loading shared library %s: %m (needed by %s)",
> > "Error loading shared library %s: %m"
> > "cannot load %s: %m\n"
> > "%s: Not a valid dynamic program\n"
> > "Library %s is not already loaded"
> > "Invalid library handle %p"
> > "Dynamic loading not supported"
> > "Unsupported request %d"

Here's a first try at making these reasonable for localization (still
left as format strings for clarity even though they can't actually
be):

"Symbol not found: %s"

"Error relocating library: %s: Symbol not found: %s"
"Error relocating library: %s: Unsupported relocation type: %d"
"Error relocating library: %s: Memory protection failure: %m"
"Error relocating library: %s: Out of memory"
and all of the above with:
"Error relocating main program:"
instead of library.

"Error loading dependency: %s (%s): %m"
"Error loading library: %s: %m"

"Error loading file: %s: %m"

"Library not already loaded: %s"
"Invalid library handle: %p"
"Dynamic loading not supported"
"Unsupported request: %d"

The %m's are also something of a fiction since there are a few cases
where we would want a message that doesn't come from errno (e.g.
invalid ELF headers). That's mildly ugly but definitely solvable.

> It looks like there's a bit of redundancy here, where 2-5 share
> "Error relocating %s", 6 and 7 share "Error loading shared library",
> and 1 and 2 share (save capitalization) "symbol not found".
> It would be nice to see that redundancy turned into shared submessages.

Yes. For example with symbol-not-found, the actual format would be
something like:

"%s: %s: %s: %s", _("Error relocating library"), name, _("Symbol not found"), sym

and:

"%s: %s", _("Symbol not found"), sym

Note that I'm using the GNU gettext _() notation for translation just
to make this simple to write in email; I don't endorse this in code.
:-)

> I also notice that there's a mix of two equivalent formats:
> error("...%m...") and
> dprintf(fd, "...%s...", strerror(errno))
> 
> which I find odd.

I believe this is because the error function was originally unsuitable
for some of those places, but I may be wrong. I think it could be used
now.

> Do we want the calls to strerror(errno) converted to something localized?
> If so, %m would seem to be harmful.

strerror already returns a translated message. So does %m.

> On the other hand...is localization set up this early?

At first the messages would only be available via dlerror after
setlocale. However I think it may make sense to have the dynamic
linker setlocale(LC_ALL, "") on the first error. Since the main
program will never be invoked if there are errors, this would not
affect program semantics at all.

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-04-22  6:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-18 22:54 Reformatting dynamic linker error mesages? (or, Bikeshed April 2015) Rich Felker
2015-04-22  0:37 ` Isaac Dunham
2015-04-22  6:15   ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).