Hello,

While working on code that converts arguments from utf-16 to utf-8, I found myself wondering about the "responsibility" for checking well-formedness of utf-8 strings that are passed to the kernel.  As I suspected, validation of these strings takes place neither in the kernel, nor in the C library.  The attached program demonstrates this by creating a file named <0xE0 0x9F 0x80>, which according to the Unicode Standard (6.2, p. 95) is an ill-formed byte sequence.

I am not sure whether this can officially be considered a bug, and it is quite clear that fixing this is going to entail some performance penalty.  That being said, after deleting this file from my Ubuntu desktop most (but not all) attempts to open the Trash folder made Nautilus crash, and it was only after deleting the file permanently from the shell that order had been restored...

Best regards,
zg