ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* byte order marks in utf-8 (old)
@ 2005-12-28 12:48 Taco Hoekwater
  0 siblings, 0 replies; only message in thread
From: Taco Hoekwater @ 2005-12-28 12:48 UTC (permalink / raw)



Hi,

A long time ago, Hans wrote:
> Patrick Gundlach wrote:
> 
>> are you sure that your scite does not use utf-16 and puts the BOM (byte
>> order mark) there?
> 
> scite indeed does this (kind of annoying)

It is the BOM, in utf-8 file encoding. A bit pointless (utf-8 is
based on opcodes instead of byte ordering), but is allowed by the
unicode specification.

> context can handle that for xml files
> 
> i can consider handling it automatically (i.e. when BOM before first start-stop,
> then assume utf-8)

It is a safe bet that any document that starts with the three bytes
   0xEF 0xBB 0xBF
is encoded as UTF-8, esp. if it is supposed to be text input.

Cheers,
Taco

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2005-12-28 12:48 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-28 12:48 byte order marks in utf-8 (old) Taco Hoekwater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).