caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: skaller <skaller@users.sourceforge.net>
To: Nuutti Kotivuori <naked+caml@naked.iki.fi>
Cc: Eric Dahlman <edahlman@atcorp.com>,
	skaller@users.sourceforge.net, caml-list@pauillac.inria.fr
Subject: Re: [Caml-list] Bug with really_input under cygwin
Date: 11 Mar 2004 14:42:23 +1100	[thread overview]
Message-ID: <1078976542.2452.106.camel@pelican.wigram> (raw)
In-Reply-To: <87hdww4tc9.fsf@aka.i.naked.iki.fi>

On Thu, 2004-03-11 at 02:25, Nuutti Kotivuori wrote:

> > even if you're processing text. Never depend on the
> > language or OS conversion functions, its very unlikely
> > they'll be right. Do all the conversions needed yourself.
> > At least when you find a problem you're not handling
> > correctly you can fix it.
> 
> Luckily not everybody sees the world as glum :-)

I'm not seeing it as glum. I'm pointing out that
today the situation is vastly more complex due to
belated recognition of the need for Standards to
support I18N issues.

Because of this the idea that \r\n <-> \n is the
only real encoding issue across platforms is wrong.
If only that were the case today, it would be a trivial
problem to resolve.

For example, text files may contain certain header bytes
that indicate if the file is UTF8 encoded, or UCS-2
with big or little endian: these bytes if found must not
be considered as 'text', they're just encoding indicators.

Even within Unicode/ISO-10646 there are myrriad
'encoding' problems, the famous ones being the use
of combining characters -- and that's *after* you have found
the ISO10646 code points :)

So, if you want to handle *text* in a portable way,
you have some work ahead of you. Don't even try to render
it correctly, the required algorithm competes with Mr Ackermann
in performance :D

As long as these kinds of comments are labelled as 'rants'
people will continue to write non-portable software and
fail to face up to the issues.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2004-03-11  3:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-09 22:30 Eric Dahlman
2004-03-09 22:52 ` Karl Zilles
2004-03-10  3:06 ` skaller
2004-03-10  4:10   ` David Brown
2004-03-10 13:14     ` Richard Zidlicky
2004-03-11  4:11       ` skaller
2004-03-11  3:24     ` skaller
2004-03-10 15:25   ` Nuutti Kotivuori
2004-03-11  3:42     ` skaller [this message]
2004-03-11  5:02       ` Nuutti Kotivuori
2004-03-11 15:21         ` skaller
2004-03-11  6:32       ` james woodyatt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1078976542.2452.106.camel@pelican.wigram \
    --to=skaller@users.sourceforge.net \
    --cc=caml-list@pauillac.inria.fr \
    --cc=edahlman@atcorp.com \
    --cc=naked+caml@naked.iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).