From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p215oTX0004698 for ; Tue, 1 Mar 2011 06:50:29 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AuQBAE4YbE3RVdQ2kGdsb2JhbACmQggWAQIJCQwHEQQhoDKKIIIdhQsviFoBAQMFhVwEhRCHD4ZTgX06 X-IronPort-AV: E=Sophos;i="4.62,245,1297033200"; d="scan'208";a="88872538" Received: from mail-vw0-f54.google.com ([209.85.212.54]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 01 Mar 2011 06:49:49 +0100 Received: by vws16 with SMTP id 16so5982121vws.27 for ; Mon, 28 Feb 2011 21:49:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=p6SEMr5aQxWj7seEceCfjLQ7kdafENa1Wmer1Yq1LS8=; b=XYCtYFZPdIZ4sD0jfmkbJDrQciofNCWA1puR8BueX541GH83pzPZ6n+qix1nXEsjLS kjX/Y+YbqexhRDaBIABsXy83Vfv+QNus0PKZZakqPdIDN3yLvf3eOVQr9vgnFAyr6mMl 9oAodHWVaMocRehzwotj713de3aTjWJI8nLqk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=hxO446Vg6ubsXGO32QQ8DSmv0t/XfvxkYmv0yZ0jKrVn1igArN3rYhbCO3Ns0iyHqC Zju4ReROkxEVYwfFgOWGEQv+svMTYCV3T1HRN0JB24mplxkUtcgtpbTrWYNzojiZmfo4 9gYCGQnGDkK5hy0TEamm6WA40nkBcGT5oK5fQ= MIME-Version: 1.0 Received: by 10.52.68.175 with SMTP id x15mr7852476vdt.70.1298958587841; Mon, 28 Feb 2011 21:49:47 -0800 (PST) Received: by 10.220.190.13 with HTTP; Mon, 28 Feb 2011 21:49:47 -0800 (PST) In-Reply-To: References: <20110228.093528.996524125295855263.Christophe.Troestler@umons.ac.be> <1298902420.27243.42.camel@thinkpad> Date: Tue, 1 Mar 2011 14:49:47 +0900 Message-ID: From: Yoriyuki Yamagata To: Sylvain Le Gall , Caml List Content-Type: text/plain; charset=ISO-8859-1 Subject: [Caml-list] Re: GSoC: better UTF-8 support Sorry, I didn't notice this thread since caml-list did not reach me sometime. So allow me to jump in the discossion. I think the entire discussion went a bit astray. It seems for me that the argument goes to specify the project detail as much as possible. According to my experience being a GSoC menter (Yes, I was), students often come up with better idea than menter. Therefore, instead of specifying the details, we'd better specify a general direction and let the students decide. As the general direction, I think we need 1) light weight stdlib replacement: Data type for Unicode chars and strings. Extensible character encofing, and simple IO. Interfaces shoud be purely functinal as far as possible. For example string wil be imutable, IO is monadic etc... 2) minimal language extension: unocode character and string literal. Unicode aware toplevel(pretty printing) etc... This is no means a complete support of Unicode, but having this in stdlib we can add more feature of Unicode standard (through for example camomile) or modify third party library to use Unicode. Best, -- Yoriyuki Yamagata yoriyuki.y@gmail.com http://sites.google.com/site/yoriyukiy/