caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Daniel Bünzli" <daniel.buenzli@erratique.ch>
To: caml-list@inria.fr
Cc: "David Kaloper Meršinjak" <dk505@cl.cam.ac.uk>
Subject: [Caml-list] Unicode 13.0.0 update for Uucd, Uucp, Uunf and Uuseg
Date: Wed, 11 Mar 2020 10:54:31 +0100	[thread overview]
Message-ID: <etPan.5e68b557.34995102.1389d@erratique.ch> (raw)

Hello, 

Unicode 13.0.0 was released on the 10th of march.

It adds 5390 characters to the standard including graphic symbols for legacy computing. If you were looking for characters representing seven-segment decimal digits, now you [have them][0]. For the curious, the [encoding proposal][1] has the motivation and source of these new symbols. For more information about all the other additions, see [this page][2]. 

Accordingly the libraries mentioned at the end of this message had to be updated, consult the individual release notes for details. Both Uucd and Uucp are incompatible releases sinces new script and block enumerants had to be added.

Uucp has a new Emoji module with the new emoji properties introduced in 13.0.0 which are now used by Uuseg to improve emoji segmentation. The overall compiled size of Uucp shrinked a bit; here uucp.cmxs went from 7.8Mo to 4.6Mo. Further reduction can likely be achieved with more work. Thanks to David Kaloper Meršinjak for helping on this.

A periodic reminder, if Unicode still puzzles you, read an absolute minimal Unicode introduction and OCaml Unicode tips on [this page][3] (also available via `odig doc uucp`).

Happy retro computing,

Daniel

P.S. The OCaml compiler [detected][4] an obsolete rule in the 13.0.0 update of the Unicode line breaking algorithm.

[0]: https://www.unicode.org/charts/PDF/U1FB00.pdf
[1]: https://www.unicode.org/L2/L2019/19025-terminals-prop.pdf
[2]: http://blog.unicode.org/2020/03/announcing-unicode-standard-version-130.html
[3]: https://erratique.ch/software/uucp/doc/unicode.html
[4]: https://www.unicode.org/mail-arch/unicode-ml/y2020-m03/0000.html

---

Uucd 13.0.0 Unicode character database decoder for OCaml.
http://erratique.ch/software/uucd

Uucp 13.0.0 Unicode character properties for OCaml.
http://erratique.ch/software/uucp

Uunf 13.0.0 Unicode text normalization for OCaml.
http://erratique.ch/software/uunf

Uuseg 13.0.0 Unicode text segmentation for OCaml.
http://erratique.ch/software/uuseg

             reply	other threads:[~2020-03-11  9:54 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-11  9:54 Daniel Bünzli [this message]
2020-03-13 14:30 ` orbifx

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=etPan.5e68b557.34995102.1389d@erratique.ch \
    --to=daniel.buenzli@erratique.ch \
    --cc=caml-list@inria.fr \
    --cc=dk505@cl.cam.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).