ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: janosch-x via ruby-core <ruby-core@ml.ruby-lang.org>
To: ruby-core@ml.ruby-lang.org
Cc: janosch-x <noreply@ruby-lang.org>
Subject: [ruby-core:116056] [Ruby master Feature#19908] Update to Unicode 15.1
Date: Sat, 06 Jan 2024 21:28:06 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-106054.20240106212805.4@ruby-lang.org> (raw)
In-Reply-To: <redmine.issue-19908.20231002065545.4@ruby-lang.org>

Issue #19908 has been updated by janosch-x (Janosch Müller).





Is not [this](https://www.unicode.org/reports/tr29/tr29-43.html#Regex_Definitions) the updated regular expression?



```diff

 ccs-base :=     [\p{L}\p{N}\p{P}\p{S}\p{Zs}]

 ccs-extend :=  [\p{M}\p{Join_Control}]

 extended_base :=       ccs-base

 | hangul-syllable

-crlf :=        CR LF

+crlf :=        CR LF | CR | LF

 legacy-core := hangul-syllable

 | ri-sequence

 | xpicto-sequence

 legacy-postcore :=    [Extend ZWJ]

 core :=        hangul-syllable

 | ri-sequence

 | xpicto-sequence

+| conjunctCluster

 | [^Control CR LF]

 postcore :=    [Extend ZWJ SpacingMark]

 precore :=     Prepend

 hangul-syllable :=    L* (V+ | LV V* | LVT) T*

 | L+

 | T+

 xpicto-sequence :=     \p{Extended_Pictographic} (Extend* ZWJ \p{Extended_Pictographic})*

+conjunctCluster :=     \p{InCB=Consonant} ([\p{InCB=Extend} \p{InCB=Linker}]* \p{InCB=Linker} [\p{InCB=Extend} \p{InCB=Linker}]* \p{InCB=Consonant})+

```



----------------------------------------

Feature #19908: Update to Unicode 15.1

https://bugs.ruby-lang.org/issues/19908#change-106054



* Author: nobu (Nobuyoshi Nakada)

* Status: Assigned

* Priority: Normal

* Assignee: duerst (Martin Dürst)

----------------------------------------

The Unicode 15.1 is released.



The current enc-unicode.rb seems to fail because of `Indic_Conjunct_break` properties with values.



I'm not sure how these properties should be handled well.

`/\p{InCB_Liner}/` or `/\p{InCB=Liner}/` as the comments in that file?

https://github.com/nobu/ruby/tree/unicode-15.1 is the former.







-- 

https://bugs.ruby-lang.org/

 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

  parent reply	other threads:[~2024-01-06 21:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-02  6:55 [ruby-core:114936] " nobu (Nobuyoshi Nakada) via ruby-core
2023-10-02 14:06 ` [ruby-core:114939] " Игорь Пятчиц via ruby-core
2023-12-26  6:52 ` [ruby-core:115899] " duerst via ruby-core
2023-12-26 11:42 ` [ruby-core:115906] " duerst via ruby-core
2024-01-06 21:28 ` janosch-x via ruby-core [this message]
2024-01-09  1:25 ` [ruby-core:116099] " duerst via ruby-core
2024-09-12  1:56 ` [ruby-core:119128] " hsbt (Hiroshi SHIBATA) via ruby-core
2024-09-12  3:21 ` [ruby-core:119130] " duerst via ruby-core
2024-09-12  3:53 ` [ruby-core:119131] " hsbt (Hiroshi SHIBATA) via ruby-core
2025-01-01 15:06 ` [ruby-core:120460] " ima1zumi (Mari Imaizumi) via ruby-core

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-106054.20240106212805.4@ruby-lang.org \
    --to=ruby-core@ml.ruby-lang.org \
    --cc=noreply@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).