Gnus development mailing list
 help / color / mirror / Atom feed
* replacing some utf-8 characters on display
@ 2021-10-27 15:53 Eric S Fraga
  2021-10-27 16:00 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 16+ messages in thread
From: Eric S Fraga @ 2021-10-27 15:53 UTC (permalink / raw)
  To: ding

There is currently a very interesting on-going long discussion on
unicode variants for emojis, e.g. ⚠️ versus ⚠.  The former is composed of
the latter and a variant selector 16 (whatever that means).

I am currently receiving emails with ⚠ which I would like to have
display a ⚠️ instead in gnus.  Just for æsthetic reasons...

Is there some automatic way (some washing treatment) to handle this?

Thank you,
eric

-- 
Eric S Fraga via Emacs 28.0.60 & org 9.5 on Debian 11.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-10-27 15:53 replacing some utf-8 characters on display Eric S Fraga
@ 2021-10-27 16:00 ` Lars Ingebrigtsen
  2021-10-27 16:04   ` Eric S Fraga
  2021-10-27 16:39   ` Robert Pluim
  0 siblings, 2 replies; 16+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-27 16:00 UTC (permalink / raw)
  To: Eric S Fraga; +Cc: ding

Eric S Fraga <e.fraga@ucl.ac.uk> writes:

> I am currently receiving emails with ⚠ which I would like to have
> display a ⚠️ instead in gnus.  Just for æsthetic reasons...
>
> Is there some automatic way (some washing treatment) to handle this?

No, but it's trivial to add.  You just look at each character and add
the VS-16 after it:

(when (eq (aref char-script-table char) 'symbol)
  (insert (string #xfe0f)))

Then you'll get the "emoji" expression (if it exists) from everything
that's a symbol.  I think.  

However, this may be confusing for some symbols, I think?  Not for ⚠,
where it's aesthetic only, but I'm wondering whether there's, like, math
symbols that have the same VS-16 behaviour.  (VS-16 means "use the emoji
glyph instead of the symbol glyph".)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-10-27 16:00 ` Lars Ingebrigtsen
@ 2021-10-27 16:04   ` Eric S Fraga
  2021-10-27 16:10     ` Lars Ingebrigtsen
  2021-10-27 16:39   ` Robert Pluim
  1 sibling, 1 reply; 16+ messages in thread
From: Eric S Fraga @ 2021-10-27 16:04 UTC (permalink / raw)
  To: ding

On Wednesday, 27 Oct 2021 at 18:00, Lars Ingebrigtsen wrote:
> No, but it's trivial to add.  You just look at each character and add
> the VS-16 after it:
>
> (when (eq (aref char-script-table char) 'symbol)
>   (insert (string #xfe0f)))

Hi Lars,

Thank you for this.  Makes sense.  But where would I add this (and the
loop over all chars, I guess)?

Thank you,
eric
-- 
Eric S Fraga via Emacs 28.0.60 & org 9.5 on Debian 11.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-10-27 16:04   ` Eric S Fraga
@ 2021-10-27 16:10     ` Lars Ingebrigtsen
  2021-10-27 16:21       ` Eric S Fraga
  0 siblings, 1 reply; 16+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-27 16:10 UTC (permalink / raw)
  To: Eric S Fraga; +Cc: ding

Eric S Fraga <e.fraga@ucl.ac.uk> writes:

> Thank you for this.  Makes sense.  But where would I add this (and the
> loop over all chars, I guess)?

Define a new article washing function, I think?  (I mean, now that
you've given me the idea, I'll be adding it, because I want it to, but
if you want to implement it first and submit a patch, that's even
better.  😆)

Those functions are pretty standardised -- just copy one,
article-treat-smartquotes for instance, and adjust.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-10-27 16:10     ` Lars Ingebrigtsen
@ 2021-10-27 16:21       ` Eric S Fraga
  0 siblings, 0 replies; 16+ messages in thread
From: Eric S Fraga @ 2021-10-27 16:21 UTC (permalink / raw)
  To: ding

On Wednesday, 27 Oct 2021 at 18:10, Lars Ingebrigtsen wrote:
> Define a new article washing function, I think?  (I mean, now that
> you've given me the idea, I'll be adding it, because I want it to, but
> if you want to implement it first and submit a patch, that's even
> better.  😆)

Given that you can probably write this in a blink, whereas it would take
me some time (I have gone and looked at some of the treat
functions... and I know it will still take me some time), I think I'll
probably wait for you to get around to it.  I'm just glad I've given you
the idea.

But I may try as well as it never hurts to learn something new... If I
do, I'll let you know but don't hold your breath! 😉

-- 
Eric S Fraga via Emacs 28.0.60 & org 9.5 on Debian 11.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-10-27 16:00 ` Lars Ingebrigtsen
  2021-10-27 16:04   ` Eric S Fraga
@ 2021-10-27 16:39   ` Robert Pluim
  2021-10-28 22:01     ` Lars Ingebrigtsen
  1 sibling, 1 reply; 16+ messages in thread
From: Robert Pluim @ 2021-10-27 16:39 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Eric S Fraga, ding

>>>>> On Wed, 27 Oct 2021 18:00:42 +0200, Lars Ingebrigtsen <larsi@gnus.org> said:
    Lars> However, this may be confusing for some symbols, I think?  Not for ⚠,
    Lars> where it's aesthetic only, but I'm wondering whether there's, like, math
    Lars> symbols that have the same VS-16 behaviour.  (VS-16 means "use the emoji
    Lars> glyph instead of the symbol glyph".)

If the math symbol + VS-16 results in an emoji glyph, itʼs not a math
symbol, itʼs an emoji :-) But there are emoji for things like
'multiply', 'divide' that you might not want to turn into fruit salad
in your email. How likely those are to be used instead of the 'normal'
characters I donʼt know.

Robert
-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-10-27 16:39   ` Robert Pluim
@ 2021-10-28 22:01     ` Lars Ingebrigtsen
  2021-11-03 10:33       ` Robert Pluim
  0 siblings, 1 reply; 16+ messages in thread
From: Lars Ingebrigtsen @ 2021-10-28 22:01 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Eric S Fraga, ding

Robert Pluim <rpluim@gmail.com> writes:

> If the math symbol + VS-16 results in an emoji glyph, itʼs not a math
> symbol, itʼs an emoji :-) But there are emoji for things like
> 'multiply', 'divide' that you might not want to turn into fruit salad
> in your email. How likely those are to be used instead of the 'normal'
> characters I donʼt know.

So that's...  ➗?  But it's its own separate code point, so I guess it
shouldn't be a problem.

I think it's worth trying (i.e., adding a washing function that adds
VS-16 after all symbol glyphs) and see whether that leads to anything
annoying.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-10-28 22:01     ` Lars Ingebrigtsen
@ 2021-11-03 10:33       ` Robert Pluim
  2021-11-04  5:14         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Pluim @ 2021-11-03 10:33 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Eric S Fraga, ding

>>>>> On Fri, 29 Oct 2021 00:01:30 +0200, Lars Ingebrigtsen <larsi@gnus.org> said:

    Lars> Robert Pluim <rpluim@gmail.com> writes:
    >> If the math symbol + VS-16 results in an emoji glyph, itʼs not a math
    >> symbol, itʼs an emoji :-) But there are emoji for things like
    >> 'multiply', 'divide' that you might not want to turn into fruit salad
    >> in your email. How likely those are to be used instead of the 'normal'
    >> characters I donʼt know.

    Lars> So that's...  ➗?  But it's its own separate code point, so I guess it
    Lars> shouldn't be a problem.

    Lars> I think it's worth trying (i.e., adding a washing function that adds
    Lars> VS-16 after all symbol glyphs) and see whether that leads to anything
    Lars> annoying.

If you do that, then you will end up with a bunch of symbol codepoints
followed by VS-16 when those codepoints aren't emoji. And I donʼt
think you can let-bind glyphless-char-display to make them disappear.

Does your recent emoji input work result in anything that could give
us a list of 'emoji where Emoji Presentation = No' codepoints?

Robert
-- 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-03 10:33       ` Robert Pluim
@ 2021-11-04  5:14         ` Lars Ingebrigtsen
  2021-11-04  5:38           ` Lars Ingebrigtsen
  0 siblings, 1 reply; 16+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-04  5:14 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Eric S Fraga, ding

Robert Pluim <rpluim@gmail.com> writes:

> Does your recent emoji input work result in anything that could give
> us a list of 'emoji where Emoji Presentation = No' codepoints?

Yes, we can just check whether the emoji font has a glyph for the symbol
before inserting the VS-16.  Here's a test message -- only the second
symbol should get a VS-16:

Emoji: ⚠️

Symbol with emoji: ⚠

Symbol without emoji: 𝐖

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-04  5:14         ` Lars Ingebrigtsen
@ 2021-11-04  5:38           ` Lars Ingebrigtsen
  2021-11-04  5:51             ` Lars Ingebrigtsen
  2021-11-04 12:26             ` Eric S Fraga
  0 siblings, 2 replies; 16+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-04  5:38 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Eric S Fraga, ding

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Yes, we can just check whether the emoji font has a glyph for the symbol
> before inserting the VS-16.  Here's a test message -- only the second
> symbol should get a VS-16:

Seems to be working -- now added to the trunk as
`gnus-treat-emojize-symbols' and is on `W D e'.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-04  5:38           ` Lars Ingebrigtsen
@ 2021-11-04  5:51             ` Lars Ingebrigtsen
  2021-11-04  5:55               ` Lars Ingebrigtsen
  2021-11-04 12:26             ` Eric S Fraga
  1 sibling, 1 reply; 16+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-04  5:51 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Eric S Fraga, ding

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Lars Ingebrigtsen <larsi@gnus.org> writes:
>
>> Yes, we can just check whether the emoji font has a glyph for the symbol
>> before inserting the VS-16.  Here's a test message -- only the second
>> symbol should get a VS-16:
>
> Seems to be working -- now added to the trunk as
> `gnus-treat-emojize-symbols' and is on `W D e'.

Hm...  but what happens with characters that are already emoji.

Test: 😀

Symbol: ⚠

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-04  5:51             ` Lars Ingebrigtsen
@ 2021-11-04  5:55               ` Lars Ingebrigtsen
  2021-11-10 15:12                 ` Eric S Fraga
                                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Lars Ingebrigtsen @ 2021-11-04  5:55 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Eric S Fraga, ding

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Hm...  but what happens with characters that are already emoji.
>
> Test: 😀️
>
> Symbol: ⚠️

Seems OK, even if it added a vs-16 unnecessarily. 

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-04  5:38           ` Lars Ingebrigtsen
  2021-11-04  5:51             ` Lars Ingebrigtsen
@ 2021-11-04 12:26             ` Eric S Fraga
  1 sibling, 0 replies; 16+ messages in thread
From: Eric S Fraga @ 2021-11-04 12:26 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Robert Pluim, ding

On Thursday,  4 Nov 2021 at 06:38, Lars Ingebrigtsen wrote:
> Seems to be working -- now added to the trunk as
> `gnus-treat-emojize-symbols' and is on `W D e'.

Thank you for this.  I'm currently tracking the v28 branch to help the
release but will try this out when I get back to tracking master...

Thanks again,
eric

-- 
Eric S Fraga via Emacs 28.0.60 & org 9.5 on Debian 11.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-04  5:55               ` Lars Ingebrigtsen
@ 2021-11-10 15:12                 ` Eric S Fraga
  2021-11-10 15:15                 ` Eric S Fraga
  2021-11-10 15:17                 ` Eric S Fraga
  2 siblings, 0 replies; 16+ messages in thread
From: Eric S Fraga @ 2021-11-10 15:12 UTC (permalink / raw)
  To: ding

Lars,

I have finally switched to the master branch and had a chance to try
this out.  Works very well!  Thank you.

-- 
Eric S Fraga via Emacs 29.0.50 & org 9.5 on Debian 11.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-04  5:55               ` Lars Ingebrigtsen
  2021-11-10 15:12                 ` Eric S Fraga
@ 2021-11-10 15:15                 ` Eric S Fraga
  2021-11-10 15:17                 ` Eric S Fraga
  2 siblings, 0 replies; 16+ messages in thread
From: Eric S Fraga @ 2021-11-10 15:15 UTC (permalink / raw)
  To: ding

Lars,

I have finally switched to the master branch and had a chance to try
this out.  Works very well!  Thank you.

(apologies if I my post comes through twice but I'm getting some error
in gnus today, since switching to the master branch, something to do
with string-trim... so I've now set debug-on-error to see if I can
figure out why.)
-- 
Eric S Fraga via Emacs 29.0.50 & org 9.5 on Debian 11.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: replacing some utf-8 characters on display
  2021-11-04  5:55               ` Lars Ingebrigtsen
  2021-11-10 15:12                 ` Eric S Fraga
  2021-11-10 15:15                 ` Eric S Fraga
@ 2021-11-10 15:17                 ` Eric S Fraga
  2 siblings, 0 replies; 16+ messages in thread
From: Eric S Fraga @ 2021-11-10 15:17 UTC (permalink / raw)
  To: ding

Lars,

I have finally switched to the master branch and had a chance to try
this out.  Works very well!  Thank you.

(apologies if this come through twice: gnus-recent is not working with
the latest gnus and have now disabled it.)

-- 
Eric S Fraga via Emacs 29.0.50 & org 9.5 on Debian 11.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-10 15:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-27 15:53 replacing some utf-8 characters on display Eric S Fraga
2021-10-27 16:00 ` Lars Ingebrigtsen
2021-10-27 16:04   ` Eric S Fraga
2021-10-27 16:10     ` Lars Ingebrigtsen
2021-10-27 16:21       ` Eric S Fraga
2021-10-27 16:39   ` Robert Pluim
2021-10-28 22:01     ` Lars Ingebrigtsen
2021-11-03 10:33       ` Robert Pluim
2021-11-04  5:14         ` Lars Ingebrigtsen
2021-11-04  5:38           ` Lars Ingebrigtsen
2021-11-04  5:51             ` Lars Ingebrigtsen
2021-11-04  5:55               ` Lars Ingebrigtsen
2021-11-10 15:12                 ` Eric S Fraga
2021-11-10 15:15                 ` Eric S Fraga
2021-11-10 15:17                 ` Eric S Fraga
2021-11-04 12:26             ` Eric S Fraga

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).