Gnus development mailing list
 help / color / mirror / Atom feed
* shr.el: folding Japanese text
@ 2010-10-08  8:10 Katsumi Yamaoka
  2010-10-08 17:05 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-08  8:10 UTC (permalink / raw)
  To: ding

Hi,

Could you see articles in the gwene.jp.itmedia.news.bursts
newsgroup?  You may see some aren't folded and some are folded
uglily.  There's no concept of the word wrapping in Japanese.
Normally there's no space between words.  A word may be folded
in the middle of it.  Is it funny?  But it's our custom. ;-)
I guess that Korean text and Chinese text are similar.

The following is a quick hack that satisfies me so-so.  For the
moment I don't know how we can switch it for Latin text and
others, though.

(defun shr-insert (text)
  (when (eq shr-state 'image)
    (insert "\n")
    (setq shr-state nil))
  (unless (string-equal text "\n")
    (let ((start (point))
	  nls)
      (insert text)
      (fill-region start (point))
      (goto-char start)
      (skip-chars-forward "\n")
      (setq nls (skip-chars-backward "\n"))
      (cond ((bobp)
	     (delete-char (- nls)))
	    ((< nls -2)
	     (delete-char (- -2 nls)))))
    (goto-char (point-max))))

Regards,



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-08  8:10 shr.el: folding Japanese text Katsumi Yamaoka
@ 2010-10-08 17:05 ` Lars Magne Ingebrigtsen
  2010-10-08 17:23   ` Ted Zlatanov
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-08 17:05 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> Could you see articles in the gwene.jp.itmedia.news.bursts
> newsgroup?  You may see some aren't folded and some are folded
> uglily.  There's no concept of the word wrapping in Japanese.
> Normally there's no space between words.  A word may be folded
> in the middle of it.  Is it funny?  But it's our custom. ;-)
> I guess that Korean text and Chinese text are similar.
>
> The following is a quick hack that satisfies me so-so.  For the
> moment I don't know how we can switch it for Latin text and
> others, though.

Yeah, this should be fixed to work on all languages.  But I'm not sure
what would be the right approach here.  Should we add language-guessing
stuff to switch the folding routine?  Or are there other (simpler) ways
to get satisfactory results in this area?

Right now I'm not seeing any obvious easy solutions here, but surely
somebody has dealt with problems like this before.  :-)  Any ideas?

Just when typing this, it occurs to me that there may be help from the
unicode standard, perhaps?  I mean, it shr-insert inserts a text with
characters from the Japanese, Chinese, Korean and *mumble* planes, then
we can break it on "character" boundaries, instead where spaces appear.
So if Emacs has a way to say (character-from-non-space-language-p char),
then our problems would be solved.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-08 17:05 ` Lars Magne Ingebrigtsen
@ 2010-10-08 17:23   ` Ted Zlatanov
  2010-10-09 16:09     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Ted Zlatanov @ 2010-10-08 17:23 UTC (permalink / raw)
  To: ding

On Fri, 08 Oct 2010 19:05:54 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 

LMI> Just when typing this, it occurs to me that there may be help from the
LMI> unicode standard, perhaps?  I mean, it shr-insert inserts a text with
LMI> characters from the Japanese, Chinese, Korean and *mumble* planes, then
LMI> we can break it on "character" boundaries, instead where spaces appear.
LMI> So if Emacs has a way to say (character-from-non-space-language-p char),
LMI> then our problems would be solved.  :-)

http://unicode.org/cldr/utility/properties.jsp

Maybe you can use some of these properties?

http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:East_Asian_Width=Wide:]
http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:Line_Break=Ideographic:]
http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:Word_Break=Katakana:]

I don't know the right way, sorry...

Ted




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-08 17:23   ` Ted Zlatanov
@ 2010-10-09 16:09     ` Lars Magne Ingebrigtsen
  2010-10-09 21:35       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-09 16:09 UTC (permalink / raw)
  To: ding

Ted Zlatanov <tzz@lifelogs.com> writes:

> http://unicode.org/cldr/utility/properties.jsp
>
> Maybe you can use some of these properties?
>
> http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:East_Asian_Width=Wide:]

The page just hangs for me...

Oh.  .jsp.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-09 16:09     ` Lars Magne Ingebrigtsen
@ 2010-10-09 21:35       ` Lars Magne Ingebrigtsen
  2010-10-09 21:48         ` Lars Magne Ingebrigtsen
                           ` (3 more replies)
  0 siblings, 4 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-09 21:35 UTC (permalink / raw)
  To: ding

I tried yanking some text from an article in
gwene.cn.com.sina.blog.hanhan (Chinese text, I guess) into a .txt
buffer, and then just `M-q'-ing it.  It didn't look very nice.  It
looked as if thought the characters were much narrower than they really
were, because it filled the lines way too late.  Like this:

最近一个月没有更新,是因为不知道说什么好,我本人也正在赶新的小说,会在9月
出版,前几天去香港书展,长望着维港,更觉得应该有好的作品奉献给大家,虽然
他人无德无能,但我又何德何能,我深感忧虑,何以解愁,唯有作品。

(I hope that's not a naughty text and I'll end up in jail now for
threatening terroristey behaviour...)

In my Emacs, that text is displayed at least five characters too wide
before the lines were folded.  But that probably depends on what font
you use.

Doesn't Emacs have any way to say "how many pixels have I left now"?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-09 21:35       ` Lars Magne Ingebrigtsen
@ 2010-10-09 21:48         ` Lars Magne Ingebrigtsen
  2010-10-10  4:57         ` CHENG Gao
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-09 21:48 UTC (permalink / raw)
  To: ding

fill.el has a char table called `fill-nospace-between-words-table' that
maps between characters and ... stuff.  It seems to have the kinsoku
function as the value for Japanese characters, for instance.

It can probably be used to do something useful, but only if we actually
know how wide the characters are as displayed.  Which fill.el doesn't
really seem to know, either...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-09 21:35       ` Lars Magne Ingebrigtsen
  2010-10-09 21:48         ` Lars Magne Ingebrigtsen
@ 2010-10-10  4:57         ` CHENG Gao
  2010-10-10 13:42         ` Andreas Schwab
  2010-10-11  7:23         ` Kan-Ru Chen
  3 siblings, 0 replies; 44+ messages in thread
From: CHENG Gao @ 2010-10-10  4:57 UTC (permalink / raw)
  To: ding

*On Sat, 09 Oct 2010 23:35:27 +0200
* Also sprach Lars Magne Ingebrigtsen <larsi@gnus.org>:

> I tried yanking some text from an article in
> gwene.cn.com.sina.blog.hanhan (Chinese text, I guess) into a .txt
> buffer, and then just `M-q'-ing it.  It didn't look very nice.  It
> looked as if thought the characters were much narrower than they really
> were, because it filled the lines way too late.  Like this:
>
> 最近一个月没有更新,是因为不知道说什么好,我本人也正在赶新的小说,会在9月
> 出版,前几天去香港书展,长望着维港,更觉得应该有好的作品奉献给大家,虽然
> 他人无德无能,但我又何德何能,我深感忧虑,何以解愁,唯有作品。
>
> (I hope that's not a naughty text and I'll end up in jail now for
> threatening terroristey behaviour...)

Dont worry. Han Han is a young novelist and famous (though I have no
interest in type like him). He was just talking about why he did not
update his blog and he is writing some new novel blah blah. Nothing
naughty.




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-09 21:35       ` Lars Magne Ingebrigtsen
  2010-10-09 21:48         ` Lars Magne Ingebrigtsen
  2010-10-10  4:57         ` CHENG Gao
@ 2010-10-10 13:42         ` Andreas Schwab
  2010-10-10 13:47           ` Lars Magne Ingebrigtsen
  2010-10-11  7:23         ` Kan-Ru Chen
  3 siblings, 1 reply; 44+ messages in thread
From: Andreas Schwab @ 2010-10-10 13:42 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> I tried yanking some text from an article in
> gwene.cn.com.sina.blog.hanhan (Chinese text, I guess) into a .txt
> buffer, and then just `M-q'-ing it.  It didn't look very nice.  It
> looked as if thought the characters were much narrower than they really
> were, because it filled the lines way too late.  Like this:
>
> 最近一个月没有更新,是因为不知道说什么好,我本人也正在赶新的小说,会在9月
> 出版,前几天去香港书展,长望着维港,更觉得应该有好的作品奉献给大家,虽然
> 他人无德无能,但我又何德何能,我深感忧虑,何以解愁,唯有作品。

What's your fill-column?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-10 13:42         ` Andreas Schwab
@ 2010-10-10 13:47           ` Lars Magne Ingebrigtsen
  2010-10-10 16:22             ` Andreas Schwab
  2010-10-11  4:03             ` James Cloos
  0 siblings, 2 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-10 13:47 UTC (permalink / raw)
  To: ding

Andreas Schwab <schwab@linux-m68k.org> writes:

> What's your fill-column?

72.

The problem is that Emacs says that `char-width' on the kanji is 2,
while they're really 24 pixels wide, compared to the 10 pixels other
characters are.

I guess Emacs doesn't really know this until the redisplay has
happened.  `posn-at-point' gets the number of pixels displayed, but only
after display has happened, for instance?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-10 13:47           ` Lars Magne Ingebrigtsen
@ 2010-10-10 16:22             ` Andreas Schwab
  2010-10-11  4:03             ` James Cloos
  1 sibling, 0 replies; 44+ messages in thread
From: Andreas Schwab @ 2010-10-10 16:22 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> The problem is that Emacs says that `char-width' on the kanji is 2,
> while they're really 24 pixels wide, compared to the 10 pixels other
> characters are.

That depends on the font, of course.  With the fonts I use the double
width characters are exactly twice a wide as the single width ones.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-10 13:47           ` Lars Magne Ingebrigtsen
  2010-10-10 16:22             ` Andreas Schwab
@ 2010-10-11  4:03             ` James Cloos
  1 sibling, 0 replies; 44+ messages in thread
From: James Cloos @ 2010-10-11  4:03 UTC (permalink / raw)
  To: ding

>>>>> "LMI" == Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

LMI> The problem is that Emacs says that `char-width' on the kanji is 2,
LMI> while they're really 24 pixels wide, compared to the 10 pixels other
LMI> characters are.

In that case it is a general problem of font mismatch.  Emacs is still
heavily geared to mono- (or duo-) width fonts, even though I can display
proportional faces.

AFAICT it always guesses distances as a fixed number of character cells
and compensates as best it can after trying to display a string.

I'd love to see support for true proportional editing; it would make
modes like cc much easier on the eyes.  But simultaneously maintaining
nice alignment for both charcell terminals and arbitrary sets of
proportional fonts is a hard problem.

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-09 21:35       ` Lars Magne Ingebrigtsen
                           ` (2 preceding siblings ...)
  2010-10-10 13:42         ` Andreas Schwab
@ 2010-10-11  7:23         ` Kan-Ru Chen
  2010-10-11 18:07           ` Lars Magne Ingebrigtsen
  3 siblings, 1 reply; 44+ messages in thread
From: Kan-Ru Chen @ 2010-10-11  7:23 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> I tried yanking some text from an article in
> gwene.cn.com.sina.blog.hanhan (Chinese text, I guess) into a .txt
> buffer, and then just `M-q'-ing it.  It didn't look very nice.  It
> looked as if thought the characters were much narrower than they really
> were, because it filled the lines way too late.  Like this:
>
> 最近一个月没有更新,是因为不知道说什么好,我本人也正在赶新的小说,会在9月
> 出版,前几天去香港书展,长望着维港,更觉得应该有好的作品奉献给大家,虽然
> 他人无德无能,但我又何德何能,我深感忧虑,何以解愁,唯有作品。
>
> (I hope that's not a naughty text and I'll end up in jail now for
> threatening terroristey behaviour...)

The text displayed totally OK with the terminal emulator, because
historically the ideographics are `double-width'. If you use a monospace
CJK font which is exactly twice wide of the latin font, you will find
the text is wrapped around 73th column as well.

(The text itself is OK, too ;-)

> In my Emacs, that text is displayed at least five characters too wide
> before the lines were folded.  But that probably depends on what font
> you use.
>
> Doesn't Emacs have any way to say "how many pixels have I left now"?

That would be better than counting characters. But consider that we can
use both GUI and terminal interfaces at the same time (multi-tty),
counting pixels seems bogus, too.

Kanru
-- 
Q: Why are my replies five sentences or less?
A: http://five.sentenc.es/




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-11  7:23         ` Kan-Ru Chen
@ 2010-10-11 18:07           ` Lars Magne Ingebrigtsen
  2010-10-12  8:19             ` Katsumi Yamaoka
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-11 18:07 UTC (permalink / raw)
  To: ding

Kan-Ru Chen <kanru@kanru.info> writes:

> That would be better than counting characters. But consider that we can
> use both GUI and terminal interfaces at the same time (multi-tty),
> counting pixels seems bogus, too.

You'd have to count pixels in the frame you're using to display it...

Anyway, it seems the best we can do is to use `char-width' and break
between these characters in the same way that fill.el does it.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-11 18:07           ` Lars Magne Ingebrigtsen
@ 2010-10-12  8:19             ` Katsumi Yamaoka
  2010-10-12 12:48               ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-12  8:19 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen wrote:
> You'd have to count pixels in the frame you're using to display it...

I'm not pretty sure but at least Japanese people seem to use
16-dot width font for Kanji if 8-dot width font is used for ASCII
(or 14-dot for Kanji and 7-dot for ASCII, etc.).  And the fill
functions seem to have been designed assuming such a habit (i.e.,
Kanji font is twice the width of ASCII font).  Though there seems
to be no help for text appearing uglily when using proportional
fonts.

> Anyway, it seems the best we can do is to use `char-width' and break
> between these characters in the same way that fill.el does it.

Is there a reason not to use the fill functions?  How about the
following?

--8<---------------cut here---------------start------------->8---
(defun shr-insert (text)
  (when (and (eq shr-state 'image)
	     (not (string-match "\\`[ \t\n]+\\'" text)))
    (insert "\n")
    (setq shr-state nil))
  (cond
   ((eq shr-folding-mode 'none)
    (insert text))
   (t
    (if (string-match "\\`[\t\n ]+" text)
	(setq text (substring text (match-end 0))))
    (if (string-match "[\t\n ]+\\'" text)
	(setq text (substring text 0 (match-beginning 0))))
    (unless (string-equal text "")
      (let ((start (line-beginning-position))
	    (fill-column shr-width)
	    (fill-prefix (make-string shr-indentation ? )))
	(cond ((bolp)
	       (insert fill-prefix))
	      ((eq (char-before) ? ))
	      ((not (and (>= (char-width (or (char-before) ? )) 2)
			 (>= (char-width (aref text 0)) 2)))
	       (insert " ")))
	(insert text)
	(fill-region-as-paragraph start (point)))))))
--8<---------------cut here---------------end--------------->8---

It looks good enough to me for any language.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-12  8:19             ` Katsumi Yamaoka
@ 2010-10-12 12:48               ` Lars Magne Ingebrigtsen
  2010-10-12 14:13                 ` Katsumi Yamaoka
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-12 12:48 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> Is there a reason not to use the fill functions?  How about the
> following?

I was hoping to avoid using the fill functions so that all the things
that need start-end regions (<A>, <EM>, etc) could rely on the start
point never changing (much) after it's been set.

I'm not sure that's actually the case any more, since the folding algo
has gotten progressively more complicated, so it might make sense to
switch the start-point things to using markers and allowing filling
after the entire paragraph has been finished.

I'm also not sure how this would affect performance.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-12 12:48               ` Lars Magne Ingebrigtsen
@ 2010-10-12 14:13                 ` Katsumi Yamaoka
  2010-10-13  8:13                   ` Katsumi Yamaoka
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-12 14:13 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Katsumi Yamaoka <yamaoka@jpl.org> writes:

>> Is there a reason not to use the fill functions?  How about the
>> following?

> I was hoping to avoid using the fill functions so that all the things
> that need start-end regions (<A>, <EM>, etc) could rely on the start
> point never changing (much) after it's been set.

Agreed.  I feel it possible to achieve.  AFAICT, what the present
`shr-insert' code should do or should not do for CJK text are:

1. Don't insert SPC between wide characters; simply concatenate
 short lines.  But SPC inserted between wide character and ASCII
 word is ok (I like that style, though there may be those who do
 not like it).

2. Fold long lines; we can chop wide characters text anywhere.

3. Do kinsoku.

That's all, maybe.  The code I posted last does them, though
fill.el may be too complicated to do them.

> I'm not sure that's actually the case any more, since the folding algo
> has gotten progressively more complicated, so it might make sense to
> switch the start-point things to using markers and allowing filling
> after the entire paragraph has been finished.

> I'm also not sure how this would affect performance.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-12 14:13                 ` Katsumi Yamaoka
@ 2010-10-13  8:13                   ` Katsumi Yamaoka
  2010-10-13 16:53                     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-13  8:13 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka wrote:
> Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
>> I was hoping to avoid using the fill functions so that all the things
>> that need start-end regions (<A>, <EM>, etc) could rely on the start
>> point never changing (much) after it's been set.

> Agreed.  I feel it possible to achieve.  AFAICT, what the present
> `shr-insert' code should do or should not do for CJK text are:

> 1. Don't insert SPC between wide characters; simply concatenate
>  short lines.  But SPC inserted between wide character and ASCII
>  word is ok (I like that style, though there may be those who do
>  not like it).

> 2. Fold long lines; we can chop wide characters text anywhere.

> 3. Do kinsoku.

Here's my next try that conforms to what you intended to, and
does 1, 2, and 3.  It may not be complete yet though.

ftp://ftp.jpl.org/pub/tmp/shr-insert.el
or http://www.jpl.org/ftp/pub/tmp/shr-insert.el



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-13  8:13                   ` Katsumi Yamaoka
@ 2010-10-13 16:53                     ` Lars Magne Ingebrigtsen
  2010-10-13 19:01                       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-13 16:53 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> Here's my next try that conforms to what you intended to, and
> does 1, 2, and 3.  It may not be complete yet though.
>
> ftp://ftp.jpl.org/pub/tmp/shr-insert.el

Yes, I think that should work, but I think using
`fill-find-break-point-function-table' instead of "[^\000-\377]" and
stuff will probably give you better results when mixing, say, Japanese
and Russian.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-13 16:53                     ` Lars Magne Ingebrigtsen
@ 2010-10-13 19:01                       ` Lars Magne Ingebrigtsen
  2010-10-14  8:16                         ` Katsumi Yamaoka
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-13 19:01 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Yes, I think that should work, but I think using
> `fill-find-break-point-function-table' instead of "[^\000-\377]" and
> stuff will probably give you better results when mixing, say, Japanese
> and Russian.

I've now tweaked the filling algorithm slightly, and it seems to do the
right thing in the Han Han group (with Chinese text), but I haven't
tested it in any groups with mixed European/Japanese text.  What's a
good test group?

Also, what font(s) should I install on Debian to get a display that has
the traditional kanji-are-twice-as-wide-as-non-kanji buffer?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-13 19:01                       ` Lars Magne Ingebrigtsen
@ 2010-10-14  8:16                         ` Katsumi Yamaoka
  2010-10-14 10:12                           ` Katsumi Yamaoka
                                             ` (3 more replies)
  0 siblings, 4 replies; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-14  8:16 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

Lars Magne Ingebrigtsen wrote:
>> Yes, I think that should work, but I think using
>> `fill-find-break-point-function-table' instead of "[^\000-\377]" and
>> stuff will probably give you better results when mixing, say, Japanese
>> and Russian.

I see.  It's much smarter.

> I've now tweaked the filling algorithm slightly, and it seems to do the
> right thing in the Han Han group (with Chinese text), but I haven't
> tested it in any groups with mixed European/Japanese text.  What's a
> good test group?

Try gwene groups of which the group names contain ".jp.".  For
instance: gwene.jp.gr.gentoo.gentoojp-news

If you don't think very troublesome, I'd like to recommend
nnshimbun.el.  It makes html articles from contents obtained
from web sites.  You don't have to alter `mm-text-html-renderer'.
My recommendation is:

M-x gnus-group-make-shimbun-group RET asahi RET rss RET

Cf. (info "(emacs-w3m)Gnus"), (info "(emacs-w3m)Nnshimbun")

> Also, what font(s) should I install on Debian to get a display that has
> the traditional kanji-are-twice-as-wide-as-non-kanji buffer?

I believe there should be such fonts, if you've installed all
the fonts Debian distributes. ;-)  I use Fedora and have all the
fonts installed.  My favorites are:

-*-fixed-medium-r-normal-*-16-*-*-*-*-*-iso8859-1
-*-fixed-medium-r-normal-*-16-*-*-*-*-*-jisx0208.1983-0

BTW, the present shr.el code deletes CJK characters that are at
the end of lines, inserts useless SPC between wide characters,
and doesn't seem to do kinsoku.  So, I tried improving them.  A
patch follows.  I think it is near completion.  WDYT?

Though I suspect there are wrong assignments for some Chinese
characters in the kinsoku configuration.  I'll ask Handa-san
later.


[-- Attachment #2: shr.el.patch.gz --]
[-- Type: application/x-gzip, Size: 775 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14  8:16                         ` Katsumi Yamaoka
@ 2010-10-14 10:12                           ` Katsumi Yamaoka
  2010-10-14 14:13                           ` Katsumi Yamaoka
                                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-14 10:12 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka wrote:
> If you don't think very troublesome, I'd like to recommend
> nnshimbun.el.  It makes html articles from contents obtained
> from web sites.  You don't have to alter `mm-text-html-renderer'.
> My recommendation is:

> M-x gnus-group-make-shimbun-group RET asahi RET rss RET

> Cf. (info "(emacs-w3m)Gnus"), (info "(emacs-w3m)Nnshimbun")

Oops.  To make it generate html articles, not text/plain articles,
you have to have:

(setq shimbun-asahi-prefer-text-plain nil)

Moreover,

(setq shimbun-asahi-japanese-hankaku t)

this would be better for testing for text containing ASCII and
Kanji.  Cf. (info "(emacs-w3m)Zenkaku to hankaku conversion")



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14  8:16                         ` Katsumi Yamaoka
  2010-10-14 10:12                           ` Katsumi Yamaoka
@ 2010-10-14 14:13                           ` Katsumi Yamaoka
  2010-10-14 18:34                           ` Lars Magne Ingebrigtsen
  2010-10-14 19:05                           ` Lars Magne Ingebrigtsen
  3 siblings, 0 replies; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-14 14:13 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka wrote:
> Though I suspect there are wrong assignments for some Chinese
> characters in the kinsoku configuration.  I'll ask Handa-san
> later.

This will be fixed in Emacs head thanks to Handa-san.  He said
he will add

;; Fullwidth characters
(modify-category-entry '(#xff01 . #xff60) ?\|)

to lisp/international/characters.el.  Then kinsoku for CJK text
will work properly (maybe tomorrow).



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14  8:16                         ` Katsumi Yamaoka
  2010-10-14 10:12                           ` Katsumi Yamaoka
  2010-10-14 14:13                           ` Katsumi Yamaoka
@ 2010-10-14 18:34                           ` Lars Magne Ingebrigtsen
  2010-10-14 19:10                             ` Lars Magne Ingebrigtsen
  2010-10-15  1:18                             ` Katsumi Yamaoka
  2010-10-14 19:05                           ` Lars Magne Ingebrigtsen
  3 siblings, 2 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-14 18:34 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> Try gwene groups of which the group names contain ".jp.".  For
> instance: gwene.jp.gr.gentoo.gentoojp-news

Ok; I've had a peek at that, but since my fonts are so weird, I can't
really tell whether it's being done correctly or not.  :-/

> If you don't think very troublesome, I'd like to recommend
> nnshimbun.el.  It makes html articles from contents obtained
> from web sites.  You don't have to alter `mm-text-html-renderer'.
> My recommendation is:
>
> M-x gnus-group-make-shimbun-group RET asahi RET rss RET
>
> Cf. (info "(emacs-w3m)Gnus"), (info "(emacs-w3m)Nnshimbun")

This is from the emacs-w3m distribution?

>> Also, what font(s) should I install on Debian to get a display that has
>> the traditional kanji-are-twice-as-wide-as-non-kanji buffer?
>
> I believe there should be such fonts, if you've installed all
> the fonts Debian distributes. ;-)  I use Fedora and have all the
> fonts installed.  My favorites are:
>
> -*-fixed-medium-r-normal-*-16-*-*-*-*-*-iso8859-1
> -*-fixed-medium-r-normal-*-16-*-*-*-*-*-jisx0208.1983-0

Let's see...

Ok; I've now installed the jisx font, I think, but Emacs isn't using
it.  Is there a way to query Emacs what font it's using for a particular
char so that I can remove that font?

> BTW, the present shr.el code deletes CJK characters that are at
> the end of lines, inserts useless SPC between wide characters,
> and doesn't seem to do kinsoku.  So, I tried improving them.  A
> patch follows.  I think it is near completion.  WDYT?

Look good, I think, but why is it inserting SPC characters?  It
shouldn't do that now...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14  8:16                         ` Katsumi Yamaoka
                                             ` (2 preceding siblings ...)
  2010-10-14 18:34                           ` Lars Magne Ingebrigtsen
@ 2010-10-14 19:05                           ` Lars Magne Ingebrigtsen
  2010-10-15  8:08                             ` Katsumi Yamaoka
  3 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-14 19:05 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> BTW, the present shr.el code deletes CJK characters that are at
> the end of lines, inserts useless SPC between wide characters,
> and doesn't seem to do kinsoku.  So, I tried improving them.  A
> patch follows.  I think it is near completion.  WDYT?

Looks good; please apply.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14 18:34                           ` Lars Magne Ingebrigtsen
@ 2010-10-14 19:10                             ` Lars Magne Ingebrigtsen
  2010-10-15  0:40                               ` Katsumi Yamaoka
                                                 ` (2 more replies)
  2010-10-15  1:18                             ` Katsumi Yamaoka
  1 sibling, 3 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-14 19:10 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Ok; I've now installed the jisx font, I think, but Emacs isn't using
> it.  Is there a way to query Emacs what font it's using for a particular
> char so that I can remove that font?

(internal-char-font (point)) seems to do the trick.

On the following text:

GnetooJP は11月6日に開催される関西オープンソース2010に参加します。

The 開 kanji is:

(#<font-object "-misc-fixed-medium-r-normal--14-130-75-75-c-140-jisx0208.1983-0"> . 9295)

The 催 kanji is:

(#<font-object "-isas-song ti-medium-r-normal--24-240-72-72-c-240-gb2312.1980-0"> . 21570)

So my buffers look like a complete mess.

And I just can't find out where the "song ti" font comes from.  I've
tried removing all the half-likely font packages, but I haven't found
the right one.

Help?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14 19:10                             ` Lars Magne Ingebrigtsen
@ 2010-10-15  0:40                               ` Katsumi Yamaoka
  2010-10-15  7:35                                 ` Lars Magne Ingebrigtsen
  2010-10-15  1:28                               ` Kan-Ru Chen
  2010-10-15  6:29                               ` Reiner Steib
  2 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-15  0:40 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 1062 bytes --]

Lars wrote:
> On the following text:

> GnetooJP は11月6日に開催される関西オープンソース2010に参加します。

> The 開 kanji is:
> (#<font-object "-misc-fixed-medium-r-normal--14-130-75-75-c-140-jisx0208.1983-0"> . 9295)
> The 催 kanji is:
> (#<font-object "-isas-song ti-medium-r-normal--24-240-72-72-c-240-gb2312.1980-0"> . 21570)

> So my buffers look like a complete mess.

> And I just can't find out where the "song ti" font comes from.  I've
> tried removing all the half-likely font packages, but I haven't found
> the right one.

> Help?

Er, I'm not an Emacs font expert but with `emacs -Q', under the
C locale, I tried what I'm using:

(set-fontset-font
       (face-attribute 'default :fontset)
       'japanese-jisx0208
       "-*-fixed-medium-r-normal-*-16-*-*-*-*-*-jisx0208.1983-0")

(set-face-font 'default
	       "-*-fixed-medium-r-normal-*-16-*-*-*-*-*-iso8859-1")

This shows that text as the screenshot attached below.  I have no
Xdefaults for Emacs at all, since I want to control X things all
by ELisp.


[-- Attachment #2: Screenshot.png --]
[-- Type: image/png, Size: 761 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14 18:34                           ` Lars Magne Ingebrigtsen
  2010-10-14 19:10                             ` Lars Magne Ingebrigtsen
@ 2010-10-15  1:18                             ` Katsumi Yamaoka
  1 sibling, 0 replies; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-15  1:18 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen wrote:
> Katsumi Yamaoka <yamaoka@jpl.org> writes:
>> M-x gnus-group-make-shimbun-group RET asahi RET rss RET
>>
>> Cf. (info "(emacs-w3m)Gnus"), (info "(emacs-w3m)Nnshimbun")

> This is from the emacs-w3m distribution?

Yes it is.  But how about this nnml folder instead?

ftp://ftp.jpl.org/pub/tmp/asahi.tar.gz

It contains the latest 100 articles from the Asahi shimbun that's
one of the most popular newspapers in Japan.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14 19:10                             ` Lars Magne Ingebrigtsen
  2010-10-15  0:40                               ` Katsumi Yamaoka
@ 2010-10-15  1:28                               ` Kan-Ru Chen
  2010-10-15  6:29                               ` Reiner Steib
  2 siblings, 0 replies; 44+ messages in thread
From: Kan-Ru Chen @ 2010-10-15  1:28 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> Ok; I've now installed the jisx font, I think, but Emacs isn't using
>> it.  Is there a way to query Emacs what font it's using for a particular
>> char so that I can remove that font?
>
> (internal-char-font (point)) seems to do the trick.

Or M-x describe-char

> On the following text:
>
> GnetooJP は11月6日に開催される関西オープンソース2010に参加します。
>
> The 開 kanji is:
>
> (#<font-object "-misc-fixed-medium-r-normal--14-130-75-75-c-140-jisx0208.1983-0"> . 9295)
>
> The 催 kanji is:
>
> (#<font-object "-isas-song ti-medium-r-normal--24-240-72-72-c-240-gb2312.1980-0"> . 21570)
>
> So my buffers look like a complete mess.
>
> And I just can't find out where the "song ti" font comes from.  I've
> tried removing all the half-likely font packages, but I haven't found
> the right one.
>
> Help?

The "song ti" font comes from xfonts-base package.

I use ttf-wqy-microhei (WenQuanYi Micro Hei), which is a font derived
form google Droid Sans font, covering large iso-10646 range. To set
font for specific character set you can use the fontset
mechanism. `(emacs) Fontsets'

However, since I'm not using fixed point font, the pixel size is not
strictly twice as ascii font.




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14 19:10                             ` Lars Magne Ingebrigtsen
  2010-10-15  0:40                               ` Katsumi Yamaoka
  2010-10-15  1:28                               ` Kan-Ru Chen
@ 2010-10-15  6:29                               ` Reiner Steib
  2010-10-15 18:05                                 ` Andreas Schwab
  2 siblings, 1 reply; 44+ messages in thread
From: Reiner Steib @ 2010-10-15  6:29 UTC (permalink / raw)
  To: ding

On Thu, Oct 14 2010, Lars Magne Ingebrigtsen wrote:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> Ok; I've now installed the jisx font, I think, but Emacs isn't using
>> it.  Is there a way to query Emacs what font it's using for a particular
>> char so that I can remove that font?
>
> (internal-char-font (point)) seems to do the trick.

M-x describe-char RET :-)

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15  0:40                               ` Katsumi Yamaoka
@ 2010-10-15  7:35                                 ` Lars Magne Ingebrigtsen
  2010-10-15 10:28                                   ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-15  7:35 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> (set-fontset-font
>        (face-attribute 'default :fontset)
>        'japanese-jisx0208
>        "-*-fixed-medium-r-normal-*-16-*-*-*-*-*-jisx0208.1983-0")

Thanks; that did the trick.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-14 19:05                           ` Lars Magne Ingebrigtsen
@ 2010-10-15  8:08                             ` Katsumi Yamaoka
  2010-10-15 14:02                               ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-15  8:08 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen wrote:
> Katsumi Yamaoka <yamaoka@jpl.org> writes:

>> BTW, the present shr.el code deletes CJK characters that are at
>> the end of lines, inserts useless SPC between wide characters,
>> and doesn't seem to do kinsoku.  So, I tried improving them.  A
>> patch follows.  I think it is near completion.  WDYT?

> Looks good; please apply.

Done.  But I noticed this change breaks rendering of some kind
of table.  It seems that `shr-render-td' was designed assuming
the length of a line should be less than `shr-width'.  But it
will be less than *or equal to* `shr-width' now (kinsoku may
lengthen it more).  I'm not sure it's ok for all the cases but
here's a workaround.

--- shr.el~	2010-10-15 08:04:53 +0000
+++ shr.el	2010-10-15 08:05:28 +0000
@@ -687,7 +687,7 @@
     (let ((cache (cdr (assoc (cons width cont) shr-content-cache))))
       (if cache
 	  (insert cache)
-	(let ((shr-width width)
+	(let ((shr-width (1- width))
 	      (shr-indentation 0))
 	  (shr-generic cont))
 	(delete-region

In addition, I tried removing trailing space inserted to each
line.  It's incomplete yet though.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15  7:35                                 ` Lars Magne Ingebrigtsen
@ 2010-10-15 10:28                                   ` Lars Magne Ingebrigtsen
  2010-10-15 10:48                                     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-15 10:28 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

>> (set-fontset-font
>>        (face-attribute 'default :fontset)
>>        'japanese-jisx0208
>>        "-*-fixed-medium-r-normal-*-16-*-*-*-*-*-jisx0208.1983-0")
>
> Thanks; that did the trick.

But now characters like this:

that&#8217;s

’

gets renders as a double-width character instead of as ':

        character: ’ (8217, #o20031, #x2019)
preferred charset: unicode (Unicode (ISO10646))
       code point: 0x2019
           syntax: . 	which means: punctuation
         category: .:Base, c:Chinese, h:Korean, j:Japanese
      buffer code: #xE2 #x80 #x99
        file code: #xE2 #x80 #x99 (encoded by coding system utf-8-emacs)
          display: by this font (glyph code)
    x:-jis-fixed-medium-r-normal--16-150-75-75-c-160-jisx0208.1983-0 (#x2147)

Character code properties: customize what to show
  name: RIGHT SINGLE QUOTATION MARK
  old-name: SINGLE COMMA QUOTATION MARK
  general-category: Pf (Punctuation, Final quote)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15 10:28                                   ` Lars Magne Ingebrigtsen
@ 2010-10-15 10:48                                     ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-15 10:48 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

>         character: ’ (8217, #o20031, #x2019)
> preferred charset: unicode (Unicode (ISO10646))

I fixed that by explicitly setting the main font in Emacs:

(set-face-font 'default
	       "-b&h-lucidatypewriter-medium-r-normal-sans-12-120-75-75-m-70-iso8859-1")


-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15  8:08                             ` Katsumi Yamaoka
@ 2010-10-15 14:02                               ` Lars Magne Ingebrigtsen
  2010-10-17  1:04                                 ` Lars Magne Ingebrigtsen
  2010-10-18  5:19                                 ` Katsumi Yamaoka
  0 siblings, 2 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-15 14:02 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> Done.  But I noticed this change breaks rendering of some kind
> of table.  It seems that `shr-render-td' was designed assuming
> the length of a line should be less than `shr-width'.  But it
> will be less than *or equal to* `shr-width' now (kinsoku may
> lengthen it more).  I'm not sure it's ok for all the cases but
> here's a workaround.

Please apply, and we'll see what happens...  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15  6:29                               ` Reiner Steib
@ 2010-10-15 18:05                                 ` Andreas Schwab
  2010-10-15 20:16                                   ` Reiner Steib
  0 siblings, 1 reply; 44+ messages in thread
From: Andreas Schwab @ 2010-10-15 18:05 UTC (permalink / raw)
  To: ding

Reiner Steib <reinersteib+gmane@imap.cc> writes:

> On Thu, Oct 14 2010, Lars Magne Ingebrigtsen wrote:
>
>> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>>
>>> Ok; I've now installed the jisx font, I think, but Emacs isn't using
>>> it.  Is there a way to query Emacs what font it's using for a particular
>>> char so that I can remove that font?
>>
>> (internal-char-font (point)) seems to do the trick.
>
> M-x describe-char RET :-)

C-u C-x =

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15 18:05                                 ` Andreas Schwab
@ 2010-10-15 20:16                                   ` Reiner Steib
  0 siblings, 0 replies; 44+ messages in thread
From: Reiner Steib @ 2010-10-15 20:16 UTC (permalink / raw)
  To: ding

On Fri, Oct 15 2010, Andreas Schwab wrote:

> Reiner Steib <reinersteib+gmane@imap.cc> writes:
>
>> On Thu, Oct 14 2010, Lars Magne Ingebrigtsen wrote:
>>
>>> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>>> (internal-char-font (point)) seems to do the trick.
>>
>> M-x describe-char RET :-)
>
> C-u C-x =

The former is easier to remember for me.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15 14:02                               ` Lars Magne Ingebrigtsen
@ 2010-10-17  1:04                                 ` Lars Magne Ingebrigtsen
  2010-10-18  5:19                                   ` Katsumi Yamaoka
  2010-10-18  5:19                                 ` Katsumi Yamaoka
  1 sibling, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-17  1:04 UTC (permalink / raw)
  To: ding

The rendering of the HTML here seems suboptimal:

<x1-OUNZF2Nd2VVfDqcGJ+0P+EWocFQ@gwene.org>

It's an article that talks about Chinese words, so it's mostly English
text with some Chinese here and there.  The spaces have been removed
from the start and end of the Chinese text, which makes it look
awkward.  Should the space stripping be altered in some way?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-15 14:02                               ` Lars Magne Ingebrigtsen
  2010-10-17  1:04                                 ` Lars Magne Ingebrigtsen
@ 2010-10-18  5:19                                 ` Katsumi Yamaoka
  2010-10-19  7:52                                   ` Katsumi Yamaoka
  1 sibling, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-18  5:19 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen wrote:
> Katsumi Yamaoka <yamaoka@jpl.org> writes:

>> Done.  But I noticed this change breaks rendering of some kind
>> of table.  It seems that `shr-render-td' was designed assuming
>> the length of a line should be less than `shr-width'.  But it
>> will be less than *or equal to* `shr-width' now (kinsoku may
>> lengthen it more).  I'm not sure it's ok for all the cases but
>> here's a workaround.

> Please apply, and we'll see what happens...  :-)

That's not so serious, only makes appearance of table ugly.  So,
I'm going to begin with learning how table is made.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-17  1:04                                 ` Lars Magne Ingebrigtsen
@ 2010-10-18  5:19                                   ` Katsumi Yamaoka
  2010-10-18 19:01                                     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-18  5:19 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen wrote:
> The rendering of the HTML here seems suboptimal:

> <x1-OUNZF2Nd2VVfDqcGJ+0P+EWocFQ@gwene.org>

> It's an article that talks about Chinese words, so it's mostly English
> text with some Chinese here and there.  The spaces have been removed
> from the start and end of the Chinese text, which makes it look
> awkward.  Should the space stripping be altered in some way?

I've improved the way to examine whether space is necessary
between characters.  Now space will not be inserted if

the previous character is wide and categorized as kinsoku-bol[1],
or
both before and behind characters are categorized as nospace[2].

This is better for Japanese and ASCII mixture text, too. :)

[1] (aref (char-category-set CHAR) ?>)
[2] (aref fill-nospace-between-words-table CHAR)



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-18  5:19                                   ` Katsumi Yamaoka
@ 2010-10-18 19:01                                     ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-18 19:01 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> I've improved the way to examine whether space is necessary
> between characters.  Now space will not be inserted if
>
> the previous character is wide and categorized as kinsoku-bol[1],
> or
> both before and behind characters are categorized as nospace[2].

Looks good; thanks.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-18  5:19                                 ` Katsumi Yamaoka
@ 2010-10-19  7:52                                   ` Katsumi Yamaoka
  2010-10-19 18:12                                     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-19  7:52 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka wrote:
> That's not so serious, only makes appearance of table ugly.  So,
> I'm going to begin with learning how table is made.

Table rendering is mostly ok now.  What we have to do next will
be to support caption, thead, and tfoot in table.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-19  7:52                                   ` Katsumi Yamaoka
@ 2010-10-19 18:12                                     ` Lars Magne Ingebrigtsen
  2010-10-20  7:29                                       ` Katsumi Yamaoka
  0 siblings, 1 reply; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-19 18:12 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

> Table rendering is mostly ok now.  What we have to do next will
> be to support caption, thead, and tfoot in table.

And colspan and rowspan.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-19 18:12                                     ` Lars Magne Ingebrigtsen
@ 2010-10-20  7:29                                       ` Katsumi Yamaoka
  2010-10-20 17:22                                         ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 44+ messages in thread
From: Katsumi Yamaoka @ 2010-10-20  7:29 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 365 bytes --]

Lars Magne Ingebrigtsen wrote:
> Katsumi Yamaoka <yamaoka@jpl.org> writes:

>> Table rendering is mostly ok now.  What we have to do next will
>> be to support caption, thead, and tfoot in table.

> And colspan and rowspan.  :-)

Ok, it's the next todo. :)
I've added the caption, thead, and tfoot supports.  An html mail
with which I tested it is attached below.


[-- Attachment #2: testmail.gz --]
[-- Type: application/x-gzip, Size: 686 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: shr.el: folding Japanese text
  2010-10-20  7:29                                       ` Katsumi Yamaoka
@ 2010-10-20 17:22                                         ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 44+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-20 17:22 UTC (permalink / raw)
  To: ding

Katsumi Yamaoka <yamaoka@jpl.org> writes:

>> And colspan and rowspan.  :-)
>
> Ok, it's the next todo. :)

:-)

> I've added the caption, thead, and tfoot supports.  An html mail
> with which I tested it is attached below.

Looks good.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2010-10-20 17:22 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-08  8:10 shr.el: folding Japanese text Katsumi Yamaoka
2010-10-08 17:05 ` Lars Magne Ingebrigtsen
2010-10-08 17:23   ` Ted Zlatanov
2010-10-09 16:09     ` Lars Magne Ingebrigtsen
2010-10-09 21:35       ` Lars Magne Ingebrigtsen
2010-10-09 21:48         ` Lars Magne Ingebrigtsen
2010-10-10  4:57         ` CHENG Gao
2010-10-10 13:42         ` Andreas Schwab
2010-10-10 13:47           ` Lars Magne Ingebrigtsen
2010-10-10 16:22             ` Andreas Schwab
2010-10-11  4:03             ` James Cloos
2010-10-11  7:23         ` Kan-Ru Chen
2010-10-11 18:07           ` Lars Magne Ingebrigtsen
2010-10-12  8:19             ` Katsumi Yamaoka
2010-10-12 12:48               ` Lars Magne Ingebrigtsen
2010-10-12 14:13                 ` Katsumi Yamaoka
2010-10-13  8:13                   ` Katsumi Yamaoka
2010-10-13 16:53                     ` Lars Magne Ingebrigtsen
2010-10-13 19:01                       ` Lars Magne Ingebrigtsen
2010-10-14  8:16                         ` Katsumi Yamaoka
2010-10-14 10:12                           ` Katsumi Yamaoka
2010-10-14 14:13                           ` Katsumi Yamaoka
2010-10-14 18:34                           ` Lars Magne Ingebrigtsen
2010-10-14 19:10                             ` Lars Magne Ingebrigtsen
2010-10-15  0:40                               ` Katsumi Yamaoka
2010-10-15  7:35                                 ` Lars Magne Ingebrigtsen
2010-10-15 10:28                                   ` Lars Magne Ingebrigtsen
2010-10-15 10:48                                     ` Lars Magne Ingebrigtsen
2010-10-15  1:28                               ` Kan-Ru Chen
2010-10-15  6:29                               ` Reiner Steib
2010-10-15 18:05                                 ` Andreas Schwab
2010-10-15 20:16                                   ` Reiner Steib
2010-10-15  1:18                             ` Katsumi Yamaoka
2010-10-14 19:05                           ` Lars Magne Ingebrigtsen
2010-10-15  8:08                             ` Katsumi Yamaoka
2010-10-15 14:02                               ` Lars Magne Ingebrigtsen
2010-10-17  1:04                                 ` Lars Magne Ingebrigtsen
2010-10-18  5:19                                   ` Katsumi Yamaoka
2010-10-18 19:01                                     ` Lars Magne Ingebrigtsen
2010-10-18  5:19                                 ` Katsumi Yamaoka
2010-10-19  7:52                                   ` Katsumi Yamaoka
2010-10-19 18:12                                     ` Lars Magne Ingebrigtsen
2010-10-20  7:29                                       ` Katsumi Yamaoka
2010-10-20 17:22                                         ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).