ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* distinguish different characters from different languages
@ 2013-06-09 14:58 Tim Li
  2013-06-09 17:31 ` Hans Hagen
  2013-06-09 22:57 ` hwitloc
  0 siblings, 2 replies; 5+ messages in thread
From: Tim Li @ 2013-06-09 14:58 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 831 bytes --]

 Hi,
 
Is there a way in ConTeXt to distinguish (or recognise) different characters from different languages, especially distinguishing those used in China, Japan and Korea (CJK) from English.
 
For example, sentence(1) and its translation (sentence (2))  below are mixed English with Chinese characters,
 
    as far as I know, Chinese write 排版 as typography.  (1)
    translation: 据我所知,中国人将typography写作排版。      (2)
 
If I input this sentence in the ConTeXt source file, how can I recognise English characters and Chinese characters respectively so that I can insert space (say, 1/4 space) when nesting English words into Chinese (the result of sentence (2) in PDF file will look like this: 据我所知,中国人将 typography 写作排版。)
 
Are there some materials or topics about this?
 
Tim
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 1781 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: distinguish different characters from different languages
  2013-06-09 14:58 distinguish different characters from different languages Tim Li
@ 2013-06-09 17:31 ` Hans Hagen
  2013-06-09 22:57 ` hwitloc
  1 sibling, 0 replies; 5+ messages in thread
From: Hans Hagen @ 2013-06-09 17:31 UTC (permalink / raw)
  To: ntg-context

On 6/9/2013 4:58 PM, Tim Li wrote:
>
> Hi,
>
> Is there a way in ConTeXt to distinguish (or recognise) different
> characters from different languages, especially distinguishing those
> used in China, Japan and Korea (CJK) from English.
>
> For example, sentence(1) and its translation (sentence (2))  below are
> mixed English with Chinese characters,
>
> as far as I know, Chinese write 排版 as /typography/.  (1)
>      translation: 据我所知,中国人将/typography/写作排版。      (2)
>
> If I input this sentence in the ConTeXt source file, how can I recognise
> English characters and Chinese characters respectively so that I can
> insert space (say, 1/4 space) when nesting English
> words into Chinese (the result of *sentence (2)* in PDF file will look
> like this: 据我所知,中国人将 typography 写作排版。)
>
> Are there some materials or topics about this?

see files in test suite under subpath 'scripts'

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: distinguish different characters from different languages
  2013-06-09 14:58 distinguish different characters from different languages Tim Li
  2013-06-09 17:31 ` Hans Hagen
@ 2013-06-09 22:57 ` hwitloc
  2013-06-10  7:34   ` Tim Li
  1 sibling, 1 reply; 5+ messages in thread
From: hwitloc @ 2013-06-09 22:57 UTC (permalink / raw)
  To: mailing list for ConTeXt users


This seems to be about inter-word spacing, rather than character sets.

For the phrase:  "据我所知,中国人将typography写作排版"

The intuitive operation for ConTeXt should be to preserve the explict space after the comma, but the word "typography" is not seperated from the rest of the text with spaces.

If you input the text as   "据我所知,中国人将 typography 写作排版"
Then the spaces should be preserved as in English or other languages.
This was not the case once for Japanese, but a (temporary?) fix was put into the ongoing development version, I believe.  The space removal was due to the fact that Chinese and Japanese do not use space between words in normal text.

For now can you use the ~ or some like escape sequence to force a space where you want it?

--------

Tim Li <timli2013@outlook.com> wrote:

> Hi,
> 
> Is there a way in ConTeXt to distinguish (or recognise) different
> characters from different languages, especially distinguishing those
> used in China, Japan and Korea (CJK) from English.
> 
> For example, sentence(1) and its translation (sentence (2)) below are
> mixed English with Chinese characters,
> 
> as far as I know, Chinese write 排版 as typography. (1)
> translation: 据我所知,中国人将typography写作排版。(2)
> 
> If I input this sentence in the ConTeXt source file, how can I
> recognise English characters and Chinese characters respectively so
> that I can insert space (say, 1/4 space) when nesting English words
> into Chinese (the result of sentence (2) in PDF file will look like
> this: 据我所知,中国人将 typography 写作排版。)
> 
> Are there some materials or topics about this?
> 
> Tim
> 
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: distinguish different characters from different languages
  2013-06-09 22:57 ` hwitloc
@ 2013-06-10  7:34   ` Tim Li
  2013-06-10 10:00     ` Hans Hagen
  0 siblings, 1 reply; 5+ messages in thread
From: Tim Li @ 2013-06-10  7:34 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 3583 bytes --]

@hwitloc
I don't think this is only an interword question, especially when you are typesetting a book like `The Joy of Chinese` which will involve many paragraphs containing many English words nested in Chinese sentences.
So if users pay more attention to insert spaces when switcting different languages, they will pay less attention to the contents they are typesetting. In this 
case, we need ConTeXt to do this task (insert spaces when switching to English from Chinese) automatically. 
 
@Hans
Which file or Which files should I read in the subpath of `scripts`? All?


 
> Date: Mon, 10 Jun 2013 07:57:50 +0900
> From: hwitloc@gmail.com
> To: ntg-context@ntg.nl
> Subject: Re: [NTG-context] distinguish different characters from different languages
> 
> 
> This seems to be about inter-word spacing, rather than character sets.
> 
> For the phrase:  "据我所知,中国人将typography写作排版"
> 
> The intuitive operation for ConTeXt should be to preserve the explict space after the comma, but the word "typography" is not seperated from the rest of the text with spaces.
> 
> If you input the text as   "据我所知,中国人将 typography 写作排版"
> Then the spaces should be preserved as in English or other languages.
> This was not the case once for Japanese, but a (temporary?) fix was put into the ongoing development version, I believe.  The space removal was due to the fact that Chinese and Japanese do not use space between words in normal text.
> 
> For now can you use the ~ or some like escape sequence to force a space where you want it?
> 
> --------
> 
> Tim Li <timli2013@outlook.com> wrote:
> 
> > Hi,
> > 
> > Is there a way in ConTeXt to distinguish (or recognise) different
> > characters from different languages, especially distinguishing those
> > used in China, Japan and Korea (CJK) from English.
> > 
> > For example, sentence(1) and its translation (sentence (2)) below are
> > mixed English with Chinese characters,
> > 
> > as far as I know, Chinese write 排版 as typography. (1)
> > translation: 据我所知,中国人将typography写作排版。(2)
> > 
> > If I input this sentence in the ConTeXt source file, how can I
> > recognise English characters and Chinese characters respectively so
> > that I can insert space (say, 1/4 space) when nesting English words
> > into Chinese (the result of sentence (2) in PDF file will look like
> > this: 据我所知,中国人将 typography 写作排版。)
> > 
> > Are there some materials or topics about this?
> > 
> > Tim
> > 
> > ___________________________________________________________________________________
> > If your question is of interest to others as well, please add an entry to the Wiki!
> > 
> > maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> > webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> > archive  : http://foundry.supelec.fr/projects/contextrev/
> > wiki     : http://contextgarden.net
> > ___________________________________________________________________________________
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 4758 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: distinguish different characters from different languages
  2013-06-10  7:34   ` Tim Li
@ 2013-06-10 10:00     ` Hans Hagen
  0 siblings, 0 replies; 5+ messages in thread
From: Hans Hagen @ 2013-06-10 10:00 UTC (permalink / raw)
  To: ntg-context

On 6/10/2013 9:34 AM, Tim Li wrote:
> @hwitloc
> I don't think this is only an interword question, especially when
> you are typesetting a book like `The Joy of Chinese` which will involve
> many paragraphs containing many English words nested in Chinese sentences.
> So if users pay more attention to insert spaces when switcting different
> languages, they will pay less attention to the contents they are
> typesetting. In this case, we need ConTeXt to do this task (insert
> spaces when switching to English from Chinese) automatically.
>
> @Hans
> Which file or Which files should I read in the subpath of `scripts`? All?
>
>
>
>
>  > Date: Mon, 10 Jun 2013 07:57:50 +0900
>  > From: hwitloc@gmail.com
>  > To: ntg-context@ntg.nl
>  > Subject: Re: [NTG-context] distinguish different characters from
> different languages
>  >
>  >
>  > This seems to be about inter-word spacing, rather than character sets.
>  >
>  > For the phrase: "据我所知,中国人将typography写作排版"
>  >
>  > The intuitive operation for ConTeXt should be to preserve the explict
> space after the comma, but the word "typography" is not seperated from
> the rest of the text with spaces.
>  >
>  > If you input the text as "据我所知,中国人将 typography 写作排版"
>  > Then the spaces should be preserved as in English or other languages.
>  > This was not the case once for Japanese, but a (temporary?) fix was
> put into the ongoing development version, I believe. The space removal
> was due to the fact that Chinese and Japanese do not use space between
> words in normal text.
>  >
>  > For now can you use the ~ or some like escape sequence to force a
> space where you want it?

it helps to know what can be downloaded from the website

http://www.pragma-ade.com/download-1.htm

the test suite has examples

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-06-10 10:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-09 14:58 distinguish different characters from different languages Tim Li
2013-06-09 17:31 ` Hans Hagen
2013-06-09 22:57 ` hwitloc
2013-06-10  7:34   ` Tim Li
2013-06-10 10:00     ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).