ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Translating PDF-files
@ 2011-01-19 12:58 Cecil Westerhof
  2011-01-19 13:05 ` Martin Schröder
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Cecil Westerhof @ 2011-01-19 12:58 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 585 bytes --]

Properly not really a ConTeXt question, but maybe nows the answer.

Someone asked me how to convert a PDF to XML and back. The reasons is that
he has a PDF in English, but he likes to have it also in Russian. His idea
is to convert the PDF file to XML, translate the XML file with
GoogleTranslate and convert the translated XML file to PDF. He asked me how
to do this. Of-course it does not have to be a XML file, if GoogleTranslate
can work with a TEX file, there is no reason not to do it.

Does anyone know how to do this, or has pointers about how to do this?

-- 
Cecil Westerhof

[-- Attachment #1.2: Type: text/html, Size: 625 bytes --]

[-- Attachment #2: Type: text/plain, Size: 486 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 12:58 Translating PDF-files Cecil Westerhof
@ 2011-01-19 13:05 ` Martin Schröder
  2011-01-19 13:10   ` Cecil Westerhof
  2011-01-19 13:12 ` R. Ermers
  2011-01-19 13:50 ` luigi scarso
  2 siblings, 1 reply; 18+ messages in thread
From: Martin Schröder @ 2011-01-19 13:05 UTC (permalink / raw)
  To: mailing list for ConTeXt users

2011/1/19 Cecil Westerhof <cldwesterhof@gmail.com>:
> Someone asked me how to convert a PDF to XML and back. The reasons is that
> he has a PDF in English, but he likes to have it also in Russian. His idea

It will be _much_ easier to get the original english sources and
translate _them_ (and create a new PDF from the translate). Trust me.

Best
   Martin
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 13:05 ` Martin Schröder
@ 2011-01-19 13:10   ` Cecil Westerhof
  0 siblings, 0 replies; 18+ messages in thread
From: Cecil Westerhof @ 2011-01-19 13:10 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 771 bytes --]

2011/1/19 Martin Schröder <martin@oneiros.de>

> 2011/1/19 Cecil Westerhof <cldwesterhof@gmail.com>:
> > Someone asked me how to convert a PDF to XML and back. The reasons is
> that
> > he has a PDF in English, but he likes to have it also in Russian. His
> idea
>
> It will be _much_ easier to get the original english sources and
> translate _them_ (and create a new PDF from the translate). Trust me.
>

Would be my guess also. Was my first comment to this person. ;-} But he
wants to do it this way. His idea is have standard PDF's on his website, but
let people choose in which language they want it, and then let it be
translated on the fly. He also has PDF's he can redistribute, but for which
he will not get the sources.

-- 
Cecil Westerhof

[-- Attachment #1.2: Type: text/html, Size: 1123 bytes --]

[-- Attachment #2: Type: text/plain, Size: 486 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 12:58 Translating PDF-files Cecil Westerhof
  2011-01-19 13:05 ` Martin Schröder
@ 2011-01-19 13:12 ` R. Ermers
  2011-01-19 13:35   ` Cecil Westerhof
  2011-01-19 14:30   ` Arthur Reutenauer
  2011-01-19 13:50 ` luigi scarso
  2 siblings, 2 replies; 18+ messages in thread
From: R. Ermers @ 2011-01-19 13:12 UTC (permalink / raw)
  To: mailing list for ConTeXt users


If your acquaintance actually needs an accurate translation into Russian, I wonder why he would choose Google Translate for that. Remember that Russian has 7 cases, and a complex verbal system with many different forms, all of which need to be deduced by Google from the much poorer English prepositions and the verbal foms in the text. Even though the result will no doubt show cyrillic words, which looks interesting, the factual result will be rubbish, and most likely unintelligible to any Russian.

Regards,

Robert


Op 19 jan 2011, om 13:58 heeft Cecil Westerhof het volgende geschreven:

> Properly not really a ConTeXt question, but maybe nows the answer.
> 
> Someone asked me how to convert a PDF to XML and back. The reasons is that he has a PDF in English, but he likes to have it also in Russian. His idea is to convert the PDF file to XML, translate the XML file with GoogleTranslate and convert the translated XML file to PDF. He asked me how to do this. Of-course it does not have to be a XML file, if GoogleTranslate can work with a TEX file, there is no reason not to do it.
> 
> Does anyone know how to do this, or has pointers about how to do this?
> 
> -- 
> Cecil Westerhof
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 13:12 ` R. Ermers
@ 2011-01-19 13:35   ` Cecil Westerhof
  2011-01-19 14:30   ` Arthur Reutenauer
  1 sibling, 0 replies; 18+ messages in thread
From: Cecil Westerhof @ 2011-01-19 13:35 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 972 bytes --]

2011/1/19 R. Ermers <r.ermers@hccnet.nl>

>
> If your acquaintance actually needs an accurate translation into Russian, I
> wonder why he would choose Google Translate for that. Remember that Russian
> has 7 cases, and a complex verbal system with many different forms, all of
> which need to be deduced by Google from the much poorer English prepositions
> and the verbal foms in the text. Even though the result will no doubt show
> cyrillic words, which looks interesting, the factual result will be rubbish,
> and most likely unintelligible to any Russian.
>

I do not know if he requires Russian, he was talking about Ukrainian. But
that maybe has the same problems.

Automatic translation is always a problem. I even do not like the results
from English to Dutch. But his reasoning is: 'better a badly translated
document, as no document'. I am not sure if I agree 100%, but if that is
what he wants, who am I to -keep- telling him he is wrong?

-- 
Cecil Westerhof

[-- Attachment #1.2: Type: text/html, Size: 1255 bytes --]

[-- Attachment #2: Type: text/plain, Size: 486 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 12:58 Translating PDF-files Cecil Westerhof
  2011-01-19 13:05 ` Martin Schröder
  2011-01-19 13:12 ` R. Ermers
@ 2011-01-19 13:50 ` luigi scarso
  2011-01-19 14:14   ` Cecil Westerhof
  2 siblings, 1 reply; 18+ messages in thread
From: luigi scarso @ 2011-01-19 13:50 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, Jan 19, 2011 at 1:58 PM, Cecil Westerhof <cldwesterhof@gmail.com> wrote:
> Properly not really a ConTeXt question, but maybe nows the answer.
>
> Someone asked me how to convert a PDF to XML and back. The reasons is that
> he has a PDF in English, but he likes to have it also in Russian. His idea
> is to convert the PDF file to XML, translate the XML file with
> GoogleTranslate and convert the translated XML file to PDF. He asked me how
> to do this. Of-course it does not have to be a XML file, if GoogleTranslate
> can work with a TEX file, there is no reason not to do it.
google for pdftotext


-- 
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 13:50 ` luigi scarso
@ 2011-01-19 14:14   ` Cecil Westerhof
  2011-01-19 14:23     ` luigi scarso
  0 siblings, 1 reply; 18+ messages in thread
From: Cecil Westerhof @ 2011-01-19 14:14 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 814 bytes --]

2011/1/19 luigi scarso <luigi.scarso@gmail.com>

> On Wed, Jan 19, 2011 at 1:58 PM, Cecil Westerhof <cldwesterhof@gmail.com>
> wrote:
> > Properly not really a ConTeXt question, but maybe nows the answer.
> >
> > Someone asked me how to convert a PDF to XML and back. The reasons is
> that
> > he has a PDF in English, but he likes to have it also in Russian. His
> idea
> > is to convert the PDF file to XML, translate the XML file with
> > GoogleTranslate and convert the translated XML file to PDF. He asked me
> how
> > to do this. Of-course it does not have to be a XML file, if
> GoogleTranslate
> > can work with a TEX file, there is no reason not to do it.
> google for pdftotext
>

Already done. What looked the most promissing was pdftohtml. Just wondering
if there is a better way.

-- 
Cecil Westerhof

[-- Attachment #1.2: Type: text/html, Size: 1204 bytes --]

[-- Attachment #2: Type: text/plain, Size: 486 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 14:14   ` Cecil Westerhof
@ 2011-01-19 14:23     ` luigi scarso
  2011-01-19 14:35       ` Cecil Westerhof
  0 siblings, 1 reply; 18+ messages in thread
From: luigi scarso @ 2011-01-19 14:23 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, Jan 19, 2011 at 3:14 PM, Cecil Westerhof <cldwesterhof@gmail.com> wrote:
> Already done. What looked the most promissing was pdftohtml. Just wondering
> if there is a better way.
What y do you want exactly ?
Preserve structure ? formulas ? layout ?
As far as these informations are not embedded (tagged) into the pdf
you have to  do (a lot of) manual work .

Also google for pdfdraw mupdf
-- 
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 13:12 ` R. Ermers
  2011-01-19 13:35   ` Cecil Westerhof
@ 2011-01-19 14:30   ` Arthur Reutenauer
  2011-01-19 15:03     ` R. Ermers
  1 sibling, 1 reply; 18+ messages in thread
From: Arthur Reutenauer @ 2011-01-19 14:30 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

> Even though the result will no doubt show cyrillic words, which looks interesting, the factual result will be rubbish, and most likely unintelligible to any Russian.

  That's an interesting statement; do you have any experience with that
at all, or are you simply speculating?  I have never heard any claim
that machine translation would be more difficult for some particular
languages.  It's generally a hard problem, and each language has its
specific issues, not only Russian (that has 6 cases, by the way, not 7,
and really only one fully conjugated tense).

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 14:23     ` luigi scarso
@ 2011-01-19 14:35       ` Cecil Westerhof
  2011-01-19 14:41         ` luigi scarso
  0 siblings, 1 reply; 18+ messages in thread
From: Cecil Westerhof @ 2011-01-19 14:35 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 874 bytes --]

2011/1/19 luigi scarso <luigi.scarso@gmail.com>

> On Wed, Jan 19, 2011 at 3:14 PM, Cecil Westerhof <cldwesterhof@gmail.com>
> wrote:
> > Already done. What looked the most promissing was pdftohtml. Just
> wondering
> > if there is a better way.
> What y do you want exactly ?
> Preserve structure ? formulas ? layout ?
> As far as these informations are not embedded (tagged) into the pdf
> you have to  do (a lot of) manual work .
>

My contact 'just' wants to translate the document. I already told him that
this is easier said than done. But he is adamant. (Notwithstanding that
several people already gave up on his quest.) I think structure and layout
should be maintained. But I think it will be mostly 'simple' documents with
text and some graphics. So I do not expect to have formula trouble.


Also google for pdfdraw mupdf
>

I will do that.

-- 
Cecil Westerhof

[-- Attachment #1.2: Type: text/html, Size: 1430 bytes --]

[-- Attachment #2: Type: text/plain, Size: 486 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 14:35       ` Cecil Westerhof
@ 2011-01-19 14:41         ` luigi scarso
  2011-01-19 14:51           ` Cecil Westerhof
  0 siblings, 1 reply; 18+ messages in thread
From: luigi scarso @ 2011-01-19 14:41 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Wed, Jan 19, 2011 at 3:35 PM, Cecil Westerhof <cldwesterhof@gmail.com> wrote:
> 2011/1/19 luigi scarso <luigi.scarso@gmail.com>
>>
>> On Wed, Jan 19, 2011 at 3:14 PM, Cecil Westerhof <cldwesterhof@gmail.com>
>> wrote:
>> > Already done. What looked the most promissing was pdftohtml. Just
>> > wondering
>> > if there is a better way.
>> What y do you want exactly ?
>> Preserve structure ? formulas ? layout ?
>> As far as these informations are not embedded (tagged) into the pdf
>> you have to  do (a lot of) manual work .
>
> My contact 'just' wants to translate the document. I already told him that
> this is easier said than done. But he is adamant. (Notwithstanding that
> several people already gave up on his quest.) I think structure and layout
> should be maintained. But I think it will be mostly 'simple' documents with
> text and some graphics. So I do not expect to have formula trouble.
hm, maybe you can have a  look at inkscape then (at least 0.48)

-- 
luigi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 14:41         ` luigi scarso
@ 2011-01-19 14:51           ` Cecil Westerhof
  0 siblings, 0 replies; 18+ messages in thread
From: Cecil Westerhof @ 2011-01-19 14:51 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 155 bytes --]

2011/1/19 luigi scarso <luigi.scarso@gmail.com>

> hm, maybe you can have a  look at inkscape then (at least 0.48)
>

I will do that.

-- 
Cecil Westerhof

[-- Attachment #1.2: Type: text/html, Size: 454 bytes --]

[-- Attachment #2: Type: text/plain, Size: 486 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 14:30   ` Arthur Reutenauer
@ 2011-01-19 15:03     ` R. Ermers
  2011-01-19 15:18       ` Arthur Reutenauer
  0 siblings, 1 reply; 18+ messages in thread
From: R. Ermers @ 2011-01-19 15:03 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Off topic: Well, I speak Russian and some other languages. Yes you are right, the 5th case is the locative (after o), the 6th case is the instrumental. One does not count the cases everyday :-)

It is not a language in general that is difficult, but the pair a language is in: the pair English-Russian is, in some aspects, more difficult than the other way around because of the choice for the perfective aspect or imperfective aspect of the tenses. An English text does not offer any clues as to which aspect to choose, but anyone who wants to speaks Russian has to decide instantly. A program is unlikely do that.

These problems might not exist for the pair Ukrainian-Russian, or perhaps (?) Polish-Russian, or - who knows - Basque-Russian.
The options for determine the appropriate aspect, if programmers succeed in building them at all, are, for example, not needed in the pair English-Dutch.

The reversed pair Russian-English poses different problems, such as when and where to put an article. The program has to derive from the context whether a given Russian noun in the text should be interpreted as determined or undetermined, and then whether it is appropriate to put the article, etcetera.

Robert


>> Even though the result will no doubt show cyrillic words, which looks interesting, the factual result will be rubbish, and most likely unintelligible to any Russian.
> 
>  That's an interesting statement; do you have any experience with that
> at all, or are you simply speculating?  I have never heard any claim
> that machine translation would be more difficult for some particular
> languages.  It's generally a hard problem, and each language has its
> specific issues, not only Russian (that has 6 cases, by the way, not 7,
> and really only one fully conjugated tense).
> 
> 	Arthur
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 15:03     ` R. Ermers
@ 2011-01-19 15:18       ` Arthur Reutenauer
  2011-01-19 16:06         ` R. Ermers
  0 siblings, 1 reply; 18+ messages in thread
From: Arthur Reutenauer @ 2011-01-19 15:18 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

> It is not a language in general that is difficult, but the pair a language is in: the pair English-Russian is, in some aspects, more difficult than the other way around because of the choice for the perfective aspect or imperfective aspect of the tenses. An English text does not offer any clues as to which aspect to choose, but anyone who wants to speaks Russian has to decide instantly. A program is unlikely do that.

  I'm sorry, but that's pure speculation.  It would be interesting to
see research about machine translation for some particular language
pairs, though.

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 15:18       ` Arthur Reutenauer
@ 2011-01-19 16:06         ` R. Ermers
  2011-01-19 16:18           ` Alan BRASLAU
  0 siblings, 1 reply; 18+ messages in thread
From: R. Ermers @ 2011-01-19 16:06 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Still off topic:

Well, this is partly lexicological knowledge and research on translation. Each language pair and translation direction has its peculiar problems.

Whether or not you are able to say it is "pure speculation" depends on how familiar you are with computer linguistics, and its progress in determining semantic content from texts (step 1) and rephrasing it in a given target language (step 2).

I'm glad that you accept that it is about the pair and the direction of the translation.

Robert

>> It is not a language in general that is difficult, but the pair a language is in: the pair English-Russian is, in some aspects, more difficult than the other way around because of the choice for the perfective aspect or imperfective aspect of the tenses. An English text does not offer any clues as to which aspect to choose, but anyone who wants to speaks Russian has to decide instantly. A program is unlikely do that.
> 
>  I'm sorry, but that's pure speculation.  It would be interesting to
> see research about machine translation for some particular language
> pairs, though.
> 
> 	Arthur
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 16:06         ` R. Ermers
@ 2011-01-19 16:18           ` Alan BRASLAU
  2011-01-19 19:23             ` Alan BRASLAU
  2011-01-20  9:53             ` Yury G. Kudryashov
  0 siblings, 2 replies; 18+ messages in thread
From: Alan BRASLAU @ 2011-01-19 16:18 UTC (permalink / raw)
  To: ntg-context

A fun exercise is to put a text through google translate into any language,
then pass the result back into the original language.

via Russian: Fun exercise is to put the text through Google Translate in any
language, and then pass the result back to the original language.

via Japonese: Exercise is fun, Google is placing text via translation into 
other languages To pass the result to the original language.

via French (for Arthur): A fun exercise is to put a text through Google 
translate in any language, then pass the result in the original language.

...

Alan

On Wednesday 19 January 2011 17:06:53 R. Ermers wrote:
> Still off topic:
> 
> Well, this is partly lexicological knowledge and research on translation.
> Each language pair and translation direction has its peculiar problems.
> 
> Whether or not you are able to say it is "pure speculation" depends on how
> familiar you are with computer linguistics, and its progress in
> determining semantic content from texts (step 1) and rephrasing it in a
> given target language (step 2).
> 
> I'm glad that you accept that it is about the pair and the direction of the
> translation.
> 
> Robert
> 
> >> It is not a language in general that is difficult, but the pair a
> >> language is in: the pair English-Russian is, in some aspects, more
> >> difficult than the other way around because of the choice for the
> >> perfective aspect or imperfective aspect of the tenses. An English text
> >> does not offer any clues as to which aspect to choose, but anyone who
> >> wants to speaks Russian has to decide instantly. A program is unlikely
> >> do that.
> >> 
> >  I'm sorry, but that's pure speculation.  It would be interesting to
> > 
> > see research about machine translation for some particular language
> > pairs, though.
> > 
> > 	Arthur
> > 
> > _________________________________________________________________________
> > __________ If your question is of interest to others as well, please add
> > an entry to the Wiki!
> > 
> > maillist : ntg-context@ntg.nl /
> > http://www.ntg.nl/mailman/listinfo/ntg-context webpage  :
> > http://www.pragma-ade.nl / http://tex.aanhet.net
> > archive  : http://foundry.supelec.fr/projects/contextrev/
> > wiki     : http://contextgarden.net
> > _________________________________________________________________________
> > __________
> 
> ___________________________________________________________________________
> ________ If your question is of interest to others as well, please add an
> entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl /
> http://www.ntg.nl/mailman/listinfo/ntg-context webpage  :
> http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________
> ________

-- 
Alan Braslau
CEA DSM-IRAMIS-SPEC
CNRS URA 2464
Orme des Merisiers
91191 Gif-sur-Yvette cedex FRANCE
tel: +33 1 69 08 73 15
fax: +33 1 69 08 87 86
mailto:alan.braslau@cea.fr

 .''`.
: :'  :
`. `'`
  `-
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 16:18           ` Alan BRASLAU
@ 2011-01-19 19:23             ` Alan BRASLAU
  2011-01-20  9:53             ` Yury G. Kudryashov
  1 sibling, 0 replies; 18+ messages in thread
From: Alan BRASLAU @ 2011-01-19 19:23 UTC (permalink / raw)
  To: ntg-context

More off topic:

A fun 'esercise be to put some text drough google translate into any language, 
den pass de result back into de o'iginal language. What it is, Mama!

(via jive, a good exercise in lex and yacc, very politically incorrect!)


On Wednesday 19 January 2011 17:18:35 Alan BRASLAU wrote:
> A fun exercise is to put a text through google translate into any language,
> then pass the result back into the original language.
> 
> via Russian: Fun exercise is to put the text through Google Translate in
> any language, and then pass the result back to the original language.
> 
> via Japonese: Exercise is fun, Google is placing text via translation into
> other languages To pass the result to the original language.
> 
> via French (for Arthur): A fun exercise is to put a text through Google
> translate in any language, then pass the result in the original language.
> 
> ...
> 
> Alan
> 
> On Wednesday 19 January 2011 17:06:53 R. Ermers wrote:
> > Still off topic:
> > 
> > Well, this is partly lexicological knowledge and research on translation.
> > Each language pair and translation direction has its peculiar problems.
> > 
> > Whether or not you are able to say it is "pure speculation" depends on
> > how familiar you are with computer linguistics, and its progress in
> > determining semantic content from texts (step 1) and rephrasing it in a
> > given target language (step 2).
> > 
> > I'm glad that you accept that it is about the pair and the direction of
> > the translation.
> > 
> > Robert
> > 
> > >> It is not a language in general that is difficult, but the pair a
> > >> language is in: the pair English-Russian is, in some aspects, more
> > >> difficult than the other way around because of the choice for the
> > >> perfective aspect or imperfective aspect of the tenses. An English
> > >> text does not offer any clues as to which aspect to choose, but
> > >> anyone who wants to speaks Russian has to decide instantly. A program
> > >> is unlikely do that.
> > >> 
> > >  I'm sorry, but that's pure speculation.  It would be interesting to
> > > 
> > > see research about machine translation for some particular language
> > > pairs, though.
> > > 
> > > 	Arthur
> > > 
> > > _______________________________________________________________________
> > > __ __________ If your question is of interest to others as well, please
> > > add an entry to the Wiki!
> > > 
> > > maillist : ntg-context@ntg.nl /
> > > http://www.ntg.nl/mailman/listinfo/ntg-context webpage  :
> > > http://www.pragma-ade.nl / http://tex.aanhet.net
> > > archive  : http://foundry.supelec.fr/projects/contextrev/
> > > wiki     : http://contextgarden.net
> > > _______________________________________________________________________
> > > __ __________
> > 
> > _________________________________________________________________________
> > __ ________ If your question is of interest to others as well, please add
> > an entry to the Wiki!
> > 
> > maillist : ntg-context@ntg.nl /
> > http://www.ntg.nl/mailman/listinfo/ntg-context webpage  :
> > http://www.pragma-ade.nl / http://tex.aanhet.net
> > archive  : http://foundry.supelec.fr/projects/contextrev/
> > wiki     : http://contextgarden.net
> > _________________________________________________________________________
> > __ ________

-- 
Alan Braslau
CEA DSM-IRAMIS-SPEC
CNRS URA 2464
Orme des Merisiers
91191 Gif-sur-Yvette cedex FRANCE
tel: +33 1 69 08 73 15
fax: +33 1 69 08 87 86
mailto:alan.braslau@cea.fr

 .''`.
: :'  :
`. `'`
  `-
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Translating PDF-files
  2011-01-19 16:18           ` Alan BRASLAU
  2011-01-19 19:23             ` Alan BRASLAU
@ 2011-01-20  9:53             ` Yury G. Kudryashov
  1 sibling, 0 replies; 18+ messages in thread
From: Yury G. Kudryashov @ 2011-01-20  9:53 UTC (permalink / raw)
  To: ntg-context

Alan BRASLAU wrote:

> A fun exercise is to put a text through google translate into any
> language, then pass the result back into the original language.
> 
> via Russian: Fun exercise is to put the text through Google Translate in
> any language, and then pass the result back to the original language.
The Russian translation is much worse than the English->Russian->English 
one.

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-01-20  9:53 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-19 12:58 Translating PDF-files Cecil Westerhof
2011-01-19 13:05 ` Martin Schröder
2011-01-19 13:10   ` Cecil Westerhof
2011-01-19 13:12 ` R. Ermers
2011-01-19 13:35   ` Cecil Westerhof
2011-01-19 14:30   ` Arthur Reutenauer
2011-01-19 15:03     ` R. Ermers
2011-01-19 15:18       ` Arthur Reutenauer
2011-01-19 16:06         ` R. Ermers
2011-01-19 16:18           ` Alan BRASLAU
2011-01-19 19:23             ` Alan BRASLAU
2011-01-20  9:53             ` Yury G. Kudryashov
2011-01-19 13:50 ` luigi scarso
2011-01-19 14:14   ` Cecil Westerhof
2011-01-19 14:23     ` luigi scarso
2011-01-19 14:35       ` Cecil Westerhof
2011-01-19 14:41         ` luigi scarso
2011-01-19 14:51           ` Cecil Westerhof

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).