ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Find too long sentences
@ 2013-02-15 12:38 "H. Özoguz"
  2013-02-15 12:52 ` Marco Patzer
  2013-02-15 12:54 ` Philipp Gesang
  0 siblings, 2 replies; 6+ messages in thread
From: "H. Özoguz" @ 2013-02-15 12:38 UTC (permalink / raw)
  To: ntg-context

Good Friday there,

working on a book with many too long sentences, I got the following 
idea/question:
Is it possible to recognize the length of a sentence, and to let context 
show in the pdf, if there is a too long sentence.

For example I am thinking of an command like
\version[longsentence,15]

which sets an symbol like "*" in the margin, if a sentence has more than 
15 words.

First, correct sentence-regonition could be a task: It is not enough to 
count the words between two dots, because of abbreviations. But there is 
probably a known algorithm for handling those problems. I think a 
feature like this could be interesting for context (and for my work :D)

Thanks for you comments
Huseyin

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Find too long sentences
  2013-02-15 12:38 Find too long sentences "H. Özoguz"
@ 2013-02-15 12:52 ` Marco Patzer
  2013-02-15 12:54 ` Philipp Gesang
  1 sibling, 0 replies; 6+ messages in thread
From: Marco Patzer @ 2013-02-15 12:52 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 545 bytes --]

On 2013–02–15 "H. Özoguz" wrote:

> working on a book with many too long sentences, I got the following
> idea/question:
> Is it possible to recognize the length of a sentence, and to let
> context show in the pdf, if there is a too long sentence.

This sounds more like a job for the text editor. Many text editors
already know what a sentence is. That means it's as easy as looping
over all sentences and counting the characters/bytes and performing
some action (e.g. placing the cursor or injecting a ConTeXt macro).

Marco

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Find too long sentences
  2013-02-15 12:38 Find too long sentences "H. Özoguz"
  2013-02-15 12:52 ` Marco Patzer
@ 2013-02-15 12:54 ` Philipp Gesang
  1 sibling, 0 replies; 6+ messages in thread
From: Philipp Gesang @ 2013-02-15 12:54 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 1631 bytes --]

···<date: 2013-02-15, Friday>···<from: "H. Özoguz">···

> Good Friday there,
> 
> working on a book with many too long sentences, I got the following
> idea/question:
> Is it possible to recognize the length of a sentence, and to let
> context show in the pdf, if there is a too long sentence.

Places to start:
  http://en.wikipedia.org/wiki/Sentence_breaking
  http://en.wikipedia.org/wiki/Natural_language_processing#Major_tasks_in_NLP

Good luck :P
Philipp



> 
> For example I am thinking of an command like
> \version[longsentence,15]
> 
> which sets an symbol like "*" in the margin, if a sentence has more
> than 15 words.
> 
> First, correct sentence-regonition could be a task: It is not enough
> to count the words between two dots, because of abbreviations. But
> there is probably a known algorithm for handling those problems. I
> think a feature like this could be interesting for context (and for
> my work :D)
> 
> Thanks for you comments
> Huseyin
> 
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : http://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

-- 
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

[-- Attachment #1.2: Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Find too long sentences
  2013-02-15 14:52   ` Marco Patzer
@ 2013-02-15 16:16     ` Marco Patzer
  0 siblings, 0 replies; 6+ messages in thread
From: Marco Patzer @ 2013-02-15 16:16 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 1086 bytes --]

On 2013–02–15 Marco Patzer wrote:

> In vim pressing “vis” (visualise inner sentence) marks the current
> sentence, then pressing “g<Ctrl-g>” yields:
> 
> Selected 2 of 4 lines; 14 of 50 words; 82 of 296 bytes
> 
> That means current sentence is 82 bytes long. The rest is up to you.
> Pick a language you like, vim uses its own scripting language but
> also has bindings for python, perl, lua, etc. Pseudo-code:
> 
> go to begin of file
> start:
>   get byte length of sentence
>   if length > max_length
>     % do something
>   fi
>   move on to the next sentence
>   goto start

Here's a quick and naïve vim function which moves the cursor to the
last sentence containing more than 250 bytes when you hit F9.

function! GoToLastTooLongSentence()
  let maxSentenceLength = 250
  while line('.') != 1
    normal ( | yis
    let num = strlen(@")
    if num >= maxSentenceLength
      normal vis  " just for demonstration, remove this
      break
    endif
  endwhile
endfunction
noremap <F9> :call GoToLastTooLongSentence()<cr>

Marco

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Find too long sentences
  2013-02-15 14:11 ` "H. Özoguz"
@ 2013-02-15 14:52   ` Marco Patzer
  2013-02-15 16:16     ` Marco Patzer
  0 siblings, 1 reply; 6+ messages in thread
From: Marco Patzer @ 2013-02-15 14:52 UTC (permalink / raw)
  To: ntg-context


[-- Attachment #1.1: Type: text/plain, Size: 1135 bytes --]

On 2013–02–15 "H. Özoguz" wrote:

> Which text editor can do that, find too long sentences?

None, since no editor knows by default what “too long” is. But a few
editors (at least vim and I assume emacs as well) have an idea of
what a sentence is. Both are scriptable, which means you can tell
them what you consider “too long”. If you use a different editor
read the manual or ask your favourite search engine.

In vim pressing “vis” (visualise inner sentence) marks the current
sentence, then pressing “g<Ctrl-g>” yields:

Selected 2 of 4 lines; 14 of 50 words; 82 of 296 bytes

That means current sentence is 82 bytes long. The rest is up to you.
Pick a language you like, vim uses its own scripting language but
also has bindings for python, perl, lua, etc. Pseudo-code:

go to begin of file
start:
  get byte length of sentence
  if length > max_length
    % do something
  fi
  move on to the next sentence
  goto start

Or just define a regular expression for a sentence (google is your
friend) and use a scripting language directly if your editor is not
scriptable.

Marco

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Find too long sentences
       [not found] <mailman.815.1360932936.2489.ntg-context@ntg.nl>
@ 2013-02-15 14:11 ` "H. Özoguz"
  2013-02-15 14:52   ` Marco Patzer
  0 siblings, 1 reply; 6+ messages in thread
From: "H. Özoguz" @ 2013-02-15 14:11 UTC (permalink / raw)
  To: ntg-context

Which text editor can do that, find too long sentences?

Huseyin


Am 15.02.2013 13:55, schrieb ntg-context-request@ntg.nl:
> This sounds more like a job for the text editor. Many text editors
> already know what a sentence is. That means it's as easy as looping
> over all sentences and counting the characters/bytes and performing
> some action (e.g. placing the cursor or injecting a ConTeXt macro).
>
> Marco

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-02-15 16:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-15 12:38 Find too long sentences "H. Özoguz"
2013-02-15 12:52 ` Marco Patzer
2013-02-15 12:54 ` Philipp Gesang
     [not found] <mailman.815.1360932936.2489.ntg-context@ntg.nl>
2013-02-15 14:11 ` "H. Özoguz"
2013-02-15 14:52   ` Marco Patzer
2013-02-15 16:16     ` Marco Patzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).