zsh-users
 help / color / mirror / code / Atom feed
* Capitalization tuning
@ 2008-08-31  0:37 henman
  2008-08-31  3:14 ` Mikael Magnusson
  2008-09-01  1:56 ` Aaron Davies
  0 siblings, 2 replies; 5+ messages in thread
From: henman @ 2008-08-31  0:37 UTC (permalink / raw)
  To: zsh-users


In my audio filename normalization process, i.e., makeing the filenames be compliant to a convertion, I need to capitalize all words.

I used the parameter expansion flag (C) to capitalize filenames


In the manual it says
"C Capitalize the resulting words. 'Words' in this case refers to sequences of alphanumeric characters separated by non-alphanumerics, not to words that result from field splitting."

The problem with the current operation of the Capitalization function is that it capitalizes the first letter of conjunction tails as well.

For example:
     $ CONJUNCT="I'll" && echo ${(C)CONJUNCT}
     I'Ll
     $CONJUNCT="They've" && echo ${(C)CONJUNCT}
     They'Ve

and so on.

My question is can I make the capitalization function think "'" is a word character?
By say, chaning the system parameter WORDCHARS or by specifying some other flag or separater?   
    like:
         W:sep:   or
         s:string: or
         by changing IFS temporarily  or
         the w and s:string: subscript flags
 etc.?


I noticed that there is no modifer for capitalization, only for l)owercase and u)pperase.  So the parameter expansion flag must be used to capitalize or so I gather from reading the "April 2, 2008" dated manual.


I could do a brute force checking for all known conjunctions, but I'd prefer a more elegant procedure.

Thanks for any advice.

regards
   d. henman


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Capitalization tuning
  2008-08-31  0:37 Capitalization tuning henman
@ 2008-08-31  3:14 ` Mikael Magnusson
  2008-09-01 23:36   ` d.henman
  2008-09-01  1:56 ` Aaron Davies
  1 sibling, 1 reply; 5+ messages in thread
From: Mikael Magnusson @ 2008-08-31  3:14 UTC (permalink / raw)
  To: zsh-users

2008/8/31  <henman@tech.email.ne.jp>:
>
> In my audio filename normalization process, i.e., makeing the filenames be compliant to a convertion, I need to capitalize all words.
>
> I used the parameter expansion flag (C) to capitalize filenames
>
>
> In the manual it says
> "C Capitalize the resulting words. 'Words' in this case refers to sequences of alphanumeric characters separated by non-alphanumerics, not to words that result from field splitting."
>
> The problem with the current operation of the Capitalization function is that it capitalizes the first letter of conjunction tails as well.
>
> For example:
>     $ CONJUNCT="I'll" && echo ${(C)CONJUNCT}
>     I'Ll
>     $CONJUNCT="They've" && echo ${(C)CONJUNCT}
>     They'Ve
>
> and so on.
>
> My question is can I make the capitalization function think "'" is a word character?
> By say, chaning the system parameter WORDCHARS or by specifying some other flag or separater?
>    like:
>         W:sep:   or
>         s:string: or
>         by changing IFS temporarily  or
>         the w and s:string: subscript flags
>  etc.?
>
>
> I noticed that there is no modifer for capitalization, only for l)owercase and u)pperase.  So the parameter expansion flag must be used to capitalize or so I gather from reading the "April 2, 2008" dated manual.
>
>
> I could do a brute force checking for all known conjunctions, but I'd prefer a more elegant procedure.
>
> Thanks for any advice.
>
> regards
>   d. henman

This is possibly not the best, or even a good, way, but it seems to work:
${${(C)${CONJUNCT//\'/\'1}}//\'1/\'}

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Capitalization tuning
  2008-08-31  0:37 Capitalization tuning henman
  2008-08-31  3:14 ` Mikael Magnusson
@ 2008-09-01  1:56 ` Aaron Davies
  1 sibling, 0 replies; 5+ messages in thread
From: Aaron Davies @ 2008-09-01  1:56 UTC (permalink / raw)
  To: henman; +Cc: zsh-users

On Sun, Aug 31, 2008 at 8:37 AM,  <henman@tech.email.ne.jp> wrote:

> The problem with the current operation of the Capitalization function is that it capitalizes the first letter of conjunction tails as well.
>
> For example:
>     $ CONJUNCT="I'll" && echo ${(C)CONJUNCT}
>     I'Ll
>     $CONJUNCT="They've" && echo ${(C)CONJUNCT}
>     They'Ve

This won't solve your problem, but it might help you research it
better: the term you're looking for is "contraction", not
"conjunction". a "conjunction" is a grammatical category for words
that join other words or phrases together. (The canonical list is
"and", "or", and "but", but there are others of various types.) A
contraction is a word formed by putting two (or very occasionally
more) words together and replacing some letters in the middle with an
apostrophe.

See <http://en.wikipedia.org/wiki/Contraction_(grammar)> and
<http://en.wikipedia.org/wiki/Grammatical_conjunction> for more
information. (See <http://www.youtube.com/watch?v=mkO87mkgcNo> for
some awesome 70's-style edutainment. :-)
-- 
Aaron Davies
aaron.davies@gmail.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Capitalization tuning
  2008-08-31  3:14 ` Mikael Magnusson
@ 2008-09-01 23:36   ` d.henman
  2008-09-02  8:43     ` Peter Stephenson
  0 siblings, 1 reply; 5+ messages in thread
From: d.henman @ 2008-09-01 23:36 UTC (permalink / raw)
  To: zsh-users


Mikael:
  the double substitution method you provided below works well.  

zsh-developers:
  I would be willing to write a new or modify the existing expansion flag code for
  (C) capitalize,  so that it would use an array or list of word separators, which
  a user could set, or allow the users to supply a word seperator list list
  s:separator-list: for example, for filenames I usually always, s/ /_/ , get rid
  of spaces in names, which apparently the existing C flags treats write as a space.

  I have never looked at the source code.  But, there's always a first time.  If someone would lead to where the 'c' parameter expansion flag processing done, I'd look at it.  If it has too many interactions with other code, I'd leave it be as it might n
ot be worth the effort.  Any ideas?

Thanks,
  darel henman


Mikael Magnusson <mikachu@gmail.com> wrote:
> 2008/8/31  <henman@tech.email.ne.jp>:
> ......
> > The problem with the current operation of the Capitalization function is that it capitalizes the first letter of conjunction tails as well.
> >
> > For example:
> >     $ CONJUNCT="I'll" && echo ${(C)CONJUNCT}
> >     I'Ll
> >     $CONJUNCT="They've" && echo ${(C)CONJUNCT}
> >     They'Ve
> >
> > ....
> 
> This is possibly not the best, or even a good, way, but it seems to work:
> ${${(C)${CONJUNCT//\'/\'1}}//\'1/\'}
> 
> -- 
> Mikael Magnusson


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Capitalization tuning
  2008-09-01 23:36   ` d.henman
@ 2008-09-02  8:43     ` Peter Stephenson
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Stephenson @ 2008-09-02  8:43 UTC (permalink / raw)
  To: zsh-users

On Tue, 02 Sep 2008 08:36:45 +0900
"d.henman" <dhenman@gmail.com> wrote:
> zsh-developers:
>   I would be willing to write a new or modify the existing expansion flag
>   code for (C) capitalize,  so that it would use an array or list of word
>   separators, which a user could set, or allow the users to supply a word
>   seperator list list s:separator-list: for example, for filenames I
>   usually always, s/ /_/ , get rid of spaces in names, which apparently
>   the existing C flags treats write as a space.
> 
> I have never looked at the source code.  But, there's always a first
> time.  If someone would lead to where the 'c' parameter expansion flag
> processing done, I'd look at it.  If it has too many interactions with
> other code, I'd leave it be as it might not be worth the effort.  Any ideas?

It's in casemodify() in hist.c.  You could add, say, K as a flag with
delimiters like s and a string of additional word characters to do this; that
would be in paramsubst() in subst.c (which is very complicated).  You'd
need to pass the string down to casemodify() to be used if it wasn't NULL.

The big complication is with multibyte characters; you'd need to convert
the delimiters into an array of wide characters, remembering the string is
metafied (there are functions to handle this sort of conversion which are
already used for the string to be capitalized itself).

None of this is that different from things the shell does anyway, but a lot
of it probably isn't obvious at first sight.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-09-02  8:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-31  0:37 Capitalization tuning henman
2008-08-31  3:14 ` Mikael Magnusson
2008-09-01 23:36   ` d.henman
2008-09-02  8:43     ` Peter Stephenson
2008-09-01  1:56 ` Aaron Davies

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).