Gnus development mailing list
 help / color / mirror / Atom feed
* on matching more naked URLs in articles
@ 2000-01-22 14:32 Steinar Bang
  2000-04-21 12:17 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 13+ messages in thread
From: Steinar Bang @ 2000-01-22 14:32 UTC (permalink / raw)


Before I waste time on it: has anyone looked into changing
gnus-button-url-regexp into matching naked URLs without a protocol
field?  Ie. URLs like dodrt.dod.no, vvv.no, imdb.com etc.

And is there a way to getting the changed gnus-button-url-regexp into
gnus-button-alist without restarting Gnus?  (I suspect that ever
longer and more cryptic regexps require quick cycles of changing and
trying)

Todays gnus-button-url-regexp is 
 "\\b\\(s?https?\\|ftp\\|file\\|gopher\\|news\\|telnet\\|wais\\|mailto\\):\\(//[-a-zA-Z0-9_.]+:[0-9]*\\)?\\([-a-zA-Z0-9_=!?#$@~`%&*+|\\/:;.,]\\|\\w\\)+\\([-a-zA-Z0-9_=#$@~`%&*+|\\/]\\|\\w\\)"

I'm not sure where the best place to change it is.  My first attempt
would be something like this:
 "\\b\\(\\(s?https?\\|ftp\\|file\\|gopher\\|news\\|telnet\\|wais\\|mailto\\):\\(//[-a-zA-Z0-9_.]+:[0-9]*\\)?\\([-a-zA-Z0-9_=!?#$@~`%&*+|\\/:;.,]\\|\\w\\)+\\([-a-zA-Z0-9_=#$@~`%&*+|\\/]\\|\\w\\)\\|\\([A-Za-z]+\\.\\)+\\(com\\|org\\|no\\|se\\)\\(/[A-Za-z0-9/]+\\)?\\)"

I'm not sure if the top level domain should be from a fixed list, or
if we just should match [A-Za-z]?  Are there efficiency reasons for
doing it either way?

Most URLs exchanged this way seems to be .com URLs or URLs in the
native national domain of a particular newsgroup.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-01-22 14:32 on matching more naked URLs in articles Steinar Bang
@ 2000-04-21 12:17 ` Lars Magne Ingebrigtsen
  2000-04-21 13:51   ` Steinar Bang
  0 siblings, 1 reply; 13+ messages in thread
From: Lars Magne Ingebrigtsen @ 2000-04-21 12:17 UTC (permalink / raw)


Steinar Bang <sb@metis.no> writes:

> Before I waste time on it: has anyone looked into changing
> gnus-button-url-regexp into matching naked URLs without a protocol
> field?  Ie. URLs like dodrt.dod.no, vvv.no, imdb.com etc.

I think that would be a good idea, and perhaps just matching stuff
that consists of a-z with two (or more) dots in between would do the
trick?  (Or one dot if it ends in "com".)

> And is there a way to getting the changed gnus-button-url-regexp into
> gnus-button-alist without restarting Gnus?

Well -- you can just use `setq'.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   larsi@gnus.org * Lars Magne Ingebrigtsen



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-21 12:17 ` Lars Magne Ingebrigtsen
@ 2000-04-21 13:51   ` Steinar Bang
  2000-04-21 18:43     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 13+ messages in thread
From: Steinar Bang @ 2000-04-21 13:51 UTC (permalink / raw)


>>>>> Lars Magne Ingebrigtsen <larsi@gnus.org>:

> Steinar Bang <sb@metis.no> writes:

>> Before I waste time on it: has anyone looked into changing
>> gnus-button-url-regexp into matching naked URLs without a protocol
>> field?  Ie. URLs like dodrt.dod.no, vvv.no, imdb.com etc.

> I think that would be a good idea, and perhaps just matching stuff
> that consists of a-z with two (or more) dots in between would do the
> trick?  (Or one dot if it ends in "com".)

Maybe.  I guess most published URLs these days doesn't have a
directory part.  I think maybe we also need A-Z and "-" and 0-9 to
catch many of these.

>> And is there a way to getting the changed gnus-button-url-regexp into
>> gnus-button-alist without restarting Gnus?

> Well -- you can just use `setq'.  :-)

I tried that, but when I looked at the variable it seemed to hold the
old value.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-21 13:51   ` Steinar Bang
@ 2000-04-21 18:43     ` Lars Magne Ingebrigtsen
  2000-04-21 20:31       ` Steinar Bang
  2000-04-21 22:24       ` David Aspinwall
  0 siblings, 2 replies; 13+ messages in thread
From: Lars Magne Ingebrigtsen @ 2000-04-21 18:43 UTC (permalink / raw)


Steinar Bang <sb@metis.no> writes:

> Maybe.  I guess most published URLs these days doesn't have a
> directory part.  I think maybe we also need A-Z and "-" and 0-9 to
> catch many of these.

Yup.  Could you send a patch for this?

-- 
(domestic pets only, the antidote for overdose, milk.)
   larsi@gnus.org * Lars Magne Ingebrigtsen



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-21 18:43     ` Lars Magne Ingebrigtsen
@ 2000-04-21 20:31       ` Steinar Bang
  2000-04-21 22:24       ` David Aspinwall
  1 sibling, 0 replies; 13+ messages in thread
From: Steinar Bang @ 2000-04-21 20:31 UTC (permalink / raw)


>>>>> Lars Magne Ingebrigtsen <larsi@gnus.org>:

> Steinar Bang <sb@metis.no> writes:
>> Maybe.  I guess most published URLs these days doesn't have a
>> directory part.  I think maybe we also need A-Z and "-" and 0-9 to
>> catch many of these.

> Yup.  Could you send a patch for this?

If I ever get around to doing it, and if I get it working, I most
certainly will. ;-)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-21 18:43     ` Lars Magne Ingebrigtsen
  2000-04-21 20:31       ` Steinar Bang
@ 2000-04-21 22:24       ` David Aspinwall
  2000-04-22 12:13         ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 13+ messages in thread
From: David Aspinwall @ 2000-04-21 22:24 UTC (permalink / raw)


>>"larsi" == Lars Magne Ingebrigtsen <larsi@gnus.org> writes:


> Yup.  Could you send a patch for this?

Changing gnus-button-url-regexp to

"\\b\\(\\(s?https?\\|ftp\\|file\\|gopher\\|news\\|telnet\\|wais\\|mailto\\):\\(//[-a-zA-Z0-9_.]+:[0-9]*\\)?\\([-a-zA-Z0-9_=!?#$@~`%&*+|\\/:;.,]\\|\\w\\)+\\([-a-zA-Z0-9_=#$@~`%&*+|\\/]\\|\\w\\)\\)\\|[-a-zA-Z0-9_]+\\.[-a-zA-Z0-9_]+\\(\\.[-a-zA-Z0-9_]+[-a-zA-Z0-9_/]+\\)+"

seems to work, at least in 5.8.3.





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-21 22:24       ` David Aspinwall
@ 2000-04-22 12:13         ` Lars Magne Ingebrigtsen
  2000-04-26 18:22           ` Karl Kleinpaste
       [not found]           ` <vxkog6w4t9y.fsf@mesquite.charcoal <3166007242308746@oakhurst.penguinpowered.com>
  0 siblings, 2 replies; 13+ messages in thread
From: Lars Magne Ingebrigtsen @ 2000-04-22 12:13 UTC (permalink / raw)


David Aspinwall <aspinwall@TimesTen.com> writes:

> Changing gnus-button-url-regexp to
> 
> "\\b\\(\\(s?https?\\|ftp\\|file\\|gopher\\|news\\|telnet\\|wais\\|mailto\\):\\(//[-a-zA-Z0-9_.]+:[0-9]*\\)?\\([-a-zA-Z0-9_=!?#$@~`%&*+|\\/:;.,]\\|\\w\\)+\\([-a-zA-Z0-9_=#$@~`%&*+|\\/]\\|\\w\\)\\)\\|[-a-zA-Z0-9_]+\\.[-a-zA-Z0-9_]+\\(\\.[-a-zA-Z0-9_]+[-a-zA-Z0-9_/]+\\)+"

I've now changed the regexp to this.

-- 
(domestic pets only, the antidote for overdose, milk.)
   larsi@gnus.org * Lars Magne Ingebrigtsen



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-22 12:13         ` Lars Magne Ingebrigtsen
@ 2000-04-26 18:22           ` Karl Kleinpaste
  2000-04-27  8:35             ` Steinar Bang
                               ` (2 more replies)
       [not found]           ` <vxkog6w4t9y.fsf@mesquite.charcoal <3166007242308746@oakhurst.penguinpowered.com>
  1 sibling, 3 replies; 13+ messages in thread
From: Karl Kleinpaste @ 2000-04-26 18:22 UTC (permalink / raw)


<sarcasm> Bloody marvelous. </sarcasm>

Now, Gnus is urlifying stuff such as bash-2.04.tar.gz and 21.2.31.

I'm not at all convinced this is a good thing.  The heuristic is
nowhere near adequately restrictive.  Maybe (only maybe) if you
restrict the trailing component to TLDs, it would be adequate.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-26 18:22           ` Karl Kleinpaste
@ 2000-04-27  8:35             ` Steinar Bang
  2000-04-27 10:17               ` Toby Speight
  2000-04-29 14:27             ` Thomas Skogestad
  2000-04-29 20:36             ` François Pinard
  2 siblings, 1 reply; 13+ messages in thread
From: Steinar Bang @ 2000-04-27  8:35 UTC (permalink / raw)


>>>>> Karl Kleinpaste <karl@charcoal.com>:

> <sarcasm> Bloody marvelous. </sarcasm>
> Now, Gnus is urlifying stuff such as bash-2.04.tar.gz and 21.2.31.

> I'm not at all convinced this is a good thing.  The heuristic is
> nowhere near adequately restrictive.  Maybe (only maybe) if you
> restrict the trailing component to TLDs, it would be adequate.

I don't see any problem in some non-URLs are highlighted, as long as
the real ones are, and that saves me cutting and pasting.

My visual nerve is far more bothered by the highlighting of quoted
text that kicks in on undesired places (eg. around forwarded code
examples).  I've turned off everything I could find in customize of
Gnus (that was the only place I could figure out where to do this),
and I live with the rest of them.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-27  8:35             ` Steinar Bang
@ 2000-04-27 10:17               ` Toby Speight
  0 siblings, 0 replies; 13+ messages in thread
From: Toby Speight @ 2000-04-27 10:17 UTC (permalink / raw)


Steinar> Steinar Bang <URL:mailto:sb@metis.no>
Karl> Karl Kleinpaste <URL:mailto:karl@charcoal.com>

0> In article <vxkog6w4t9y.fsf@mesquite.charcoal.com>, Karl wrote:

Karl> <sarcasm> Bloody marvelous. </sarcasm>
Karl>
Karl> Now, Gnus is urlifying stuff such as bash-2.04.tar.gz and
Karl> 21.2.31.
Karl>
Karl> I'm not at all convinced this is a good thing.  The heuristic is
Karl> nowhere near adequately restrictive.  Maybe (only maybe) if you
Karl> restrict the trailing component to TLDs, it would be adequate.


0> In article <whk8hk0wn6.fsf@viffer.metis.no>, Steinar wrote:

Steinar> I don't see any problem in some non-URLs are highlighted, as
Steinar> long as the real ones are, and that saves me cutting and
Steinar> pasting.


This is obviously an area where personal preferences differ.  I think
we need to put some canned "loose" and "tight" regexps as :fixed
choices in the defcustom.

IOW, instead of

  :type 'regexp

we should have

  :type '(choice (const "<tight>" :tag "Match conservatively")
                 (const "<loose>" :tag "Match liberally")
                 (regexp :tag "User-defined"))

where the consts are suitable defaults for the two POVs.

I think we can assume that users either (a) use Customize, or (b) are
clueful enough to find the definition and copy the regexps from there.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-26 18:22           ` Karl Kleinpaste
  2000-04-27  8:35             ` Steinar Bang
@ 2000-04-29 14:27             ` Thomas Skogestad
  2000-04-29 20:36             ` François Pinard
  2 siblings, 0 replies; 13+ messages in thread
From: Thomas Skogestad @ 2000-04-29 14:27 UTC (permalink / raw)


* Karl Kleinpaste

| Now, Gnus is urlifying stuff such as bash-2.04.tar.gz and 21.2.31.

Is there any way to turn this marvelous new feature off?

(I've already turned off all sorts of coloring of quotes etc. And now
suddenly Gnus wants to color Linux kernel versions and other stuff.)

-- 
thomas.skogestad@jusstud.uio.no
http://quimby.gnus.org/circus/problem.gif
ftp://ftp.splode.com/pub/users/friedman/emacs-lisp/kill-a-lawyer.el



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
       [not found]           ` <vxkog6w4t9y.fsf@mesquite.charcoal <3166007242308746@oakhurst.penguinpowered.com>
@ 2000-04-29 15:44             ` Karl EICHWALDER
  0 siblings, 0 replies; 13+ messages in thread
From: Karl EICHWALDER @ 2000-04-29 15:44 UTC (permalink / raw)
  Cc: ding

Thomas Skogestad <tskogest@jusstud.uio.no> writes:

> * Karl Kleinpaste

> | Now, Gnus is urlifying stuff such as bash-2.04.tar.gz and 21.2.31.

> Is there any way to turn this marvelous new feature off?

I again activated the old regexp:

(setq gnus-button-url-regexp "\\b\\(s?https?\\|ftp\\|file\\|gopher\\|news\\|telnet\\|wais\\|mailto\\):\\(//[-a-zA-Z0-9_.]+:[0-9]*\\)?\\([-a-zA-Z0-9_=!?#$@~`%&*+|\\/:;.,]\\|\\w\\)+\\([-a-zA-Z0-9_=#$@~`%&*+|\\/]\\|\\w\\)")

Note: I didn't check whether Toby's well thought proposal is already
implemented.

-- 
work : ke@suse.de                          |
     : http://www.suse.de/~ke/             |          ------    ,__o
home : ke@gnu.franken.de                   |         ------   _-\_<,
     : http://www.franken.de/users/gnu/ke/ |        ------   (*)/'(*)




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: on matching more naked URLs in articles
  2000-04-26 18:22           ` Karl Kleinpaste
  2000-04-27  8:35             ` Steinar Bang
  2000-04-29 14:27             ` Thomas Skogestad
@ 2000-04-29 20:36             ` François Pinard
  2 siblings, 0 replies; 13+ messages in thread
From: François Pinard @ 2000-04-29 20:36 UTC (permalink / raw)
  Cc: ding

Karl Kleinpaste <karl@charcoal.com> writes:

> Now, Gnus is urlifying stuff such as bash-2.04.tar.gz and 21.2.31.
> I'm not at all convinced this is a good thing.  The heuristic is nowhere
> near adequately restrictive.

I initially received new emphasis on file names or version numbers as a
nice display feature, and sent a letter to Lars for thanking him.

However, I realize that these new things are also clickable, and when
clicked, they get interpreted as browsable URLs.  This is not nice, as there
are many false matches.  If a numeric quad representing an IP address is
emphasised, no problem with me.  But clicking on it should reasonably try a
ping maybe, but launching Netscape is much! :-) If I click on a mere file
name, I would expect that file to be visited if it exists locally maybe,
but launching Netscape looks like overkill to me...

Emphasis is welcome, but clickability might be rethought.  I understand
that the equilibrium between getting a lot of false matches, and missing
good matches, is a delicate matter, and related to the taste of users.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard





^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2000-04-29 20:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-01-22 14:32 on matching more naked URLs in articles Steinar Bang
2000-04-21 12:17 ` Lars Magne Ingebrigtsen
2000-04-21 13:51   ` Steinar Bang
2000-04-21 18:43     ` Lars Magne Ingebrigtsen
2000-04-21 20:31       ` Steinar Bang
2000-04-21 22:24       ` David Aspinwall
2000-04-22 12:13         ` Lars Magne Ingebrigtsen
2000-04-26 18:22           ` Karl Kleinpaste
2000-04-27  8:35             ` Steinar Bang
2000-04-27 10:17               ` Toby Speight
2000-04-29 14:27             ` Thomas Skogestad
2000-04-29 20:36             ` François Pinard
     [not found]           ` <vxkog6w4t9y.fsf@mesquite.charcoal <3166007242308746@oakhurst.penguinpowered.com>
2000-04-29 15:44             ` Karl EICHWALDER

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).