From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3653 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: embedded newbies site. Date: Tue, 16 Jul 2013 10:00:53 -0400 Message-ID: <20130716140053.GO29800@brightrain.aerifal.cx> References: <1373940214.3719.5@driftwood> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1373983266 805 80.91.229.3 (16 Jul 2013 14:01:06 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 16 Jul 2013 14:01:06 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-3657-gllmg-musl=m.gmane.org@lists.openwall.com Tue Jul 16 16:01:08 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Uz5oR-0004Ob-GZ for gllmg-musl@plane.gmane.org; Tue, 16 Jul 2013 16:01:07 +0200 Original-Received: (qmail 15602 invoked by uid 550); 16 Jul 2013 14:01:07 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 15594 invoked from network); 16 Jul 2013 14:01:06 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:3653 Archived-At: On Tue, Jul 16, 2013 at 07:50:29AM -0400, LM wrote: > design. For instance, there are some negative mentions about the PCRE > library, but when I tried to track down the cons for using it, I only found > dated performance comparisons showing how poorly it worked if you don't use > the newer JIT implementation. What might be a positive for a system that's The whole concept of regular expressions is that they're regular, meaning they're matchable in O(n) time with O(1) space. PCRE (the implementation) uses backtracking for everything, giving it exponentially-bad performance (JIT cannot fix this), and PCRE (the language) has a lot of features that are fundamentally not regular and thus can't be implemented efficiently. Also, the behavior of some of the features (e.g. greedy vs non-greedy matching) were not designed intentionally but just arose out of the backtracking implementation, and thus don't make a lot of sense unless you think from the standpoint of such an implementation. Aside from performance, PCRE is harmful to CS education because it undermines the whole definition of "regular" when students learn a sense of "regular expression" that's not actually regular. Of course this can be worked around if the instructor teaches this issue well when teaching PCRE, but I think normally PCRE is just taught as a tool without any theoretical background. Rich