Greedy and Lazy Regular Expressions

Find Tips and tricks on how to better use the Zeus IDE. Feel free to post your own tips but please do not post bug reports, feature requests or questions here.
Post Reply
jussij
Site Admin
Posts: 2465
Joined: Fri Aug 13, 2004 5:10 pm

Greedy and Lazy Regular Expressions

Post by jussij »

Consider the following snippet of text:

Code: Select all

<March>, <12>, <2009>
Assume the user is trying to extract the <March> component of this text, so they might write the following regular expression:

Code: Select all

<.*>
But to their surprise when the search is run it selects the entire line of text: <March>, <12>, <2009>

One way to fix this is is to modifying the query to also specifying the text not to be matched as shown below:

Code: Select all

<[^>]*>
But the reason the regular expressions is doing what it is doing is that by default it is defined to be greedy, meaning it will try to match as much text as possible.

A much simpler fix to this problem is to define the regular expression as being lazy by adding a question mark after the quantifier as shown below:

Code: Select all

<.*?>
NOTE: Lazy regular expressions are also referred to as being minimal, non-greedy, reluctant or un-greedy.

Cheers Jussi
AlanStewart
Posts: 83
Joined: Fri Jun 02, 2006 6:52 pm

Greedy

Post by AlanStewart »

That's a very handy thing to know! Thank you! I've been struggling with this problem ever since I had to give up using Brief!
Post Reply