Consider the following snippet of text:
Assume the user is trying to extract the
<March> component of this text, so they might write the following regular expression:
But to their surprise when the search is run it selects the entire line of text:
<March>, <12>, <2009>
One way to fix this is is to modifying the query to also specifying the text not to be matched as shown below:
But the reason the regular expressions is doing what it is doing is that by default it is defined to be
greedy, meaning it will try to match as much text as possible.
A much simpler fix to this problem is to define the regular expression as being
lazy by adding a question mark after the quantifier as shown below:
NOTE: Lazy regular expressions are also referred to as being
minimal,
non-greedy,
reluctant or
un-greedy.
Cheers Jussi