Multi-line Matching in Regular Expressions

Get help with the installation and running of the Zeus IDE. Please do not post bug reports or feature requests here. When in doubt post your question here.
Post Reply
Rat
Posts: 68
Joined: Wed Jun 15, 2011 1:18 am

Multi-line Matching in Regular Expressions

Post by Rat »

Is there any way perform a multi-line Search (and/or Replace) operation using regular expressions in Zeus?

p.s. I'd also like to know if there is a multi-file Search and Replace options as well? (I can only find a multi-file Search function.)
jussij
Site Admin
Posts: 2650
Joined: Fri Aug 13, 2004 5:10 pm

Post by jussij »

Is there any way perform a multi-line Search (and/or Replace) operation using regular expressions in Zeus?
The \n string can be used to represents a line feed which means it should be possible to constuct a multi-line regexp search string.

In other words where you think there should be a line feed use the \n sting instead.

Cheers Jussi
Rat
Posts: 68
Joined: Wed Jun 15, 2011 1:18 am

Post by Rat »

Hi Jussi,

Thanks for the reply...

I have tried various pattern combinations including "[.\r\n]+" and "[\w\s\r\n+" etc. I do quite a lot of Regular Expression construction and am reasonably familiar with the PCRE syntax. So far I've had no luck, hence the posting.

If you could tell me which Reg Ex library you're using in Zeus I could check the documentation on its syntax variant.

p.s. Its also not obvious how to do Greedy and Ungreedy searches.
jussij
Site Admin
Posts: 2650
Joined: Fri Aug 13, 2004 5:10 pm

Post by jussij »

I have tried various pattern combinations including "[.\r\n]+" and "[\w\s\r\n+" etc.
Unfortunately these types of searches will not work in Zeus :(

Zeus stores each line of the document in a separate buffer meaning the document is held in many disjointed blocks of memory. But the regular expression engine only works if the data is held in a continuous block of memory.

So to make the \n search work for this kind of discontinuos memory Zeus does some extra work. What Zeus does is count the number of \n characters in the search pattern, then concatenates that many lines plus one into a working buffer and passes that on to the regex engine.

So in your case there is only one \n so at most only two lines will be searched at any one time.
If you could tell me which Reg Ex library you're using in Zeus

PCRE - Perl Compatible Regular Expressions - http://pcre.org/
p.s. Its also not obvious how to do Greedy and Ungreedy searches.

From the online help:

p+?
In the previous p+ example the search would have found single or multiple numbers of the p character. For example it finds the multiple ppp pattern at the end of the line.

The reason this happens is because the search defaults to being greedy and as such it will try to consume as many matching characters as possible.

In some cases a more lazy match is preferable and this can be achieved by appending the '?' operator to the end of the search string.

You can find this in the online help by using the Help, Using and Configuring Zeus menu and then selecting the Regular Expression folder from the Contents panel.

Cheers Jussi
Rat
Posts: 68
Joined: Wed Jun 15, 2011 1:18 am

Post by Rat »

So to make the \n search work for this kind of discontinuos memory Zeus does some extra work. What Zeus does is count the number of \n characters in the search pattern, then concatenates that many lines plus one into a working buffer and passes that on to the regex engine.
I would like to, for instance, search and replace multi-line comments marked with "/*" and "*/" delimiters. The number of lines contained between the delimiters is unknown.
In the previous p+ example the search would have found single or multiple numbers of the p character. For example it finds the multiple ppp pattern at the end of the line.

The reason this happens is because the search defaults to being greedy and as such it will try to consume as many matching characters as possible.

In some cases a more lazy match is preferable and this can be achieved by appending the '?' operator to the end of the search string.
Yep that'll do it, (to be honest I was thinking of a pattern modifier like the "/U" used in PHP, but thats non-standard and unnecessary).
jussij
Site Admin
Posts: 2650
Joined: Fri Aug 13, 2004 5:10 pm

Post by jussij »

I would like to, for instance, search and replace multi-line comments marked with "/*" and "*/" delimiters. The number of lines contained between the delimiters is unknown.

And unfortunately this is not possible in Zeus :(

The best I can suggest is a regexp that looks something like this:

Code: Select all

(/\*)+([\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*)(\*/)+
This will find any block comment that 1 through to 5 lines long.

As you can see, this is the line search pattern:

Code: Select all

[\n]*.*
So you can can add any additional line pattenrs to the string to increase the number of lines covered by the search.

For example this pattern will see any block comment up to 12 lines long:

Code: Select all

(/\*)+([\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*)(\*/)+
Having said this, when testing these patterns I did noticed a slight bug in Zeus :(

If for example you use the 12 line pattern and you have two small block comments only a few lines appart, the second block will not be found as it will have been consumed by the earlier concatenation used to find the first :(

Cheers Jussi
Rat
Posts: 68
Joined: Wed Jun 15, 2011 1:18 am

Post by Rat »

Thanks Jussi...
p.s. I'd also like to know if there is a multi-file Search and Replace options as well? (I can only find a multi-file Search function.)
Rat
Posts: 68
Joined: Wed Jun 15, 2011 1:18 am

Post by Rat »

If for example you use the 12 line pattern and you have two small block comments only a few lines appart, the second block will not be found as it will have been consumed by the earlier concatenation used to find the first
Are you sure this isn't the normal default Greedy matching behaviour?
jussij
Site Admin
Posts: 2650
Joined: Fri Aug 13, 2004 5:10 pm

Post by jussij »

Are you sure this isn't the normal default Greedy matching behaviour?
You are 100% correct. Turns out this is not a bug in Zeus but rather my bad regular expression :oops:

This less-greedy regexp is much better ;)

Code: Select all

(/\*+)([\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*)(\*/+)
There may well be even a better way to express this, but the secret is the number of \n's found in the search string will represent the maximum number of lines of the block comment found that can be found.

Cheers Jussi
AlanStewart
Posts: 83
Joined: Fri Jun 02, 2006 6:52 pm

Re: Multi-line Matching in Regular Expressions

Post by AlanStewart »

I might be necroing a VERY old thread but it helped me solve a problem I was having so I wanted to post my solution.

I need to clean up some CSS code which contains a number of unused definitions marked with XXX:

Code: Select all

header .h_title XXX
{
/*outline: 1px solid white;*/
   display: inline-block;
   position: relative;
/*   width: calc(960px - 4px - 20px - 137px - 10px - 5px);*/
   margin: 8px 0px 0px 20px;
/*   text-align: center;*/
   vertical-align: top;
/*   font-size: 1.3em;*/
/*   z-index: 2;*/
}
(I know, it's a mess.)

I was having a devil of a time getting something simple working but with this post I realized I could hack the system:

(^.+\WXXX\n)+({\n(.*\n)*?}\n\n)|(\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n)

I just need to make sure that there's enough \n in there to trigger a bigger buffer. And the ? in (.*\n)*? keeps it from getting too greedy and matching multiple blocks at once.

The only downside is that it doesn't work so well with nested @media blocks, but that's just something I need to watch out for myself.

So thank you for the hint that helped me solve this problem!
Post Reply