Multi-line Matching in Regular Expressions

Rat · Post by **Rat** » Fri Dec 02, 2011 3:12 am

Is there any way perform a multi-line Search (and/or Replace) operation using regular expressions in Zeus?

p.s. I'd also like to know if there is a multi-file Search and Replace options as well? (I can only find a multi-file Search function.)

Post by **jussij** » Fri Dec 02, 2011 2:45 pm

Is there any way perform a multi-line Search (and/or Replace) operation using regular expressions in Zeus?

The \n string can be used to represents a line feed which means it should be possible to constuct a multi-line regexp search string.

In other words where you think there should be a line feed use the \n sting instead.

Cheers Jussi

Rat · Post by **Rat** » Sun Dec 04, 2011 12:50 am

Hi Jussi,

Thanks for the reply...

I have tried various pattern combinations including "[.\r\n]+" and "[\w\s\r\n+" etc. I do quite a lot of Regular Expression construction and am reasonably familiar with the PCRE syntax. So far I've had no luck, hence the posting.

If you could tell me which Reg Ex library you're using in Zeus I could check the documentation on its syntax variant.

p.s. Its also not obvious how to do Greedy and Ungreedy searches.

Post by **jussij** » Mon Dec 05, 2011 12:28 am

I have tried various pattern combinations including "[.\r\n]+" and "[\w\s\r\n+" etc.

Unfortunately these types of searches will not work in Zeus

Zeus stores each line of the document in a separate buffer meaning the document is held in many disjointed blocks of memory. But the regular expression engine only works if the data is held in a continuous block of memory.

So to make the \n search work for this kind of discontinuos memory Zeus does some extra work. What Zeus does is count the number of \n characters in the search pattern, then concatenates that many lines plus one into a working buffer and passes that on to the regex engine.

So in your case there is only one \n so at most only two lines will be searched at any one time.

If you could tell me which Reg Ex library you're using in Zeus

PCRE - Perl Compatible Regular Expressions - http://pcre.org/

p.s. Its also not obvious how to do Greedy and Ungreedy searches.

From the online help:

p+?

In the previous p+ example the search would have found single or multiple numbers of the p character. For example it finds the multiple ppp pattern at the end of the line.

The reason this happens is because the search defaults to being greedy and as such it will try to consume as many matching characters as possible.

In some cases a more lazy match is preferable and this can be achieved by appending the '?' operator to the end of the search string.

You can find this in the online help by using the Help, Using and Configuring Zeus menu and then selecting the Regular Expression folder from the Contents panel.

Cheers Jussi

Rat · Post by **Rat** » Mon Dec 05, 2011 1:51 am

So to make the \n search work for this kind of discontinuos memory Zeus does some extra work. What Zeus does is count the number of \n characters in the search pattern, then concatenates that many lines plus one into a working buffer and passes that on to the regex engine.

I would like to, for instance, search and replace multi-line comments marked with "/*" and "*/" delimiters. The number of lines contained between the delimiters is unknown.

In the previous p+ example the search would have found single or multiple numbers of the p character. For example it finds the multiple ppp pattern at the end of the line.

The reason this happens is because the search defaults to being greedy and as such it will try to consume as many matching characters as possible.

In some cases a more lazy match is preferable and this can be achieved by appending the '?' operator to the end of the search string.

Yep that'll do it, (to be honest I was thinking of a pattern modifier like the "/U" used in PHP, but thats non-standard and unnecessary).

Post by **jussij** » Tue Dec 06, 2011 12:18 am

I would like to, for instance, search and replace multi-line comments marked with "/*" and "*/" delimiters. The number of lines contained between the delimiters is unknown.

And unfortunately this is not possible in Zeus

The best I can suggest is a regexp that looks something like this:

Code: Select all

(/\*)+([\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*)(\*/)+

This will find any block comment that 1 through to 5 lines long.

As you can see, this is the line search pattern:

Code: Select all

[\n]*.*

So you can can add any additional line pattenrs to the string to increase the number of lines covered by the search.

For example this pattern will see any block comment up to 12 lines long:

Code: Select all

(/\*)+([\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*[\n]*.*)(\*/)+

Having said this, when testing these patterns I did noticed a slight bug in Zeus

If for example you use the 12 line pattern and you have two small block comments only a few lines appart, the second block will not be found as it will have been consumed by the earlier concatenation used to find the first

Cheers Jussi

Rat · Post by **Rat** » Tue Dec 06, 2011 12:24 am

Thanks Jussi...

p.s. I'd also like to know if there is a multi-file Search and Replace options as well? (I can only find a multi-file Search function.)

Rat · Post by **Rat** » Tue Dec 06, 2011 12:58 am

If for example you use the 12 line pattern and you have two small block comments only a few lines appart, the second block will not be found as it will have been consumed by the earlier concatenation used to find the first

Are you sure this isn't the normal default Greedy matching behaviour?

Post by **jussij** » Tue Dec 06, 2011 1:59 am

Are you sure this isn't the normal default Greedy matching behaviour?

You are 100% correct. Turns out this is not a bug in Zeus but rather my bad regular expression

This less-greedy regexp is much better

Code: Select all

(/\*+)([\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*[\n]?.*)(\*/+)

There may well be even a better way to express this, but the secret is the number of \n's found in the search string will represent the maximum number of lines of the block comment found that can be found.

Cheers Jussi

AlanStewart · Post by **AlanStewart** » Wed Sep 26, 2018 8:52 pm

I might be necroing a VERY old thread but it helped me solve a problem I was having so I wanted to post my solution.

I need to clean up some CSS code which contains a number of unused definitions marked with XXX:

Code: Select all

header .h_title XXX
{
/*outline: 1px solid white;*/
   display: inline-block;
   position: relative;
/*   width: calc(960px - 4px - 20px - 137px - 10px - 5px);*/
   margin: 8px 0px 0px 20px;
/*   text-align: center;*/
   vertical-align: top;
/*   font-size: 1.3em;*/
/*   z-index: 2;*/
}

(I know, it's a mess.)

I was having a devil of a time getting something simple working but with this post I realized I could hack the system:

(^.+\WXXX\n)+({\n(.*\n)*?}\n\n)|(\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n)

I just need to make sure that there's enough \n in there to trigger a bigger buffer. And the ? in (.*\n)*? keeps it from getting too greedy and matching multiple blocks at once.

The only downside is that it doesn't work so well with nested @media blocks, but that's just something I need to watch out for myself.

So thank you for the hint that helped me solve this problem!

Multi-line Matching in Regular Expressions

Multi-line Matching in Regular Expressions

Re: Multi-line Matching in Regular Expressions