Page 1 of 1

Regular expression replace end of line infinite loop issue

Posted: Sat May 17, 2014 2:26 pm
by econoplas
Hello Jussi,

I recently encountered an infinite looping problem with zeus when I regexp replace end-of-line ($) with some string "x".

When I do this in zeus 3.97r, I experience an infinite loop where zeus is replacing the end of the first line with "x" repeatedly until I cancel it. And it consumes 100% of one CPU core while it is busy replacing.

To reproduce:
1. Open a new untitled document (I was using an file)
2. Enter 3 lines of text in the file such as:

Code: Select all

3. Press CTRL-HOME to go to the top of the file
4. Press CTRL-R to open the Replace Dialog and enter/click the following:
Find what: $
Replace with: x
[x] Use regular expression
(o) Search down
Click "Replace All" button
5. After a few seconds click the "Cancel" button.

I noticed that the first line of the file has several thousand "x" characters at the end after a couple of seconds, and the 2nd and 3rd line are unchanged. Instead I expected my file to contain 3 lines: ax, bx, and cx. During the replace operation the Status bar flashes & updates frequently with "Replacement Completed with NNNN occurences changed" where NNNN keeps increasing continually.

Thought I would bring it up so you could possibly look into this issue next time you are looking at the code to see if it can be fixed. This is not urgent in other words, as I can always do this in vim or emacs when needed.

I use this approach all the time to paste a list of things into a quickie python or ruby script, and then regex replace the BOL with " and the EOL with ", to make a list of string items for lookup tables & dictionaries/hashes and such.

Thanks for your support! Troy

Posted: Mon May 19, 2014 12:02 am
by jussij
Hi Troy,

Thanks for the posting the bug report.
And it consumes 100% of one CPU core while it is busy replacing.

This is always a good indication that Zeus might be stuck in a tight loop.

Usually Ctrl+C, Ctrl+Break or Esc will kill these loops.
I noticed that the first line of the file has several thousand "x" characters at the end after a couple of seconds, and the 2nd and 3rd line are unchanged.
I agree this might seem strange but I'm not sure it is a bug :?:

As the $ represents the end of line, what is happening in Zeus is this:
  1. Find end of line
  2. Replace end of line found with x (which naturally moves the end of line right by one)
  3. Goto (1)
So as you can see, this describe exactly what you're seeing as Zeus keeps finding the same end of line. But I'm not sure that this is in fact wrong :?

The real issues here is the search string is only the end of line marker and nothing more and as the replace moves the end of line to the right, the search and replace loop never leaves the first line.

Zeus does have code in the search and replace designed to stop search and replace infinite loops and that code goes something like this:
  1. Find the text
  2. Remove find text and add the replacement text
  3. Move the start of next the next find to the end of the replaced text
Once again you can see this logic describes exactly what you are seeing.

To fix this bug would need a change to that logic, but once again I'm not convinced that the logic is wrong.

Also notice if you had done this search and replace:

Code: Select all

This Three Character Search Text: (.$)
This Three Character Replace Text: \1x
That search and replace would have given the required result ;)

Cheers Jussi

PS: When writing regular expressions these things can happen quite easily, so what I generally do is turn on the Repeat find on replace option and do one find and one replace. That sequence effectively tests the first loop of the multi-loop Replace All option.

So what about blank lines?

Posted: Wed Aug 27, 2014 4:27 pm
by econoplas
Hi Jussi,

Thanks for the tip! I quickly filed it in my memory under "things zeus does differently with regexp replaces". I agree it sounds like this is "as designed" and different from most other editors, but not necessarily a bug. It just requires adjustment for users migrating to zeus from other editors that support a bare "$" regexp replacement (which most do).

One case I came across recently where this approach doesn't work is trying to replace the end of line using (.$) with \1x approach when it comes to dealing with blank lines.

1. Enter 5 blank lines (or blank lines interleaved with non-blank)
2. Go to the top of the file
3. Open the replace dialog...

Find what: (.$)
Replace with: \1x
[x] Use regular expression
(o) Search down
Click "Replace All" button

Obviously (.$) doesn't match my blank lines because there is no character available to match the "." So I also tried just ($) without the "." and it also goes into infinite loop. I get "xx\1xx\1xx\1xx ......" appended to my first blank line.

As a work-around I was able to use either ^$ to match the entire line or just ^ by itself to match the start of line in the "Find what:" field, and it works great. I didn't use ( .. ) around the ^$ or the ^ in the "Find what" field, and consequently didn't need the corresponding \1 in the "Replace with:" field.

Find what: ^$ ( NOTE: or simply ^ )
Replace with: x (NOTE: don't need the \1 here)
[x] Use regular expression
(o) Search down
Click "Replace All" button

I am posting this just in case someone reads this forum article at some point and tries the proposed approach of using (.$) or ($) with \1x and runs into problems on blank lines.

Thanks again! Troy :wink:

Posted: Thu Aug 28, 2014 12:07 am
by jussij
Hi Troy,

Thanks for taking the time to post your feedback.

As you mention in your post the $ pattern can be a little tricky, mainly because of the way Zeus does it's line based searching.

But as you also point out in that post, in most cases there will be some other pattern that will do what is required. It is just a matter of finding it ;)

Thanks again for the detailed post.

Cheers Jussi