Zeus Macro for deleting duplicate lines?
Zeus Macro for deleting duplicate lines?
I've got a file of a few thousand email addresses and I'd like to delete all the exact duplicates but I can't figure out how to do it... anyone?
Re: Zeus Macro for deleting duplicate lines?
This remove_lines.lua file should do the trick:
The macro uses a regexp to search form multiple blank lines and replace them with a single blank line:
Also if you use this scope you can have the macro run for all open files:
IMPORTANT: Also as always, before running this script, make sure you have backup copies of the files (or better still have the files under source control), so you can restore these files should something go wrong 
Cheers Jussi
Code: Select all
function key_macro()
screen_update_disable()
-- save current search settings
search_options_save()
-- save the current cursor details
cursor_save()
-- the different available scope options
SCOPE_FORWARD = 0 -- forward from current cursor
SCOPE_REVERSE = 1 -- reverse from current cursor
SCOPE_MARKED = 2 -- marked region only
SCOPE_ENTIRE = 3 -- entire contents of current document
SCOPE_ALL = 4 -- all currently open documents
-- set the required search scope option
set_search_option("Scope", SCOPE_ENTIRE)
-- set the other search options
set_search_option("UseCase" , 0)
set_search_option("WholeWord" , 0)
set_search_option("RegExpress", 1)
-- do the search and replace for the next match only
replace_all = 0
-- do the search and replace for all matches found
replace_all = 1
-- search for any empty lines and replace them with a single empty line
replace("^(\\n$)(\\1$)+" , "\\n", replace_all)
-- restore the cursor details
cursor_restore()
search_options_restore()
screen_update_enable()
screen_update()
end
key_macro() -- run the macro
Code: Select all
replace("^(\\n$)(\\1$)+" , "\\n", replace_all)
Code: Select all
SCOPE_ALL = 4 -- all currently open documents

Cheers Jussi
Re: Zeus Macro for deleting duplicate lines?
If I understand your script correctly, this macro simply removes multiple blank lines from the file... not the duplicated text *on* the lines...?
Example:
fred@foo.com
mary@foo.com
mary@foo.com
mary@foo.com
tom@foo.com
tom2@foo.com
victor@foo.com
victor@foo.com
william@foo.com
william@msn.com
The script I'm looking for would remove two of the mary@foo.com and one victor@foo.com lines in the example above. The rest of the lines are unique.
Example:
fred@foo.com
mary@foo.com
mary@foo.com
mary@foo.com
tom@foo.com
tom2@foo.com
victor@foo.com
victor@foo.com
william@foo.com
william@msn.com
The script I'm looking for would remove two of the mary@foo.com and one victor@foo.com lines in the example above. The rest of the lines are unique.
Re: Zeus Macro for deleting duplicate lines?
Sorry. My mistake for not taking the time to correctly read your original post
Yes, you are correct that macro will remove multiple empty lines replacing them with a single line.
To remove adjacent, duplicate lines just change the regexp to be this:
Cheers Jussi

Yes, you are correct that macro will remove multiple empty lines replacing them with a single line.
To remove adjacent, duplicate lines just change the regexp to be this:
Code: Select all
-- search for any duplicate lines with with a single line of text
replace("^(.*)(\\r?\\n\\1)+$" , "\\1", replace_all)
Re: Zeus Macro for deleting duplicate lines?
Thanks! I didn't realize that you could refer to a regexp pattern-match string within the <find> string.
Given your updated example, I ended up doing a simple REPLACE, finding "^(.*)\n\1$", replacing with "\1".
Of course, I had to perform the search & replace several times to get rid of the multiple occurrences of a single string, but WOW - thanks!
Given your updated example, I ended up doing a simple REPLACE, finding "^(.*)\n\1$", replacing with "\1".

Re: Zeus Macro for deleting duplicate lines?
Yep, running the regexp directly from the Search/Replace dialog is naturally also an option, as the macro replace function calls the same underlying replace functionality 
Another option is the Editor, Sort menu, which also has an option to remove duplicate lines. I use this feature quite a bit
But one warning on when using this Sort option.
As the name suggests this dialog will sort the contents of the file (or the marked area) and it also has the option to remove any duplicates.
All of those edits as 100% undo-able and this means if you try to do this for a really big file you might run into issues.
Firstly the bigger the file the longer it will take to complete and finally because the changes are undo-able it consumes lots of memory. So if the file is big enough you might even crash the editor if it runs out of memory
Cheers Jussi

Another option is the Editor, Sort menu, which also has an option to remove duplicate lines. I use this feature quite a bit

But one warning on when using this Sort option.
As the name suggests this dialog will sort the contents of the file (or the marked area) and it also has the option to remove any duplicates.
All of those edits as 100% undo-able and this means if you try to do this for a really big file you might run into issues.
Firstly the bigger the file the longer it will take to complete and finally because the changes are undo-able it consumes lots of memory. So if the file is big enough you might even crash the editor if it runs out of memory

Cheers Jussi
Zeus Macro for deleting duplicate lines
I am a bit rusty on my Pascal. I have a need to randomize the lines in a text file. Does anyone have a sample script that might help me get started? I need to do this several times a week so it seems worth the effort to create a script. The files contain between 10 and 20 thousand lines.
Re: Zeus Macro for deleting duplicate lines?
I would use Python to do this and since Zeus comes with a version of Python, you will not need to install anything additional to have this work.I have a need to randomize the lines in a text file
As an example, the shuffle.py code below reads in an input file, shuffle it's contents and writes out the shuffled results to an ouput file:
Code: Select all
from random import shuffle
def ShuffleFile(inputFileName, outputFileName):
# read in the input file
with open(inputFileName, 'r') as fileInput:
lines = [i for i in fileInput.readlines()]
# shuffle the fines
shuffle(lines)
# write the output file (change to overwrite same file)
with open(outputFileName, 'w+') as fileOutput:
for item in lines:
fileOutput.write(item)
in_file="d:/temp/input.txt"
out_file="d:/temp/output.txt"
# call the shuffle function
ShuffleFile(in_file, out_file)
Code: Select all
python.exe shuffle.py
1. Make sure you have an input file called: d:/temp/input.txt
2. Open the shuffle.py file in Zeus
3. Use the Macros, Execute 'Shuffe.py' Script menu
4. To see the results open the file called: d:/temp/output.txt
Cheers Jussi
Re: Zeus Macro for deleting duplicate lines?
This random.py script can be used to randomize a marked region of text from inside editor. To randomize a region of text inside a file, first load the file into the editor, mark the lines of text to be randomized and then run the macro.
Cheers Jussi
Code: Select all
#
# Name: Random Marked Area
#
# Author: Jussi Jumppanen
#
# Language: Python
#
# Description: Using a marked region of text, randomize the marked lines
# of text and replace them with a random result set.
#
import zeus
import random
def key_macro():
if zeus.is_document():
if zeus.is_read_only() == False:
if (zeus.is_marked() == 1):
lines = []
zeus.cursor_save()
# get the marked area details
mode = zeus.get_marked_mode()
top = zeus.get_marked_top()
bottom = zeus.get_marked_bottom()
left = zeus.get_marked_left()
right = zeus.get_marked_right()
delta = bottom - top
# copy and shuffle the lines
for index in range(top, bottom + 1):
text = zeus.get_line_text(index)
lines.append(text)
random.shuffle(lines)
zeus.MarkDelete()
# insert the new lines and restore the markings
for line in lines:
zeus.line_insert(top, line)
zeus.set_marked_area(mode, top, left, bottom, right)
zeus.cursor_restore()
else:
zeus.message('This requires a marked region of text.')
zeus.beep()
else:
zeus.message('This macro only works for a writable document.')
zeus.beep()
else:
zeus.message('This macro only works for documents.')
zeus.beep()
key_macro() # run the macro