Page 1 of 1

Possible issue with preserving UTF-8 BOM

Posted: Mon Jan 06, 2014 8:49 pm
by aphor
Both "Editor Options" and "Document Options" contain a section for "File Encoding".

If I set:
- "Editor Options"->"File Encoding"="Automatic" with "Add BOM" checked
- "Document Options"->"File Encoding"="Use editor defaults" ("Add BOM" cannot be selected with this option)
then I expected:
- the default ("Automatic") Editor Option would be to preserve existing file format

However, with those settings, the BOM is stripped from UTF-8 encoded files when saving the modified file and when using "File"->"Save As..." with "Encoding=UTF-8" plus "Add BOM ..." selected, I did not correctly get a BOM on the saved file.

Explicitly setting the "Document Options" to "UTF-8" with "Add BOM" enabled gives me the behavior I want, but I'm not sure if the original configuration was expected to preserve the BOM.

Cheers!

Posted: Tue Jan 07, 2014 5:46 am
by jussij
I expected the default ("Automatic") Editor Option would be to preserve existing file format

Yep. That is the way it should work.
However, with those settings, the BOM is stripped from UTF-8 encoded files when saving the modified file and when using "File"->"Save As..." with "Encoding=UTF-8" plus "Add BOM ..." selected, I did not correctly get a BOM on the saved file.
And that does look like a bug :(

Watch this space for a new beta version.

Cheers Jussi

Posted: Tue Jan 07, 2014 6:26 am
by jussij
I'm having trouble replicating this issue :(

What I did was:

1) Setup the Editor Options as Automatic with Add BOM enabled.
2) Setup the XML Document type as UTF-8 with Add BOM enabled.
3) Setup the Text Document type as automatic.
4) Created a new test1.xml XML file and saved it to disk. Using the HEX dump in the Tools menu I can see that test1.xml file has a BOM.
5) I then did a file Save As of the test1.xml to test1.txt and using the HEX Dump I can see the new test1.txt file also has a BOM.
6) I modify the test1.txt file and do a save of that file that save retains the BOM.
7) I then do a Save As of the test1.txt file to the test2.txt that test2.txt file also retains the BOM.

Cheers Jussi

Posted: Tue Jan 07, 2014 7:25 pm
by aphor
Thanks for looking.

Following your steps below, I also cannot repro the issue.

If I use "C# Document Type" as the second file type (where you are using "Text Document Type"), though, I see the behavior I originally reported.

I tried going through the full options set for both the text and c# document types, to account for the difference, but didn't come up with a reason it only repros with *.cs files. I can e-mail you my zeuscs-*.ext files if that would help... it may be that I just changed a setting that had an implication I didn't intend.

Posted: Wed Jan 08, 2014 4:18 am
by jussij
There did seem to be an issue with the BOM and the file Save As action dialog.

The automatic option does make the BOM behaviour a little more complex and difficult to test.

But I think this issue should be fixed in the latest beta found here: http://www.zeusedit.com/z300/ze397r-beta.zip

Cheers Jussi

Posted: Wed Jan 08, 2014 10:57 pm
by aphor
That appears to fix it. I'll update this thread if I see any issues, but I'm 5 files in so far and haven't seen anything wrong.