A small html file was created with the most commonly used html syntax. It's size was 2402 bytes. It was imported into MS Word 2000 and its size increased nearly tenfold to 23040 bytes. It was exported as an HTML file with a size of 8217 bytes, still nearly 3.5 times as big as the original file.
The same file was imported into Word Perfect 7 and its size became 12361, five times as big as the HTML file. It was exported as an HTML file with a size of 2545, only slightly bigger than the original file.
The html added by MS Word was a vast amount of unnecessary code, 2.5 times as much as the desired content. The only html added by Word Perfect was a set of default color choices, though it did mess up the position of the carriage return characters.
The last test was to load the Word Perfect file into MS Word and export it as html. Results were not satisfactory. An attempt was made to load the MS Word file into Word Perfect for export, but Word Perfect was not able to load the MS Word file.
Here are Bill Parke's comments on the text added by MS Word: "99% of what Word adds is stylesheet information, and 99% of that is non-standard (i.e. not CSS1 or 2 -- all that "mso-... " stuff), not to mention all the redundant style information included with various HTML elements throughout the document."
Eric Lescasse said, "Dreamweaver 3 (and Dreamweaver 3 UltraDev) from Macromedia has a Word HTML "cleaner" feature which removes about 80% to 90% of the tags found in Word HTML documents without apparently losing any visual aspect of the document!"