HTML Cleaner

NOTE: Works using  Firefox, Chrome, Opera. DOES NOT WORK WITH MS EDGE.

HTML Cleaner is on online tool you can use to clean up messy code created by saving a Word or Excel document as a webpage. It can also clean up the messy code created by using a WYSIWYG editor. This includes cleaning up your tables. Tables should only be used for presenting data and they can have quite a bit of excessive code if created in Word or Excel.

If your pages are filled with unwanted messy code it can affect not only the uniform look and feel of your website, but at the same time it's very unhealthy from the SEO's point of view. So this simple tool can be a big help to you.

How to use the HTML Cleaner?

Opening HTML Cleaner presents you with two input fields - a WYSIWYG editor on the left and a plain text editor on the right. Copy and paste the code you want to clean into the right hand side plain text editor. On the left hand side you will see how your code displays. For the purpose of this article, I am using Cemeteries in Washington County as an example. It is a large table and if you view source to see all of the generated code.

Screenshot HTML Cleaner.

Further down on the page is the control panel where you can set up the cleaning preferences. This is a list of checkboxes with the most commonly used operations a web editor might need.

Screenshot cleaning options. 

Make sure that you check the following options when cleaning your tables so that your images and hyperlinks will remain. You can view more details on each of the selections of before and after samples of the code.

Remove Tag attributes, except (img tags) and href (anchor tags) - This option will preserve your HTML structure but will remove every attribute such as classes, inline styles and other tag attributes except the src attribute of image tags and href attributes of anchor tags. These features have been separated because there are individual options to remove the links and images from the HTML source.

Original Code
<table style="width: 300px; text-align: center;" border="1" cellpadding="5">
<tbody>
<tr>
<th width="75"><strong>Name</strong></th>
<th colspan="2"><span style="font-weight: bold;">Telephone</span></th>
</tr>
<tr>
<td>John</td>
<td><a href="tel:0123456785">0123 456 785</a></td>
<td><img src="images/check.gif" alt="checked" /></td>
</tr>
</tbody>
</table>
Cleaned Code
<table>
<tbody>
<tr>
<th><strong>Name</strong></th>
<th><span>Telephone</span></th>
</tr>
<tr>
<td>John</td>
<td><a href="tel:0123456785">0123 456 785</a></td>
<td><img src="images/check.gif"/></td>
</tr>
</tbody>
</table> 

Remove Inline Styles - Using inline styles means that the styles of the elements are not assigned by classes or ids but are specified within the HTML tag using the style="..." attribute. Using this feature of the HTML Cleaner you can easily remove every unwanted inline css code from your document with a single click.

Original Code with Inline Styles
<p style="color: #228;">How are you today?</p>
<table style="width: 300px; text-align: center;" border="1" cellpadding="5">
<tr>
<th width="75"><strong><em>Name</em></strong></th>
<th colspan="2"><span style="font-weight: bold;">Telephone</span></th>
</tr>
<tr>
<td>John</td>
<td><a style="color: #F00; font-weight: bold;" href="tel:0123456785">0123 456 785</a></td>
<td><img width="25" height="30" src="images/check.gif" alt="checked" /></td>
</tr>
</table>
Cleaned Code with Inline Styles removed
<p>Hello <strong>World!</strong></p>
<p>How are you today?</p>
<table border="1" cellpadding="5">
<tr>
<th width="75"><strong><em>Name</em></strong></th>
<th colspan="2"><span>Telephone</span></th>
</tr>
<tr>
<td>John</td>
<td><a href="tel:0123456785">0123 456 785</a></td>
<td><img width="25" height="30" src="images/check.gif" alt="checked" /></td>
</tr>
</table>

Remove Classes & IDs - Removes all the classes and ID's from your messy HTML code. If you want to rewrite these attributes, you can leave this option unchecked and use the find and replace tool to specify your own id and class names. Be careful with this one if you have used classes or id's for your layout.

Original Code with Classes
<h1 class="bigHeading">Names</h1>
<table class="greyTable" style="width: 300px;" border="1">
<tbody>
<tr><th class="TableHeading" width="75">Name</th><th class="TableHeading" colspan="2">Telephone</th></tr>
<tr>
<td>John</td>
<td><a class="tapToCall" href="tel:0123456785">0123 456 785</a></td>
<td><img id="checked1234" src="images/check.gif" alt="checked" width="25" height="30" /></td>
</tr>
</tbody>
</table>
Cleaned Code with Classes Removed
<h1>Names</h1>
<table style="width: 300px;" border="1">
<tbody>
<tr><th width="75">Name</th><th colspan="2">Telephone</th></tr>
<tr>
<td>John</td>
<td><a href="tel:0123456785">0123 456 785</a></td>
<td><img src="images/check.gif" alt="checked" width="25" height="30" /></td>
</tr>
</tbody>
</table>
Remove successive &nbsp;&nbsp; - It's a bad habit to create vertical gaps entering empty lines instead of setting a margin. These are basically paragraphs containing a single non-breaking space (<p>&nbsp;</p>). Use this feature together with (sometimes) the Remove successive spaces option to get rid of all these unwanted lines.
Original Code using successive &nbsp;&nbsp;
<p>&nbsp;</p>
<p>&nbsp;</p>
<h2>Names</h2>
<p><strong>Name &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Telephone</strong></p>
<p>John&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0123&nbsp;456&nbsp;78 &nbsp;&nbsp;&nbsp; <img src="images/check.gif" alt="checked" /></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
Cleaned Code with successive &nbsp;&nbsp; removed
<p>&nbsp;</p>
<p>&nbsp;</p>
<h2>Names</h2>
<p><strong>Name Telephone</strong></p>
<p>John 0123&nbsp;456&nbsp;78 <img src="images/check.gif" alt="checked" /></p>
<p>&nbsp;</p>
<p>&nbsp;</p>

Remove Span Tags - Inline text styles are often set by using the span tags. Activating this option will remove all span tags including their styles, classes etc.

Original Code with span tag
<p>Hello <span class="head">World</span>!</p>
Original Code with span tag removed
<p>Hello World!</p>

Removing Comments - There's nothing too much to explain with this feature. It does what the title says, removes EVERY HTML comment. Everything written between the <!-- beginning and --> closing tag is considered a comment. Be careful with this one as you may not want all of your comments removed especially if you happen to be using a dynamic web template.

Original Code with comment
<p>Lorem ipsum dolor</p> <!-- sit amet -->
Original Code with comment removed
<p>Lorem ipsum dolor</p>

Encode special characters - With this option you can activate or disable the encoding of special html characters. While other source cleaning options take effect when you hit the Clean HTML button, this one is in effect all the time when you modify the visual editor. When this setting is checked/unchecked the editors refresh immediately executing the character encoding as selected.

Code with Encoding turned OFF
<p><em>-<sup>b</sup>/<sub>R</sub> ?' cos<sup>2</sup> ? - m x'' cos ? - m R ?'' cos<sup>2</sup>? + m R ?' <sup>2</sup>sin ? cos ? - m g sin ? - <sup>b</sup>/<sub>R</sub> ?' sin<sup>2</sup> ? = m R ?'' sin<sup>2</sup>? + m R ?' <sup>2</sup>sin ? cos ?</em></p>
Code with Encoding turned ON
<p><em>&minus;<sup>b</sup>&frasl;<sub>R</sub> &theta;' cos<sup>2</sup> &theta; &minus; m x'' cos &theta; &minus; m R &theta;'' cos<sup>2</sup>&theta; + m R &theta;' <sup>2</sup>sin &theta; cos &theta; &minus; m g sin &theta; &minus; <sup>b</sup>&frasl;<sub>R</sub> &theta;' sin<sup>2</sup> &theta; = m R &theta;'' sin<sup>2</sup>&theta; + m R &theta;' <sup>2</sup>sin &theta; cos &theta;<br /></em></p>

Set new lines and Text indents - When the execution of all the other active cleaning options has been completed this option makes the hierarchy of the HTML tags visible using text indention by entering the required amount of tabs at the beginning of each line for a better readability. NOTE: I do not use this one simple because I can accomplish the same thing using Expression Web.

Once you click the big Clean HTML button, the checked operations are going to be executed on the current HTML source. The results will be displayed. The text editor on the right will show your cleaned code and the WYSIWYG editor will show you how the page displays. NOTE: If you are cleaning a table and have any empty cells, make sure you have NOT TICKED Remove tags containing one &nsbp;

Screenshot Cleaned code.

Next to the settings is the section allocated to the Find and replace tool. You can add up to 12 substitutions which will be executed once the other HTML cleaning operations have been completed.

Example of a table that has hyperlinks

Example of a table that has hyperlinks  CLEANED

Example of a table that has hyperlinks  CLEANED and STYLED

 IMPORTANT NOTE: Using the free version of the HTML Cleaner requires that you  include links in the edited documents. This cleanup tool might add a promotional third party link to the end of the cleaned documents and you need to leave this code unchanged as long as you use the free version.