How to Convert a File to HTML

Charlie R. Claywell
Reviewed by Vikki Olds

In a perfect world, anyone could take a word processor document and convert it for their web site with just the click of the mouse. However, anyone who has tried to convert text to HTML knows it's not that simple. The reason is that some of the functions used in a word processor (such as tabs) do not have an equivalent HTML tag. You can convert a text file to HTML, but you may have to give up some control over the layout of your document.

Three Conversion Options

1. Do a "Save As"

The most obvious option for conversion if you are using Microsoft Word is doing a 'Save As' and selecting 'Web Page'. In other word processing programs, when you save or publish the document, choose the HTML option. For example in Word Perfect, you will go to Publish >> HTML. This may work if the text document is very simplistic, but is not always effective.

Problems that result with this approach include:

  • This HTML conversion approach can normally can handle paragraph breaks, soft returns and minimal formatting like italic, bold or underlining, but beyond that the option struggles.
  • The layout will change because width is controlled by a style in HTML. The web version of your document will simply expand to its widest possible size. This can be controlled to a small degree by placing all your text inside a borderless table using percentages for the width, but problems can still occur.
  • Bloated coding can occur, especially with Word documents. Instead of converting paragraphs into <p> tags in HTML, Microsoft creates inline styles for each paragraph. The same is true for other elements like ordered or unordered lists.
  • Microsoft creates XML files and uses the @font-face rule to further control the HTML rendering of the page. This extra coding can add hundreds of lines of code to a document that may only be one-page long inside Word. The extra coding and files can make file management and site maintenance unnecessarily difficult.

Since most documents are not normally that simplistic, instead containing images, bullet lists, headers, tables with alternating row or column colors, Save As just doesn't work most of the time.

2. Use a Web Text Editor

Some programs, like Dreamweaver, have a 'convert Word to HTML' option built into the program. If you have Dreamweaver or another web text editor tool like TextPad, you may be able to save yourself considerable time by working backwards. Instead of creating the text document first, create the Web version first.

By creating your document in the editor, you can ensure that it displays correctly in a browser. If you paste the same content into a Word document using the option of retaining source formatting, your Word document should closely resemble the Web page. If you have already created the text document, double check the features of your web text editor.

3. Online Conversion Tools

If you are unable to use a web text editor to create your document, then the next best option is a conversion tool.

  • Word2CleanHTML: With Word2CleanHTML you simply paste your text into the tool, check or uncheck six options below the text (like remove empty paragraphs, etc.) and then click Convert to Clean HTML. Unlike some converters, this one handles bulleted lists fine, but it changes numbered lists to paragraphs. This converter is best used for text without tabs or colors. All colors are removed by the tool because of the way text processors apply color to a font -- and because it is more practical to control colors in HTML with a style sheet.
  • TextFixer: Like Word2CleanHTML, TextFixer lets you paste your content into the tool and just click a submit button. However, unlike Word2CleanHTML, you cannot select any options to further clean up the code. For example, you cannot check an option to get rid of empty paragraph tags. TextFixer also struggles more with tables. In Word2CleanHTML, tables for the most part maintain their overall size for columns, rows and the table itself as the converter attempts to adjust the width for each part of the table. In TextFixer the table simply collapses on itself because the tool removes any width or padding the table originally had. Neither of these two tools retain the color of the table's rows.
  • ZamZar: This tool seems to be the best online converter. ZamZar does have a couple of weaknesses, though, including the extra steps required to get your converted file. In this tool, you upload your text document and choose the format you want it converted to. Besides HTML, options like jpg, pdf and png are offered. After the file is converted you will receive an email with a link so you can download the new document -- which is a zip file if you have images in the document. Of the three converters listed here, it is the only one that retains colors for fonts and tables -- and it also retains any images on the original text document. Of course, you would need to upload the processed image to your server. The other issue with this tool is it creates CSS styles on the HTML document (instead of a style sheet) to maintain the original look of the document.

Plain or Rich Text

You can use either a plain text or rich text format.

  • You will probably end up with cleaner HTML code if you use a plain text editor like NotePad, but the tradeoff is, unless you use it a lot, you will have to learn it.
  • Most people rely on Microsoft Word or other rich text editors to process desktop documents and publications.

Regardless, of which text tool you use, conversion of some elements will always be shaky at best. Elements like tabs, cell spacing, cell padding in tables and bulleted or numbered lists tend to be the hardest to consistently convert cleanly.

Additional Options

If your text document is complicated and does not easily convert to HTML, you still have two more options.

  • The first and easiest is to convert the text file to a pdf. Doing this should create an exact duplicate of your text document. Simply upload the pdf to the server and link to it.
  • If you prefer not to force a user to open a pdf, then the last option is to hand-code the document in HTML. By hand-coding the content, you can ensure that all applicable styles are attached to each text element and that the page will display exactly as planned.

Conversion Decisions

When preparing to convert a document to HTML, there are a number of decisions to make. The option that is best for you is based on the complexity of coding and the tools available to you. Whether you rely on your word processing program, use a text editor or online conversion tool or resort to hand-coding, it is possible to get the results that you want.

How to Convert a File to HTML