Too Many Long Boxes!
   
   
Fanzing Tutorial
Intro
Part 1: Tools
Part 2: Basic Code
Part 3: More Codes
Part 4: Basic Page
Part 5: Conversion
Part 6: Clean-Up
Part 7: Skeleton!
Part 8: Demo
Part 9: Style!
Part 10: Resources
Character codes
Contributor's Central:
Main Page

HTML Tutorial

Fanzing Forum

Creating Desktop Themes

F.A.Q.

End of Summer
 

Fanzing's Web Design Tutorial

Part 5: Document Conversion

Do I Have To Fix My 180-Page Epic By Hand?

So far, so good.  You know how to markup your documents with paragraph tags, bold and italic tags, and maybe a few of the fancier tags that I told you about in Step Three.  Just one thing: Do you HAVE to type a <p> tag and a </p> tag around all 590 paragraphs in that long Batman story you wrote last month?

I know the feeling.  BELIEVE me, I know the feeling.   Because for the last two years, I've had to prepare hundreds of unformatted texts…some of them dozens of pages long!  And if there's anything I'm qualified to teach, it's shortcuts for making documents of all kinds (text, Word, WordPerfect, rich text format) into HTML pages.

For a long time, my best method was to take Rich Text Format files and open them in Arachnophilia, which did the conversion for me.  Rich Text Format is like an advanced text editor (or a basic word processor, depending on what you do with it).  Unlike TXT files, it saves formatting like bold, italics and fonts.  Quite a lifesaver that was.  Unfortunately, Arachnophilia is not only glitchy, it has bad habits.  It does not put <p> tags around the paragraphs, for one thing.  Also, at times it turns off the tags in the wrong order.  Both can be problems.

HOWEVER, all is not lost.  I still recommend using Arachnophilia for working with the code.   And I'd much rather have to "clean up" the code on a 50-page epic than do the HTML coding from scratch!

 

So Many Documents, So Many Methods!

I'll admit, I don't really know where to begin with this one.  First, all of you have different kinds of documents.  Each kind of document has several different ways to convert it.  And depending on the conversion used, you may need to clean up the code.

Some of you write in text (using Notepad or similar).  Others work in Microsoft Word.  Others use a different type of word processor and then use Rich Text Format to convert it.  And on top of all that, you can transform all of these documents from one kind to another!

Hence, I'm not going to make any kind of claims to completeness or authority. After you get the hang of this and decide to experiment, you may come up with an easier way to do things.  There may be other conversion programs out there that do a better job.  If so, let me know!

Here are some of the options for each.

 

IF YOU HAVE A MICROSOFT WORD DOCUMENT (.DOC), YOU CAN DO ANY OF THE FOLLOWING:

  • Open the document in Microsoft Word. Under "File", choose "Save as HTML".  This option is available in Word '97 and later.  See comments below about using this method. 
  • Save the document as an .RTF file, then open it in Arachnophilia.
  • Save the document as a text file, then follow one of the options for transforming text files.  You may wonder "Why would anyone do this method?"  See the discussion below about unnecessary font tags.

Dreamweaver 3 automatically cleans up MS-Word converted HTML. Thus, I'd recommend using the MS-Word conversion if at all possible.

IF YOU HAVE A TEXT (.TXT) FILE, YOU CAN DO ANY OF THE FOLLOWING:

  • Open the TXT file in Microsoft Word.  Under "File", choose "Save as HTML".  This option is available in Word '97 and later.  I'd recommend this option if possible.
  • Code the html by hand in Notepad.
  • Open the TXT file in Arachnophilia and add the tags yourself, using some of the handy buttons and menus to simplify the process.  Then save it as an HTML file.
  • Open the TXT file in Arachnophilia.  Then go to the "File" menu, select "New" and choose "RTF File".  Copy the TXT file's contents into the RTF file window.  Save the RTF file and close both windows.  Then open that RTF file in Arachnophilia and the file will be converted into HTML.  From there, tidy up the code (since it will not have paragraph marks; Arachnophilia uses two break tags instead).
  • Open the TXT file in Netscape Composer, then "Save".  It will be saved as HTML and converted.  While I don't like WYSIWYG browsers, particularly for complex documents, this one doesn't appear to be all that bad for basic conversion.  Not a lot of clutter.  Like Arachnophilia, the paragraph tags need work.  However, Composer at least puts in the opening <p> instead of break tags.
  • Look for the options offered by other programs.  I'm sure there are many other methods possible.  Perhaps you'll find the perfect one that formats paragraphs properly!

IF YOU HAVE A RICH TEXT FORMAT DOCUMENT (.RTF), YOU CAN DO ANY OF THE FOLLOWING:

  • Open the document in Microsoft Word. Under "File", choose "Save as HTML".  This option is available in Word '97 and later.  I'd recommend this option if possible.
  • Other word processors may also offer the "Save As HTML" option; you'll need to evaluate the resulting code for yourself.
  • Open the document in Arachnophilia and it will be converted to HTML. Then you just have to clean it up.
  • Save the document as a text file, then follow one of the options for transforming text files.  You may wonder "Why would anyone do this method?"  See the discussion below about unnecessary font tags.

 

What Is The Goal?

I have often heard web designers railing against this design program or that program…particularly the WYSIWYG programs…because they put in all kinds of junk code or unnecessary formatting.  This is true, although I think some of the worst offenders of old are improving (and some designers are particularly snooty about anything that isn't coded by hand 100%!). 

Since you may have just learned HTML in the earlier pages of this tutorial, you probably aren't as tough a critic as a techie would be.  You may be wondering how you can tell "good code" from "bad code".  More to the point, since we have specific needs for our Fanzing web design, what are we concerned about?

My biggest concerns are:

  • That all paragraphs have a <p> at the beginning and a <p> at the end.
  • That HTML tags are turned off in the proper sequence (as we discussed in tutorial 2, on the basics of HTML).
  • That the font tags be removed or used at a minimum.

 

About Font Tags

The first two concerns have been covered in depth already. What of this third requirement about font tags?  I didn't even teach you about font tags, right?  Unfortunately, while you don't need to learn about font tags for our purposes, you will probably encounter them…because some of these conversion methods add them!

You shouldn't HAVE to work with font tags.  Size, color and typeface are the three main things changed by the font tag.  Now, on some web sites, those change all the time.  It makes the web site more colorful.  However, at Fanzing we're generally working with one font face, one font color and one font size.  In the past, this was just a good idea.  Now that I'm using a style sheet to make all paragraphs all over the site look the same, the <font> tags can really undermine that.  Thus, font tags must be removed if the software puts them in.  (And if you put them in intentionally, such usage needs to be rare.)

If you're converting from plain text, no problem.  None of the conversion methods should add font tags.

If you're converting from RTF and Word formats directly to HTML, you should be on the lookout for them.  Some documents (it really depends on the way you first typed the document) may have a font tag at the beginning and the ending of the piece.  Others may end up with tons of icky code throughout, requiring a massive clean-up.

One trick, which you should apply at your discretion, is to select "Save As…" and change the document from RTF or DOC to a TXT file.  This gets rid of all formatting!   For most authors, who only have paragraphs and headers at the most, that's not a bad sacrifice; that means you make the conversion, then put the proper markup around the headers and you're in business!  However, if you've got a lot of bolds, italics and other formatting already in the story, I wouldn't use this method.  I'd do another form of clean-up.

 

Clean-up is such a multi-pronged project that I thought I'd give it its own section.  Continue on to Part Two.

is Editor-In-Chief of Fanzing.com. He is the world's biggest Elongated Man fan and runs the only EM fan site. He lives in Rochester, MN.
AIM: Fanzinger
ICQ: 70101007

 
Return to the Top of the Page

Now that you've read this piece,
discuss it in the Fanzing Forum!

     
 
This tutorial is © 2000 Michael Hutchison
Fanzing is not associated with DC Comics.
All DC Comics characters, trademarks and images (where used) are ™ DC Comics, Inc.
DC characters are used here in fan art and fiction in accordance with their generous "fair use" policies.

LinkExchange
 
Fanzing site version 7.3
Updated 3/12/2009