Introduction
This essay will get you started writing
HTML (Hypertext Markup Language)
so you can put up a website on the Internet. HTML
a platform independent technique of distributing formatted documents via the web. The
bold, italic etc. in the document you are reading now
(presumably on a web browser), is encoded by embedding tags like <B> and
<I>. This markup scheme works on any brand of computer and allows websites to
send all information in a standard way, without having to worry about what brand of
computer the recipient has, or what software she uses.
Learning HTML
I found the easiest way to
learn HTML
is to look at other people’s examples, to cut and paste from them and to
experiment by fiddling the various parameters to see what the visual effects
are. Trying to make sense of W3C (World Wide Web Consortium)
HTML
standards requires a PhD in computer language theory. Anyone can play monkey and copy
from other sites.
Documentation
It might help to buy an introductory text,
but HTML
is so simple, that probably is not necessary.
Proofreading
To ensure your HTML
will work when you upload it to a website server, use all lowercase filenames and
directory names. Avoid spaces and punctuation (except _) in names. Make sure you use
relative links to your own files — no C: or
file://localhost/C|/ style absolute references.
The W3 Consortium offers an
online validator for the various HTML
dialects. It is sort of a Lint for HTML.
It can ensure your HTML
will work properly on browsers other than the one you tested it on. The
W3C
consortium also controls the various HTML
standards.
For speed and control, I use
CSE HTMLValidator to check my web pages offline and in batches.
When it comes to HTML4 (Hypertext Markup Language v 4) and CSS style
sheets, browser support is shaky. TopStyle will help you keep track of which features work on which
browsers.
This section just summarised the tags. Sometimes what I
tell you here will be enough to use them. It is really just designed to jog your
memory. Look elsewhere for details or experiment!
Class
It is easiest to use the class attribute, then specify what
it means in your CSS style sheet.
The tags used to apply CSS (Cascading Style Sheets) styles
Use of CSS
class |
Start Tag |
End Tag |
description |
<span class=strawberry> |
</span> |
encloses text of the strawberry class. The browser will look in the style
sheet to figure out what attributes should be applied to strawberry text,
perhaps a size, colour, alignment, font etc. |
<div class=strawberry sweet> |
</div> |
Applying two classes to the same tag. This applies both
the classes strawberry and sweet to a group of lines. The browser will look in the style
sheet to figure out what attributes should be applied to strawberry and to
sweet text, perhaps a size, colour, alignment, font etc. Note they are
separated by a space not a comma! |
<ul class=strawberry> |
</ul> |
like a regular ul except everything in it is
should be treated as strawberry text. |
Groups
Groups, Lists, Glossaries |
Start Tag |
End Tag |
description |
<ol> |
</ol> |
ordered numbered list |
<ul> |
</ul> |
unordered bulleted list. Consider using a borderless table with column of
titles and a column of detail instead. The bullets themselves don’t give
much additional information. |
<menu> |
</menu> |
menu list, more compact than ul. |
<li> |
</li> |
list item |
<dl> |
</dl> |
dictionary list |
<dt> |
</dt> |
dictionary term being defined |
<dd> |
</dd> |
dictionary definition |
Line Breaks
Line Breaks |
Tags |
description |
<br>
or
<br /> |
new line, no extra space.
To prepare for XHTML (extensible Hypertext Markup Language), it is better to
use <br />. |
<br clear=all /> |
gets past any flow-around illustration. |
<p>
or
<p>…</p> |
new paragraph, blank line inserted.
To prepare for XHTML,
it is better to use <p>…</p>
surrounding each paragraph. |
<p align=center>…</p> |
centre each line |
<hr>
or
<hr /> |
horizontal rule
To prepare for XHTML,
it is better to use <hr /> |
Font selectors (without CSS
)
These are mostly obsolete in HTML5 (Hypertext Markup Language v 5). You must use CSS
styles and classes instead.
Font Colours and Size |
Start Tag |
End Tag |
Appearance |
Description |
<h1> |
</h1> |
sample |
major heading |
<h6> |
</h6> |
sample |
most minor heading |
<b> |
</b> |
sample |
bold, denrecated. Use <strong> |
<strong> |
</strong> |
sample |
bold, formerly <b> |
<i> |
</i> |
sample |
italic, deprecated, Use <em> |
<em> |
</em> |
sample |
italic (emphasised), formerly <i> |
<tt> |
</tt> |
sample |
typewriter font |
<u> |
</u> |
sample |
underlined. |
<pre> |
</pre> |
sample
|
preformatted |
<font size=+3> |
</font> |
sample |
or 3 for absolute size rather than increase |
<font color=red> |
</font> |
sample |
see choice of colours. |
<td bgcolor=#ffeedd> |
</td> |
sample |
see choice of colours. |
<font face="Comic Sans
MS,Helvetica,sans-serif"> |
</font> |
sample |
suggest a typeface. User must have it installed, can specify alternates in
order of preference. You should end with one of the CSS
default fonts serif, sansserif or monospace. |
<big> |
</big> |
sample |
shorthand for <font size=+1> |
<small> |
</small> |
sample |
shorthand for <font size=-1> |
<dfn> |
</dfn> |
sample |
definition |
<em> |
</em> |
sample |
emphasis, usually renders as italic. |
<cite> |
</cite> |
sample |
book titles |
<code> |
</code> |
sample |
program listings |
<kbd> |
</kbd> |
sample |
keystrokes |
<samp> |
</samp> |
sample |
computer status messages |
<sup> |
</sup> |
2 |
superscript. You can also use entities like ² ² |
<strong> |
</strong> |
sample |
strong emphasis, usually rendered as bold. |
<var> |
</var> |
sample |
to be replaced by specific when used. Typically rendered in italics. |
<u> |
</u> |
sample |
underline |
<address> |
</address> |
sample
|
email address, possibly street address. |
<blockquote> |
</blockquote> |
Prematurely Aged Hawks
Why do hawks so often look as they are undergoing chemotherapy? Are the overcompensating for personal frailty?
Does all that paranoia prematurely age them?
~ Roedy (1948-02-04 age:70) |
long quotation |
Nested Quote Escaping
When you need to specify a " in the middle of
text, you can use the " entity to represent
the character, or just leave it unescaped as ".
When you need to specify a " in the middle of a
quoted attribute value, you have two ways of handling it:
- <param name=say value=He
said "All your base are belong to
us">
- <param name=say value='He said All your base are belong to us'>
You can insert comments in your
HTML
that are ignored. You can insert them in the text but not inside tags. Anything
between <!-- … --> is ignored. It is not treated like white space. Comments can
span lines. <!> is a dummy comment. Avoid the
string -- inside comments. I always put a space after
<!-- and before -->,
though it is not strictly necessary. Note the asymmetry of the start and end tags.
Comments are not treated as white space, e. g. grandstand will render as grandstand not
grand stand.
a <!-- large --> dog
a<!-- large --> dog
a <!-- large -->dog
will all render the same way: a dog but of course
a<!-- large -->dog
will render as adog.
Anchors
typical
<h%><a name="GLOSSARY"></a>Roedy’s Java Glossary</h
Rules for making up anchor names:
- The HTML 4.01 spec section 6.2
states that anchor names must begin with a letter a-z,
A-Z and may be followed by any number of letters,
digits 0-9, hyphens -,
underscores _, colons :,
and periods .. So lead _
are not permitted. All-numeric anchors are not permitted.
- Anchor names are supposed to be case-insensitive. Apple is supposed to be treated as the same as APPLE. To be safe, always consistently use UPPER CASE.
- For indirect links, use a trailing underscore _ on
the anchor name, e.g. MAC_ so you will know not to refer
people to those dummy anchors, but rather directly to the
HTML
at that anchor points to. For example the HTML
at anchor at MAC_ may say see
MACINTOSH. People are lazy and will get angry if
you send them to anchor MAC_ rather than anchor
MACINTOSH because they have to do an extra click to get
to MACINTOSH where the real information is.
Sun flagrantly ignores these rules and uses space, ( ) and comma in its anchors
in generated Javadoc.
Links To
typical
Colours
Click any ball to view the corresponding colour palette.
The above
colour chart shows Netscape’s 133 standard colours and
HTML
3.2’s 16 standard colours. It shows the colours displayed eight ways, (colour
on white, colour on black, black on colour, white on colour) both using alpha names
and hex names. You can check out your browser for Netscape colour compatibility. It
shows the Standard Netscape 8.0 alpha names such as aliceblue and also the hex, RGB (Red Green Blue)
an HSB (Hue Saturation Brightness)
values both as HTML
and raw ASCII (American Standard Code for Information Interchange)
text.
<fig> <caption>
<credit> <overlay> are not supported in the big three browsers.
Indenting
<ul>...</ul>
Happily, the technique also nests properly.
The official way is to use CSS styles.
<div style="padding-left: 30px">...</div>
If you want to pad all paragraphs, put this in the head
section or in the style sheet. <style type="text/css">
p {padding-left: 30px}
</style>
Or do <style type="text/css">
p.leftpad {padding-left: 30px}
</style>
and then it will only indent subsequent paragraphs that are marked like this:
<p class="leftpad">...</p>
Unfortunately, the technique does not handle nesting. <div> does
however.
To
<p style="text-indent: 30px">
I like to create my web pages with a text editor, but if you want a tool to help
you compose HTML
in a more WYSIWYG (What You See Is What You Get)
style try one of these:
- SlickEdit: This is what I use —
a general purpose editor. It has HTML
and Java syntax colouring which makes it much easier to avoid typos and Java and
HTML
beautifying to nicely indent the tags.
- DreamWeaver: A professional tool
for creating HTML and HTML
with embedded JSP (Java Server Pages), PHP (Pre-Hypertext Processor),
ASP (Active Server Page), ColdFusion etc.
- TopStyle: helps you compose and manage
your style sheets.
- Microsoft
FrontPage 2000
- The Quoter Amanuensis will
automatically convert HTML
’s reserved characters to their & é © etc. form.
You just copy your text to the clipboard, click CONVERT on the amanuensis, then
paste the converted text into your document in a text editor such as SlickEdit.
Decorating
Here are some tools for snazzing up your web pages
with graphics or other gizmos:
Special Character Entity Codes
Here are special characters and
the codes you must key to get them in HTML.
The official term for them is entities. These work no
matter what encoding the browser is using. If you want codes that change as the
encoding changes see this ASCII
table.
The entities such as ÷✓ only
work in HTML, not Java. In Java, you get at the exotic characters
by encoding them in hex in your strings like this: \u00f7\u2713 to produce ÷
✓. See String literals for
more details.
For official set of W3C
entities see this definitive list of entities.
Please tell me about any omissions in my own tables.
Last revised/verified: 2005-06-24
Standard Prelude
Here is a standard header you could use on all
your HTML
files, with the obvious modifications.
The header, link, meta and body tags have the following purposes:
DOCTYPE
says which level of HTML
you are using.
- If you use a strict DTD (Document Type Definition),
you must have absolutely perfect HTML,
something quite unlikely unless your website is generated by a computer
program.
- If your page has frames, use a frameset
DTD.
- For most websites, use a more relaxed loose
DTD.
- If you want an XML-like markup with strictly matching tags and more
consistency than regular HTML,
use one of the XHTML
DTD.
Here are the DOCTYPEs I use
Which DOCTYPE to
use?: you need a different one depending on whether you are doing plain,
frameset or strict html. Note DOCTYPE
HTML
PUBLIC should be upper case. Here are the possible DOCTYPES:
Visual Slick Edit
HTML
tidy will improperly convert your
DTDs (
Document Type Definitions)
to lower case.
link href
point to your master CSS
style sheet.
link meta
ICRA rating for sexual
and violent content. Replaces old PICS (Platform for Internet Content Selection)
labels.
link rev
give author’s email address. Used by Lynx browser. Watch
out for spam harvesters. I don’t use it.
link home
Allows keystroke shortcut to get to home page.
link icon
icon for the task bar. Must be 16x16.
IE (Internet Explorer)
ignores it in favour of the site level favicon.ico. Netscape ignores it entirely. Opera, Firefox
and SeaMonkey will scale it to 16 × 15 which
will usually badly distort it. The URL (Uniform Resource Locator)
is relative to the current page.
link prev
specify the logically previous page so browser can do a better
back/up. You can also have a next link to the next logical page to help the
browser navigate. There are all kinds ofspecial links you can embed for browsing
via a toolbar.
Content-Language
the language you wrote the page in,
usually en for English. See the list of
possible language codes.
Content-Style-Type
Says we are using
CSS
as our stylesheet language. Without it embedded style markup might be ignored or
misinterpreted.
Content-Type
Tell which character set encoding you are using, e.g. which accented letters you
use. Normally iso-8859-1 for Latin-1, sometimes
iso-8859-3 for Esperanto, or Unicode (or utf-8 for compact
Unicode) to handle nearly all languages, including Arabic, Thai and Chinese.
icra-label
encodes your document for adult content, if any.
Author
author’s name.
Copyright
© copyright of the web page.
Description
Summary of what’s on the page. Used by search engines
to summarise the page for people. Should be more detailed than the title.
Generator
What tool you used to create the HTML.
Keywords
important words in the document. Used by search engines to
direct people to your document. Avoid using concept words that don’t
actually appear in the document. Search engines think you are cheating.
title
Title used to display a hit by a search engine. Also used on the
window bar when the page is displayed.
body tags
are for CSS-challenged browsers that don’t understand the
style sheet.
Body Tag Details
If you use CSS,
you don’t use a <body tag.
Body tag fields for pre-CSS browsers.
Body Tags |
Field |
Function |
bgcolor |
background RGB
in hex |
background |
*.gif to use as background tiled. For repeating
backgrounds, it is best to make the *.gif 25 pixels
high even if in theory 1 pixel would do. That speeds rendering even though it
slows download. |
text |
ordinary text colour |
link |
clickable links not yet visited |
vlink |
links that have already been visited |
alink |
active link text, what you just clicked. |
marginheight |
pixels in border top/bottom. Oddly 0 is an
illegal value. Leave marginheight out altogether
for 0. |
marginwidth |
pixels in border left/right. Oddly 0 is an
illegal value. Leave marginwidth out altogether
for 0. |
To make such code easier to maintain, you could use SSI Server Side Includes. You then need maintain only one copy of the
standard headers. I do it with HTML
macros, which does not require any code on the server. If you look at my
HTML
source, you can see how I generate standard headers. I have not yet released the
tools I use to the public. If you are curious how I generate my website using macros,
see the HTML4 entry.
CSS
Styles
In general, avoid inline styles and use style sheets. That way you can
make a change to you style sheet and your whole website is instantly updated. If you
really want to do it, look around the net at people’s headers.
CSS
Style Sheets
Have a look at my style sheet. Looking at an
example will probably explain nearly everything you need to know. TopStyle makes it easier to edit style sheets, but it
won’t explain what the million little fields are for. You will probably figure
it out much faster that way that by reading documentation prepared for and by
mathematicians.
HTML Used Only In Emails |
HTML |
Purpose |
<x-sigsep>
<p></x-sigsep> |
Separates body from the signature. |
|
Indicates some text in your email, in this case
http://mindprod.com/ that looked to Eudora like a
URL,
that it has automatically converted into a link. |
|
Used for quoting in replies. Rendered as nested vertical bars down the left
margin. |
|
An embedded, as opposed to attached, image. The image itself is made into a
hidden attachment that is base64
encoded. HTML
email cannot presume web access when mail is written, so ordinary <img
tags can’t be used. |
Tips
- You may read that it is a wise idea to replace links to ../index.html with ./.
The problem is, if you do it, your links will not work if your files are loaded from local hard disk. Instead of jumping to the home page, you will get a directory listing of the home directory.
Learning More
- Visibone make a series of
cheat sheets, both online and printed. Some show the full Unicode set and the
extended Unicode &xxx codes. Others show CSS.
Others show XHTML. Some are colour charts.