Introduction
This essay will get you started writing
HTML (Hypertext Markup Language)
so you can put up a website on the Internet.
HTML
a platform independent technique of distributing formatted documents via
the web. The
bold, italic etc. in the document you are
reading now (presumably on a web browser), is
encoded by embedding tags like <B> and <I>. This markup
scheme works on any brand of computer and
allows websites to send all information in a standard way, without
having to worry about what brand of computer
the recipient has, or what software she uses.
Learning
HTML
I found the easiest way to learn
HTML
is to look at other people’s examples, to cut
and paste from them, and to experiment by fiddling the various
parameters to see what the visual effects
are. Trying to make sense of
W3C (World Wide Web Consortium)
HTML
standards requires a PhD in computer language theory. Anyone can
play monkey and copy from other sites.
Documentation
It might help to buy an introductory text, but
HTML
is so simple, that probably is not necessary.
Proofreading
To ensure your
HTML
will work when you upload it to a website server, use all lowercase
filenames and directory
names. Avoid spaces and punctuation (except _) in names. Make sure you
use relative links to your own files
— no C: or file://localhost/C|/ style
absolute references.
The W3 Consortium offers an
online
validator for the various
HTML
dialects. It is sort of a Lint for
HTML.
It
can ensure your
HTML
will work properly on browsers other than the one you tested it on.
The
W3C
consortium also
controls the various
HTML
standards.
For speed, and control, I use
CSE HTMLValidator to check my web pages offline and in
batches.
When it comes to HTML4, and CSS
style sheets, browser support is shaky.
TopStyle will help you keep
track of which features work on which
browsers.
The Tags
This section just summarised the tags. Sometimes what I tell you here
will be enough to use them. It is really
just designed to jog your memory. Look elsewhere for details or experiment!
Class
It is easiest to use the class attribute, then specify what it means in
your CSS style sheet.
| Use of
CSS (Cascading Style Sheets)
class |
| Start Tag |
End Tag |
description |
| <span class=strawberry> |
</span> |
encloses text of the strawberry class. The browser will look
in the style sheet to figure out what
attributes should be applied to strawberry text, perhaps a size,
colour, alignment, font etc. |
| <div class=strawberry
sweet> |
</div> |
Applying two classes to the same tag. This
applies both the classes strawberry
and sweet to a group of lines.
The browser will
look in the style sheet to figure out what attributes should be
applied to strawberry and to sweet text,
perhaps a size, colour, alignment, font etc. Note they are
separated by a space not a comma! |
| <ul class=strawberry> |
</ul> |
like a regular ul except
everything in it is should be treated as strawberry
text. |
Groups
| Groups, Lists, Glossaries |
| Start Tag |
End Tag |
description |
| <ol> |
</ol> |
ordered numbered list |
| <ul> |
</ul> |
unordered bulleted list. Consider using a borderless table
with column of titles and a column of detail
instead. The bullets themselves don’t give much additional
information. |
| <menu> |
</menu> |
menu list, more compact than ul. |
| <li> |
</li> |
list item |
| <dl> |
</dl> |
dictionary list |
| <dt> |
</dt> |
dictionary term being defined |
| <dd> |
</dd> |
dictionary definition |
Line Breaks
| Line Breaks |
| Tags |
description |
<br>
or
<br /> |
new line, no extra space.
To prepare for
XHTML (extensible Hypertext Markup Language),
it is better to use <br />. |
| <br clear=all /> |
gets past any flow-around illustration. |
<p>
or
<p>…</p> |
new paragraph, blank line inserted.
To prepare for
XHTML,
it is better to use <p>…</p>
surrounding each paragraph. |
| <p align=center>…</p> |
centre each line |
<hr>
or
<hr /> |
horizontal rule
To prepare for
XHTML,
it is better to use <hr /> |
Font selectors (without
CSS)
| Font Colours and Size |
| Start Tag |
End Tag |
Appearance |
Description |
| <h1> |
</h1> |
sample |
major heading |
| <h6> |
</h6> |
sample |
most minor heading |
| <b> |
</b> |
sample |
bold, c.f. strong |
| <i> |
</i> |
sample |
italic, c.f. em |
| <tt> |
</tt> |
sample |
typewriter font |
| <u> |
</u> |
sample |
underlined. |
| <pre> |
</pre> |
sample
|
preformatted |
| <font size=+3> |
</font> |
sample |
or 3 for absolute size rather than increase |
| <font color=red> |
</font> |
sample |
see choice of colours. |
| <td bgcolor=#ffeedd> |
</td> |
sample |
see choice of colours. |
| <font face="Comic
Sans
MS,Helvetica,sanserif"> |
</font> |
sample |
suggest a typeface. User must have it installed, can specify
alternates in order of preference. You
should end with one of the
CSS
default fonts serif, sansserif
or monospace. |
| <big> |
</big> |
sample |
shorthand for <font size=+1> |
| <small> |
</small> |
sample |
shorthand for <font size=-1> |
| <dfn> |
</dfn> |
sample |
definition |
| <em> |
</em> |
sample |
emphasis, usually renders as italic. |
| <cite> |
</cite> |
sample |
book titles |
| <code> |
</code> |
sample |
program listings |
| <kbd> |
</kbd> |
sample |
keystrokes |
| <samp> |
</samp> |
sample |
computer status messages |
| <sup> |
</sup> |
2 |
superscript. You can also use entities like ²
² |
| <strong> |
</strong> |
sample |
strong emphasis, usually rendered as bold. |
| <var> |
</var> |
sample |
to be replaced by specific when used. Typically rendered in
italics. |
| <u> |
</u> |
sample |
underline |
| <address> |
</address> |
sample
|
email address, possibly street address. |
| <blockquote> |
</blockquote> |
Diderot took the ground that, if orthodox religion be true Christ was guilty of suicide. Having the power to defend
himself he should have used it.
~ Robert Green Ingersoll (born: 1833-08-11 died: 1899-07-21 at age: 65) |
long quotation |
Nested Quote Escaping
When you need to specify a " in the
middle of text, you can use the "
entity to represent the character, or just leave it unescaped
as ".
When you need to specify a " in the
middle of a quoted attribute value, you
have two ways of handling it:
- <param name=say value=He said "All
your base are belong to us">
- <param name=say value='He
said All your base are belong to us'>
Comments
You can insert comments in your
HTML
that are ignored. You can insert them in the text but not inside tags.
Anything between <!-- … -->
is ignored. It is not treated like white space.
Comments can span lines. <!> is a
dummy comment. Avoid the string -- inside
comments. I always put a space after <!--
and before
-->, though it is not strictly
necessary. Note the asymmetry of the start and end
tags.
Comments are not treated as white space, e. g. grandstand will render as grandstand
not grand stand.
a <!-- large --> dog
a<!-- large --> dog
a <!-- large -->dog
will all render the same way: a dog but
of course
a<!-- large -->dog
will render as adog.
Anchors
typical <h2><a name="GLOSSARY"></a>Roedy’s Java Glossary</h2>
Rules for making up anchor names:
- The HTML
4.01 spec section 6.2
states that anchor names must begin with a letter a-z,
A-Z, and may be followed by any number
of letters, digits 0-9,
hyphens -, underscores _,
colons :, and periods ..
So lead _ are not
permitted. All-numeric anchors are not permitted.
- Anchor names are supposed to be case insensitive. Apple
is supposed to be treated
as the same as APPLE. To be safe, always
consistently use UPPER CASE.
- For indirect links, use a trailing underscore _
on the anchor name, e.g.
MAC_ so you will know not to refer people
to those dummy anchors, but rather directly
to the
HTML
at that anchor points to. For example the
HTML
at anchor at MAC_ may say
see MACINTOSH.
People are lazy and will get angry if you send them to
anchor MAC_ rather than anchor MACINTOSH,
because they have
to do an extra click to get to MACINTOSH
where the real information is.
Sun flagrantly ignores these rules and uses space, ( ) and comma in its
anchors in generated Javadoc.
Links To
typical
Colours
Click any ball to view the corresponding colour palette.
The
above colour chart shows Netscape’s
133 standard colours, and
HTML
3.2’s 16 standard colours. It shows the colours displayed eight
ways,
(colour on white, colour on black, black on colour, white on colour)
both using alpha names and hex names. You
can check out your browser for Netscape colour compatibility. It shows
the Standard Netscape 8.0 alpha names such
as aliceblue and also the hex,
RGB (Red Green Blue) an
HSB (Hue Saturation Brightness)
values both as
HTML
and raw
ASCII (American Standard Code for Information Interchange)
text.
Figures
<fig> <caption> <credit>
<overlay> are not supported in the big
three browsers.
Indenting
<ul>...</ul>
Happily, the technique also nests properly.
The official way is to use CSS
styles.
<div style="padding-left: 30px">...</div>
If you want to pad all paragraphs, put this in the head
section or in the style sheet.
<style type="text/css">
p {padding-left: 30px}
</style>
Or do <style type="text/css">
p.leftpad {padding-left: 30px}
</style>
and then it will only indent subsequent paragraphs that are marked like
this:
<p class="leftpad">...</p>
Unfortunately, the technique does not handle nesting. <div>
does however.
To
<p style="text-indent: 30px">
Composition Tools
I like to create my web pages with a text editor, but if you want a
tool to help you compose
HTML
in a more
WYSIWYG (What You See Is What You Get)
style try one of these:
- SlickEdit: This is
what I use — a general purpose editor.
It has
HTML
and Java syntax colouring which makes it much easier to avoid typos,
and Java and
HTML
beautifying
to nicely indent the tags.
- DreamWeaver: A
professional tool for creating
HTML,
and
HTML
with embedded
JSP (Java Server Pages),
PHP (Pre-Hypertext Processor),
ASP (Active Server Page),
ColdFusion etc.
- Netscape Composer:
part of Netscape Communicator.
- TopStyle: helps you
compose and manage your style sheets.
- Microsoft
FrontPage 2000
- The Quoter
Amanuensis will automatically convert
HTML
’s reserved characters to their & é
© etc. form. You just copy your text
to the clipboard, click CONVERT on the amanuensis, then paste the
converted text into your document in a text
editor such as SlickEdit.
Decorating
Here are some tools for snazzing up your web pages with graphics or
other gizmos:
Special Character Entity Codes
Here are special characters and the codes you must key to get them in
HTML.
The official term for them is
entities. These work no matter what encoding
the browser is using. If you want codes
that change as the encoding changes see this ASCII
table.
The entities such as ÷✓
only work in
HTML,
not Java. In
Java, you get at the exotic characters by encoding them in hex in your
strings like this: \u00f7\u2713
to produce ÷ ✓. See String literals for more details.
For official set of
W3C
entities see this definitive
list of entities. Please tell me about any
omissions in my own tables.
Last revised/verified: 2005-06-24
Standard Prelude
Here is a standard header you could use on all your
HTML
files, with the obvious modifications.
The header, link, meta and body tags have the following purposes:
DOCTYPE
says which level of
HTML
you are using.
- If you use a strict
DTD (Document Type Definition),
you must have absolutely perfect
HTML,
something
quite unlikely unless your website is generated by a computer
program.
- If your page has frames, use a frameset
DTD.
- For most websites, use a more relaxed loose
DTD.
- If you want an XML-like markup with strictly matching tags and
more consistency than regular
HTML,
use
one of the XHTML
DTD.
Here are the DOCTYPEs I use
Which
DOCTYPE to use?: you need a
different one depending on whether you are doing plain, frameset or
strict html. Note DOCTYPE
HTML
PUBLIC should be upper case. Here are the possible
DOCTYPES:
Visual Slick Edit
HTML
tidy will improperly convert your
DTDs (
Document Type Definitions)
to lower case.
link href
point to your master
CSS
style sheet.
link meta
ICRA rating for sexual and
violent content. Replaces old
PICS (Platform for Internet Content Selection)
labels.
link rev
give author’s email address. Used by Lynx browser. Watch out
for spam harvesters. I don’t use it.
link home
Allows keystroke shortcut to get to home page.
link icon
icon for the task bar. Must be 16x16.
IE (Internet Explorer)
ignores it in favour of the site level favicon.ico.
Netscape ignores it entirely. Opera, Firefox and SeaMonkey will
scale it
to 16 × 15 which will usually
badly distort it. The
URL (Uniform Resource Locator)
is relative to the
current page.
link prev
specify the logically previous page so browser can do a better
back/up. You can also have a next link to the
next logical page to help the browser navigate. There are all kinds
of special
links you can embed for browsing via a toolbar.
Content-Language
the language you wrote the page in, usually en
for English. See the list of
possible language codes.
Content-Style-Type
Says we are using
CSS
as our stylesheet language. Without it embedded style markup might
be ignored or
misinterpreted.
Content-Type
Tell which character set encoding
you are using, e.g. which
accented letters you use. Normally iso-8859-1
for Latin-1, sometimes
iso-8859-3 for Esperanto, or Unicode
(or
utf-8 for compact Unicode) to handle
nearly all languages, including Arabic,
Thai and Chinese.
icra-label
encodes your document for adult content, if any.
Author
author’s name.
Copyright
© copyright of the web page.
Description
Summary of what’s on the page. Used by search engines to
summarise the page for people. Should be more
detailed than the title.
Generator
What tool you used to create the
HTML.
Keywords
important words in the document. Used by search engines to direct
people to your document. Avoid using
concept words that don’t actually appear in the document.
Search engines think you are cheating.
title
Title used to display a hit by a search engine. Also used on the
window bar when the page is displayed.
body tags
are for CSS-challenged browsers that don’t understand the
style sheet.
Body Tag Details
If you use
CSS,
you don’t use a <body tag.
| Body Tags |
| Field |
Function |
| bgcolor |
background
RGB
in hex |
| background |
*.gif to use as background tiled. For
repeating backgrounds, it is best to
make the *.gif 25 pixels high even if
in theory 1 pixel would do. That speeds
rendering even though it slows download. |
| text |
ordinary text colour |
| link |
clickable links not yet visited |
| vlink |
links that have already been visited |
| alink |
active link text, what you just clicked. |
| marginheight |
pixels in border top/bottom. Oddly 0
is an illegal value. Leave
marginheight out altogether for 0. |
| marginwidth |
pixels in border left/right. Oddly 0
is an illegal value. Leave
marginwidth out altogether for 0. |
To make such code easier to maintain, you could use SSI
Server Side
Includes. You then need maintain only one copy of the standard headers.
I do it with
HTML
macros, which does not
require any code on the server. If you look at my
HTML
source, you can see how I generate standard headers. I
have not yet released the tools I use to the public. If you are curious
how I generate my website using macros,
see the HTML entry.
CSS
Styles
In general, avoid inline styles, and use style sheets. That way you can
make a change to you style sheet and your
whole website is instantly updated. If you really want to do it, look
around the net at people’s headers.
CSS
Style Sheets
Have a look at my style sheet. Looking
at an example will probably explain
nearly everything you need to know. TopStyle
makes it easier to edit
style sheets, but it won’t explain what the million little fields
are for. You will probably figure it out
much faster that way that by reading documentation prepared for and by
mathematicians.
| HTML Used
Only In Emails |
| HTML |
Purpose |
<x-sigsep>
<p></x-sigsep> |
Separates body from the signature. |
|
Indicates some text in your email, in this case http://mindprod.com/
that looked to Eudora
like a
URL,
that it has automatically converted into a link. |
|
Used for quoting in replies. Rendered as nested vertical bars
down the left margin. |
|
An embedded, as opposed to attached, image. The image itself
is made into a hidden attachment that is
base64 encoded.
HTML
email cannot presume web access when mail is
written, so ordinary <img tags can’t be used. |
Learning More
- dot
HTML online
HTML
guide.
- dot
CSS online
CSS
guide.
- Visibone make a
series of cheat sheets, both online and
printed. Some show the full Unicode set and the extended Unicode
&xxx codes. Others show
CSS.
Others show
XHTML.
Some are colour charts.