image provider

RFC Conversion to HTML


Disclaimer

This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

The Problem

If you don’t know what an RFC (Request For Comment) is, check out the RFC Java glossary entry. RFC Internet standards are a collected set of vanilla text documents that are difficult to use for several reasons:

Your Task

Have a look at a typical RFC, RFC 1945 an obsolete standard for HTTP (Hypertext Transfer Protocol). This RFC is in HTML but crudely converted from the text file.

To convert these documents properly to HTML you need to:

  1. Identify title lines, ASCII (American Standard Code for Information Interchange) diagrams, ordinary text, etc. and enclose them in suitable HTML tags. Remove ASCII underlining and replace with CSS (Cascading Style Sheets) equivalents. For ASCII diagrams to work, <pre is not sufficient. You need to suggest fonts in your CSS style sheet that are perfectly monospace.
  2. Consider writing a program that converts ASCII diagrams to *.png images with nicely formed arrows and proper corners.
  3. Replace text references to other RFCs (Request For Comments) with <a href tags.
  4. Cross-reference other RFCs that mention this RFC with a links section at the bottom.
  5. Identify obsoleted RFCs. Don’t discard them. They are often more informal than the lawyerly replacements and hence easier to understand. However, you need to mark them as obsolete and insert links at the top to the replacing RFCs. You also want links back to the obsoleted RFCs.
  6. Build an index of just the words in the various paragraph title lines and provide the index in the form of a set of HTML pages, one per letter of the alphabet and also an Applet to help you find what you need, even if you are viewing the set of RFCs offline. Make sure you order the links so the one that mentions a word most often comes first. Intelligently prune the index.
  7. Create a hyperlinked Table of Contents for each RFC so you can rapidly jump to the part of it of interest.
This is an unusual project in that you don’t have to do it all by computer. You can always do manual touchup for tricky sections. You would like it as automated as a possible so that you can quickly add in new RFCs. Happily, RFCs never change, so you don’t have to worry about maintaining your touch ups.

Ideally you will get the RFCs into an SQL (Standard Query Language) database so you can do clever searches. You would manually add keywords to each entry. You might be able to detect the changes in standards.

You can make money with this by posting the formatted RFCs on a website with a Google Adsense ad discretely off to the side. Last time I looked there were 4938 RFCs. Last revised/verified: 2007-07-05 It is within reason you could at least quickly proofread them all for gross formatting problems. Thousands of programmers reference these every day. If yours are the most convenient, the word will get out that yours is the best site to browse them. Google does the rest with ads cleverly matched to the content of each RFC. You could also offer the set to read offline, indexed with Google Desktop, via the Replicator.

Since I wrote this, I discovered the rfc-editor.org search engine. It has an SQL search engine. It tracks obsoletes, deletes, is-deleted by etc. It provides the RFCs in text form and in a PDF (Portable Document Format) version that exactly matches the text version. However, it does not provide an internally cross-referenced HTML version, which is what you really need as a web reference.

You can find out which RFCs are obsolete and which RFC obsoletes what at faqs.org.

For a more ambitious project you might like to partially gray out any text that has been obsoleted and provide clickable links to the replacing text, going directly to it, not via a chain.

CSS
Formatted Partial Index of all RFCs
Google AdSense
HTML
HTML Cheat Sheet
IETF list of RFCs
Most popular RFCs
RFC
RFC search engine: shows replacements for obsolete RFCs

This page is posted
on the web at:

http://mindprod.com/project/rfcconversion.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\project\rfcconversion.html
Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[18.119.128.164]
You are visitor number