image provider

Mailreader/Newsreader


Disclaimer

This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

The problems with existing mailreader/newsreaders are: In the new scheme, all mail is compressed, digitally signed and encrypted. All newsposts are compressed and digitally signed. Even completely anonymous posts are signed with a special anonymous digital signature and even noms-de-plume are digitally signed. All transmission is 8-bit. The whole business of exchanging and verifying keys, encryption and decryption, compression and decompression is totally automatic and transparent. The user is totally unaware of it, other than the process of applying for a various strengths of digital ID.

You can track a piece of electronic mail just the way you can track a parcel, to discover if it has been received at the recipient’s ISP (Internet Service Provider), at the recipient’s computer and at the recipient. You don’t have to do anything active to track mail. You just see little status icons changing. If a piece of mail is overdue, it will show up in a way that attracts your attention. You can withdraw mail right up to the point it is read.

You can enclose programs with your email or posts. They don’t actually necessarily get sent until the recipient asks for them. With one click, the recipient can install that program or receive the enclosure.

Your messages might contain large items such as *.gifs, sound files or movies. The recipient can configure if he wants these transmitted inline, or if he would prefer to click on them only if he wants them. Someone with a 56K modem might use delayed display for images over 100K. Someone with a cable modem might use delayed display for images over 1 MB. Quick-to-transmit thumbnails would automatically replace *.gifs you don’t want to display inline.

Message can contain all manner of HTML features. All browsers of this style of EMAIL can handle that, so there is no need to track whether the recipient can accept HTML.

Like ICQ (Internet Chat Query), someone cannot send you mail without your prior permission. They can’t send you mail because they don’t have your public key to encrypt the mail. You receive only encrypted mail. They can only send you a tiny request permission to send a message.

By periodically changing your public key and sending the new key only to those you want to continue receiving mail from, effectively cuts off any pests you may have unwisely given your key to. The basic idea is no one can send you stuff you don’t want. It does not even get into your ISP ’s mail server. Spammers and other pests can’t spend your datacommunications resources or your time.

Your address book automatically updates. When you move, change ISPs (Internet Service Providers) etc, everyone in your address book authorised to send you mail gets informed and their address books automatically update. Eventually the scheme would be used to automatically inform all the magazines you subscribe to of your new mailing address.

Whatever information you choose to broadcast, automatically stays up to date in other people’s address books, including possibly personal details like birthday, birthdate, height, weight, children’s names, credit card numbers, bank account number… You would disclose different amounts of this information to different people. Once you mark it disclosable, it would stay up to date automatically in everyone else’s address book.

Attribution (quoting and tracking who said what) would be handled technically by not embedding quotes in messages. Instead a reference to the original message and an offset/length would be embedded in the message. When the reader’s viewer automatically expands the quote inline it would look like an ordinary quotation, but with guaranteed accuracy and a guaranteed accurate author, clearly marked as a quotation. There is no way to put words in another’s mouth or ascribe them to the wrong person. Further, the viewer could configure his viewer to show only the first N lines of quotes. To see more he must click or scroll. Further, the reader can always see the original quote in full context. Your client software looks after retrieving the quote from local store or from the server. Delayed-read technology similar to that I described for embedded images also applies to long quotes of old material.

Here is how a long quote might look:


Roedy Green, [poohbah@mindprod.com] on 2000-10-10 19:45 posted:
I’ve been thinking about what Bill said in his play Hamlet, Act I Scene III. He has this bumbling old guy Polonius speak some real pearls:
Bill, [shakespear@mindprod.com] on 1680-10-10 17:52 commented:


Roedy Green, [poohbah@mindprod.com] on 2000-10-10 19:55 commented
Rambles a bit, but that line fourth from the end is a keeper.

One advantage of accurately tracking attributions is you can now filter out twits, even when other people quote them.

You can send email by clicking the email address. You can comment on any quotation, either the original or in the quoted context by right clicking on the quoted material. Nested quoting could be displayed in a number of ways. Here is one that does not require indenting.


Roedy Green, [poohbah@mindprod.com] on 2000-10-10 19:45 posted:
I’ve been thinking about how quoting might work after reading Eddy’s post:
Eddy, [eddy@mindprod.com] on 2000/10/10 17:52 quoted:
Peter Pan, [peter@mindprod.com] on 2000/10/10 17:38 said:
The reader would have the option to limit how much of that quote was displayed at once and to see the quote in its original context.
Eddy, [eddy@mindprod.com] on 2000/10/10 17:52 commented:
Ah — now I see what you’re getting at. I like this a lot, as long as support for it doesn’t get in the way of normal reading as we have now.
Roedy Green, [poohbah@mindprod.com] on 2000-10-10 19:45 commented:
Yes, upward compatibility is a difficult problem. I think we just have to start over.

When you send a complimentary email along with your post, with accurate attributions, you can control if it goes to the person who originally asked the question, or to somebody who attempted an answer, or both. such complimentary emails automatically include the reference to the public post, so the recipient need not reload it.

You might make use of colour to code those whose information turned out to be trustworthy in past and who talks through their hat. You might suppress the attributions information entirely for normal reading, only turning it on when you want it.

You can file your mail and newsgroups, tagging them, possibly automatically, with many possible keys. You don’t need a strictly hierarchical system. There is an SQL (Standard Query Language) database lookup so you can retrieve messages without having to linearly search the entire message base.

To speed service, different newsgroups should be hosted from different servers. Your newsreader client can then simultaneously fetch information from many different servers, automatically selecting a backup server if the usual one is busy or down. For folks online 24 hours a day, servers could notify when new messages are available for pickup on newsgroups with low traffic.

All messages on the server would be in Unicode. This gets rid of the problem of local character set encodings that may not match between writer and reader. This means when the writer keys é the reader sees é. Messages are encoded with which languages they contain. It is possible to post a message along with translations of it. Clients automatically select the appropriate version of the message.

Since everyone participating has a digital id and automatically digitally signs all messages, there is no problem determining who is really a moderator and who is not. It becomes almost impossible to spoof moderator privileges. Only moderators can cancel messages. You can, of course, withdraw your own messages. There are three types of conferences:

  1. Unmoderated free for all.
  2. Pre moderation. Every message must be OKed by one of the moderators before any non-moderators can see it. The problem with this technique is it slows down the pace of discussion to a stately crawl.
  3. Post moderation. This is a more elaborate version of the moderation scheme used with great success on BIX/Cosy. All posted messages are immediately available. Moderators can cancel offending messages. These effectively disappear from the system, even from local stores. Repeat offenders are put on pre-screen probation, or eventually are blocked completely from further posting of any messages. Posts from new members are pre-screened until such time as the moderator gives the OK for direct posting. The problem with this technique is it occasionally lets an offensive message through for a while. You need a clear set of guidelines available to all at a button click about what is considered acceptable in any given newsgroup. Rules might include such things as:
    1. Topic relevance.
    2. Questions must not be covered by the newsgroup FAQ (Frequently Asked Questions), also available at any time with a button click.
    3. No ad-hominem attacks.
    4. No advertising.
    5. No profanity. Language must be suitable for children.
    If a message is cancelled, the author is automatically notified of why. He can revise it and repost it.
Because everyone must have a digital ID, it becomes much easier to control who is allowed to read or post, if you want to. To prevent expelled people from coming back under a new email id, there are two things you can do:
  1. You collect a small fee to join. That makes it much harder for someone expelled for misconduct to return under a different email id. They must come up with a new credit card number as well.
  2. Alternatively, if you don’t want to collect fees, you can demand that people have Thawte trusted email IDs. These mean that a notary, or a number of Web of Trust members, has verified the passport or other papers of the email id holder. This ties the email ID to a specific person. Then they can’t easily masquerade under multiple email ids. Such a trusted ID would be used for any newsgroup.
The system does not permit cross-posting (posting the same message in more than one newsgroup.), of course, someone could try cutting and pasting and posting almost the same message to several different newsgroups. That is not permitted. There are several places where it could be detected, first of all in the client. Cross-posted messages would be blocked or cancelled. Posting a message saying only I agree in many different newsgroups would then be considered a cross-post. I have not decided if this is a good or bad side effect.

There are several features to help you keep track of whom you are talking to:

Implementation

The most important thing to nail down in the protocol and messages the server and client exchange. You could use RMI (Remote Method Invocation) for maximum ease and flexibility, CORBA (Common Object Request Broker Architecture) for more language agnosticism, DataOutputStream, LEDataOutputStream or even ASCII (American Standard Code for Information Interchange) in a pinch. All Strings are Unicode, UTF8 encoded. All messages are digitally signed. This is just a first cut. But it gives you an idea of just how simple it could be.
Messages Exchanged by Client and Server
Message Who Says It? Fields Notes
who are you? Both client and server random challenge phrase Challenge phrase must be echoed back encrypted with the private key to prove identity in the I AM.
I AM Both client and server email id, public key, encrypted challenge phrase.  
CHANGED client timestamp When the client last changed the list of newsgroups he picks up from this server.
PROFILE? server   Requests a list of newsgroups the client is interested in, etc. The server lets the client know how fresh its information is for the various fields. It might never keep any information permanently on file and would return a 0 timestamp to indicate that. PKZIPped.
PROFILE client Complete list of newsgroup names, list of languages the client speaks using two letter codes, perhaps other fields such as a VCF (Versit Card File) business card, standard message signature, resume, ICQ number for the fields in the profile for which the server does not have up to date information. Issued only in response to a PROFILE? message. PKZIPped. The list of languages is used to filter out messages in languages the client cannot understand, e.g. EN if you speak only English. T
GET HEADERS client list of timestamps pairs, from and to time for each of the newsgroups. You can bypass a newsgroup by setting the pair to 0,0. All newsgroups must be represented. Also a boolean indicating whether you want your own messages back or not. PKZIPped. You don’t get the messages bodies, just the header info.
GET MESSAGES client list of timestamps pairs, from and to time for each of the newsgroups. You can bypass a newsgroup by setting the pair to 0,0. All newsgroups must be represented. Also a boolean indicating whether you want your own messages back or not. PKZIPped. This scheme lets you pick up past messages, lost messages or most commonly recently posted messages. If you crash and restore from an old backup, your local store automatically refreshes without any special effort.
DOWNLOAD server A jar file, with one member per message. These are compressed. The messages have an internal format.  
UPLOAD client A jar file, with one member per message to post. These are compressed. The messages have an internal format. This is the same compressed format that message are stored in on the server.  
STATUS both client and server How did the most recent UPLOAD or DOWNLOAD go? Error number. 0 is ok.
Individual messages are stored in PKZIP compressed format, digitally signed. Ideally compression would preload the ZIP engine with HTML tags to get a 90% compression.
Individual Message Format
Field Notes
Newsgroup name Dots in names mean nothing at this level, though they may to some administrative tools. The server could peel this off each message for long term storage and regenerate it on transmission, but typically storage is cheap compared with the cost of recalculating digital signatures.
Message ID A timestamp assigned by the client. The first person to post with a given timestamp gets that ID. Others posting later would have to adjust slightly.
Author email  
Author name  
Title Must be blank for all but an original post.
language What language this post is written in.
format MIME (Multipurpose Internet Mail Extensions) type, usually HTML, simplified HTML that can be rendered directly in the client or plain text.
refer type Could be original post, comment on another post, translation of another, or attachment to some other post.
ok to translate? No: the author reserves the right to translations in all languages. Yes: it is ok for other people to post translations. Ask: others must ask first and perhaps have their translations checked by some third party before posting.
reference Timestamp of another post in this same newsgroup that this post is commenting on.
message body Usually simple HTML or plain text.

Spam

To get serious about spam, you need to make it illegal to send spam without prior authorisation. Further, you need the law to force the ISPs to detect spam.

For legitimate spam, such a product announcements from some company you are interested in, you give that company a digitally signed authorisation giving him the right to send you email for a given period of time.

The ISPs are legally required to block all bulk email without such authorisation.

There needs to be a way to revoke that permission. In that case the spam won’t likely be blocked until it reaches your mailserver, or even your email program.

There is no such thing as anonymous spam any more. This means you won’t even see a request to send spam, the way you might from some private individual in my scheme to discourage being pestered by individuals.

How then do the spammers fight back? They try to masquerade their spam as individual emails, varying each with some artificial intelligence, using a myriad of sending addresses. In my scheme, you can’t send email without a digital id. If foreign ISPs or their customers forge fake ids to send spam, or in any other ways spend spam, the entire ISP must be blocked by law. In similar ways if an id issuer is corrupt, then technically it becomes feasible to invalidate all ids issued by that issuer.

Governments and ISPs have been slow to act. Microsoft has even weakened the anti-spam features of its Outlook mail program.

Don’t give up hope. It is not necessary for goverments or ISPs to act. It is possible to band together with fellow users and warn each other about spam in an automated way so that most people don’t have to see it. The Vipul’s Razor project is just such an attempt. Unfortuately, at this stage, it is usable only by Perl geeks rather than by ordinary users.

Chain of Trust

I think it should be possible to click on any quotation mentioned in a newsgroup posting or email and trace it back to the original source. There should be a digital signing mechanism so that you can trust that the quote indeed did come from that source, a person, a newspaper, a website, a book…

Further, consider the case where the original source was the White House website. They often remove material or modify it in a quite Orwellian way. The quoting mechanism should still mark as valid a quotation that was later changed and should note the new version as well.

This mechanism should also work when a website quotes a website, newspaper or book.

One problem we have not is there is a no easy mechanism in HTML to track the source of every piece of information. Even well-meaning websites can have material on them from discredited sources.


This page is posted
on the web at:

http://mindprod.com/project/mailreadernewsreader.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\project\mailreadernewsreader.html
Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[54.235.6.60]
You are visitor number