image provider

What’s That Char?


Disclaimer

This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

The Problem

Here are four problems I have come across:

  1. I am screenscraping and I find a strange character on the screen. I need to write code to match it. I need to know what its \uxxxx code is.
  2. I am coding a French word in my webpage. I need the alpha entity for a-grave.
  3. I see a clever effect on somebody’s web site perhaps using an obscure character or perhaps a pair of characters. I want to know how it was done.
  4. I see a letter. It is important that I know if it is an 0Oo8 or |l1I or ″ ′ ´ “ ” ‘ ’ " '. I would would like positive confirmation. In some fonts you just cannot tell. Serial numbers and passwords give no context.

Your Mission

Your mission is to write simple utility that works like this:

The user can paste a phrase into it containing problematic characters, then drag the cursor across the phrase letter by letter. As the cursor hits each letter it draws the letter 2 cm (0.79 in) tall so you can get good look at it. It displays the hex 0x0000 code, the \uxxxx code, the decimal entity �, the hex entity � and an alpha entity & if there is one. Further it gives a short description of what the character is.

In addition there is a hex spinner to let you select the character to find the scoop on.

You can then copy/paste any of the information about the character into your program or webpage.

Implementation

There is a CSV (Comma-Separated Value) table with all the information you need as part of the entities package. It has information on all the codes with alpha entities plus a few more. The Unicode consortium has descriptions for the remainder. You could write a prepare utility to compact the table into binary, possibly serialised, gzipped form for fast loading as a resource.

Have a look at the code for Quoter.ToHex to see how to iterate over codepoints in a String.

Entities package
HTML entities
Unicode viewer

This page is posted
on the web at:

http://mindprod.com/project/whatsthatchar.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\project\whatsthatchar.html
Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[107.22.56.225]
You are visitor number