image provider

Multilingual PADs


This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

A PAD (Portable Application Description) is an XML (extensible Markup Language) file that describes a computer program in a standardised way. They tell you what the program is for and where to get the program and documentation. They have provision to provide the descriptions in many languages.

Most of the time the PAD contains only English. This project aims to automatically add other languages to PAD files to make programs accessible to a wider audience.

The basic idea is to use Google Translate or similar service to translate each section of the PAD file. Google can translate to about 40 different lanuages. You would use it from the command line like this:

rem ------- pad ------ source target1 target2 target3
addLanguage myprog.pad English French German Thai

AddLanguage would translate the various English-marked fields into French German and Thai using Google Translate and insert them into the PAD. It would leave other translations untouched. It could, of course, start with French and produce an English translation.

There are strict limits on how many characters each field can have. If the translation does not fit, all you can do is go over the limit and warn the author that it will have to be trimmed, either by trimming the original English or by trimming the translation. See the commentator student project for ideas on how to make it easier to trim.

PADGen will not accept any entities or HTML (Hypertext Markup Language), not even hex entities. You must code all accented chars as Unicode.

There are three basic approaches you could interface with Google Translate.

  1. Paste text into the windows at and screen scrape the results.
  2. Compose HTML documents, post them then ask Google to translate them.
  3. Compose XML documents with just the data you want to translate in them and upload them to the Google translation toolkit. You would get html back, which you could then insert. The toolkit is designed for manual tweaking of the translations.
  4. Upload XML to the standard Google translate interface and screen scrape the result.

You could also try BabelFish that supports 75 languages.

Commentator Student Project
Google Translate

This page is posted
on the web at:

Optional Replicator mirror
on local hard disk J:

Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

Your face IP:[]
You are visitor number