url-encoded : Java Glossary

go to home page U words local find full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish punctuation 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (all) ©1996-2010 2009-01-28 Roedy Green, Canadian Mind Products
url-encoded
A way of armouring, i.e. sending awkward characters. Browsers use url-encoding on HTTP GET and PUT requests to the server. They embed data in the URLs. Url-encoding is also used by the url-encoded and x-www-form-urlencoded mime types.

You see url-encoding every time you do a Google search e.g.

http://www.google.com/search?client=opera&rls=en&q=%22rabbits%22%2BEaster+eggs &sourceid=opera&ie=utf-8&oe=utf-8
The request url-encodes my query:
"rabbits"+Easter eggs

There are two flavours of urlencoding, one used in URLs, and one used in forms.

URL Encoding

Ironically, despite the name, you are not supposed to java.net. URLEncoder. encode/decode to handle encoding URLs or GET parameters. It will work most of the time however. Unfortunately, the URL class provides no escaping features. You must use the URI class and convert the URL with toURL(). The encoding algorithm is described in  RFC 3986.

To decode a String, you just feed it to the single-argument URI constructor, then extract the various fields with methods like URI.getPath().

Properly speaking, you should not see bare & in URLs; they should be pre-encoded as &. I wrote a utility called Amper that processes *.html files to make this correction.

Form Encoding

Form url-encoding/decoding is handled by java.net.URLEncoder . encode/decode. This is only intended for String data with a few awkward characters in it, not heavy-duty binary. Encodings you will likely use in conjunction with URLEncoder include ISO-8859-1 (Latin-1), UTF-8 and windows-1250.

When you use URLEncode.encode you must specify an 8-bit encoding such as UTF-8 or ISO-8859-1. The algorithm first converts to 8-bit characters then encodes. Thus the encoded string depends on the encoding you choose. The encoding is not embedded in the output. You just have to know what it is when an incoming encoding url-encoded string arrives.

java.net.URLEncoder uses the following set of characters to convert 8-bit data into printable characters :a to z, A to Z, 0 to 9, -, ., *, and _. It works like this:

In the best case, your message is the same size as the original. In a pathological case, your message can balloon up to three times the original size.

Learning More

Sun’s Javadoc on URLEncoder class : available:
Sun’s Javadoc on URLDecoder class : available:
Sun’s Javadoc on URL class : available:
Sun’s Javadoc on URI class : available:
Sun’s Javadoc on URI.toString : available:
Sun’s Javadoc on URI.toASCIIString : available:

CMP homejump to top You can get the freshest copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/urlencoded.html J:\mindprod\jgloss\urlencoded.html
CMP logofeedback Please email your feedback for publication, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : feedback email
mindprod.com IP:[65.110.21.43]
view BlogYour face IP:[38.107.191.111]
You are visitor number 33,754.