HTTP : Java Glossary

go to home page H words local find full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish punctuation 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (all) ©1996-2009 2008-08-22 Roedy Green, Canadian Mind Products
HTTP
Hypertext Transfer Protocol. A protocol used on the Internet by web browsers to transport text and graphics. It is focuses on grabbing a page at a time, rather setting up a session. Applets also use it to download jars, classes and resources. Browsers use to download files and images, not just HTML text.
Browser To Server Speeding Up HTTP
Server To Browser response codes
Language & Charset Learning More
Sample Code Links
Under the Hood

Message Headers From Browser To Server

Fields in the headers let browsers and servers communicate. For example:
HTTP Headers that Browsers Send Servers
Field Typical Value Meaning
User-Agent: Java.exe default ⇒ Java/1.6.0_17 Last revised/verified: 2009-09-28

OperaOpera/9.80 (Windows NT 6.0; U; en) Presto/2.2.15 Version/10.00 Last revised/verified: 2009-09-28

FirefoxMozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.4) Gecko/20091016 Firefox/3.5.4 (.NET CLR 3.5.30729) Last revised/verified: 2009-10-27

Sea Monkey Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.4) Gecko/20091017 SeaMonkey/2.0 Last revised/verified: 2009-10-27

FlockMozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.13) Gecko/2009080717 Firefox/3.0.13 Last revised/verified: 2009-09-28

SafariMozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/531.9 (KHTML, like Gecko) Version/4.0.3 Safari/531.9.1 Last revised/verified: 2009-09-28

AvantMozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/4.0; Avant Browser; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.40729; .NET CLR 3.0.30729) Last revised/verified: 2009-09-28

ChromeMozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/532.0 (KHTML, like Gecko) Last revised/verified: 2009-09-28

IE 8Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729) Last revised/verified: 2009-09-28

IE 7Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)

NetscapeMozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.5pre) Gecko/20070710 firefox/2.0.0.4 Navigator/9.0b2

Which browser being used/simulated
Host: localhost:8081 destination url, server:port.
Accept: application/xhtml+voice+xml;version=1.2, application/x-xhtml+voice+xml;version=1.2, application/x-shockwave-flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,text/css,*/*;q=0.1
or
text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
MIME types the browser is willing to accept. The encoding of this field, is described in RFC 2616 section 14. and in the more friendly w3.org version. Roughly the q numbers define your preference. The higher the number the higher the preference. Default is 1. The q applies to the preceding MIME. You set this with URLConnection. setRequestProperty( "Accept", …); not "accept" as the Sun docs erroneously suggest.
Accept-Language: en Language the browser in willing to accept.
Accept-Charset: windows-1252, utf-8, utf-16, iso-8859-1;q=0.6, *;q=0.1 Character set encodings the browser is willing to accept.
Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0 compression schemes the browser is willing to accept.
  • deflate: zlib format defined in RFC 1950 plus the deflate compression mechanism described in RFC 1951. This is a stripped down gzip without the header.
  • gzip, alias x-gzip: Java-style gzip RFC 1952 Lempel-Ziv coding with a 32 bit CRC.
  • compress, alias x-compress, UNIX compress
  • identity means as-is, no compression. Use in the Content-Request header, but not the Content-Encoding header. Just leave out the Content-Encoding if it is identity.
Referer: http://mindprod.com/jgloss/http.html the web page that contained the link that triggered this request.
If-Modified-Since: Mon, 06 Feb 2006 01:24:23 GMT Only bother with the request if the file has changed since this date, otherwise the browser already has a copy in cache.
Connection: Keep-Alive requests server keep the socket open for further messages. It is true by default in HTTP 1.1, so you don’t need to use it.
Keep-Alive: 300 requests server keep the socket open 300 seconds for further messages.
Pragma: no-cache requests getting a fresh copy from the server, rather than from a cache.
Content-Type: application/x-www-form-urlencoded MIME type of the payload to the server.
Content-Length: 114 length in encoded bytes of the payload to the server.

Beware using HttpURLConnection.setFollowRedirects( false); This reportedly causes trouble in recent JDKs. When it is set true, it will not automatically follow responses with: <META HTTP-EQUIV="Refresh".

Message Headers From Server To Browser

HTTP Headers that Servers Send Browsers
Field Typical Value Meaning
Server: Apache/2.0.55 (NETWARE) mod_perl/1.99_12 Perl/v5.8.4 Which server software being used.
Accept-Ranges: bytes Inform the browser that the server supports downloading just parts of files, as small as a byte granularity.
Keep-Alive: timeout=15, max=99 how long to keep this socket open for more messages.
Connection: Keep-Alive requests browser keep the socket open for further messages.
Content-Type: image/png MIME type of the payload from the server. Also used to encode the CharSet encoding, e. g. Content-Type: text/html; charset=utf-8
Content-Encoding: gzip gzip or x-zip or deflate or not present if no compression.
Content-disposition: attachment;filename="smile.png" Server suggests a filename to save this download under.
Content-Length: 842 length in encoded bytes of the payload from the server.

In the real world, the conversations between browser/client and server are much more complicated as slipshod than you might suppose. Each query often results is a flurry of permanent and temporary redirects back and forth. Each element on an HTML page must be requested independently. Sometimes servers will send back a fail error code, then send the page anyway. Or they will send a 404 with an OK text response code. Sometimes servers refuse HEAD requests, but accept the equivalent GET. Sometimes servers send back https: in response to an http: request. Sometimes servers give you a totally different page from the one you requested and don’t tell you the one you wanted is on longer available. Sometimes servers rediret to localhost, or send back gibberish messages. Sometimes a server won’t send you a page if you have recently previously requested it. They expect you to have cached it. Browsers just do their best to muddle through. When you start emulating browsers with code, you get pretty flaky programs.

Language and Charset

You might wonder, where does the server encode the language and character set? Oddly not in the HTTP header, but embedded in the HTML documents, with tags like this:
<!-- embedding language and charset inside an HTML document -->
<meta http-equiv="Content-Language" content="en">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Embedding this information makes it easier for web page authors to control, even if it makes finding the information slightly more difficult for the browser.

Sample Code

This code for doing GET and POST is from the com.mindprod.http package. You can download the whole package.

Code to do a GET

Code to do a POST

Code to do a HEAD

Code to do a Fetch (generic GET)

Code to do a PROBE (check if page present)

Base class for GET, POST, and PROBE

Code to READ

Read the response either as bytes (readBytesBlocking) or converted to a String (readStringBlocking):

Under the Hood

What happens when your Java-based browser requests a page?
  1. URL.openConnection just sets up a place to build the HTTP header. It does no communicating with the outside world.
  2. HTTPConnection.connect() requests sending the header to the server.
  3. This request triggers opening a TCP/IP socket connection to the server. This is done by sending a SYN connection request packet. The server sends back an SYN+ACK. Then the client sends an ACK, upon which may be piggybacked some data.
  4. This triggers sending the GET header composed of all the header fields set up before the .connect call. The GET request header includes a list of the encodings and compression algorithms the browser would like in response. .connect does not return until the HTTP header is safely sent out the wire.
  5. The browser calls HTTPConnection.getResponseCode to see if request went ok. This blocks until the server responds with an HTTP response header.
  6. Then the browser calls HTTPConnection.getInputStream and reads the text of the message from the server containing the requested web page. Using the standard TCP/IP protocol flow-control features, the server sends data only as fast the browser can read it.
  7. The browser then scans the web page for the URLs of embedded images and puts out GET requests for them.
  8. Then various images usually come back from the server on the original socket. The browser could elect to request each image on it own socket so they can arrive simultaneously.
The stream is made purely of printable characters. The server can detect the start of a new GET request by looking for line terminators.

Speeding Up HTTP

There are several things you might consider to speed up HTTP transmissions.

Learning More

RFC 2045 MIME part 1.

RFC 2049 MIME part 2, non ASCII.

RFC 1945 HTTP 1.0 specification.

RFC 2045 MIME Part One: Format of Internet Message Bodies, specifies the various headers used to describe the structure of MIME messages.

RFC 2046 MIME Part Two: Media Types, describes the general structure of the MIME media typing system and defines an initial set of media types.

RFC 2047 MIME Part Three: Message Header Extensions for non-ASCII text

RFC 4288 and RFC 4289 MIME Part Four: Registration Procedures

RFC 2049 MIME Part Five: Conformance Criteria and Examples, Provides some illustrative examples of MIME message formats

RFC 2183 MIME Part Five: Conformance Criteria and HTTP Content-Disposition

RFC 2616 updates the HTTP protocol

RFC 2617: for details on how to send username and password in http headers to restrict access

RFC 2183 MIME Part Five: Conformance Criteria and HTTP Content-Disposition

Sun’s Javadoc on URLConnection class : available:
Sun’s Javadoc on HttpURLConnection class : available:
Avant
browser
CGI
Chrome
Details on HTTP headers
File I/O Amanuensis: to see how to write code that reads and writes via HTTP-CGI
Firefox
Flock
forms: see the raw socket information exchanged
HTTP Client
IE
MIME
network properties
Opera
remote file access
response codes
RFC
Safari
Sea Monkey

CMP homejump to top You can get the freshest copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/http.html J:\mindprod\jgloss\http.html
CMP logofeedback Please email your feedback for publication, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : feedback email
mindprod.com IP:[65.110.21.43]
view BlogYour face IP:[38.107.191.108]
You are visitor number 20,850.