|
|
|
|
|
|
Java Coding Part Two: Adding Authentication
Now that we understand how easy it is to fetch a URL from Java, what about those sites that require passwords and proxy servers that require authentication? A complete discussion of the HTTP protocol is best left for side reading, but basically requires that the username and password be Base64 encoded in the username:password format. Here too Java comes to the rescue by providing classes in the JDKClasses.zip file to do the hard work. Please note that this code assumes that you know in advance that the site requires a password and that you have one. These few lines are all it takes:
|
URLConnection conn = destination.openConnection();
String authString = new String(username+":"+password);
String auth = "Basic " + new sun.misc.BASE64Encoder().encode(authString.getBytes());
// this is how you send the username and password to the site
conn.setRequestProperty("Authorization", auth); |
This sure beats trying to figure out how to do the encoding and subsequent handshaking with the web site. Note that the Java OpenSSL wrapper doesnt completely integrate with HTTPClient, meaning that you wont be able to use this convenient feature with HTTPS. In the sample code, however, I show how to get through a proxy server by hand (a few extra lines of code).
Authenticating proxy servers require the same format as HTTP authentication, except for the last line. Disclaimer: I havent tested this next part since my proxy server doesnt require authentication. It bases access by IP address instead. I suspect that most proxy servers are similar, but I provide the code just the same.
|
URLConnection conn = destination.openConnection()
// this next part is only used for proxy servers that require authentication.
// Haven't tested it yet.
String authString2 = new String(proxyInfo.username+":"+proxyInfo.password);
String auth2 = "Basic " + new sun.misc.BASE64Encoder().encode(authString2.getBytes());
conn.setRequestProperty("Proxy-Authorization", auth2); |
So now you can access just about any site, be it FTP, HTTP, or HTTPS, although there are those limits on HTTPS right now. At this point I need to ask that you please remember that any text you capture is copyrighted by the web site in question. Im not a lawyer and I cannot tell you all the ramifications of that fact, but you should use common sense and discretion all the same. When in doubt, contact the owner of the web site and ask for their permission. The only place this should not be required is for public records produced by the government, which may be reproduced without express permission.
Previous Section Next Section
|
|