3

I've created a web crawler using Java and Playframework 1.2.3. Now, i'd like to crawl some webpages protected by a classic login/password form.

In fact, it's like doing this play test :

@Test
public void someTestOfASecuredAction() {
    Map<String, String> loginUserParams = new HashMap<String, String>();
    loginUserParams.put("username", "admin");
    loginUserParams.put("password", "admin");
    Response loginResponse = POST("/login", loginUserParams);

    Request request = newRequest();
    request.cookies = loginResponse.cookies; // this makes the request authenticated
    request.url = "/some/secured/action";
    request.method = "POST";
    request.params.put("someparam", "somevalue");
    Response response = makeRequest(request);
    assertIsOk(response); // Passes!
}

But not with the site generated by play, but with an external website.

So, i manage to use play web server to do that :

Map<String,String> params = new HashMap<String, String>();
params.put( "telecom_username", 
            Play.configuration.getProperty("telecom.access.user") );

params.put( "telecom_password", 
            Play.configuration.getProperty("telecom.access.pass") );

HttpResponse response = WS.url(loginUrl)
                        .setParameters(params)
                        .followRedirects(true)
                        .post();

When i'm doing that, if i look in response.getString(), i found the redirection page where cookies are set before continuing, but then, if i get a protected page, i'm still not log in. It's like the cookies were never set, and the HttpResponse object does not have any cookies related function, like the response in previous test code.

I've also tried the authenticate() method on ws.url() but it doesn't work either.

I don't really know if what i'm trying to do is possible by using play web server, but i could use an help on this ^^

Thanks a lot !

Camille Laborde
  • 858
  • 11
  • 17

1 Answers1

7

Ok, I found a way to do it, but I did it the hard way, here it is:

First, the GET, where we store the session cookies, please take into account the charset that I'm using and that I knew the name of the cookie I was looking for, you could store them all. Also, you may want to encrypt them.

HttpResponse wsResponse = WS.url(comercialYComunUrl).get();
String responseString = wsResponse.getString("ISO-8859-1");

if (wsResponse.getStatus() == 200) {
  List<Header> headers = wsResponse.getHeaders();
  // get all the cookies
  List<String> cookies = new ArrayList<String>();
  for (Header header: headers) {
    if (header.name.equals("Set-Cookie")) {
      cookies = header.values;
    }
  }
  // look for the session cookies
  String sessionCookie = "";
  for (String cookie : cookies) {
    if (cookie.toUpperCase().contains("ASPSESSIONID")) {
      sessionCookie = cookie.split(";", 2)[0];
    }
  }
  // store it on the session
  session.put("COOKIE", sessionCookie);
}

An now the Post:

String url = "http://www.url.com/";
String charset = "ISO-8859-1";
String param1 = "value1";
String param2 = "value2";
String param3 = "value3";

String query = String.format("param1=%s&param2=%s&param2=%s", 
     URLEncoder.encode(param1, charset),
     URLEncoder.encode(param2, charset),
     URLEncoder.encode(param3, charset));

URLConnection connection = new URL(url).openConnection();
connection.setDoOutput(true); // Triggers POST.
connection.setRequestProperty("Accept-Charset", charset);
connection.setRequestProperty("Content-Type",
    "application/x-www-form-urlencoded;charset=" + charset);
connection.addRequestProperty("Cookie", session.get("COOKIE"));
OutputStream output = null;
try {
     output = connection.getOutputStream();
     output.write(query.getBytes(charset));
} finally {
     if (output != null) try { output.close(); } catch (IOException logOrIgnore) {}
}
InputStream responseStream = connection.getInputStream();

StringWriter writer = new StringWriter();
IOUtils.copy(responseStream, writer);
String response = writer.toString();

And that worked for me, this is my source, it's a great post: How to use java.net.URLConnection to fire and handle HTTP requests?

----------------------------EDIT-------------------------------------

Ok, I wan't all that happy with the answer, so I found a better way:

String url = "http://www.url.com/";
String charset = "ISO-8859-1";

String param1 = "value1";
String param2 = "value2";
String param3 = "value3";

WSRequest wsRequest = WS.url(url);
wsRequest.parameters.put("param1", param1);
wsRequest.parameters.put("param2", param2);
wsRequest.parameters.put("param3", param3);
wsRequest.headers.put("Cookie", session.get("COOKIE"));
HttpResponse wsResponse = wsRequest.post();
String responseString = wsResponse.getString(charset);

And it works ^.^

Community
  • 1
  • 1
Chango
  • 6,754
  • 1
  • 28
  • 37