26

I need to write a python script that makes multiple HTTP requests to the same site. Unless I'm wrong (and I may very well be) urllib reauthenticates for every request. For reasons I won't go into I need to be able to authenticate once and then use that session for the rest of my requests.

I'm using python 2.3.4

Hector Scout
  • 1,388
  • 2
  • 14
  • 31
  • Authentication is driven by the site. If they ask for authentication (via 401 response), your client can provide it. You can (sometimes) head this off. Depends on the site's use of the Nonce. – S.Lott May 28 '09 at 23:22

3 Answers3

29

Use Requests library. From http://docs.python-requests.org/en/latest/user/advanced/#session-objects :

The Session object allows you to persist certain parameters across requests. It also persists cookies across all requests made from the Session instance.

s = requests.session()

s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")

print r.text
# '{"cookies": {"sessioncookie": "123456789"}}'
Piotr Dobrogost
  • 41,292
  • 40
  • 236
  • 366
26

If you want to keep the authentication you need to reuse the cookie. I'm not sure if urllib2 is available in python 2.3.4 but here is an example on how to do it:

req1 = urllib2.Request(url1)
response = urllib2.urlopen(req1)
cookie = response.headers.get('Set-Cookie')

# Use the cookie is subsequent requests
req2 = urllib2.Request(url2)
req2.add_header('cookie', cookie)
response = urllib2.urlopen(req2)
Nadia Alramli
  • 111,714
  • 37
  • 173
  • 152
  • python 2.3.4 does have urllib2. Thanks – Hector Scout Jun 01 '09 at 14:35
  • 2
    That's not so straightforward as you show it is. See RFC 6265, section *5.4 The Cookie Header*, when you find this statement *The user agent MUST use an algorithm equivalent to the following algorithm to compute the "cookie-string" from a cookie store and a request-uri:* with the following algorithm. – Piotr Dobrogost Oct 07 '12 at 19:34
16

Python 2

If this is cookie based authentication use HTTPCookieProcessor:

import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")

If this is HTTP authentication use basic or digest AuthHandler:

import urllib2
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(realm='PDQ Application',
                          uri='https://mahler:8092/site-updates.py',
                          user='klem',
                          passwd='kadidd!ehopper')
opener = urllib2.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
urllib2.install_opener(opener)
urllib2.urlopen('http://www.example.com/login.html')

... and use same opener for every request.

Python 3

In Python3 urllib2 and cookielib were moved to http.request and http.cookiejar respectively.

lispmachine
  • 4,407
  • 1
  • 22
  • 31