2

I tried to use Httrack and Sitesucker to download the content of a login required website but the webpages it downloaded are all like login forms, register and seekpasswords.html, without the real content. The website is pretty much like treehouse or Udemy and I have bought a course on it but the course will expire next month so I was hoping to see if I can save the content before it vanishes(is it illegal? I don't know).

Both Httrack and Sitesucker didn't ask me about the login info though. Therefore, how can I download the content or it is just simply impossible?

Thanks!

Hang Chen
  • 133
  • 1
  • 5

1 Answers1

0

What you have to do is use a software called Website Ripper Copier Pro. This allows you to insert your own cookies which maintains access to the subscriber content throughout the download. It's still a bit of a fiddly process and you may get unexpectedly logged out, so play with the settings and adjust accordingly. Below is what worked for me:

  • Open Internet Explorer, log into site
  • Then, open site in Chrome and log in. Then you right click in Chrome, go to Inspect/Application (in the top bar)/Cookies (in the side bar) and select your website from the drop down menu (which then shows you all the cookies on the right side)
  • Then you go into Website Ripper Copier Pro / Start a new project / Select 'Copy websites for offline browsing' / Enter the starting address
  • Click next until you come to Advanced page filters
  • To filter links by URL / click on 'URL filters' / enter the log out link (right click in Chrome and copy) of your website to 'Exclusion’, also add other links on the site you don’t want to download
  • To filter links by description / click on ‘Description filters’ / Add keywords log out, logout, sign out, signout to the ‘Exclusion section
  • Then click next and choose your Save destination. Click also on 'Cookies' in the same top bar
  • Tick ‘Directly accept and return cookies’, tick ‘Import cookies from Internet Explorer’, Tick ‘Use your own cookies’
  • Click ‘add’ under ‘use your own cookies’ and transfer the data you can see on Chrome for the cookies of your website, one cookie at a time (tedious work, yes)
  • Click 'Run now'
Toto
  • 19,304