Download files from a URL – Python script

Downloading files from a web server over the internet is a trivial task which is performed in our everyday life. Isnt it?

For example when accessing a web site, your browser downloads some files to your computer. Commonly, it could obviously be .html files and then it might include .css .js files as well. You dont notice it as it is handled and managed by the browser itself. However there can be instances that you need to access a specific URL and download some files. No doubt it can be done manually. What if there are hundreds of files to be downloaded. In this article it is discussed how to get it done with minimal effort using a Python script.

Disclaimer – Employing a script to access/ download content from a web server can cause a higher traffic towards the web server. Therefore do not use this knowledge to access any resource in a web server without getting prior permissions. Use this script at your own risk. The author bears no responsibility.

For the purpose we are going to use the library urllib which is an inbuilt library in Python. .urlopen and .read methods can be used to access the content of the resource and then .write method can be used to save the content to the local machine. Check how the code is organized.

Python Version 2.7.15

import urllib
url = 'http://url-for-resources'

webf = urllib.urlopen(url)
txt = webf.read()
f = open('local-machine-folder-path', 'w+')
f.write(txt)
f.close()

It is pretty straight forward. Then the next objective is to automate the process. Using a loop is the easiest. One issue to overcome though. How to get the file names. If the file names follow a regular sequence for example data_01.json, data_02.json, data_03.json etc, it is possible to generate the sequence. Otherwise file names should be inserted to the code. A solution could be to extract the file names to a file and then read the file and get filenames. 

Here is the Python script for the above approach.


import urllib

f = open("local-file-with-file-names", "r")
file_list = f.readlines()


for x in file_list:

	filename = x.strip()
	url = 'http://url-for-resources/'+ filename

	webf = urllib.urlopen(url)
	txt = webf.read()
	f = open('local-file-path-to-save/'+filename, 'w+')
	f.write(txt)
	f.close()
	
	print('Saving file ' + filename)

One line contains one file name in the file which is used to include files to be downloaded.

Advertisements

Please add your valuable idea below, will make a discussion, thanks !

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s