Skip to content

Commit 1d2f76a

Browse files
authored
Simple pdf downloader
1 parent 9d84ae8 commit 1d2f76a

File tree

2 files changed

+21
-0
lines changed

2 files changed

+21
-0
lines changed

PDF_Downloader/pdf.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
import os
2+
import requests
3+
from urllib.parse import urljoin
4+
from bs4 import BeautifulSoup
5+
6+
#Put the link from which you need to download all the pdf
7+
url = ""
8+
9+
#If there is no such folder, the script will create one automatically
10+
folder_location = r'./NewFolder'
11+
if not os.path.exists(folder_location):os.mkdir(folder_location)
12+
13+
response = requests.get(url)
14+
soup= BeautifulSoup(response.text, "html.parser")
15+
for link in soup.select("a[href$='.pdf']"):
16+
#Name the pdf files using the last portion of each link which are unique in this case
17+
filename = os.path.join(folder_location,link['href'].split('/')[-1])
18+
with open(filename, 'wb') as f:
19+
f.write(requests.get(urljoin(url,link['href'])).content)

PDF_Downloader/requirements.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
beautifulsoup4==4.10.0
2+
requests==2.18.4

0 commit comments

Comments
 (0)