Software/Python
Python 시작하기 - BeautifulSoup이용 웹크롤링
SW70
2024. 7. 2. 23:19
728x90
Python으로 BeautifulSoup4를 사용해서 웹크롤링 해보자.
예제로 Naver 배당주 페이지를 크롤링 해보자
※ BeautifulSoup4 설치
pip install BeautifulSoup4
※ 소스
import requests
from bs4 import BeautifulSoup as bs
arr = []
for i in range(1, 28): #28
page = requests.get("https://finance.naver.com/sise/dividend_list.naver?field=dividend_rate&sosok=&ordering=desc&page=" + str(i))
soup = bs(page.text, "html.parser")
elements = soup.select('table.type_1 tr td a')
# append() [-6:] strip() replace() float() .parent
for index, element in enumerate(elements, 1):
a = element
b = a.parent.parent.select('td')
arr2 = [a.attrs['href'][-6:], a.text]
for i in range(1,12):
if b[i].text.strip() != '-':
arr2.append(float(b[i].text.strip().replace(',','')))
else:
arr2.append('')
arr.append(arr2)
print(arr)
※ 실행

728x90