Python 시작하기 - BeautifulSoup이용 웹크롤링

호야70 2024. 7. 2. 23:19

2024. 7. 2. 23:19

728x90

Python으로 BeautifulSoup4를 사용해서 웹크롤링 해보자.

예제로 Naver 배당주 페이지를 크롤링 해보자

pip install BeautifulSoup4

import requests

from bs4 import BeautifulSoup as bs

arr = []

for i in range(1, 28): #28

soup = bs(page.text, "html.parser")

elements = soup.select('table.type_1 tr td a')

# append() [-6:] strip() replace() float() .parent

for index, element in enumerate(elements, 1):

a = element

b = a.parent.parent.select('td')

arr2 = [a.attrs['href'][-6:], a.text]

for i in range(1,12):

if b[i].text.strip() != '-':

arr2.append(float(b[i].text.strip().replace(',','')))

else:

arr2.append('')

arr.append(arr2)

print(arr)

728x90

초심자 코더 호야