728x90
◎ 주식정보 크롤링¶
In [1]:
# 라이브러리 불러오기
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import time
import pandas as pd
from selenium.common.exceptions import NoSuchElementException
◎ 당일 투자자별 매수 현황 파일 다운로드¶
In [2]:
# 투자자별 매수 파일 다운로드
options = webdriver.ChromeOptions()
options.add_argument('start-maximized') #창 최대화
options.binary_location= 'C:/Program Files/Google/Chrome/Application/chrome.exe'
wd = webdriver.Chrome('C:/dev_python/chromedriver.exe', chrome_options = options)
url = 'http://data.krx.co.kr/contents/MDC/MDI/mdiLoader/index.cmd?menuId=MDC0201'
wd.get(url)
time.sleep(2)
# 필터 클릭
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[1]/ul/li[2]/div/div[1]/ul/li[2]/a').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[1]/ul/li[2]/div/div[1]/ul/li[2]/ul/li[3]/a').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[1]/ul/li[2]/div/div[1]/ul/li[2]/ul/li[3]/ul/li[3]/a').click()
time.sleep(2)
# 외국인 매수 상황
wd.find_element_by_xpath('//*[@id="invstTpCd"]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="invstTpCd"]/option[11]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[2]/div[1]/p[2]/button[2]/img').click()
time.sleep(1)
wd.find_element_by_xpath('/html/body/div[2]/section[2]/section/section/div/div/form/div[2]/div[2]/div[2]/div/div[2]/a').click()
time.sleep(3)
# 연기금 매수 상황
wd.find_element_by_xpath('//*[@id="invstTpCd"]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="invstTpCd"]/option[7]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[2]/div[1]/p[2]/button[2]/img').click()
time.sleep(1)
wd.find_element_by_xpath('/html/body/div[2]/section[2]/section/section/div/div/form/div[2]/div[2]/div[2]/div/div[2]/a').click()
time.sleep(3)
# 기관합계 매수 상황
wd.find_element_by_xpath('//*[@id="invstTpCd"]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="invstTpCd"]/option[8]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[2]/div[1]/p[2]/button[2]/img').click()
time.sleep(1)
wd.find_element_by_xpath('/html/body/div[2]/section[2]/section/section/div/div/form/div[2]/div[2]/div[2]/div/div[2]/a').click()
time.sleep(3)
wd.close()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:8: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
wd = webdriver.Chrome('C:/dev_python/chromedriver.exe', chrome_options = options)
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:8: DeprecationWarning: use options instead of chrome_options
wd = webdriver.Chrome('C:/dev_python/chromedriver.exe', chrome_options = options)
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:16: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[1]/ul/li[2]/div/div[1]/ul/li[2]/a').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:18: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[1]/ul/li[2]/div/div[1]/ul/li[2]/ul/li[3]/a').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:20: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[1]/ul/li[2]/div/div[1]/ul/li[2]/ul/li[3]/ul/li[3]/a').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:24: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="invstTpCd"]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:26: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="invstTpCd"]/option[11]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:28: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:30: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:32: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[2]/div[1]/p[2]/button[2]/img').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:34: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('/html/body/div[2]/section[2]/section/section/div/div/form/div[2]/div[2]/div[2]/div/div[2]/a').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:39: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="invstTpCd"]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:41: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="invstTpCd"]/option[7]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:43: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:45: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:47: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[2]/div[1]/p[2]/button[2]/img').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:49: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('/html/body/div[2]/section[2]/section/section/div/div/form/div[2]/div[2]/div[2]/div/div[2]/a').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:53: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="invstTpCd"]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:55: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="invstTpCd"]/option[8]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:57: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:59: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:61: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="MDCSTAT024_FORM"]/div[2]/div[1]/p[2]/button[2]/img').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2226395536.py:63: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('/html/body/div[2]/section[2]/section/section/div/div/form/div[2]/div[2]/div[2]/div/div[2]/a').click()
◎ 당일 주가 상승률 TOP 50 dataframe 만들기¶
In [3]:
# 당일 주가 상승률 TOP 50 dataframe 만들기
options = webdriver.ChromeOptions()
options.add_argument('start-maximized')
options.binary_location= 'C:/Program Files/Google/Chrome/Application/chrome.exe'
wd = webdriver.Chrome('C:/dev_python/chromedriver.exe', chrome_options = options)
url = 'http://data.krx.co.kr/contents/MDC/MDI/mdiLoader/index.cmd?menuId=MDC0302'
wd.get(url)
time.sleep(2)
# 필터 클릭
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[6]/ul/li[2]/div/div[1]/ul/li[1]/a').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[6]/ul/li[2]/div/div[1]/ul/li[1]/ul/li[1]/a').click()
time.sleep(3)
wd.find_element_by_xpath('//*[@id="MDCEASY015_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
time.sleep(1)
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
# 주가 관련 코드 모으기
html0 = wd.page_source
html = BeautifulSoup(html0, 'html.parser')
data_list = []
comments_list1 = html.findAll('tr', {'class':'CI-GRID-EVEN'})
comments_list2 = html.findAll('tr', {'class':'CI-GRID-ODD'})
for j in range(len(comments_list1)):
#회사이름
company = comments_list1[j].find('td',{'data-name':'ISU_ABBRV'}).text
company = company.replace('\n', '') #전처리
company = company.replace('\t', '')
# 등락률
updown = comments_list1[j].find('td', {'data-name': 'FLUC_RT'}).span.text
updown = updown.replace('\n', '')
updown = updown.replace('\t', '')
#회사이름2
company2 = comments_list2[j].find('td',{'data-name':'ISU_ABBRV'}).text
company2 = company2.replace('\n', '') #전처리
company2 = company2.replace('\t', '')
# 등락률2
updown2 = comments_list2[j].find('td', {'data-name': 'FLUC_RT'}).span.text
updown2 = updown2.replace('\n', '')
updown2 = updown2.replace('\t', '')
data = {'회사명': company, '등락률': updown}
data_list.append(data)
data2 = {'회사명': company2, '등락률': updown2}
data_list.append(data2)
# 데이터 프레임 형성
result_df = pd.DataFrame(data_list, columns=['회사명','등락률'])
#result_df.to_excel("C:/Users/90000527/Desktop/업무/data11.xlsx", index = False)
wd.close()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2834122827.py:7: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
wd = webdriver.Chrome('C:/dev_python/chromedriver.exe', chrome_options = options)
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2834122827.py:7: DeprecationWarning: use options instead of chrome_options
wd = webdriver.Chrome('C:/dev_python/chromedriver.exe', chrome_options = options)
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2834122827.py:15: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[6]/ul/li[2]/div/div[1]/ul/li[1]/a').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2834122827.py:17: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsMdiMenu"]/div[4]/ul/li[6]/ul/li[2]/div/div[1]/ul/li[1]/ul/li[1]/a').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2834122827.py:19: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="MDCEASY015_FORM"]/div[1]/div/table/tbody/tr[3]/td/div/div/button[2]').click()
C:\Users\90000527\AppData\Local\Temp/ipykernel_17896/2834122827.py:21: DeprecationWarning: find_element_by_xpath is deprecated. Please use find_element(by=By.XPATH, value=xpath) instead
wd.find_element_by_xpath('//*[@id="jsSearchButton"]').click()
In [4]:
result_df
Out[4]:
회사명 | 등락률 | |
---|---|---|
0 | 대성창투 | +29.87 |
1 | 대동전자 | +29.86 |
2 | 대한과학 | +29.72 |
3 | 청담글로벌 | +29.49 |
4 | 모아텍 | +23.35 |
5 | 플레이그램 | +19.43 |
6 | 에이디칩스 | +18.97 |
7 | 대동기어 | +16.62 |
8 | 우리기술투자 | +15.98 |
9 | 알로이스 | +15.75 |
10 | 테크엔 | +14.80 |
11 | 이비테크 | +14.67 |
12 | 키다리스튜디오 | +14.58 |
13 | 스템랩 | +14.38 |
14 | 엔에스엠 | +14.24 |
15 | 엔지브이아이 | +14.04 |
16 | 엄지하우스 | +13.78 |
17 | 코오롱글로벌 | +13.48 |
18 | 나우코스 | +13.16 |
19 | 엔에스엔 | +12.88 |
20 | 아우딘퓨쳐스 | +12.86 |
21 | 에스알바이오텍 | +12.67 |
22 | 비플라이소프트 | +12.26 |
23 | 동양에스텍 | +11.87 |
24 | 케어젠 | +11.62 |
25 | 씨에스베어링 | +11.30 |
26 | 신풍제약 | +11.11 |
27 | 앙츠 | +11.11 |
28 | 타임기술 | +10.82 |
29 | 슈프리마아이디 | +10.76 |
30 | 팬엔터테인먼트 | +10.76 |
31 | 대동금속 | +10.12 |
32 | 코데즈컴바인 | +10.08 |
33 | 삼천당제약 | +9.94 |
34 | 락앤락 | +9.82 |
35 | 비덴트 | +9.72 |
36 | 한국비엔씨 | +9.58 |
37 | 지투파워 | +9.52 |
38 | 탑코미디어 | +9.52 |
39 | 데이터스트림즈 | +9.50 |
40 | 뿌리깊은나무들 | +9.50 |
41 | 코나솔 | +9.37 |
42 | 하인크코리아 | +9.17 |
43 | 보라티알 | +9.13 |
44 | 태경케미컬 | +9.13 |
45 | 버킷스튜디오 | +9.11 |
46 | 린드먼아시아 | +8.60 |
47 | 씨아이테크 | +8.49 |
48 | 이마트 | +8.33 |
49 | 신화인터텍 | +8.26 |
In [5]:
from IPython.core.display import display, HTML
display(HTML("<style>.container {width:90% !important;}</style>"))
728x90
'Ccode > 크롤링(crawling)' 카테고리의 다른 글
유튜브 댓글 크롤링(crawling) (0) | 2022.06.24 |
---|---|
AP_CGV 사이트에서 영화 리뷰 crawling (0) | 2021.08.02 |