1072번째 오늘의 꼬맨틀 정답 유사단어 크롤링

카테고리 없음

1072번째 오늘의 꼬맨틀 정답 유사단어 크롤링

Han_Star 2025. 3. 9. 21:39

꼬맨틀은 오늘의 단어를 맞히는 게임입니다.

정답 단어를 추측하면, 추측한 단어가 정답 단어와 얼마나 유사한지 유사도 점수로 알려줍니다.

https://semantle-ko.newsjel.ly/

꼬맨틀 - 단어 유사도 추측 게임 - 뉴스젤리 : 데이터 시각화 전문 기업

하루에 한 번, 오늘의 단어를 맞혀보세요. 단어를 입력하면 정답 단어와의 유사도를 보여줍니다.

semantle-ko.newsjel.ly

1072번째 꼬맨틀의 정답 단어는 '서류'였었습니다.

첫 단어를 '물건'을 썼었는데, 1000번 안에 들면서 재밌게 플레이했었네요.

정답을 맞히거나, 문제 풀이를 포기한다면 상위 1000개의 단어를 확인할 수 있게 됩니다.

그런데 이 유사한 단어 목록은 3일 제공하고 그 뒤에는 접근할 수 없게 됩니다.

그래서 한번 크롤링해서 가져와보기로 했습니다.

정답과 유사한 단어 목록의 화면은 다음과 같습니다.

단순합니다.

soup = BeautifulSoup(html)
table = soup.find('table')
answer_id = soup.find('span',id = 'answer_id').get_text(strip=True)
answer = soup.find('span',id = 'answer').get_text(strip=True)

HTML에서 테이블을 추출하여 저장하는 데는 BeautifulSoup의 find('table') 메서드를 사용할 수 있습니다.

문제는, 정답 페이지의 데이터가 동적으로 생성된다는 점입니다.

따라서 Selenium을 사용해 동적으로 렌더링된 페이지 소스를 가져오기로 합니다.

driver = webdriver.Chrome()
driver.get('https:/semantle-ko.newsjel.ly/nearest1k/1072')
time.sleep(3)
html = driver.page_source
driver.quit()

dfs = pd.read_html(str(table))
df = dfs[0]

pd.read_html()은 HTML 문자열 내의 테이블 데이터를 DataFrame 형태로 읽어와 리스트로 반환합니다.

저희가 가져온 table은 하나밖에 없으니, dfs[0]이 찾고자 하는 테이블입니다.

잘 가져왔군요!

꼬맨틀#1072_서류_유사단어.xlsx

0.03MB

전체 코드입니다.
url의 숫자를 고쳐 사용하시면 되겠습니다.

from selenium import webdriver
from selenium.webdriver.common.by import By
import time
from bs4 import BeautifulSoup
import pandas as pd

driver = webdriver.Chrome()
driver.get('https:/semantle-ko.newsjel.ly/nearest1k/1072')
time.sleep(3)
html = driver.page_source
driver.quit()

soup = BeautifulSoup(html)
table = soup.find('table')
answer_id = soup.find('span',id = 'answer_id').get_text(strip=True)
answer = soup.find('span',id = 'answer').get_text(strip=True)

print(answer_id,answer)

dfs = pd.read_html(str(table))
df = dfs[0]

df

df.to_excel(f'꼬맨틀#{answer_id}_{answer}_유사단어.xlsx',index = False)

현재글1072번째 오늘의 꼬맨틀 정답 유사단어 크롤링

histroy

기록을 남깁니다.

tableau 설치방법, 로또번호크롤링, 파이썬, 글꼴오류, 태블로 다운로드, 유사단어, 구글 플레이 리뷰 크롤링, 크롤링, 삼성 주가 데이터, 없어진 기록, programmers, 직방, beautifulsoup, pandas, 꼬맨틀, undetected_chromedriver, google-play-scraper, 부동산 크롤링, PYTHON, selenium,

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

histroy

1072번째 오늘의 꼬맨틀 정답 유사단어 크롤링

'카테고리 없음'의 다른글

티스토리툴바