Python 第一支爬蟲(Jupyter Notebook)

試用 Pythone 試作爬蟲。使用 Jupyter Notebook。

引言

雖然…但是…為了不落後“AI”大趨勢，還是要玩一下 Python 相關的工具。

第一支就是爬蟲。

參考

Day-1 Python爬蟲小人生(1)

開發環境

IDE: Jupyter Notebook

語言: Python

安裝相關套件

前置作業：Pythone 與 pip 已安裝完成。

安裝 Jupyter Notebook

> pip install jupyter notebook

Python Requests 套件

> pip install requests

Python Beautifulsoup4套件

> pip install beautifulsoup4

如同以往安裝環境過程諸多不順(Orz)。請與 google 大神好好相處。加油你可以的。

啟動 Jupyter Notebook

成功安裝完畢後，本人以系統管理員身份執行啟動指令才成功。

> jupyter notebook

Python 指令紀錄

用 HTTP Get 取品網頁。

import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.ptt.cc/bbs/MobileComm/index.html")
print(r.text)

用 Parser 把目標 "div.title a" 抽取出來。

import requests
from bs4 import BeautifulSoup

soup = BeautifulSoup(r.text,"html.parser") #將網頁資料以html.parser
sel = soup.select("div.title a") #取HTML標中的 <div class="title"></div> 中的<a>標籤存入sel

for s in sel:
    print(s["href"], s.text)

沒圖沒真象

(EOF)

PreviousLINE Notify (@linenotify) 試用紀錄 NextPython + Selenium 動態爬蟲

Last updated 2 years ago

hashtag引言

hashtag參考

hashtag開發環境

hashtag安裝相關套件

hashtag啟動 Jupyter Notebook

hashtagPython 指令紀錄

hashtag沒圖沒真象

引言

參考

開發環境

安裝相關套件

啟動 Jupyter Notebook

Python 指令紀錄

沒圖沒真象