Environment Setup
python3/Request libraries/Parsing libraries/Databases/Repositories/Web libraries/App scraping libraries/Web crawler framework libraries
-
Python 3
- Windows 11 can be downloaded directly from the Store
- On Linux,
apt-get install python3
-
Request libraries
-
requests
pip3 install requests -
selenium
pip install selenium -
ChromeDriver
- View the Chrome version in About Chrome
- Download the corresponding version from ChromeDriver
- Add ChromeDriver to your environment variables
-
phantomJSThe new Selenium versions no longer support phantomJS; you can use it directly with ChromeDriver
Verification:
from selenium import webdriverfrom selenium.webdriver.chrome.options import Optionschrome_options = Options()chrome_options.add_argument('--headless')chrome_options.add_argument('--disable-gpu')driver = webdriver.Chrome(options=chrome_options)driver.get("<https://dreaife.icu/>")print(driver.current_url) -
aiohttp
pip install aiodns
-
-
Parsing libraries
-
lxml
pip install lxml -
beautifulsoup4
pip install beautifulsoup4 -
pyquery
pip install pyquery -
tesserocr
-
Install Tesseract
-
Install tesserocr
Windows using
pip install <name>.whl -
Verification
import tesserocrfrom PIL import Imageimage = Image.open('G:/codeS/backOnGithub/Jupyter/spider/image.png')print(tesserocr.image_to_text(image))Note: If you encounter File “tesserocr.pyx”, line 2580, in tesserocr._tesserocr.image_to_textRuntimeError: Failed to init API, possibly an invalid tessdata path error, you need to first put tessdata into the error folder
-
-
-
Databases
- MySQL
- MongoDB
- Redis
-
Repositories
-
PyMySQL
pip install pymysql -
PyMongo
pip install pymongo -
redis-py
pip install redis -
RedisDump
Install Ruby
gem install redis-dump
-
-
Web libraries
-
Flask
pip install flask -
Tornado
pip install tornado
-
-
App scraping libraries
-
Charles
-
mitmproxy
pip install mitmproxy -
Appium
-
-
Web crawling frameworks
-
pyspider
pip install pyspiderIf Windows 11 cannot run it, you can refer to this article
-
scrapy
-
scrapy-splash
-
scrapy-redis
-
If this article helped you, please share it with others!
Some information may be outdated





