51 words
1 minute
Running pyspider on Windows 11 with Docker
There were issues installing pyspider on Windows 11, with multiple errors occurring.
I found that the official website offers a Docker-based installation method.
Directly via Docker
# mysqldocker run --name mysql -d -v /data/mysql:/var/lib/mysql -e MYSQL_ALLOW_EMPTY_PASSWORD=yes mysql:latest# rabbitmqdocker run --name rabbitmq -d rabbitmq:latest
# phantomjsdocker run --name phantomjs -d binux/pyspider:latest phantomjs
# result workerdocker run --name result_worker -m 128m -d --link mysql:mysql --link rabbitmq:rabbitmq binux/pyspider:latest result_worker# processor, run multiple instance if needed.docker run --name processor -m 256m -d --link mysql:mysql --link rabbitmq:rabbitmq binux/pyspider:latest processor# fetcher, run multiple instance if needed.docker run --name fetcher -m 256m -d --link phantomjs:phantomjs --link rabbitmq:rabbitmq binux/pyspider:latest fetcher --no-xmlrpc# schedulerdocker run --name scheduler -d --link mysql:mysql --link rabbitmq:rabbitmq binux/pyspider:latest scheduler# webuidocker run --name webui -m 256m -d -p 5000:5000 --link mysql:mysql --link rabbitmq:rabbitmq --link scheduler:scheduler --link phantomjs:phantomjs binux/pyspider:latest webuiUsing docker-compose
services: phantomjs: image: binux/pyspider:latest command: phantomjs result: image: binux/pyspider:latest external_links: - mysql - rabbitmq command: result_worker processor: image: binux/pyspider:latest external_links: - mysql - rabbitmq command: processor fetcher: image: binux/pyspider:latest external_links: - rabbitmq links: - phantomjs command : fetcher scheduler: image: binux/pyspider:latest external_links: - mysql - rabbitmq command: scheduler webui: image: binux/pyspider:latest external_links: - mysql - rabbitmq links: - scheduler - phantomjs command: webui ports: - "5000:5000"Then just run:
docker-compose up -d
After running successfully, if you visit http://localhost<5000>5000>/ and see the content below, it indicates that pyspider is running successfully.

Share
If this article helped you, please share it with others!
Running pyspider on Windows 11 with Docker
https://dreaife.tokyo/en/posts/docker-pyspider-win/ Some information may be outdated
Related Posts Smart
1
Web Crawling Basics
spider A web crawler is an automated program used to obtain information from web pages. Its basic workflow includes sending HTTP requests to retrieve page source code, extracting the required data, and saving it. Since web pages are built from HTML, CSS, and JavaScript, crawlers need to handle both static and dynamic pages. Sessions and cookies maintain user state, while proxy servers can hide the real IP address. Common request methods include GET and POST, and response status codes indicate request results. Crawlers should follow anti-scraping constraints and use proxies and proper headers to improve efficiency.
2
Python Web Crawler Environment Setup
spider Setting up a Python web crawler environment includes installing Python 3, request libraries (such as requests and selenium), parsing libraries (such as lxml and beautifulsoup4), databases (such as MySQL and MongoDB), storage libraries (such as PyMySQL and PyMongo), web libraries (such as Flask and Tornado), app crawling tools (such as mitmproxy and appium), and crawler frameworks (such as pyspider and scrapy). Installation commands and notes for each library are provided in detail.
3
Learning Basic Spider Libraries
spider This article studies basic web scraping libraries, including Python urllib and requests. It introduces HTTP request construction, exception handling, URL parsing, regular expression usage, and how to extract information from the Maoyan movie ranking page. It also emphasizes advanced usage such as request headers, cookies, proxy settings, and session persistence.
4
Getting Started with Docker
infra Docker is a technology for solving microservice deployment problems by packaging applications and their dependencies into isolated containers, avoiding inconsistent environments and dependency conflicts. Compared with virtual machines, Docker starts faster and uses fewer resources. Its architecture includes images and containers, and users can share and obtain images through Docker Hub. Basic operations include creating and managing images and containers and using volumes for data persistence and host-container decoupling. Docker Compose can simplify distributed application deployment.
5
The First Round of Selection in the New Era
life With the development of AI technology, the cost of using advanced models may lead to social stratification, where only those with strong financial means can use these models. Although current prices are still acceptable, future price increases may make them unaffordable for most people, thus forming the first round of selection. The author feels anxious about this phenomenon, while also realizing that AI applications have moved beyond programming and into broader industries. Facing the challenges and opportunities of a new world, individuals continue to explore under the momentum of the times.
Random Posts Random
1
Experiment 3: UDP Protocol Analysis
cs-base 2022-07-01
2
Python Web Crawler Environment Setup
spider 2024-01-01
3
Configure Docker + code-server on Alibaba Cloud to Build an Online Compiler
cs-base 2022-07-06
4
Java threadLocal
cs-base 2024-02-04
5
Experiment 9: Encryption, Digital Signatures, and Certificates
cs-base 2022-07-01





