Python Crawler(6)Deployment on Docker on EC2 - 快马扬鞭须努力！ - ITeye博客

`

sillycat

浏览: 2567377 次
性别:
来自: 成都

最近访客更多访客>>

huageng520

learnmore

u012363178

ymgjava

博主相关

博客

微博

相册

收藏

留言

关于我

文章分类

社区版块

存档分类

最新评论

nation：你好，在部署Mesos+Spark的运行环境时，出现一个现象， ...
Spark(4)Deal with Mesos
sillycat： AMAZON Relatedhttps://www.godad ...
AMAZON API Gateway(2)Client Side SSL with NGINX
sillycat： sudo usermod -aG docker ec2-use ...
Docker and VirtualBox(1)Set up Shared Disk for Virtual Box
sillycat： Every Half an Hour30 * * * * /u ...
Build Home NAS(3)Data Redundancy
sillycat： 3 List the Cron Job I Have>c ...
Build Home NAS(3)Data Redundancy

Python Crawler(6)Deployment on Docker on EC2

博客分类：

Scripts

阅读更多

Python Crawler(6)Deployment on Docker on EC2

The start.sh will be similar to the rasbperryPi one.

The file conf/scrapyd.conf will be the same

The Makefile, I just change the name of the docker image
IMAGE=sillycat/public
TAG=centos7-scrapyd
NAME=centos7-scrapyd

docker-context:

build: docker-context
docker build -t $(IMAGE):$(TAG) .

run:
docker run -d -p 6800:6800 --name $(NAME) $(IMAGE):$(TAG)

debug:
docker run -ti -p 6800:6800 --name $(NAME) $(IMAGE):$(TAG) /bin/bash

clean:
docker stop ${NAME}
docker rm ${NAME}

logs:
docker logs ${NAME}

publish:
docker push ${IMAGE}:${TAG}

fetch:
docker pull ${IMAGE}:${TAG}

Dockerfile will be the major difference parts.
#Prepre the OS
FROM centos:7
MAINTAINER Carl Luo <luohuazju@gmail.com>

#install the softwarea
RUN yum -y update
RUN yum install -y gcc
RUN yum install -y python-devel

#install pip
RUN mkdir -p /install/
WORKDIR /install/
RUN curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
RUN python get-pip.py

#install scrapyd
RUN pip install scrapyd

#copy the config
RUN mkdir -p /tool/scrapyd/
ADD conf/scrapyd.conf /tool/scrapyd/

#set up the app
EXPOSE 6800
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD[ "./start.sh" ]

References:
http://sillycat.iteye.com/blog/2394767

分享到：

SOLR Cloud(2)SOLR7 Single Instance | Python Crawler(5)Deployment on Raspberry ...

2017-09-28 00:55
浏览 580
评论(0)
分类:企业架构
查看更多

评论

发表评论

您还没有登录,请您登录后再发表评论

相关推荐

PythonCrawler-master_网络爬虫最新教程_python_: 本教程"PythonCrawler-master"旨在教授如何利用Python进行网页数据的抓取和处理。教程涵盖了网络爬虫的基础知识，包括HTML解析、HTTP请求、数据存储等核心内容，同时也涉及了一些高级技巧，如模拟登录、反爬虫策略和...

-heartpulse-用python编写的爬虫项目集合-PythonCrawler.zip: PythonCrawler: 用 python编写的爬虫项目集合:bug:(本项目代码仅作为爬虫技术学习之用，学习者务必遵循中华人民共和国法律！)

docker-scrapy-crawler:docker scrapyd scrapy boot2docker crawler - 一个可以“Dockerized”的蜘蛛 Python 应用程序: Docker Scrapyd Scrapy Crawler - Mailan-Spider 应用程序这个存储库是一个可以“Dockerized”的蜘蛛 Python 应用程序。它附带了在 Mac OS X 中“Dockerizing”Python 应用程序的分步指南。您将了解 Scrapy、...

scaleable-crawler-with-docker-cluster:一个具有docker集群的可扩展且高效的crawelr: 在这个项目"scaleable-crawler-with-docker-cluster"中，我们主要关注的是如何构建一个能够高效、可扩展且在Docker集群上运行的网络爬虫。这个解决方案利用了Python作为编程语言，Docker作为容器化技术，RabbitMQ...

Python库 | spidy_web_crawler-1.6.0-py3-none-any.whl: python库。资源全名：spidy_web_crawler-1.6.0-py3-none-any.whl

PythonCrawler-Scrapy-Mysql-File-Template, scrapy爬虫框架模板，将数据保存到Mysql数据库或者文件中。.zip: **PythonCrawler-Scrapy-Mysql-File-Template 框架详解** 本文将深入探讨一个基于Python的开源爬虫框架——Scrapy，以及如何利用它来构建爬虫项目，将抓取的数据存储到MySQL数据库或文件中。Scrapy是一个强大的、...

PythonCrawler-master用python编写的爬虫项目集合: baidu_sy_img.py: 抓取百度的高清摄影图片。 baidu_wm_img.py: 抓取百度图片唯美意境模块。 get_photos.py: 抓取百度贴吧某话题下的所有图片。 get_web_all_img.py: 抓取整个网站的图片。 lagou_position_spider.py:...

Python-Crawler-master_爬虫_python爬虫_: Python-Crawler-master是一个关于Python爬虫的项目，主要利用Python的多线程技术来实现对电影天堂网站资源的高效抓取。在这个项目中，开发者旨在提供一个实用且高效的爬虫框架，帮助用户获取到电影天堂网站上的丰富...

Python website crawler..zip: Python website crawler.

Python爬虫示例之distribute-crawler-master.zip: Python爬虫示例之distribute_crawler-master.Python爬虫示例之distribute_crawler-master.Python爬虫示例之distribute_crawler-master.Python爬虫示例之distribute_crawler-master.Python爬虫示例之distribute_...

PythonCrawler:用python编写的爬虫项目集合: ( )\ ) ) ) ( ( (()/( ( ( /( ( /( )\ ( ) ( ( )\ ( ( /(_)))\ ) )\()))\()) ( ( (((_) )( ( /( )\))( ((_) ))\ )( (_)) (()/( (_))/((_)\ )\ )\ ) )\___ (()\ )(_))((_)()\ _ /((_)(()\

crawler_tutorial.ipynb: 简单爬虫操作，直达博客——复工复产，利用Python爬虫爬取火车票信息，利用Python 爬虫获取火车票信息

Python爬虫学习路径图_Learn-Python-Crawler.zip: Python爬虫学习路径图_Learn-Python-Crawler

python-crawler-python爬虫: 学习 Python 爬虫需要掌握以下几个方面的知识：首先，需要了解 Python 基础知识，包括变量、数据类型、控制结构、函数、模块等。 Python 是一种易于学习的语言，对于初学者来说，学习 Python 基础知识并不困难。其次...

crawlerforSinaweibo_爬虫python_webcrawler_python_weibo_python爬虫_: 6. `requirements.txt`：列出项目依赖的Python库和版本。 7. `logs`：日志文件夹，记录爬虫运行时的错误和信息。 8. `test`：测试目录，包含单元测试和集成测试代码。综上所述，"crawlerforSinaweibo" 是一个使用...

python爬虫日常小练习，小项目-python_crawler.zip: 在这个“python_crawler”项目中，我们很可能看到了一系列用于学习和实践Python爬虫技术的代码和资源。下面，我们将深入探讨Python爬虫的一些核心知识点。 1. **基础概念**：Python爬虫，也称为网络爬虫或网页抓取...

python爬虫（Here is a basic Python web crawler code.）: Here is a basic Python web crawler code that uses the requests and beautifulsoup4 libraries: This code sends an HTTP request to the specified URL, then uses BeautifulSoup to parse the ...

python-crawler-master很好的学习资源: python-crawler-master很好的学习资源

browsertrix-crawler:在单个Docker容器中运行基于高保真度的基于浏览器的搜寻器: Browsertrix搜寻器Browsertrix Crawler是一个简化的基于浏览器的高保真爬网系统，旨在在单个Docker容器中运行单个爬网。它是对原始进行更精简替换的一部分而设计的。对于需要单个爬网并且需要管理多个容器的情况，...

python-crawler-master.zip: 这个"python-crawler-master.zip"压缩包显然包含了一个完整的Python爬虫项目，适合初学者学习和实践。让我们详细了解一下Python爬虫的基本概念、重要性以及如何进行开发。 Python爬虫是一种自动化程序，用于遍历...

Global site tag (gtag.js) - Google Analytics