Skip to content

Chenkunlin/go-crawler

 
 

Repository files navigation

go-crawler

A distributed crawler based on Golang.


Details:

https://blog.csdn.net/chao2016/article/details/81697353

Tree:

go-crawler
├── README.md
├── engine
│   ├── concurrent.go
│   ├── simple.go
│   ├── types.go
│   └── worker.go
├── fetcher
│   └── fetcher.go
├── frontend
│   ├── controller
│   │   └── searchresult.go
│   ├── model
│   │   └── page.go
│   ├── starter.go
│   └── view
│       ├── css
│       │   └── style.css
│       ├── index.html
│       ├── js
│       │   └── index.js
│       ├── logo.png
│       ├── searchresult.go
│       ├── searchresult_test.go
│       ├── template.html
│       ├── template.test.html
│       └── template_test.go
├── main.go
├── model
│   └── profile.go
├── persist
│   ├── itemsaver.go
│   └── itemsaver_test.go
├── scheduler
│   ├── queued.go
│   └── simple.go
└── zhenai
    └── parser
        ├── city.go
        ├── citylist.go
        ├── citylist_test.go
        ├── citylist_test_data.html
        ├── profile.go
        ├── profile_test.go
        └── profile_test_data.html

Download:

git clone git@github.com:chao2015/go-crawler.git

or download the previous version via the release page

dependences:

// 1. docker 18.06.0-ce
// 2. 安装elasticsearch
docker run -d -p 9200:9200 elasticsearch
// 3. 安装elastic client:
go get -v gopkg.in/olivere/elastic.v5

Run:

docker ps

// 若有(elasticsearch CONTAINER ID)
docker kill 6a8ea105cbc6

docker run -d -p 9200:9200 elasticsearch

mv go-crawler/ $GOPATH/src/
cd $GOPATH/src/go-crawler/
go run main.go
go run frontend/starter.go
// http://localhost:8888
// 搜索示例:男 已购房 已购车 Age:(<30) Height:(>180)

Have fun! ^_^

About

A distributed crawler based on Golang.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 87.9%
  • CSS 11.1%
  • JavaScript 1.0%