Web Crawlers - 情報セキュリティ株式会社 · •Web crawlers are known by a variety of...
Transcript of Web Crawlers - 情報セキュリティ株式会社 · •Web crawlers are known by a variety of...
Information Security Inc.
Web Crawlers
Information Security Confidential - Partner Use Only
Contents
2
• What are Web Crawlers?
• Ways to crawl a website
• References
Information Security Confidential - Partner Use Only
What are Web Crawlers?
3
• Web crawlers are known by a variety of names – industry jargon
labels them spiders or bots but technically they are referred to as
web crawlers
Information Security Confidential - Partner Use Only
Ways to crawl a website
4
• Metasploit
Information Security Confidential - Partner Use Only
Ways to crawl a website
5
• HTTrack
Information Security Confidential - Partner Use Only
Ways to crawl a website
6
• Black Widow
Information Security Confidential - Partner Use Only
Ways to crawl a website
7
• Burp Suite Spider
Information Security Confidential - Partner Use Only
Ways to crawl a website
8
• Scrapy framework
(https://doc.scrapy.org/en/master/intro/tutorial.html)
Information Security Confidential - Partner Use Only
Ways to crawl a website
9
• Scrapy framework
(https://doc.scrapy.org/en/master/intro/tutorial.html)
Information Security Confidential - Partner Use Only
Ways to crawl a website
10
• Scrapy framework
(https://doc.scrapy.org/en/master/intro/tutorial.html)
Information Security Confidential - Partner Use Only
Ways to crawl a website
11
• Scrapy framework
(https://doc.scrapy.org/en/master/intro/tutorial.html)
▲ Example Spider (extract all links and follow them)
Information Security Confidential - Partner Use Only
References
12
• Wikipedia
https://en.wikipedia.org/wiki/Web_crawler
• ScienceDaily
https://www.sciencedaily.com/terms/web_crawler.htm
• Metasploit
https://www.metasploit.com
• HTTrack
https://www.httrack.com
• Black Widow
http://softbytelabs.com/us/downloads.html
• Burp Suite
https://portswigger.net/burp
• Scrapy
https://scrapy.org/