Analyse your SEO Data with R and Kibana
-
Upload
vincent-terrasi -
Category
Data & Analytics
-
view
1.631 -
download
4
Transcript of Analyse your SEO Data with R and Kibana
![Page 1: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/1.jpg)
Analyse your SEO Data with R and Kibana
June 10th, 2016
Vincent Terrasi
![Page 2: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/2.jpg)
Vincent Terrasi
--
SEO Director - Groupe M6Web CuisineAZ, PasseportSanté, MeteoCity, …
--
Join the OVH adventure in July 2016
Blog : data-seo.com
![Page 3: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/3.jpg)
Agenda
Mission : Do a Real-Time Log Analysis Tool
1. Using Screaming Frog to crawl a website
2. Using R for SEO Analysis
3. Using PaasLogs to centralize logs
4. Using Kibana to build fancy dashboards
5. Test !
3
“The world is full of obvious things which nobody by any chance ever observes.”
Sherlock Holmes Quote
![Page 4: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/4.jpg)
Real-Time Log Analysis Tool 4
• Screaming Frog
• Google Analytics
• R Crawler
• IIS Logs
• Apache Logs
• Nginx Logs Logs
![Page 5: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/5.jpg)
Using Screaming Frog
![Page 6: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/6.jpg)
Screaming Frog : Export Data 6
When the crawl is
finished, click the
export button and save
the XLSX file
Add your url and click
the start button
![Page 7: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/7.jpg)
Screaming Frog : Data ! 7
"Address"
"Content"
"Status Code"
"Status"
"Title 1"
"Title 1 Length"
"Title 1 Pixel Width"
"Title 2"
"Title 2 Length"
"Title 2 Pixel Width"
"Meta Description 1"
"Meta Description 1 Length“
"Meta Description 1 Pixel Width"
"Meta Keyword 1"
"Meta Keywords 1 Length"
"H1-1"
"H1-1 length"
"H2-1"
"H2-1 length"
"H2-2"
"H2-2 length"
"Meta Robots 1“
"Meta Refresh 1"
"Canonical Link Element 1"
"Size"
"Word Count"
"Level"
"Inlinks"
"Outlinks"
"External Outlinks"
"Hash"
"Response Time"
"Last Modified"
"Redirect URI“
"GA Sessions"
"GA % New Sessions"
"GA New Users"
"GA Bounce Rate"
"GA Page Views Per Sesssion"
"GA Avg Session Duration"
"GA Page Value"
"GA Goal Conversion Rate All"
"GA Goal Completions All"
"GA Goal Value All"
"Clicks"
"Impressions"
"CTR"
"Position"
"H1-2"
"H1-2 length"
![Page 8: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/8.jpg)
Using R
![Page 9: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/9.jpg)
Why R ?
Scriptable
Big Community
Mac / PC / Unix
Open Source
7500 packages
9
Documentation
![Page 10: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/10.jpg)
WheRe ? How ?
https://www.cran.r-project.org/
10
Rgui RStudio
![Page 11: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/11.jpg)
Using R : Step 1
Export All Urls
11
"request“;"section“;"active“;
"speed“;"compliant“;"depth“;"inlinks"
Packages :
Stringr
Ggplot
Dplyr
Readxl
![Page 12: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/12.jpg)
R Examples
Crawl via Screaming Frog
Classify URLs by : Section
Load Time
Number of Inlinks
Detect Active Pages Min 1 visit per month
Detect Compliant Pages Canonical Not Equal
Meta No-index
Bad HTTP Status Code
Detect Duplicate Meta
12
![Page 13: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/13.jpg)
R : read files 13
# Read xlsx file
urls <- read_excel("internal_html_blog.xlsx",
sheet = 1,
col_names = TRUE,
skip=1)
# Read csv file
urls <- read.csv2("internal_html_blog.csv", sep=";", header = TRUE)
![Page 14: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/14.jpg)
Detect Active Pages 14
#default
urls_select$Active <- FALSE
urls_select$Active[ which(urls_select$`GA Sessions` > 0) ] <- TRUE
#factor
urls_select$Active <- as.factor(urls_select$Active)
![Page 15: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/15.jpg)
Classify URLs by Section 15
schemas <- read.csv(“conf.csv”,header = FALSE, col.names = "schema", stringsAsFactors = FALSE)
urls_select$Cat <- "no match"
for (j in 1:length(schemas))
{
urls_select$Cat[ which(stri_detect_fixed(urls_select$Address , schemas[j]) ) ] <- schemas[j]
}
/agenda/sorties-cinema/
/agenda/parutions/
/agenda/evenements/
/agenda/programme-tv/
/encyclopedie/
Conf.csv
![Page 16: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/16.jpg)
Classify URLs By Load Time 16
urls_select$Speed <- NA
urls_select$Speed[ which(urls_select$`Response Time` < 0.501 ) ] <- "Fast“
urls_select$Speed [ which(urls_select$`Response Time` >= 0.501
& urls_select$`Response Time` < 1.001) ] <- "Medium“
urls_select$Speed[ which(urls_select$`Response Time` >= 1.001
& urls_select$`Response Time` < 2.001) ] <- "Slow“
urls_select$Speed[ which(urls_select$`Response Time` >= 2.001) ] <- "Slowest"
urls_select$Speed <- as.factor(urls_select$Speed)
![Page 17: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/17.jpg)
Classify URLs By Number of Inlinks 17
urls_select$`Group Inlinks` <- "URLs with No Follow Inlinks"
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` < 1 ) ] <- "URLs with No Follow Inlinks"
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` == 1 ) ] <- "URLs with 1 Follow Inlink“
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` > 1
& urls_select$`Inlinks` < 6) ] <- "URLs with 2 to 5 Follow Inlinks“
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` >= 6
& urls_select$`Inlinks` < 11 ) ] <- "URLs with 5 to 10 Follow Inlinks“
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` >= 11) ] <- "URLs with more than 10 Follow Inlinks"
urls_select$`Group Inlinks` <- as.factor(urls_select$`Group Inlinks`)
![Page 18: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/18.jpg)
Detect Compliant Pages 18
# Compliant Pages
# Canonical Not Equal
# Meta No-index
# Bad HTTP Status Code
# Not Equal
urls_select$Compliant <- TRUE
urls_select$Compliant[ which(urls_select$`Status Code` != 200
| urls_select$`Canonical Link Element 1` != urls_select$Address
| urls_select$Status != "OK"
| grepl("noindex",urls_select$`Meta Robots 1`)
) ] <- FALSE
urls_select$Compliant <- as.factor(urls_select$Compliant)
![Page 19: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/19.jpg)
Detect Duplicata Meta 19
urls_select$`Status Title` <- 'Unique'
urls_select$`Status Title`[ which(urls_select$`Title 1 Length` == 0) ] <- "No Set"
urls_select$`Status Description` <- 'Unique'
urls_select$`Status Description`[ which(urls_select$`Meta Description 1 Length` == 0) ] <- "No Set"
urls_select$`Status H1` <- 'Unique'
urls_select$`Status H1`[ which(urls_select$`H1-1 Length` == 0) ] <- "No Set"
urls_select$`Status Title`[ which(duplicated(urls_select$`Title 1`)) ] <- 'Duplicate'
urls_select$`Status Description`[ which(duplicated(urls_select$`Meta Description 1`)) ] <- 'Duplicate'
urls_select$`Status H1`[ which(duplicated(urls_select$`H1-1`)) ] <- 'Duplicate'
urls_select$`Status Title` <- as.factor(urls_select$`Status Title`)
urls_select$`Status Description` <- as.factor(urls_select$`Status Description`)
urls_select$`Status H1` <- as.factor(urls_select$`Status H1`)
![Page 20: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/20.jpg)
Generate CSV 20
urls_light <- select(urls_select,Address,Cat,Active,Speed,Compliant,Level,Inlinks) %>%
mutate(Address=gsub(“http://moniste.fr","",Address))
colnames(urls_light) <- c("request","section","active","speed","compliant","depth","inlinks")
write.csv2(“file.csv”, filename, row.names = FALSE)
Package dplyr : select and mutate
Edit colnames
Use write.csv2
![Page 21: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/21.jpg)
R : ggplot2 command 21
DATA
Create the ggplot object and populate it with data (always a data frame)
ggplot( mydata, aes( x=section,y=count, fill=active ))
LAYERS
Add layer(s)
+ geom_point()
FACET
Used for conditionning on variable(s)
+ facet_grid(~rescode)
![Page 22: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/22.jpg)
ggplot2 : Geometry 22
![Page 23: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/23.jpg)
R Chart : Active Pages 23
urls_level_active <- group_by(urls_select,Level,Active) %>%
summarise(count = n()) %>%
filter(Level<12)
Geometry Aesthetic
p <- ggplot(urls_level_active, aes(x=Level, y=count, fill=Active) ) +
geom_bar(stat = "identity", position = "stack") +
scale_fill_manual(values=c("#e5e500", "#4DBD33")) +
labs(x = "Depth", y ="Crawled URLs")
#display
print(p)
# save in file
ggsave(file=“chart.png")
![Page 24: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/24.jpg)
R Chart : GA Sessions 24
urls_cat_gasessions <- aggregate( urls_select$`GA Sessions`, by=list(Cat=urls_select$Cat, urls_select$Compliant), FUN=sum, na.rm=TRUE)
colnames(urls_cat_gasessions) <- c("Category","Compliant","GA Sessions")
p <- ggplot(urls_cat_gasessions, aes(x=Category, y=`GA Sessions`, fill=Compliant))+
geom_bar(stat = "identity", position = "stack") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
labs(x = "Section", y ="Sessions") +
scale_fill_manual(values=c("#e5e500","#4DBD33"))
#display
print(p)
# save in file
ggsave(file=“chart.png")
![Page 25: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/25.jpg)
R Chart : Compliant 25
urls_cat_compliant_statuscode <- group_by(urls_select,Cat,
Compliant,`Status Code`) %>%
summarise(count = n()) %>%
filter(grepl(200,`Status Code`) | grepl(301,`Status Code`))
p <- ggplot(urls_cat_compliant_statuscode, aes(x=Cat, y=count,
fill= Compliant ) ) +
geom_bar(stat = "identity", position = "stack") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
facet_grid(`Status Code` ~ .) +
labs(x = "Section", y ="Crawled URLs") +
scale_fill_manual(values=c("#e5e500","#4DBD33"))
![Page 26: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/26.jpg)
R : SEO Cheat Sheet 26
Package Dplyr
select() allows you to rapidly zoom in on a useful subset using operations that usually only work on numeric variable positions
mutate() a data frame by adding new or replacing existing columns
filter() allows you to select a subset of rows in a data frame.
Package Gplot2
aes - geom
ggsave()
Package Readxl
read_excel()
read.csv2()
write.csv2()
![Page 27: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/27.jpg)
ELK
![Page 28: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/28.jpg)
Architecture 28
Hard to monitor and optimize host server performance
![Page 29: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/29.jpg)
Architecture 29
![Page 30: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/30.jpg)
Using PaasLogs
![Page 31: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/31.jpg)
PaasLogs 31
![Page 32: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/32.jpg)
PaasLogs 32
164 noeuds au sein du cluster Elastic Search
180 machines connectées
Entre 100 000 et 300 000 logs traités par seconde
12 milliards de logs transitent tous les jours
211 milliards de documents enregistrés
8 clicks and 3 copy/paste to use it !
![Page 33: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/33.jpg)
PaasLogs: Step 1 33
![Page 34: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/34.jpg)
PaasLogs : Step 2 34
![Page 35: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/35.jpg)
PaasLogs 35
![Page 36: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/36.jpg)
PaasLogs : Streams 36
The Streams are the recipient of your logs. When you send a log with the
right stream token, it arrives automatically to your stream in a awesome
software named Graylog.
![Page 37: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/37.jpg)
PaasLogs : Dashboards 37
The Dashboard is the global view of your logs, A Dashboard is an efficient
way to exploit your logs and to view global information like metrics and
trends about your data without being overwhelmed by the logs details.
![Page 38: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/38.jpg)
PaasLogs : Aliases 38
The Aliases will allow you to access directly your data from your Kibana or
using an Elasticsearch query
DON’T FORGET TO ENABLE KIBANA INDICES AND WRITE YOUR USER PASSWORD
![Page 39: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/39.jpg)
PaasLogs : Inputs 39
The Inputs will allow you to ask OVH to host your own dedicated collector
like Logstash or Flowgger.
![Page 40: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/40.jpg)
PaasLogs : Network Configuration 40
![Page 41: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/41.jpg)
PaasLogs : Plugins Logstash 41
OVHCOMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\]
"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion_num:float})?|%{DATA:rawrequest})"
%{NUMBER:response_int:int} (?:%{NUMBER:bytes_int:int}|-)
OVHCOMBINEDAPACHELOG %{OVHCOMMONAPACHELOG} "%{NOTSPACE:referrer}" %{QS:agent}
![Page 42: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/42.jpg)
PaasLogs : Config Logstash 42
if [type] == "apache" {
grok {
match => [ "message", "%{OVHCOMBINEDAPACHELOG}"]
patterns_dir => "/opt/logstash/patterns"
}
}
if [type] == "csv_infos" {
csv {
columns => ["request", "section","active", "speed",
"compliant","depth","inlinks"]
separator => ";"
}
}
![Page 43: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/43.jpg)
How to send Logs to PaasLogs ? 43
![Page 44: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/44.jpg)
Use Filebeat 44
![Page 45: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/45.jpg)
Filebeat : Install 45
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
![Page 46: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/46.jpg)
Filebeat : Edit filebeat.yml 46
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
filebeat:
prospectors:
-
paths:
- /home/ubuntu/lib/apache2/log/access.log
input_type: log
fields_under_root: true
document_type: apache
-
paths:
- /home/ubuntu/workspace/csv/crawled-urls-filebeat-*.csv
input_type: csv
fields_under_root: true
document_type: csv_infos
output:
logstash:
hosts: ["c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com:5044"]
worker: 1
tls:
certificate_authorities: ["/home/ubuntu/workspace/certificat/key.crt"]
![Page 47: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/47.jpg)
Filebeat : Start 47
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
Copy / Paste Key.crt
-----BEGIN CERTIFICATE-----
MIIDozCCAougAwIBAgIJALxR4fTZlzQMMA0GCSqGSIb3DQEBCwUAMGgxCzAJBgNVBAYTAkZSMQ
8wDQYDVQQIDAZGcmFuY2UxDjAMBgNVBAcMBVBhcmlzMQwwCgYDVQQKDANPVkgxCzAJBgNVB
AYTAkZSMR0wGwYDVQQDDBRpbi5sYWFzLnJ1bmFib3ZlLmNvbTAeFw0xNjAzMTAxNTEzMDNaFw0
xNzAzMTAxNTEzMDNaMGgxCzAJBgNVBAYTAkZSMQ8wDQYDVQQIDAZGcmFuY2UxDjAMBgNVBA
cMBVBhcmlzMQwwCgYDVQQKDANPVkgx
-----END CERTIFICATE-----
Start Filebeat
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
![Page 48: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/48.jpg)
How to combine multiple sources ? 48
![Page 49: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/49.jpg)
Paaslogs : Plugins ES 49
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
Description : Copies fields from previous log events in Elasticsearch to current events
if [type] == "apache" {
elasticsearch {
hosts => "laas.runabove.com"
index => "logsDataSEO" # alias
ssl => true
query => ‘ type:csv_infos AND request: "%{[request]}" ‘
fields => [["speed","speed"],["compliant","compliant"],
["section","section"],["active","active"],
["depth","depth"],["inlinks","inlinks"]]
}
}
# TIP : fields => [[src,dest],[src,dest]]
![Page 50: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/50.jpg)
Using Kibana
![Page 51: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/51.jpg)
Kibana : Install 51
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
Download Kibana 4.1
• Download and unzip Kibana 4
• Extract your archive
• Open config/kibana.yml in an editor
• Set the elasticsearch.url to point at your Elasticsearch instance
• Run ./bin/kibana (or bin\kibana.bat on Windows)
• Point your browser athttp://yourhost.com:5601
![Page 52: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/52.jpg)
Kibana : Edit Kibana.yml 52
Update Kibana.xml
server.port: 8080
server.host: "0.0.0.0"
elasticsearch.url: "https://laas.runabove.com:9200"
elasticsearch.preserveHost: true
kibana.index: "ra-logs-33078"
kibana.defaultAppId: "discover"
elasticsearch.username: "ra-logs-33078"
elasticsearch.password: "rHftest6APlolNcc6"
![Page 53: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/53.jpg)
Kibana : Line Chart 53
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
Number of active crawled from google over a period of time
![Page 54: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/54.jpg)
Kibana : Vertical Bar Chart 54
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
![Page 55: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/55.jpg)
Kibana : Pie Chart 55
![Page 56: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/56.jpg)
How to compare two periods ? 56
![Page 57: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/57.jpg)
Kibana : Use Date Range 57
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
OU https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
nano /etc/filebeat/filebeat.yml
Change filebat.yml
> Indiquer le chemin du fichier de logs
> Mettre le chemin : c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com
Copie Key.crt
-----
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
7/ get logs si erreur
tail -f /var/log/filebeat.log
![Page 58: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/58.jpg)
Final Architecture
PassLogs Kibana
Filebeat
@
58
@
Soft RealTime
--
Old Logs
IIS
Apache
Ngnix
HA Proxy
![Page 59: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/59.jpg)
Test yourself 59
Use Screaming Frog Spider Tool
www.screamingfrog.co.uk
Teach R
www.datacamp.com
www.data-seo.com
www.moise-le-geek.fr/push-your-hands-in-the-r-introduction/
Test PassLogs
www.runabove.com
Install Kibana
www.elastic.co/downloads/kibana
![Page 60: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/60.jpg)
TODO List 60
- Create a GitHub Repository with all source code
- Add Plugin Logstash to do a reverse DNS lookup
- Schedule A Crawl By Command Line
- Upload Screaming Frog File to web server
![Page 61: Analyse your SEO Data with R and Kibana](https://reader034.fdocuments.net/reader034/viewer/2022042515/5873fa7c1a28abb1528b6deb/html5/thumbnails/61.jpg)
Thank you
Keep in touch June 10th, 2016
@vincentterrasi Vincent Terrasi