Apache logs monitoring

download Apache logs monitoring

If you can't read please download the document

Transcript of Apache logs monitoring

Monitoring Apache

Monitoring Apache

There are many ways to examine apaches status and performance

apachectl v tells you the version number

apachectl V gives you complete compiler settings

apachectl status gives you the servers status in the form of a scoreboard where, for each apache child, you see its status as one of these characters:_ waiting for connection

S starting up

R reading a request

W sending a reply

K keepalive

D performing DNS lookup

C closing connection

L logging information

G gracefully finishing

I idle cleanup

. open slot with no current process

_____CCCCCCC_____RR_CCCCCRR_________CC_CCC__......._____CCCCCCCRW______..................____CCCCLLCCCCCR____..................

Extended Status

You can obtain even more information (including PID) using ./apachectl fullstatus

this gives you a snapshot of the current status of each child

to use fullstatusload the mod_info.so module (not needed in apache 2.2, part of the core)

add the directive ExtendedStatus On to your httpd.conf file

add a container for the address /server-status in your httpd.conf file that has the directive SetHandler server-status

Now, when you type ./apachectl fullstatus, the listing gives you more details:Srv child server number & generation (in the form 5-1), and PID

Accesses of this connection for this child

Mode (as per last slide, _, C, R, W, etc)

CPU usage, number of seconds

Seconds since beginning of most recent request

Milliseconds required to process most recent request

Kilobytes transferred for the connection

Mbytes transferred for this child

server-status and server-info

You can also obtain view this information via web browser

Either server status information (as from the last slide), or server information

for either/both of these, add a or containerNOTE: the URL for these is simply http://ipaddress/server-status or http://ipaddress/server-info

also to the container the proper handler, SetHandler server-status or SetHandler server-info

Information available by server-info includes

version, compilation date

modules loaded, directives of each

hostname, port

timeout, keep-alive directives

server root, configuration file location

Security

Making this information available presents a security flaw

by knowing the version of apache, it is easier to hack into the server and manipulate/destroy files

yet this might be useful for a web administrator to check status or server information at any time either locally or remotely

In the container from the previous slide, lets add proper allow/deny statements to limit who can access this information

deny access to all except for specific IP address/port of the location where our webadmin will access the server information fromOrder deny,allow

Deny from all

Allow from 10.2.3.0/24

by using 0 as the last octet, we are allowing access to anyone from this subnetwork (10.2.3)the 24 is used to indicate a mask to indicate which octet to examine (8 for first octet, 16 for first two, 24 for first three)

do this for both and containers (if we use both)

Error Pages

Apache is configured to generate a generic page on an error based on the status code

these response pages may lack useful information and so apache allows you to alter the default configuration on errors

you cancreate your own error pages

create your own error scripts for instance, a php script

generate a short automated message

use a multi-language error page available in the errors directory

redirect the attempt to a local URL see for instance what happens at www.nku.edu when you specify any incorrect URL/filename

redirect the attempt to an external URL

in your httpd.conf file, you set these up using the ErrorDocument directive of the form:ErrorDocument error-code document-name (or message)

Examples

ErrorDocument 401 /subscribe.html

here, presumably the user was not able to validly log in and thus generated a 401 error, so we bring up the page /subscribe.html

ErrorDocument 404 /cgi-bin/notfound.php

here, we run a script that we set up to handle any 404 (URL not found) errors (this is what NKU does)

ErrorDocument 500 Server Error!!

here, we return a page with the text Server Error!!

ErrorDocument 410 /var/web/errors/HTTP_GONE.html.var

here, we use one of the error pages made available in apache

these can respond differently based on several situations language of choice based on language negotiation, response includes environment variable(s) value(s) such as $HTTP_REFERER

ErrorDocument 505 http://www.errors.org/error505.cgi

redirect to an external URL because of wrong HTTP version

Using the Multi-Language Files

To use the multi-language error document files available in your error directory, there are several steps you will have to make

create an alias from /error/ to the actual location in your filespace of your error documentsAlias /error/ /usr/local/apache2/error/notice the use of trialing / here!

create a container for that directory containing at a minimumOptions IncludesNoExec

AddOutputFilter Includes html

AddHandler type-map var the files in this directory end with a .var extension

Order deny,allow

Allow from all (this is needed since / (root) is denied to all)

add your ErrorDocument directivee.g., ErrorDocument 404 /error/HTTP_NOT_FOUND.html.var

these already exist the file httpd-multilang-errordoc.conf

More on Multi-Language Pages

The nice thing about the use of the multi-language error pages that are available in Apache is that, based on browser information, the actual language returned can be specialized

if you look at any of these files, you see entries for Content-language for a number of different languages

based on the Content-language sent by the browser, the matching Body is selected and returnedfurther, an if statement allows for a more specialized message as to whether the page was reached directory or from a referer (a link)

In order to get the language selected appropriately, you might want to include two additional directives in your container from the previous slide:

LanguagePriority list (of languages here, e.g., en cs de es )

ForceLanguagePriority Prefer Fallback

External Redirects on Errors

An external redirect is not a matter of simply passing the buck

recall from chapter 5 a redirect sends a response to the web browser with a redirect status code (30x) and a new URLthe web browser then sends out a new HTTP request of the new URL

this can confuse crawlers and other agents who were expecting content back from their requests or error codes if the request could not be fulfilled, instead, they are given a new URL to pursuethe redirection can also cause problems if it arises during authentication because the browser is not receiving a 401 code and so will not prompt the user for a password potentially leaving the user confused as to why the original request was not fulfilled yet taken to the wrong location

Automatic Logging

There are two forms of logging that are taken care of automatically

access logging logging every request sent by clients (browsers, users, software)

error logging logging any request that results in an error

Either type of event will place a new entry into the appropriate log file

Each entry will contain at a minimum

the time/date of the request

the URL

the IP address of the requester

For errors, the status code will be included with the entry

For accesses, the command serviced (e.g., GET), the status code, and the browsers specification (type, OS, HTTP version) will be included with the entry

Typically, Apache performs the logging itself

rather than invoking syslogd or klogd as with other Linux services

Error Logging

Errors can be written to a log file, sent to a pipe (that is, piped to a Linux command) or written to the linux syslog service

There are two apache directives to control logging

ErrorLog specify the file or syslogif you do not set ErrorLog, it defaults to writing to the file error_log

if you specify a filename, it is assumed to be under ServerRoot unless you specify the full path

if the filename starts with | then the information is piped to the command that follows | as in | cat which would display the error information to the terminal window, probably a poor option

if you specify syslog, the syslogd service is used and follows the action in the /etc/syslog.conf file for local7 messages

LogLevel one of emerg, alert, crit, error, warn, notice, info, debug (see table 7-2 on page 182 for more detail)

IE Browser Error Pages

IE tends to ignore the error pages sent by Apache and it displays its own, more generic page

MS considers their own pages to be more user friendly

the problem is that the error page sent by Apache might include some useful content that an IE user will not see

IE will only display error pages for403, 405, 410 errors if the pages size > 256 bytes

400, 404, 406, 408, 409, 500, 501, 505 errors if the pages size > 512 bytes

but these pages, as generated by apache, tend to be smaller than the byte size listed above

there is a way to force IE to display the sent error page using the Windows Registry, but most users will not be aware of this

or, you could create your own error pages and make sure that they are > 512 bytes to force IE to post your pagesI tried both of these and I could not get IE to post the apache page so Im unsure if its even possible!

I/O Logging

Aside from logging requests and errors, you can log regular apache I/O if desired

this requires the use of the mod_dumpio modulethis is not part of the base apache, so it must be separately compiled

add the LoadModule statement to httpd.conf

there are three directivesDumpIOInput on (or off, the default)

DumpIOOutput on (or off, the default)

LogLevel=value where value is one of emerg, alert, crit, error, warn, notice, info, debug here, you need to use debug

the I/O logging is sent to your error log file, and because this generates an enormous amount of messages, you will probably not want to use this feature at all, or for a very long time

Access Logging

All http requests to your server are logged in the access log

these include requests that result in errors

Unlike error logging, these can only be logged to a specified file or written to a pipe

they cannot be sent to syslogd

You can specialize the access log using the mod_log_config module which offers two directives

CustomLog allows you to specify a new place for the output (a different file or a pipe)

LogFormat which allows you to specify how accesses are logged in terms of what types of information (we will see details on this in the next slide)in addition, the mod_sentenvif module can be used can be used to set various environment variables based on attributes of a request

Log Formatting

The LogFormat directive allows you to specify how you want your log entries to appear

you are able to define different formats and have them sent to different files although this may not be usefulLogFormat format name

CustomLog location name

format is a specification of the type of information to record and in what order it should be recorded (covered over the next few slides)

location is the location in your file system where you want the log file to be written if you specify a relative path, it is relative to ServerRoot

name is the same on both lines used to link a specified format to a log fileyou can shorten this by just doing CustomLog location format and omit the second directive and the name, but this means that you cannot share a format between two or more different log files

You can also specify under what condition(s) a format might be used (for instance, if the access resulted in a 200 status)

therefore, you can specify multiple logs, each with its own format

More

The format will comprise a series of percent directives (covered on the next slide) that specify what information should be logged (recorded)

these include such pieces of information as requestors IP address, URL requested, time of request, etc

the entire format is placed inside of for example, %a %U means IP address of client and URL requested

Conditional directives allow you to specify what status code(s) you desire for that piece of information to be logged

multiple status codes are separated by commas, and the code(s) appear between % and the directive%200a means to log %a (IP addr of client) if the status code is 200

%400,401,402,403,404U means to log %U (URL) if the status code is any of 400-404

you can also place ! in front of the number as in %!200a

if the condition is not met, the requested value is replaced by a hyphen (- ) in the log file

Useful Percent Directives

The full set of percent directives is given in table 7-3 on page 188, here, we look at the most useful

%a remote IP address,

%A server IP address

%B bytes sent excluding header

%c connection status when complete

%D duration of request

%f filename (resource)

%H request protocol

%m request method

%P PID of child servicing request

%s status

%t time of request

%u remote user (only available if user has authenticated)

%U requested URL

%{X}e output the value of environment variable X

%{X}i output Xs header (X might be User-agent or Referer)

%{format}t output the time using the provided format

Examples

The Common Log Format is a standard format developed for NCSA servers

this format string is %h %l %u %t \%r\ %>s %b which isthe host, remote logname (or if not known/supported), user name (if known through login), date, request (inside of since the \ is an escape character), status (3 digit code) with a > prior to the status, and bytes of the transferred file including the header

Imagine that your website is linked from other sites and you want to know how often a visitor has reached your site through one of those links (referers)

use %{Referer}i -> %U

this records into your log file the referer and the URL (how they got here and where they tried to go)

Or you might want to know the web browser of a visitor

use %{User-agent}i

Multiple Logs

Lets imagine that we want to have one log for all successful operations, one log for redirections, and one log for 40x errors

LogFormat %200a %200U %200t success

CustomLog logs/success_log success

LogFormat %301,302,303%a %301,302,303U %301,302,303t redirection

CustomLog logs/redirection_log redirection

LogFormat %401,403,404,410a %401,403,404,410U %401,403,404,410t error40x

CustomLog logs/error_log_40x error40x

We could change the format so that each log file logs different types of information

for instance we might want to know the specific error for the error_log_40x file by adding %snote that %s will return the original requests status in the case of a redirection (e.g., 30x), if we want the final status, use %>s

or the size of the file (%B) on a 200 success

SetEnvIf Directives

This directive allows you to set an environment variable which you can then use for your logging

the format of the directive isSetEnvIf attribute regex env-variable[=value]you can set multiple variables if desired

the attribute is usually a value from the request header (e.g., Method, Protocol, Host, User-Agent, Referer, Range) or it can be one of Remote_Addr, Remote_User, Request_URI or it can be an already defined environment variableexample: SetEnvIf Referer www\.nku\.edu internalthis sets the variable local (to true)

example: SetEnvIf Remote_Addr 127\.0\.0\.1 self

example: SetEnvIf Request_URI \.gif$ type=gif

SetEnvIf Request_URI \.jpg$ type=jpg

this will set the variable type to be of the type of image requested

Log Rotation

If you are running a web server for even a modest sized web site, you may receive thousands of hits a day

each of these is logged in the access_log file and the error_log file may become large as well

log rotation is the process of moving the current log file into a retired log filethese typically appear with .# after their name as in access_log, access_log.1, access_log.2, access_log.3 with the previous access_log.3 being deleted and the new access_log starting blank

depending on how quickly a log file fills up, you may want to rotate the files every day, every week or every month

while you might write your own script to handle this and then issue a crontab job, there is a built-in apache program called rotatelogs that does this for youthis program is typically in the same directory as apachectl

you run it as rotatelogs filename rotationtime (in seconds, 86400 is every day)

Favicon.ico

The favicon is an icon that is displayed in the browsers address bar next to the URL of the sites logo (you can also see these in bookmarks)

the icon will reside in the web sites home directory (DocumentRoot)

If a site does not have a favicon.ico in that directory, typically the error and access logs fill up with error messages

you have three ways to prevent thiscreate an icon and put it in this directory

create a 0 byte file whose name is favicon.ico in this directory

suppress the log messages as follows:SetEnvIf Request_URI favicon\.ico favicon

CustomLog logs/access_log common env=!favicon

this says for any request for favicon.ico, set the variable favicon to true, and log anything when favicon is false

Reporting Programs

You want to search your log files for useful information

how many people are visiting? what errors are arising? is the same IP address sending numerous requests (e.g., denial of service request)?

wading through thousands of entries can be time consuming

you have many choices such as using awk or writing your own shell scriptswith awk, you could count the number of times each unique IP address is found to see if you are being attacked

with your own script, you could generate a report that lists all of the 404 errors by URL so that you could see if there are URLs that are being misinterpreted by the users

AWStats is a reporting tool that can dig through your file(s) for useful information like trends, that you might want to share with your marketing department this is open source software