|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
For any good monitoring solution the quality of the probes is the key. Otherwise this is "garbage in -- garbage out" situation...
|
Many Unix utilities can be reused as probes using some scripting language wrapper. For example, most UNIX systems have sar utility output of which can be piped into Perl script and/or converted into html. The latter can be send as a status report to a monitoring Web-server. Sar usually runs periodically from the cron (for example each 15 minutes) but you can implement any type of scheduling you wish. Similarly the most basic heartbeat capabilities can be achieved using ping or similar module in Perl, Python or other scripting language. It is important to design and use a unified architecture for the most probes (some specialized probes can, of course, represent an exception). As Damir Delija aptly noted in his Sys Admin article Unix Monitoring Scripts:
A monitoring tool or script is part of system management and to be really efficient must be part of an enterprise-wide effort, not a standalone tool. Its purpose is to detect problems and send alerts or, rarely, to try to correct the problem. Basically, a monitoring/alerting tool consists of four different parts:
- Configuration -- Defines the environment and does initializations, sets the defaults, etc.
- Sensor -- Collects data from the system or fetches pre-stored data.
- Conditions -- Decides whether events are fired.
- Actions -- Takes action if events are fired.
If these elements are simply bundled into a script without thinking, the script will be ineffective and un-adaptable. Good tools also include an abstraction layer added to simplify things later, when modifications are done.
Generally in any monitoring task Perl is your friend and there is tremendous amount of free Perl probes available on internet either as standalone modules/utilities or as a part of monitoring packages. Usually they are well-written, quite simple and thus can be adapted to your task without too much effort. For example, if you want to monitor web logs such packages as W3Perl can be quite handy.
In any monitoring task Perl is your friend and there is tremendous amount of free Perl probes available on internet either as standalone modules/utilities or as a part of monitoring packages. Usually they are well-written, quite simple and thus can be adapted to your task without too much effort. |
Many probes can be created as a simple Perl wrappers around a command whose results are important for judging health of the server, subsystem (CPU, disk space, etc) or particular application. If you put some efforts to develop common structure and that enforce it (adopt unified probe architecture) then you can achieve significant savings in writing code for them and will spend less time and efforts on maintenance. You may even adopt some of the existing packages probes architecture, for example mon or Nagios and not to try to reinvent the bicycle.
Careful selection and adoption (with possible extension on one of existing probes architecture is important for successful custom monitoring infrastructure development. It's important nor to reinvent the bicycle |
As soon as you have a set of useful probes, you need the infrastructure to run them. In many cases you do not need anything fancy and it can be really, really simple. The most basic structure of monitoring package is an infinite loop that runs probes each polling interval. Each probe can write to named pipe that is attached to converter of key value pairs (or HTML if you want to be fancy, but please keep it simple). Script on the other end is essentially an agent -- the code designed to pass the message to the server. For remote probes SMTP mail can be used as the simplest delivery mechanism or probes can directly communicate with the WEB server via Web forms.
Here is the subset of probes used on old Tivoli Distributed monitoring
Disk Resource Monitoring Sources
Inodes free inodes Inodes used inodesused Percent inodes used inodesusedpct Percent space used diskusedpct Space free diskavail Space used diskused Tivoli DB free space tivdbspace
Security Monitoring Sources
Check file permissions fileperm Compare files filediff Daemon status daemon File checksum filechk File size filesize Occurrences in file countstr Process instances daemonct User logins by user ulogins Users logged in ulogintot Network Monitoring Sources
Client RPC timeouts rpctmout Host status host Network collisions netcoll Network collisions/packet netcollpct NFS bad calls badnfs Input packet errors netinerr Input packets netin Output packet errors netouterr Output packets netout Remote oserv status oserv RPC bad calls badrpc System Resources Monitoring Sources
Available swap space swapavail Host status host Lingering terminated processes zombies Load average loadavg Mail queue length mailqlen Page-outs pageouts Printer Monitoring
Daemon status daemon Jobs in print queue printjobs Status of print queue printstat Total size queued printjobsize User-Defined Monitoring Sources
Asynchronous numeric nasync Asynchronous string sasync Numeric script ncustom String script scustom
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Author: unixsysny | Created Date: 8/31/2004 |
Author Website: none | Rating: 13 of 16 |
Code Demo URL: none | Views: 12,971 |
Compatibility: Korn Shell | |
Summary: This DISK Capacity Monitoring script could be tweaked to run as a bash script also - just happen to prefer ksh. |
#! /bin/ksh # ########### Function SENDMAIL will send mail to sys admins regarding disk capacity ############## SENDMAIL () { mail -s 'Disk Space Alert' $INTERESTED<<EOF $PERC % ALERT Filesystem $FILESYSTEM has reached $PERC% of its capacity. EOF return } ########## VARIABLES ########### [email protected],[email protected],[email protected] ######### Run df -kl and extract filesystem and disk usage amount to temporary holding file ######## df -kl | grep -iv filesystem | awk '{ print $6"\t "$5}'|cut -d"%" -f1 >holding ######### reasign standard input to temporary holding file####### exec < holding ######### Read FILESYSTEM and PERCentage, for each line execute the case test ######## while read FILESYSTEM PERC do ######## Test each FILESYSTEM PERCentage against it's threshold amount. If PERCentage is ######## greater than threshold amount execute the SENDMAIL function. case "$FILESYSTEM" in /) if [[ $PERC -gt 80 ]]; then SENDMAIL fi;; /stand) if [[ $PERC -gt 47 ]]; then SENDMAIL fi;; /proc) if [[ $PERC -gt 1 ]]; then SENDMAIL fi;; /dev/fd) if [[ $PERC -gt 80 ]]; then SENDMAIL fi;; /dev/_tcp) if [[ $PERC -gt 80 ]]; then SENDMAIL fi;; /home) if [[ $PERC -gt 42 ]]; then SENDMAIL fi;; /home2) if [[ $PERC -gt 10 ]]; then SENDMAIL fi;; /system/processor) if [[ $PERC -gt 1 ]]; then SENDMAIL fi;; /tmp) if [[ $PERC -gt 30 ]]; then SENDMAIL fi;; /var/tmp) if [[ $PERC -gt 45 ]]; then SENDMAIL fi;; /osm3) if [[ $PERC -gt 80 ]]; then SENDMAIL fi;; /osm) if [[ $PERC -gt 80 ]]; then SENDMAIL fi;; /osm1) if [[ $PERC -gt 84 ]]; then SENDMAIL fi;; /osm2) if [[ $PERC -gt 80 ]]; then SENDMAIL fi;; *)mail -s 'Invalid FILESYSTEM! found' $INTERESTED<<EOF Filesystem $FILESYSTEM has been discovered by the Diskmonitor Process. EOF esac done
By David Gavin on Tue, 1998-12-01 02:00. Security
Mr. Gavin provides tools for systems data collection and display and discusses what information is needed and why.For the last few years, I have been supporting users on various flavors of UNIX systems and have found the System Accounting Reports data invaluable for performance analysis. When I began using Linux for my personal workstation, the lack of a similar performance data collection and reporting tool set was a real problem. It's hard to get management to upgrade your system when you have no data to back up your claims of ``I need more POWER!''. Thus, I started looking for a package to get the information I needed, and found out there wasn't any. I fell back on the last resort--I wrote my own, using as many existing tools as possible. I came up with scripts that collect data and display it graphically in an X11 window or hard copy.
What Do We Want to Know?
To get a good idea of how a system is performing, watch key system resources over a period of time to see how their usage and availability changes depending upon what's running on the system. The following categories of system resources are ones I wished to track.
CPU Utilization: The central processing unit, as viewed from Linux, is always in one of the following states:
- idle: available for work, waiting
- user: high-level functions, data movement, math, etc.
- system: performing kernel functions, I/O and other hardware interaction
- nice: like user, a job with low priority will yield the CPU to another task with a higher priority
By noting the percentage of time spent in each state, we can discover overloading of one state or another. Too much idle means nothing is being done; too much system time indicates a need for faster I/O or additional devices to spread the load. Each system will have its own profile when running its workload, and by watching these numbers over time, we can determine what's normal for that system. Once a baseline is established, we can easily detect changes in the profile.
Interrupts: Most I/O devices use interrupts to signal the CPU when there is work for it to do. For example, SCSI controllers will raise an interrupt to signal that a requested disk block has been read and is available in memory. A serial port with a mouse on it will generate an interrupt each time a button is pressed/released or when the mouse is moved. Watching the count of each interrupt can give you a rough idea of how much load the associated device is handling.
Context Switching: Time slicing is the term often used to describe how computers can appear to be doing multiple jobs at once. Each task is given control of the system for a certain ``slice'' of time, and when that time is up, the system saves the state of the running process and gives control of the system to another process, making sure that the necessary resources are available. This administrative process is called context switching. In some operating systems, the cost of this switching can be fairly expensive, sometimes using more resources than the processes it is switching. Linux is very good in this respect, but by watching the amount of this activity, you will learn to recognize when a system has a lot of tasks actively consuming resources.
Memory: When many processes are running and using up available memory, the system will slow down as processes get paged or swapped out to make room for other processes to run. When the time slice is exhausted, that task may have to be written out to the paging device to make way for the next process. Memory-utilization graphs help point out memory problems.
Paging: As mentioned above, when available memory begins to get scarce, the virtual memory system will start writing pages of real memory out to the swap device, freeing up space for active processes. Disk drives are fast, but when paging gets beyond a certain point, the system can spend all of its time shuttling pages in and out. Paging on a Linux system can also be increased by the loading of programs, as Linux ``demand pages'' each portion of an executable as needed.
Swapping: Swapping is much like paging. However, it migrates entire process images, consisting of many pages of memory, from real memory to the swap devices rather than the usual page-by-page mechanism normally used for paging.
Disk I/O: Linux keeps statistics on the first four disks; total I/O, reads, writes, block reads and block writes. These numbers can show uneven loading of multiple disks and show the balance of reads versus writes.
Network I/O: Network I/O can be used to diagnose problems and examine loading of the network interface(s). The statistics show traffic in and out, collisions, and errors encountered in both directions.
These charts can also help in the following instances:
- The system is running jobs you aren't aware of during hours when you are not present.
- Someone is logging on or remotely running commands on the system without your knowledge.
This sort of information will often show up as a spike in the charts at times when the system should have been idle. Sudden increases in activity can also be due to jobs run by crontab.
These scripts monitor web servers, disk space, dns and SMTP servers using ksh shell, wget, basic Perl modules. Its goal is to be easy, simple to use. You may use one, none, or some of the scripts, as they are all independent.
Installation of the scripts consists of:
- All scripts are open sourced under the GPL
- All scripts are designed to be easy to use, requiring only basic UNIX skills to install.
- All service scripts use a redundant SMTP client to send alerts. (Up to 3 SMTP servers are supported)
- Two Service scripts (SMTP and DNS) create a hyperlinked drill down page (see screenshot #3) to check on the current status of ports.
- Monitor_smtp.sh will check for service availability without sending an email through an SMTP servers, it will check for a valid SMTP banner on port 25. (This can be modified to send the full email as well)
- Monitor_dns.sh monitors dns service availability by checking for a known, defined host in your DNS.
- Creating a local user to run each script and schedule these through that user's crontab. You may run each script with any user capable of running wget, df -k, and top. I suggest creating a user called monitor.
- Extracting and placing all scripts (Main and Support) on a web server and making them executable. Also move images to an images directory under your web root.
Ex.
wget http://monitorsuite.sourceforge.net/monitor_suite.tgz
tar zxvf monitor_suite.tgz
mkdir -p /usr/local/admin/bin
mv *sh /usr/local/admin/bin/
mv *pl /usr/local/admin/bin/
chmod 755 /usr/local/admin/bin/*
mv *gif /var/www/html/images
chown monitor.monitor /usr/local/admin/bin/*
- Read through each main mon*.sh script and fill out local variables (For example your webroot and pager recipient).
- Creating a directory under the root of your web server where the scripts will write its logs and history. I used webmon for monitor_web.sh. The other scripts are similar: I used smtpmon for monitor_smtp.sh and stats for monitor_stats.pl. Monitor_disk.sh is different in that is the only one installed locally on each server you wish to monitor.
Ex.
mkdir /var/www/html/smtpmon/
mkdir /var/www/html/webmon/
mkdir /var/www/html/dnsmon/
- Make sure the user running the scripts have permission to write to the script home.
chown monitor /var/www/html/smtpmon/
chown monitor /var/www/html/dnsmon/
chown monitor /var/www/html/webmon/
- Move Support Files into place in the new script homes:
cp footer.txt /var/www/html/smtpmon/
cp footer.txt /var/www/html/webmon/
mv smtpservers.txt /var/www/html/smtpmon/
mv smtp_header.txt /var/www/html/smtpmon/
mv urls.txt /var/www/html/webmon/
mv dnsservers.txt /var/www/html/dnsmon/
- Installing wget if not on your Unix/Linux Distribution (for monitor_web.sh)
- Installing NET::SMTP Perl module if not on your Unix/Linux Distribution (used in smtp.pl and send_alert.pl)
- Installing NET::telnet Perl module if not on your Unix/Linux Distribution (for monitor_smtp.sh and monitor_dns.sh (used in connect.pl))
- Setting up in cron.
For Example:
#-->Web monitor
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /usr/local/admin/bin/monitor_web.sh > /dev/null 2>&1
#-->Smtp monitor
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /usr/local/admin/bin/monitor_smtp.sh > /dev/null 2>&1
#-->DNS monitor
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /usr/local/admin/bin/monitor_dns.sh > /dev/null 2>&1
Notes
- Each wget log is about a half a k. (500 bytes). Multiply this times the number of servers you are monitoring, times the frequency you are monitoring (ie. every 5 minutes equals 12 times an hour, 288 times a day) to understand how much space you need for history. As a reference we have 10 production web servers being monitored 24x7, 6 months of logs take up about 500 MB.
- Each smtp log is about 1.5k. This includes debugging info (recommended, not required)
- Each successful dns log is about 200 bytes. Each error log about 1k.
- Logging for monitor_disk.sh is either through cron or via use of wrapper script. Or you may modify the source.
- We are using the insecure rsh protocol in the monitor_stats.pl script to show you how to get this setup quickly, but it recommended you use ssh with properly distributed keys to gain security.
- Support files for each script (if needed) are listed directly below the script.
- If you get a "bad interpreter" message, make sure the first line points to a valid shell you have installed. (ksh and bash should both work)
- I suggest using IPs addresses for your defined SMTP servers to send alerts through (or use host files), in case of a DNS outage.
Screenshots
Main Scripts
- Monitor_web.sh Web Server Monitoring Script:
- Monitor_disk.sh Unix Disk Monitoring Script:
- Monitor_smtp.sh SMTP Monitoring Script:
- Monitor_stats.pl System Monitoring Dashboard:
- Monitor_dns.sh DNS Monitoring Script
Support Scripts
- Send_alert.pl (used for sending alerts instead of using /bin/mail)
- Connect.pl (used for testing ports)
- Banner.pl (used for testing emails)
Google matched content |
Scripts Category UNIX System_administration
freshmeat.net Project details for Unix Server Monitoring Scripts
Monitoring Unix System Processes with Psmon by Nicola Worthington
Psmon is a system monitoring script written in Perl and licensed under the Apache license that is quite useful if you run servers with critical processes on them. You can download the latest version from the psmon homepage (Version 1.39 as of this writing). Read on for tips on installing and configuring it.
Sys Admin/Unix Monitoring Scripts by Damir Delija
Linux, UNIX system Monitoring - Bash shell scripts directory
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last updated: March 12, 2019