Parallel command execution means executing the same command or script on multiple servers. Such tools
can be command line or interactive. Connection to remote host is typically performed via ssh configured for
Password-less SSH login. Some of those tools
also use
rdist when you trying to transfer
file and ssh they you execute the file. If the number of target servers is small (say less then 32) you can create you own
parallel execution script. In any case often it makes sense to use one written in a scripting language you know, so that you
can adapt functionality to your specific needs, if necessary and correct some errors.
Right now Ansible/Puppet/Chief etc are all fashion and hot. But they do not contain any exiting new ideas and
add another layer of complexity into already complex infrastructure that sysadmins needs to master and maintain. The reality is that 80% of what they do can be done
with simple parallel execution tools and simple scripts written in your favorite scripting language or bash. With much less fuss and without the
necessity to learn another complex and obscure software package.
Right now Ansible/Puppet/Chief etc are all fashion and hot. But the reality is that 80% of what they do can be done with simple parallel execution tools and simple scripts written in your favorite scripting language or bash. With much less fuss and without the necessity to learn another complex and obscure software package.
Some of parallel execution scripts are simple and elegant and can me maintained with ease (for example C-tools, which are written in Python). Some
are really powerful when you work with large number of servers (pdsh). the package also contains a powerful parallel copying program (pdcp.
The package is well debugged and is rock solid, which can't be said about any heavyweight tools like Puppet.
The most typical usage is the distribution of changed configuration files to multiple servers. The prerequisite is availability of SSH keys (See
Passwordless SSH login ) on all servers and one simple
file that defines groups of servers you administer (format is not uniform and depends ont he packagfe you select but it
can be created using simple script or pipe from /etc/hosts file). That can be done in an hour or so.
After that you can perform simultaneous actions on multiple hosts or file transfers to/from them. If you have NFS or
other parallel filesystem you can use it for distribution of files too. This is the typical case in computational clusters. With NFS
the need for complex tool like Puppet is even less.
Such tools are really necessary in cluster and computer grid environment, which have their specific configuration (with common
NFS filesystem for all nodes) and due to this are not well served with heavy-weight tools like Puppet. If a common filesystem
exists for all nodes the latter can be viewed as an overkill. See
Cluster management tools for more details. Specialized packages also exist
for clusters (one example is Bright cluster manager)
which also integrate operations on DRAC/ILO) and installation/reinstallation of computational nodes and common cluster software.
For execution of a simple command on up to, say 32 nodes, you can use simple for loop or
xargs.
for host in hostA hostB hostC ; do
ssh $host do_something
done
the for loop also can read the list on nodes from the file.
pdsh-- pdsh is a variant
of the rsh(1) command.
RPMs are available from Fedora EPEL (pdsh-2.26-4.el6.x86_64.rpm
CentOS 6 Download),
and SourceForge.net Unlike
rsh(1), which runs commands
on a single remote host, pdsh can run multiple remote commands in parallel. pdsh
uses a "sliding window" (or fanout) of threads to conserve resources on the initiating
host while allowing some connections to time out.
Xargs. You can use xargs submitting to it list of servers and using ssh or
scp command to distribute the files. Parallel is
a Perl script written by Ole Tange that extends and improves capabilities of xargs
that can optimize this operation creating multiple threads, one for each server. To
transfer file to remote computer you can use option --transfer: It is included in most
Linux distributions.
GNU xargs also has option that allows you to take advantage of multiple cores
in your machine. Its -P option which allows xargs to invoke the specified
command multiple times in parallel which makes it preferable to the for loop.
MusshMussh is
a Bash script that allows you to execute a command or script over ssh on multiple hosts with one
command. When possible mussh will use ssh-agent and RSA/DSA keys to minimize the
need to enter your password more than once.
For simple corrections of configuration files on the multiple servers, which can't be replaced by a simple file
distribution for a sinle source you use Expect and the remote protocol of choice
(ssh, telnet, etc). Expect provides all the necessary functionality, has tons of examples around,
and a cool book (Exploring Expect). NOTE: You can also use a tool that replicate each operation you perform on one host to multiple hosts in real
time like ClusterSSH qguxg ua better tool for simple corrections.
Perl
GNU version of parallelthis is a powerful,
flexible and highly underappreciated tool. To run commands on more than one remote computer
run :
It also can transfer files before the command execution with option --bg and clean
after commands were executed with --cleanup option.
Option --return filename transfer files from remote
computers. --return is used with --sshlogin when the arguments are files on the remote computers.
When processing is done the file filename will be transferred from the remote computer using rsync
and will be put relative to the default login dir.
clusterssh Cluster SSH opens terminal windows with connections to specified hosts
and an administration console. Any text typed into the administration console is replicated
to all other connected and active windows. This tool is intended for,
but not limited to, cluster administration where the same configuration or commands must be run
on each node within the cluster. Performing these commands all at once via this
tool ensures all nodes are kept in sync.
Perl has several modules which provide all the necessary functionality to build your own
tools. For example Net:OpenSSH, Net:SSH::Perl, Net::SFTP, etc.
See also
psshSet of Python scripts.
PSSH provides parallel versions of OpenSSH and
related tools. Included are pssh, pscp, prsync, pnuke, and
pslurp. The project includes psshlib which can be used to create custom applications.
SSH Power Tool (sshpt)
was designed for parallel SSH without requiring that the user setup pre-shared SSH keys. It supports executions via sudo and can also copy
files and execute them afterwards (optionally, via sudo as well). By default it outputs results
in CSV format but sshpt.py doubles as an importable Python module so you can use it in your own
programs (I used to use it as a back-end behind a a custom-built web-based reporting tool at my
former employer).
Tentakel, another Python-based tool for distributed command execution. It is a program
for executing the same command on many hosts in parallel using ssh (it supports other methods
too). Main advantage is you can create several sets of servers according requirements. For example
webserver group,
mail server group,
home servers group
The command is executed in parallel on all servers in this group (time saving). By default,
every result is printed to stdout (screen). The output format can be defined for each group.
Multi Remote Tools MrTools is a suite of tools for managing large, distributed environments.
It can be used to execute scripts on multiple remote hosts without prior installation, copy of
a file or directory to multiple hosts as efficiently as possible in a relatively secure way, and
collect a copy of a file or directory from multiple hosts.
parsync (wrapper for rsync). Several similar wrappers exits.
If you have NFS share that is mounted on all hosts you can use it with parallel execution of commands tools to distribute files.
This method is often used in computational clusters, as in them there is a filesystem (NFS for small cluster, IBM GPFS or Lystre for
large) mounted on all nodes.
The cluster comes with a simple parallel shell named pdsh. The pdsh shell is handy for
running commands across the cluster. There is man page that describes the capabilities of pdsh
is detail. One of the useful features is the capability of specifying all or a subset of the
cluster. For example: pdsh -a targets the to all nodes of the cluster, including the master.
pdsh -a -x node00 targets the to all nodes of the cluster except the master. pdsh node[01-08]
targets the to the 8 nodes of the cluster named node01, node02, . . ., node08.
Another utility that is useful for formatting the output of pdsh is dshbak. Here we will
show some handy uses of pdsh.
Show the current date and time on all nodes of the cluster. pdsh -a date
Show the current load and system uptime for all nodes of the cluster. pdsh -a
uptime
Show the version of the Operating System on all nodes.
pdsh -a cat /etc/redhat-release
Check who is logged in the MetaGeek lab!
pdsh -w node[01-32] who
Show all process that have the substring pbs on the cluster. These will be the PBS
servers running on each node.
pdsh -a ps augx | grep pbs | grep -v grep
The utility dshbak formats the output from pdsh by consolidating the output from
each node. The option -c shows identical output from different nodes just once . Try
the following commands.
pdsh -w node[01-32] who | dshbak
pdsh -w node[01-32] who | dshbak -c
pdsh -a date | dshbak -c
Administrators can build wrapper commands around pdsh for commands that are
frequently used across multiple systems and Serviceguard clusters. Several such wrapper
commands are provided with DSAU. These wrappers are Serviceguard cluster-aware and default to
fanning out cluster-wide when used in a Serviceguard environment. These wrappers support most
standard pdsh command line options and also support long options ( --
option syntax) .
cexec is a general purpose pdsh wrapper. In addition to the standard
pdsh features, cexec includes a reporting feature. Use the
--report_loc option to have cexec display the report location for a command.
The command report records the command issued in addition to the nodes where the command
succeeded, failed, or the nodes that were unreachable. The report can be used with the
--retry option to replay the command against nodes that failed, succeeded, were
unreachable, or all nodes.
ccp
ccp is a wrapper for pdcp and copies files cluster-wide or to the
specified set of systems.
cps
cps fans out a ps command across a set of systems or cluster.
ckill
ckill allows the administrator to signal a process by name since the pid of a
specific process will vary across a set of systems or the members of a cluster.
cuptime
cuptime displays the uptime statistics for a set of systems or a cluster.
cwall
cwall displays a wall(1M) broadcast message on multiple hosts.
All the wrappers support the CFANOUT_HOSTS environment variable when not executing in a
Serviceguard cluster. The environment variable specifies a file containing the list of hosts to
target, one hostname per line. This will be used if no other host specifications are present on
the command line. When no target nodelist command line options are used and CFANOUT_HOSTS is
undefined, the command will be executed on the local host.
For more information on these commands, refer to their reference manpages.
Hm, this seems like a good idea, but I'm not sure dshbak is the right
place for this. (That script is meant to simply reformat output which
is prefixed by "node: ")
If you'd like to track up/down nodes, you should check out Al Chu's
Cerebro and whatsup/libnodeupdown:
http://www.llnl.gov/linux/cerebro/cerebro.html
http://www.llnl.gov/linux/whatsup/
But I do realize that reporting nodes that did not respond to pdsh
would also be a good feature. However, it seems to me that pdsh itself
would have to do this work, because only it knows the list of hosts originally
targeted. (How would dshbak know this?)
As an alternative I sometimes use something like this:
# pdsh -a true 2>&1 | sed 's/^[^:]*: //' | dshbak -c
----------------
emcr[73,138,165,293,313,331,357,386,389,481,493,499,519,522,526,536,548,553,560,564,574,601,604,612,618,636,646,655,665,676,678,693,700-701,703,706,711,713,715,717-718,724,733,737,740,759,767,779,817,840,851,890]
----------------
mcmd: connect failed: No route to host
----------------
emcrj
----------------
mcmd: xpoll: protocol failure in circuit setup
i.e. strip off the leading pdsh@...: and send all errors to stdout. Then
collect errors with dshbak to see which hosts are not reachable.
Maybe we should add an option to pdsh to issue a report of failed hosts
at the end of execution?
mark
>
NOTE: if you don't want to enter passwords for each server, then you need to have an
authorized_key installed on the remote servers. If necessary, you can use the environment
variable PDSH_SSH_ARGS to specify ssh options, including which identity file to
use ( -i ).
The commands will be run in parallel on all servers, and output from them will be
intermingled (with the hostname pre-pended to each output line). You can view the output nicely
formatted and separated by host using pdsh 's dshbak utility:
dshbak logfile.txt | less
Alternatively, you can pipe through dshbak before redirecting to a logfile:
IMO it's better to save the raw log file and use dshbak when required, but
that's just my subjective preference. For remote commands that produce only a single line of
output (e.g. uname or uptime ), dshbak is overly verbose
as the output is nicely concise. e.g. from my home network:
You can define hosts and groups of hosts in a file called /etc/genders and then
specify the host group with pdsh -g instead of pdsh -w . e.g. with an
/etc/genders file like this:
pdsh -g all uname -a will run uname -a on all servers. pdsh
-g web uptime will run uptime only on server1 and server 2. pdsh -g
web,mysql df -h / will run df on servers 1, 2, 5, and 6. and so on.
BTW, one odd thing about pdsh is that it is configured to use rsh
by default instead of ssh . You need to either:
use -R ssh on the pdsh command line (e.g. pdsh -R ssh -w server[0-9]
...
export PDSH_RCMD_TYPE=ssh before running pdsh
run echo ssh > /etc/pdsh/rcmd_default to set ssh as the
permanent default.
There are several other tools that do the same basic job as pdsh . I've tried
several of them and found that they're generally more hassle to set up and use.
pdsh pretty much just works with zero or minimal configuration.
dsh -q displays the values of the dsh variables (DSH_NODE_LIST, DCP_NODE_RCP...) dsh <command> runs comamnd on each server in DSH_NODE_LIST dsh <command> | dshbak same as above, just formats output to separate each
host dsh -w aix1,aix2 <command> execute command on the given servers (dsh -w aix1,aix2
"oslevel -s") dsh -e <script> to run the given script on each server
(for me it was faster to dcp and after run the script with dsh on the remote server)
dcp <file> <location> copies a file to the given location (without
location home dir will be used)
dping -n aix1, aix2 do a ping on the listed servers dping -f <filename> do a ping for all servers given in the file (-f)
Pscp utility allows you to transfer/copy files to multiple remote Linux
servers using single terminal with one single command, this tool is a part of Pssh (Parallel
SSH Tools), which provides parallel versions of OpenSSH and other similar tools such as:
pscp – is utility for copying files in parallel to a number of hosts.
prsync – is a utility for efficiently copying files to multiple hosts in
parallel.
pnuke – it helps to kills processes on multiple remote hosts in parallel.
pslurp – it helps to copy files from multiple remote hosts to a central host in
parallel.
When working in a network environment where there are multiple hosts on the network, a
System Administrator may find these tools listed above very useful. When working in a network
environment where there are multiple hosts on the network, a System Administrator may find
these tools listed above very useful.
Pscp – Copy Files to Multiple Linux Servers In this article, we shall look at some useful
examples of Pscp utility to transfer/copy files to multiple Linux hosts on a network. To use
the pscp tool, you need to install the PSSH utility on your Linux system, for installation of
PSSH you can read this article. Pscp – Copy Files to Multiple Linux Servers In this
article, we shall look at some useful examples of Pscp utility to transfer/copy files to
multiple Linux hosts on a network. To use the pscp tool, you need to install the PSSH utility
on your Linux system, for installation of PSSH you can read this article. In this article, we
shall look at some useful examples of Pscp utility to transfer/copy files to multiple Linux
hosts on a network. To use the pscp tool, you need to install the PSSH utility on your Linux
system, for installation of PSSH you can read this article. To use the pscp tool, you need to
install the PSSH utility on your Linux system, for installation of PSSH you can read this
article. To use the pscp tool, you need to install the PSSH utility on your Linux system, for
installation of PSSH you can read this article.
Almost all the different options used with these tools are the same except for few that
are related to the specific functionality of a given utility. Almost all the different options
used with these tools are the same except for few that are related to the specific
functionality of a given utility. How to Use Pscp to Transfer/Copy Files to Multiple Linux
Servers While using pscp you need to create a separate file that includes the number of
Linux server IP address and SSH port number that you need to connect to the server. While using
pscp you need to create a separate file that includes the number of Linux server IP address and
SSH port number that you need to connect to the server. Copy Files to Multiple Linux
Servers Let's create a new file called " myscphosts.txt " and add the list of Linux hosts
IP address and SSH port (default 22 ) number as shown. Let's create a new file called "
myscphosts.txt " and add the list of Linux hosts IP address and SSH port (default 22 ) number
as shown.
192.168.0.3:22
192.168.0.9:22
Once you've added hosts to the file, it's time to copy files from local machine to multiple
Linux hosts under /tmp directory with the help of following command. Once you've added hosts to the
file, it's time to copy files from local machine to multiple Linux hosts under /tmp directory with
the help of following command.
Warning: do not enter your password if anyone else has superuser
privileges or access to your account.
Password:
[1] 17:48:25 [SUCCESS] 192.168.0.3:22
[2] 17:48:35 [SUCCESS] 192.168.0.9:22
Explanation about the options used in the above command. Explanation about the options used
in the above command.
-h switch used to read a hosts from a given file and location.
-l switch reads a default username on all hosts that do not define a specific user.
-A switch tells pscp ask for a password and send to ssh.
-v switch is used to run pscp in verbose mode.
Copy Directories to Multiple Linux Servers If you want to copy entire directory use
-r option, which will recursively copy entire directories as shown. If you want to copy entire
directory use -r option, which will recursively copy entire directories as shown.
Warning: do not enter your password if anyone else has superuser
privileges or access to your account.
Password:
[1] 17:48:25 [SUCCESS] 192.168.0.3:22
[2] 17:48:35 [SUCCESS] 192.168.0.9:22
You can view the manual entry page for the pscp or use pscp --help command to
seek for help.
It didn't work for me as well. I can get into the machine through same ip and port as
I've inserted into hosts.txt file. Still i get the below messages:
Have you placed correct remote SSH host IP address and port number in the
myscphosts.txt file? please confirm and add correct values and then try
again..
"... A very common way of using pdsh is to set the environment variable WCOLL to point to the file that contains the list of hosts you want to use in the pdsh command. For example, I created a subdirectory PDSH where I create a file hosts that lists the hosts I want to use ..."
The -w option means I am specifying the node(s) that will run the command. In this case, I specified the IP address
of the node (192.168.1.250). After the list of nodes, I add the command I want to run, which is uname -r in this case.
Notice that pdsh starts the output line by identifying the node name.
If you need to mix rcmd modules in a single command, you can specify which module to use in the command line,
by putting the rcmd module before the node name. In this case, I used ssh and typical ssh syntax.
A very common way of using pdsh is to set the environment variable WCOLL to point to the file that contains the list
of hosts you want to use in the pdsh command. For example, I created a subdirectory PDSH where I create a file
hosts that lists the hosts I want to use:
[laytonjb@home4 ~]$ mkdir PDSH
[laytonjb@home4 ~]$ cd PDSH
[laytonjb@home4 PDSH]$ vi hosts
[laytonjb@home4 PDSH]$ more hosts
192.168.1.4
192.168.1.250
I'm only using two nodes: 192.168.1.4 and 192.168.1.250. The first is my test system (like a cluster head node), and the second
is my test compute node. You can put hosts in the file as you would on the command line separated by commas. Be sure not to put a
blank line at the end of the file because pdsh will try to connect to it. You can put the environment variable WCOLL
in your .bashrc file:
export WCOLL=/home/laytonjb/PDSH/hosts
As before, you can source your .bashrc file, or you can log out and log back in. Specifying Hosts
I won't list all the several other ways to specify a list of nodes, because the pdsh website
[9] discusses virtually
all of them; however, some of the methods are pretty handy. The simplest way is to specify the nodes on the command line is to use
the -w option:
In this case, I specified the node names separated by commas. You can also use a range of hosts as follows:
pdsh -w host[1-11]
pdsh -w host[1-4,8-11]
In the first case, pdsh expands the host range to host1, host2, host3, , host11. In the second case, it expands the hosts similarly
(host1, host2, host3, host4, host8, host9, host10, host11). You can go to the pdsh website for more information on hostlist expressions
[10] .
Another option is to have pdsh read the hosts from a file other than the one to which WCOLL points. The command shown in
Listing 2 tells
pdsh to take the hostnames from the file /tmp/hosts , which is listed after -w ^ (with no space between
the "^" and the filename). You can also use several host files,
Listing 2 Read Hosts from File
$ more /tmp/hosts
192.168.1.4
$ more /tmp/hosts2
192.168.1.250
$ pdsh -w ^/tmp/hosts,^/tmp/hosts2 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
The option -w -192.168.1.250 excluded node 192.168.1.250 from the list and only output the information for 192.168.1.4.
You can also exclude nodes using a node file:
or a list of hostnames to be excluded from the command to run also works.
More Useful pdsh Commands
Now I can shift into second gear and try some fancier pdsh tricks. First, I want to run a more complicated command on all of the
nodes ( Listing 3
). Notice that I put the entire command in quotes. This means the entire command is run on each node, including the first (
cat /proc/cpuinfo ) and second ( grep bogomips ) parts.
Listing 3 Quotation Marks 1
In the output, the node precedes the command results, so you can tell what output is associated with which node. Notice that the
BogoMips values are different on the two nodes, which is perfectly understandable because the systems are different. The first node
has eight cores (four cores and four Hyper-Thread cores), and the second node has four cores.
You can use this command across a homogeneous cluster to make sure all the nodes are reporting back the same BogoMips value. If
the cluster is truly homogeneous, this value should be the same. If it's not, then I would take the offending node out of production
and check it.
A slightly different command shown in
Listing 4 runs
the first part contained in quotes, cat /proc/cpuinfo , on each node and the second part of the command, grep
bogomips , on the node on which you issue the pdsh command.
Listing 4 Quotation Marks 2
The point here is that you need to be careful on the command line. In this example, the differences are trivial, but other commands
could have differences that might be difficult to notice.
One very important thing to note is that pdsh does not guarantee a return of output in any particular order. If you have a list
of 20 nodes, the output does not necessarily start with node 1 and increase incrementally to node 20. For example, in
Listing 5 , I run
vmstat on each node and get three lines of output from each node.
In this series of blog posts I'm taking a look at a few very useful tools that can make your
life as the sysadmin of a cluster of Linux machines easier. This may be a Hadoop cluster, or
just a plain simple set of 'normal' machines on which you want to run the same commands and
monitoring.
Previously we looked at using SSH keys for
intra-machine authorisation , which is a pre-requisite what we'll look at here -- executing
the same command across multiple machines using PDSH. In the next post of the series we'll see
how we can monitor OS metrics across a cluster with colmux.
PDSH is a very smart little tool that enables you to issue the same command on multiple
hosts at once, and see the output. You need to have set up ssh key authentication from the
client to host on all of them, so if you followed the steps in the first section of this
article you'll be good to go.
The syntax for using it is nice and simple:
-w specifies the addresses. You can use numerical ranges [1-4]
and/or comma-separated lists of hosts. If you want to connect as a user other than the
current user on the calling machine, you can specify it here (or as a separate
-l argument)
After that is the command to run.
For example run against a small cluster of four machines that I have:
robin@RNMMBP $ pdsh -w root@rnmcluster02-node0[1-4] date
rnmcluster02-node01: Fri Nov 28 17:26:17 GMT 2014
rnmcluster02-node02: Fri Nov 28 17:26:18 GMT 2014
rnmcluster02-node03: Fri Nov 28 17:26:18 GMT 2014
rnmcluster02-node04: Fri Nov 28 17:26:18 GMT 2014
... ... ...
Example - install and start collectl on all nodes
I started looking into pdsh when it came to setting up a cluster of machines from scratch.
One of the must-have tools I like to have on any machine that I work with is the excellent
collectl .
This is an OS resource monitoring tool that I initially learnt of through Kevin Closson and Greg Rahn , and provides the kind of information you'd get
from top etc – and then some! It can run interactively, log to disk, run as a service
– and it also happens to integrate
very nicely with graphite , making it a no-brainer choice for any server.
So, instead of logging into each box individually I could instead run this:
pdsh -w root@rnmcluster02-node0[1-4] yum install -y collectl
pdsh -w root@rnmcluster02-node0[1-4] service collectl start
pdsh -w root@rnmcluster02-node0[1-4] chkconfig collectl on
Yes, I know there are tools out there like puppet and chef that are designed for doing this
kind of templated build of multiple servers, but the point I want to illustrate here is that
pdsh enables you to do ad-hoc changes to a set of servers at once. Sure, once I have my cluster
built and want to create an image/template for future builds, then it would be daft if
I were building the whole lot through pdsh-distributed yum commands.
Example - setting up
the date/timezone/NTPD
Often the accuracy of the clock on each server in a cluster is crucial, and we can easily do
this with pdsh:
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] ntpdate pool.ntp.org
rnmcluster02-node03: 30 Nov 20:46:22 ntpdate[27610]: step time server 176.58.109.199 offset -2.928585 sec
rnmcluster02-node02: 30 Nov 20:46:22 ntpdate[28527]: step time server 176.58.109.199 offset -2.946021 sec
rnmcluster02-node04: 30 Nov 20:46:22 ntpdate[27615]: step time server 129.250.35.250 offset -2.915713 sec
rnmcluster02-node01: 30 Nov 20:46:25 ntpdate[29316]: 178.79.160.57 rate limit response from server.
rnmcluster02-node01: 30 Nov 20:46:22 ntpdate[29316]: step time server 176.58.109.199 offset -2.925016 sec
Set NTPD to start automatically at boot:
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] chkconfig ntpd on
Start NTPD:
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] service ntpd start
Example - using a HEREDOC (here-document) and sending quotation marks in a command with
PDSH
Here documents
(heredocs) are a nice way to embed multi-line content in a single command, enabling the
scripting of a file creation rather than the clumsy instruction to " open an editor and
paste the following lines into it and save the file as /foo/bar ".
Fortunately heredocs work just fine with pdsh, so long as you remember to enclose the whole
command in quotation marks. And speaking of which, if you need to include quotation marks in
your actual command, you need to escape them with a backslash. Here's an example of both,
setting up the configuration file for my ever-favourite gnu screen on all the nodes of the
cluster:
Now when I login to each individual node and run screen, I get a nice toolbar at the
bottom:
Combining
commands
To combine commands together that you send to each host you can use the standard bash
operator semicolon ;
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] "date;sleep 5;date"
rnmcluster02-node01: Sun Nov 30 20:57:06 GMT 2014
rnmcluster02-node03: Sun Nov 30 20:57:06 GMT 2014
rnmcluster02-node04: Sun Nov 30 20:57:06 GMT 2014
rnmcluster02-node02: Sun Nov 30 20:57:06 GMT 2014
rnmcluster02-node01: Sun Nov 30 20:57:11 GMT 2014
rnmcluster02-node03: Sun Nov 30 20:57:11 GMT 2014
rnmcluster02-node04: Sun Nov 30 20:57:11 GMT 2014
rnmcluster02-node02: Sun Nov 30 20:57:11 GMT 2014
Note the use of the quotation marks to enclose the entire command string. Without them the
bash interpretor will take the ; as the delineator of the local commands,
and try to run the subsequent commands locally:
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] date;sleep 5;date
rnmcluster02-node03: Sun Nov 30 20:57:53 GMT 2014
rnmcluster02-node04: Sun Nov 30 20:57:53 GMT 2014
rnmcluster02-node02: Sun Nov 30 20:57:53 GMT 2014
rnmcluster02-node01: Sun Nov 30 20:57:53 GMT 2014
Sun 30 Nov 2014 20:58:00 GMT
You can also use && and || to run subsequent commands
conditionally if the previous one succeeds or fails respectively:
robin@RNMMBP $ pdsh -w root@rnmcluster02-node[01-4] "chkconfig collectl on && service collectl start"
rnmcluster02-node03: Starting collectl: [ OK ]
rnmcluster02-node02: Starting collectl: [ OK ]
rnmcluster02-node04: Starting collectl: [ OK ]
rnmcluster02-node01: Starting collectl: [ OK ]
Piping and file redirects
Similar to combining commands above, you can pipe the output of commands, and you need to
use quotation marks to enclose the whole command string.
The difference is that you'll be shifting the whole of the pipe across the network in order
to process it locally, so if you're just grepping etc this doesn't make any sense. For use of
utilities held locally and not on the remote server though, this might make sense.
File redirects work the same way – within quotation marks and the redirect will be to
a file on the remote server, outside of them it'll be local:
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] "chkconfig>/tmp/pdsh.out"
robin@RNMMBP ~ $ ls -l /tmp/pdsh.out
ls: /tmp/pdsh.out: No such file or directory
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] chkconfig>/tmp/pdsh.out
robin@RNMMBP ~ $ ls -l /tmp/pdsh.out
-rw-r--r-- 1 robin wheel 7608 30 Nov 19:23 /tmp/pdsh.out
Cancelling PDSH operations
As you can see from above, the precise syntax of pdsh calls can be hugely important. If you
run a command and it appears 'stuck', or if you have that heartstopping realisation that the
shutdown -h now you meant to run locally you ran across the cluster, you can press
Ctrl-C once to see the status of your commands:
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] sleep 30
^Cpdsh@RNMMBP: interrupt (one more within 1 sec to abort)
pdsh@RNMMBP: (^Z within 1 sec to cancel pending threads)
pdsh@RNMMBP: rnmcluster02-node01: command in progress
pdsh@RNMMBP: rnmcluster02-node02: command in progress
pdsh@RNMMBP: rnmcluster02-node03: command in progress
pdsh@RNMMBP: rnmcluster02-node04: command in progress
and press it twice (or within a second of the first) to cancel:
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] sleep 30
^Cpdsh@RNMMBP: interrupt (one more within 1 sec to abort)
pdsh@RNMMBP: (^Z within 1 sec to cancel pending threads)
pdsh@RNMMBP: rnmcluster02-node01: command in progress
pdsh@RNMMBP: rnmcluster02-node02: command in progress
pdsh@RNMMBP: rnmcluster02-node03: command in progress
pdsh@RNMMBP: rnmcluster02-node04: command in progress
^Csending SIGTERM to ssh rnmcluster02-node01
sending signal 15 to rnmcluster02-node01 [ssh] pid 26534
sending SIGTERM to ssh rnmcluster02-node02
sending signal 15 to rnmcluster02-node02 [ssh] pid 26535
sending SIGTERM to ssh rnmcluster02-node03
sending signal 15 to rnmcluster02-node03 [ssh] pid 26533
sending SIGTERM to ssh rnmcluster02-node04
sending signal 15 to rnmcluster02-node04 [ssh] pid 26532
pdsh@RNMMBP: interrupt, aborting.
If you've got threads yet to run on the remote hosts, but want to keep running whatever has
already started, you can use Ctrl-C, Ctrl-Z:
robin@RNMMBP ~ $ pdsh -f 2 -w root@rnmcluster02-node[01-4] "sleep 5;date"
^Cpdsh@RNMMBP: interrupt (one more within 1 sec to abort)
pdsh@RNMMBP: (^Z within 1 sec to cancel pending threads)
pdsh@RNMMBP: rnmcluster02-node01: command in progress
pdsh@RNMMBP: rnmcluster02-node02: command in progress
^Zpdsh@RNMMBP: Canceled 2 pending threads.
rnmcluster02-node01: Mon Dec 1 21:46:35 GMT 2014
rnmcluster02-node02: Mon Dec 1 21:46:35 GMT 2014
NB the above example illustrates the use of the -f argument to limit how many
threads are run against remote hosts at once. We can see the command is left running on the
first two nodes and returns the date, whilst the Ctrl-C - Ctrl-Z stops it from being executed
on the remaining nodes.
PDSH_SSH_ARGS_APPEND
By default, when you ssh to new host for the first time you'll be prompted to validate the
remote host's SSH key fingerprint.
The authenticity of host 'rnmcluster02-node02 (172.28.128.9)' can't be established.
RSA key fingerprint is 00:c0:75:a8:bc:30:cb:8e:b3:8e:e4:29:42:6a:27:1c.
Are you sure you want to continue connecting (yes/no)?
This is one of those prompts that the majority of us just hit enter at and ignore; if that
includes you then you will want to make sure that your PDSH call doesn't fall in a heap because
you're connecting to a bunch of new servers all at once. PDSH is not an interactive tool, so if
it requires input from the hosts it's connecting to it'll just fail. To avoid this SSH prompt,
you can set up the environment variable PDSH SSH ARGS_APPEND as follows:
The -q makes failures less verbose, and the -o passes in a couple
of options, StrictHostKeyChecking to disable the above check, and
UserKnownHostsFile to stop SSH keeping a list of host IP/hostnames and
corresponding SSH fingerprints (by pointing it at /dev/null ). You'll want this if
you're working with VMs that are sharing a pool of IPs and get re-used, otherwise you get this
scary failure:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
00:c0:75:a8:bc:30:cb:8e:b3:8e:e4:29:42:6a:27:1c.
Please contact your system administrator.
For both of these above options, make sure you're aware of the security implications that
you're opening yourself up to. For a sandbox environment I just ignore them; for anything where
security is of importance make sure you are aware of quite which server you are connecting to
by SSH, and protecting yourself from MitM attacks.
When working with multiple Linux machines I would first and foremost make sure SSH
keys are set up in order to ease management through password-less logins.
After SSH keys, I would recommend pdsh for parallel execution of the same SSH command across
the cluster. It's a big time saver particularly when initially setting up the cluster given the
installation and configuration changes that are inevitably needed.
In the next article of this series we'll see how the tool colmux is a powerful way to
monitor OS metrics across a cluster.
So now your turn – what particular tools or tips do you have for working with a
cluster of Linux machines? Leave your answers in the comments below, or tweet them to me at
@rmoff .
[1] 18:10:10 [SUCCESS] [email protected]
Sun Feb 26 18:10:10 IST 2017
[2] 18:10:10 [SUCCESS] vivek@dellm6700
Sun Feb 26 18:10:10 IST 2017
[3] 18:10:10 [SUCCESS] [email protected]
Sun Feb 26 18:10:10 IST 2017
[4] 18:10:10 [SUCCESS] [email protected]
Sun Feb 26 18:10:10 IST 2017
Run the uptime command on each host: $ pssh -i -h ~/.pssh_hosts_files uptime
Sample outputs:
You can now automate common sysadmin tasks such as patching all servers: $ pssh -h ~/.pssh_hosts_files -- sudo yum -y update
OR $ pssh -h ~/.pssh_hosts_files -- sudo apt-get -y update
$ pssh -h ~/.pssh_hosts_files -- sudo apt-get -y upgrade
How do I use pssh to copy
file to all servers?
The syntax is: pscp -h ~/.pssh_hosts_files src dest
To copy $HOME/demo.txt to /tmp/ on all servers, enter: $ pscp -h ~/.pssh_hosts_files $HOME/demo.txt /tmp/
Sample outputs:
Or use the prsync command for efficient copying of files: $ prsync -h ~/.pssh_hosts_files /etc/passwd /tmp/
$ prsync -h ~/.pssh_hosts_files *.html /var/www/html/
How do I kill processes in
parallel on a number of hosts?
Use the pnuke command for killing processes in parallel on a number of hosts. The syntax
is:
$ pnuke -h .pssh_hosts_files process_name
### kill nginx and firefox on hosts: $ pnuke -h ~/.pssh_hosts_files firefox
$ pnuke -h ~/.pssh_hosts_files nginx
See pssh/pscp command man pages for more information.
Thanks for the article! Always looking for more options to perform similar tasks.
When you want to interact with multiple hosts simultaneously, MobaXterm
(mobaxterm.mobatek.net), is a powerful tool. You can even use your favorite text editor
(vim, emacs, nano, ed) in real time.
Each character typed is sent in parallel to all hosts and you immediately see the
effect. Selectively toggling whether the input stream is sent to individual host(s) during
a session allows for custom changes that only affect a desired subset of hosts.
MobaXterm has a free home version as well as a paid professional edition. The company
was highly responsive to issues reports that I provided and corrected the issues
quickly.
I have no affiliation with the company other than being a happy free edition
customer.
If you're a Linux system administrator, chances are you've got more than one machine that
you're responsible for on a daily basis. You may even have a bank of machines that you maintain
that are similar -- a farm of Web servers, for example. If you have a need to type the same
command into several machines at once, you can login to each one with SSH and do it serially,
or you can save yourself a lot of time and effort and use a tool like ClusterSSH.
ClusterSSH is a Tk/Perl wrapper around standard Linux tools like XTerm and SSH. As such,
it'll run on just about any POSIX-compliant OS where the libraries exist -- I've run it on
Linux, Solaris, and Mac OS X. It requires the Perl libraries Tk ( perl-tk on
Debian or Ubuntu) and X11::Protocol ( libx11-protocol-perl on Debian or Ubuntu),
in addition to xterm and OpenSSH.
Installation
Installing ClusterSSH on a Debian or Ubuntu system is trivial -- a simple sudo apt-get
install clusterssh will install it and its dependencies. It is also packaged for use
with Fedora, and it is installable via the ports system on FreeBSD. There's also a MacPorts
version for use with Mac OS X, if you use an Apple machine. Of course, it can also be compiled
from source.
Configuration
ClusterSSH can be configured either via its global configuration file --
/etc/clusters , or via a file in the user's home directory called
.csshrc . I tend to favor the user-level configuration as that lets multiple
people on the same system to setup their ClusterSSH client as they choose. Configuration is
straightforward in either case, as the file format is the same. ClusterSSH defines a "cluster"
as a group of machines that you'd like to control via one interface. With that in mind, you
enumerate your clusters at the top of the file in a "clusters" block, and then you describe
each cluster in a separate section below.
For example, let's say I've got two clusters, each consisting of two machines. "Cluster1"
has the machines "Test1" and "Test2" in it, and "Cluster2" has the machines "Test3" and "Test4"
in it. The ~.csshrc (or /etc/clusters ) control file would look like
this:
clusters = cluster1 cluster2
cluster1 = test1 test2
cluster2 = test3 test4
You can also make meta-clusters -- clusters that refer to clusters. If you wanted to make a
cluster called "all" that encompassed all the machines, you could define it two ways. First,
you could simply create a cluster that held all the machines, like the following:
By calling out the "all" cluster as containing cluster1 and cluster2, if either of those
clusters ever change, the change is automatically captured so you don't have to update the
"all" definition. This will save you time and headache if your .csshrc file ever grows in
size.
Using ClusterSSH
Using ClusterSSH is similar to launching SSH by itself. Simply running cssh -l
<username> <clustername> will launch ClusterSSH and log you in as the
desired user on that cluster. In the figure below, you can see I've logged into "cluster1" as
myself. The small window labeled "CSSH [2]" is the Cluster SSH console window. Anything I type
into that small window gets echoed to all the machines in the cluster -- in this case, machines
"test1" and "test2". In a pinch, you can also login to machines that aren't in your .csshrc
file, simply by running cssh -l <username> <machinename1>
<machinename2> <machinename3> .
If I want to send something to one of the terminals, I can simply switch focus by clicking
in the desired XTerm, and just type in that window like I usually would. ClusterSSH has a few
menu items that really help when dealing with a mix of machines. As per the figure below, in
the "Hosts" menu of the ClusterSSH console there's are several options that come in handy.
"Retile Windows" does just that if you've manually resized or moved something. "Add host(s)
or Cluster(s)" is great if you want to add another set of machines or another cluster to the
running ClusterSSH session. Finally, you'll see each host listed at the bottom of the "Hosts"
menu. By checking or unchecking the boxes next to each hostname, you can select which hosts the
ClusterSSH console will echo commands to. This is handy if you want to exclude a host or two
for a one-off or particular reason. The final menu option that's nice to have is under the
"Send" menu, called "Hostname". This simply echoes each machine's hostname to the command line,
which can be handy if you're constructing something host-specific across your cluster.
Caveats with ClusterSSH
Like many UNIX tools, ClusterSSH has the potential to go horribly awry if you aren't
very careful with its use. I've seen ClusterSSH mistakes take out an entire tier of
Web servers simply by propagating a typo in an Apache configuration. Having access to multiple
machines at once, possibly as a privileged user, means mistakes come at a great cost. Take
care, and double-check what you're doing before you punch that Enter key.
Conclusion
ClusterSSH isn't a replacement for having a configuration management system or any of the
other best practices when managing a number of machines. However, if you need to do something
in a pinch outside of your usual toolset or process, or if you're doing prototype work,
ClusterSSH is indispensable. It can save a lot of time when doing tasks that need to be done on
more than one machine, but like any power tool, it can cause a lot of damage if used
haphazardly.
Execute Commands Simultaneously
on Multiple Servers
Run the same command at the same time on
multiple systems, simplifying administrative tasks and reducing
synchronization problems
.
If you have multiple servers with similar or identical configurations
(such as nodes in a cluster), it's often difficult to make sure the contents
and configuration of those servers are identical. It's even more difficult
when you need to make configuration modifications from the command line,
knowing you'll have to execute the exact same command on a large number of
systems (better get coffee first). You could try writing a script to perform
the task automatically, but sometimes scripting is overkill for the work to
be done. Fortunately, there's another way to execute commands on multiple
hosts simultaneously.
A great solution for this problem is an excellent tool called
multixterm
,
which enables you to simultaneously open
xterms
to any number of systems, type your commands in a single
central window and have the commands executed in each of the
xterm
windows you've started.
Sound appealing? Type once, execute many-it sounds like a new pipelining
instruction set.
This command will open
ssh
connections to
host1
and
host2
(
Figure
4-1
). Anything typed in the area labeled "stdin window" (which is
usually gray or green, depending on your color scheme) will be sent to both
windows, as shown in the figure.
As you can see from the sample command, the
–xc
option stands for execute command, and it must be followed by the command
that you want to execute on each host, enclosed in double quotation marks.
If the specified command includes a wildcard such as
%n
, each hostname that follows the
command will be substituted into the command in turn when it is executed.
Thus, in our example, the commands
ssh host1
and
ssh host2
were both executed by
multixterm
, each within its own
xterm
window.
See
Also
man
multixterm
"Enable Quick telnet/SSH Connections
from the Desktop"
[Hack #41]
"Disconnect Your Console Without Ending
Your Session"
[Hack #34]
Now I can shift into second gear and try some fancier pdsh tricks.
First, I want to run a more complicated command on all of the nodes .
Notice that I put the entire command in quotes. This means the entire
command is run on each node, including the first (cat /proc/cpuinfo) and
second (grep bogomips , model ,cpu) parts.
[shaha@oc8535558703 PDSH]$ pdsh 'cat /proc/cpuinfo' | egrep 'bogomips|model|cpu'
ubuntu@ec2-52-58-254-227: cpu family : 6
ubuntu@ec2-52-58-254-227: model : 63
ubuntu@ec2-52-58-254-227: model name : Intel(R) Xeon(R) CPU E5-2676 v3
@ 2.40GHz
ubuntu@ec2-52-58-254-227: cpu MHz : 2400.070
ubuntu@ec2-52-58-254-227: cpu cores : 1
ubuntu@ec2-52-58-254-227: cpuid level : 13
ubuntu@ec2-52-58-254-227: bogomips : 4800.14
ec2-user@ec2-52-59-121-138: cpu family : 6
ec2-user@ec2-52-59-121-138: model : 62
ec2-user@ec2-52-59-121-138: model name : Intel(R) Xeon(R) CPU E5-2670
v2 @ 2.50GHz
ec2-user@ec2-52-59-121-138: cpu MHz : 2500.036
ec2-user@ec2-52-59-121-138: cpu cores : 1
ec2-user@ec2-52-59-121-138: cpuid level : 13
ec2-user@ec2-52-59-121-138: bogomips : 5000.07
[shaha@oc8535558703 PDSH]$
What's the next best thing after ssh in a for loop? Only needs to scale to a hundred or so hosts.
A client-side only implementation is preferable. Some options I've dug up:
Posted by: Anonymous [ip: 98.215.194.203] on November 01, 2008 09:22 PM
no, its not. Lets take a sample size of 32 hosts and run a quick command on each:
$ time for f
in `cat hosts`;
do ssh $f 'ls / > /dev/null';
done
real 2m45.195s
That's roughly 5.15 seconds per host. If this were a 5000 node network we're looking at about
7.1 hours to complete this command. Lets do the same test with pssh and a max parallel of 10:
$ time pssh -p 10 -h hosts
"ls > /dev/null"
real 0m17.220s
That's some considerable savings. lets try each one in parallel and set the max to 32:
$ time pssh -p 32 -h hosts
"ls > /dev/null"
real 0m7.436s
If one run took about 5 seconds, doing them all at the same time also took about 5 seconds, just
with a bit of overhead. I don't have a 5000 node network (anymore) but you can see there are considerable
savings by doing some things in parallel. You probably wouldn't ever run 5000 commands in parallel
but really thats a limit of your hardware and network. if you had a beefy enough host machine you
probably could run 50, 100 or even 200 in parallel if the machine could handle it.
Posted by: Anonymous [ip: 99.156.88.107] on November 02, 2008 07:54 PM
It's absolutely not good enough. 4 or so years ago a coworker and I wrote a suite of parallel ssh
tools to help perform security related duties on the very large network in our global corp. With
our tools on a mosix cluster using load balanced ssh-agents across multiple nodes we could run upto
1000 outbound sessions concurrently. This made tasks such as looking for users processes or cronjobs
on 10,000+ hosts world wide a task that could be done in a reasonable amount of time, as opposed
to taking more than a day.
Parallel SSH execution and a single shell to control them all
Posted by: Anonymous [ip: 24.14.35.105] on November 02, 2008 12:43 PM
I use the parallel option to xargs for this. Tried shmux, and some other tools, but xargs
seems to work best for me. Just use a more recent gnu version. Some older gnu versions, some aix
version, etc... have some issues. Only real gotcha that I've run into is that it will stop the whole
run if a command exits non-zero. Just write a little wrapper that exits 0 and you're good to go.
I've used this in 2 ~1000 server environments to push code(pipe tar over ssh for better compatibility),
and remotely execute commands.
Parallel SSH execution and a single shell to control them all
Posted by: Anonymous [ip: 87.230.108.21] on November 06, 2008 10:08 AM
What I'm curious about is this:
"""
if you want to interactively edit the same file on multiple machines, it might be quicker to use
a parallel SSH utility and edit the file on all nodes with vi rather than concoct a script to do
the same edit.
"""
I would have found a short note on which of these three is capable of doing so very helpfull. Cluster
SSH's description sounds as though it would be the tool that could do it. But I just don't have the
time to test it just yet.
Anyone tried that yet? Or knows to which tool this statement refers to?
Group Shell (also called gsh)
is a remote shell multiplexor. It lets you control many remote shells at once in a single shell.
Unlike other commands dispatchers, it is interactive, so shells spawned on the remote hosts are
persistent.
It requires only a SSH server on the remote hosts, or some other way to open a remote
shell.
gsh allows you to run commands on multiple hosts by adding tags to the gsh command.
gsh tag "remote command"
Important things to remember:
/etc/ghosts contains a list of all the servers and tags
gsh is a lot more fun once you've set up ssh keys to your servers
Examples
List uptime on all servers in the linux group:
gsh linux "uptime"
Check to see if an IP address was blocked with CSF by checking the csf and csfcluster groups/tags:
pssh provides parallel versions of the OpenSSH tools that are useful for controlling large numbers
of machines simultaneously. It includes parallel versions of ssh, scp, and rsync, as well as a parallel
kill command.
Use tentakel for parallel, distributed command execution.
Often you want to execute a command not only on one computer, but on several at once. For example,
you might want to report the current statistics on a group of managed servers or update all of your
web servers at once.
1 The Obvious Approach
You could simply do this on the command line with a shell script like the following:
for host in hostA hostB hostC ; do
ssh $host do_something
done
However, this has several disadvantages:
It is slow because the connections to the remote hosts do not run in parallel. Every connection
must wait for the previous one to finish.
Managing many sets of hosts can become a complicated task because there is no easy way to define
groups of hosts (e.g., mailservers or workstations).
The output is provided by the program that is run remotely.
The output is hard to read because there are no marks indicating when the output for a specific
host begins or ends.
2 How tentakel Can Help
While you could write a shell script to address some of these disadvantages, you might want to
consider tentakel, which is available in the ports collection. Its execution starts multiple
threads that run independently of each other. The maximum waiting time depends on the longest running
remote connection, not on the sum of all of them. After the last remote command has returned,
tentakel displays the results of all remote command executions. You can also configure
how the output should look, combining or differentiating the results from individual hosts.
tentakel operates on groups of hosts. A group can have two types of members: hosts
or references to other groups. A group can also have parameters to control various aspects of the
connection, including username and access method (rsh or ssh, for example).
3 Installing and Configuring tentakel
Install tentakel from the ports collection:
cd /usr/ports/sysutils/tentakel
make install clean
You can instead install tentakel by hand; consult the INSTALL file in the
distribution. A make install should work in most cases, provided that you have a working
Python environment installed.
After the installation, create the configuration file tentakel.conf in the directory
$HOME/.tentakel/. See the example file in /usr/local/share/doc/tentakel/tentakel.conf.example
for a quick overview of the format.
Alternatively, copy the file into /usr/local/etc/ or /etc/, depending on your system's
policy, in order to have a site-wide tentakel.conf that will be used when there is no user-specific
configuration. As an administrator, you may predefine groups for your users this way.
Assuming that you have a farm of three servers, mosel, aare, and spree,
of which the first two are web servers, your configuration might resemble this:
set format="%d\n%o\n"
group webservers(user="webmaster")
+mosel +aare
group servers(user="root")
@webservers +spree
With this definition, you can use the group name servers to execute a command on
all your servers as root and the group name webservers to execute it
only on your web servers as user webmaster.
The first line defines the output format, as explained in Figure.
tentakel output format characters
Character
Output
%d
The hostname
%o
The output of the remotely executed commands
\n
A newline character
This commands tentakel to print the hostname, followed by the lines of the remote
output for each server sequentially. You can enrich the format string with additional directives,
such as %s for the exit status from commands. See the manpage for more information.
As you can see from the servers definition, there is no need to list all servers
in each group; include servers from other groups using the @groupname notation.
On the remote machines, the only required configuration is to ensure that you can log into them
from the tentakel machine without entering a password. Usually that will mean using
ssh and public keys, which is also tentakel's default. tentakel
provides the parameter method for using different mechanisms, so refer to the manpage
for details.
4 Using tentakel
To update the web pages on all web servers from a CVS repository:
% tentakel -g webservers "cd /var/www/htdocs && cvs update"
### mosel(0):
cvs update: Updating .
U index.html
U main.css
### aare(1):
C main.css
cvs update: Updating .
%
Note the use of quotes around the command to be executed. This prevents the local shell from interpreting
special characters such as & or ;.
If no command is specified, tentakel invokes interactive mode:
dsh (the distributed shell) is a program which executes a single command on multiple remote
machines. It can execute this command in parallel (i.e., on any number of machines at a time)
or in serial (by specifying parallel execution of the command on 1 node at a time). It was originally
designed to work with rsh, but has full support for ssh and with a little tweaking of the top part
of the dsh executable, should work with any program that allows remote execution of a command without
an interactive login.
Massh is a mass ssh tool that allows for parallel execution of commands on remote systems. This
makes it possible to update and manage hundreds or even thousands of systems. It can also push files
in parallel and run scripts.
It also includes Pingz, a mass pinger that can do DNS lookups, and Ambit, a string expander that
allows for both pre-defined groups of hosts and arbitrary strings that represent host groupings.
The combination of Massh and Ambit creates a powerful way to manage groups of systems as configurable
units. This allows a focus on managing an environment of services, not servers. Clean, organized
output sets it apart from other mass ssh tools.
With the help of tool called tentakel, you run distributed command execution. It is a program
for executing the same command on many hosts in parallel using ssh (it supports other methods too).
Main advantage is you can create several sets of servers according requirements. For example webserver
group, mail server group, home servers group etc. The command is executed in parallel on all servers
in this group (time saving). By default, every result is printed to stdout (screen). The output format
can be defined for each group.
You need to install tentakel on admin workstation (192.168.1.1). We have two group servers, first
is group of web server with three host and another is homeservers with two hosts.
The requirements on the remote hosts (groups) need a running sshd server on the remote side. You
need to setup
ssh-key based login between admin workstation and all group servers/hosts to take full advantage
of this tentakel distributed command execution method.
Tentakel requires a working Python installation. It is known to work with Python 2.3. Python 2.2
and Python 2.1 are not supported. If you are using old version of python then please upgrade it.
Let us see howto install and configure tentakel.
Visit sourceforge home page to download
tentakel or download RPM files from tentakel home page.
Untar source code, enter:
# tar -zxvf tentakel-2.2.tgz
You should be root user for the install step. To install it type
# make
# make install
For demonstration purpose we will use following setup:
admin pc Group hosts
Running Debian Linux homeservers 192.168.1.12 192.168.1.15
User: jadmin
Copy sample tentakel configuration file tentakel.conf.example to /etc directory
# cp tentakel.conf.example /etc/ tentakel.conf
Modify /etc/tentakel.conf according to above setup, at the end your file should look like as follows:
# first section: global parameters
set ssh_path="/usr/bin/ssh"
set method="ssh" # ssh method
set user="jadmin" # ssh username for remote servers
#set format="%d %o\n" # output format see man page
#set maxparallel="3" # run at most 3 commands in parallel
# our home servers with two hosts
group homeservers ()
+192.168.1.12 +192.168.1.15
# localhost
group local ()
+127.0.0.1
Save the file and exit to shell prompt. Where,
group homeservers () : Group name
+192.168.1.12 +192.168.1.15 : Host inclusion. name is included and can be an ip address or a hostname.
Configure
ssh-key based login to avoid password prompt between admin workstation and group servers for
jadmin user.
Login as jadmin and type the following command:
$ tentakel -g homeservers
interactive mode
tentakel(homeservers)>
Where,
-g groupname: Select the group groupname The group must be defined in the configuration file (here
it is homeservers). If not specified tentakel implicitly assumes the default group.
At tentakel(homeservers)> prompt type command uname and uptime command as follows:
exec "uname -mrs"
exec "uptime"
Few more examples
Find who is logged on all homeservers and what they are doing (type at shell prompt)
$ tentakel -g homeservers "w"
Executes the uptime command on all hosts defined in group homeservers:
$ tentakel -g homeservers uptime
As you can see, tentakel is very powerful and easy to use tool. It also supports the concept of
plugins. A plugin is a single Python module and must appear in the $HOME/.tentakel/plugins/ directory.
Main advantage of plugin is customization according to your need. For example, entire web server
or mysql server farm can be controlled according our requirements.
However, tentakel is not the only utility for this kind of work. There are programs that do similar
things or have to do with tentakel in some way. The complete list can be found online
here. tentakel should work on
almost all variant of UNIX/BSD or Linux distributions.
RGANG looks good too. It incorporates an algorithm to build a tree-like structure (or "worm"
structure) to allow the distribution processing time to scale very well to 1000 or more nodes.
Looks rock solid.
If your somewhat traditional, just use Expect.
Does most of the same, has tons of examples around, a cool book (Exploring Expect).
And you can handle stuff that NEEDS a terminal like ssh password prompts or the password program
to change passwords.And it works on windows.
Expect. Is very nice back in Solaris day I had complete monitoring system written in rsh and
expect tool. Open advantage of ssh is that it provides API for C/C++ programs. So I get performance
Anonymous user thanks for sharing your script with us. ,appreciate your post.
Thanks for mentioning tentakel in your blog. You also mentioned rgang, which looks nice indeed.
However, there are two reasons why I don't like rgang: 1) the license is not as free as tentakels
(at least it does not look like as far as I can tell without being a lawyer) 2) it looks much
more unmaintained thatn tentakel :)
makeself is a small shell script that generates a self-extractable compressed TAR archive from
a directory. The resulting file appears as a shell script, and can be launched as is. The archive
will then uncompress itself to a temporary directory and an arbitrary command will be executed (for
example, an installation script).
This is pretty similar to archives generated with WinZip Self-Extractor
in the Windows world.
Spacewalk is a Linux and Solaris systems management solution. It allows you to inventory your
systems (hardware and software information), install and update software on your systems, collect
and distribute your custom software packages into manageable groups, provision (Kickstart) your systems,
manage and deploy configuration files to your systems, monitor your systems, provision virtual guests,
and start/stop/configure virtual guests.
Cluster SSH opens terminal windows with connections to specified hosts and an administration console.
Any text typed into the administration console is replicated to all other connected and active windows.
This tool is intended for, but not limited to, cluster administration
where the same configuration or commands must be run on each node within the cluster.
Performing these commands all at once via this tool ensures all nodes are kept in sync.
As an administrator of SLES/OES Linux clusters or multiple SUSE Linux servers you are probably
familiar with that fact that you have to make an identical change on more than one server. Those
can be things like editing files, execute commands, collect data or some other administrative task.
There are a couple of way to do this. You can write a script that performs the change for you,
or you can SSH into a server, make the change and repeat that task manually for every server.
Now both ways can cost an extended amount of time. Writing and testing a shell script takes some
time and performing the task by hand on lets say five or more servers also costs time.
Now, wouldn't it be a real timesaver when you have only one console in which you can perform tasks
on multiple servers simultaneously? This solution can be found in ClusterSSH.
Solution:
With ClusterSSH it is possible to make a SSH connection to multiple servers and
perform tasks from one single command window, without any scripting.
The 'cssh' command lets you connect to any server specified as a command line argument, or
to groups of servers (or cluster nodes) defined in a configuration file.
The 'cssh' command opens a terminal window to every server which can be used to review the output
sent from the cssh-console, or to edit a single host directly. Commands given in to the cssh-console
are executed on every connected host. When you start typing in the cssh-console you'll see that the
same command also show up on the commandline of the connected systems.
The state of connected systems can be toggled from the cssh-console. So if you want to exclude
certain hosts temporarily from specific command, you can do this with a single mouseclick. Also,
hosts can be added on the fly and open terminal windows can automatically be rearranged.
One caveat to be aware of is when editing files. Never assume that file is identical on all systems.
For example, lines in a file you are editing may be in a different order. Don't just go to a certain
line in a file and start editing. Instead search for the text you want to exit, just to be sure the
correct text is edited on all connected systems.
Example:
Configuration files section from the man-page:
/etc/clusters
This file contains a list of tags to server names mappings. When any name is used on the command
line it is checked to see if it is a tag in /etc/clusters (or the .csshrc file, or any additional
cluster file specified by -c). If it is a tag, then the tag is replaced with the list of servers
from the file. The file is formatted as follows:
<tag> [user@]<server> [user@]<server> [...]
i.e.
# List of servers in live
live admin1@server1 admin2@server2 server3 server4
Clusters may also be specified within the users .csshrc file, as documented below.
/etc/csshrc & $HOME/.csshrc
This file contains configuration overrides - the defaults are as marked. Default options are overwritten
first by the global file, and then by the user file.
Environment:
ClusterSSH can be used to any system running the SSH daemon.
About: pssh provides parallel versions of the OpenSSH tools that are useful for controlling
large numbers of machines simultaneously. It includes parallel versions
of ssh, scp, and rsync, as well as a parallel kill command.
Changes: A 64-bit bug was fixed: select now uses None when there is no timeout rather than
sys.maxint. EINTR is caught on select, read, and write calls. Longopts were fixed for pnuke, prsync,
pscp, pslurp, and pssh. Missing environment variables options support was added.
genpass
Generates memorable passwords that are tough to crack. In addition to the command-line tool, the
package includes an AppleScript for use on Mac OS X. (Perl, AppleScript) View the README Download version 5.8 - gzipped
tarball, 5 KB Last update: April 2007
lookx (hostx, userx, groupx, servx)
Looks up hostnames and IP addresses, usernames and UIDs, groups and GIDs, services and port numbers,
using a system's native mechanisms. (Perl) View the README Download version 5.0 - gzipped
tarball, 5 KB Last update: October 2008
The idea behind this tool originally came from wanting to do something on each machine in our
network. Existing scripts would serially go to each machine run the command, wait for it to finish,
and continue to the next machine. There was no reason why this couldn't be done in parallel. The
problems, however, were many. First of all, the output from finishing parallel jobs needs to be buffered
in such a way that different machines wouldn't output their results on top of eachother. A final
bit was added because it was nice to have output alphabetical rather than first-done, first-seen.
The result is a parallel job spawner that displays output from the machines alphabetically, as soon
as it is available. If ``alpha'' take longer than ``zebra'', there will be no output past ``alpha''
until it is finished. As soon as ``alpha'' is finished, though, everyone's output is printed.
Sending a SIGUSR1 to gsh(1) will cause it to report which machines are still pending.
(Effectively turns on --debug for one cycle.)
Latest version is
1.0.2.
$Id: index.html,v 1.11 2006/05/25 23:16:28 nemesis Exp $
This tool was developed totally separately from
another tool which has the
same name. Looks like both this author and I had the same idea. The significant differences between
these versions appears to be that I cleaned up the macro language, and added a lot of options for
behavior.
Well, this is seriously undocumented code. The short version is:
perl Makefile.PL
make
make install
And then create a file called /etc/ghosts which lists all the machines you
want to contact. It would look something like this:
# Macros
sunprod=solaris-e450
# Machines
#
# Name Group Hardware OS
bilbo prod intel linux
baggins prod e4500 solaris
tolkien devel e450 solaris
Machine groups are run together with "+"s and "-"s as you see fit:
ghosts intel+e450
ghosts prod-intel
The "ghosts" command just shows the resulting list. "gsh" a group to run
a command:
gsh devel+intel "cat /etc/motd"
You'll need to have ssh set up with trusted RSA keys, though. I should
cover that in here too, but it's REALLY late tonight, and I just want to
get this posted so my buddy will quite bugging me about downloading the
"latest" version. :P
See the TODO file for the huge list of things I need to do. Mostly
documentation. :)
Credit where credit is due: this is very very losely based on the "gsh" tool
that came (comes?) with the Perl distribution, and on extra work by
Mike Murphy. My version will do things in parallel, and does proper macro
expansions. It is released under the GNU General Public License.
Kees Cook
[email protected]
http://outflux.net/
I just discovered this very cool feature of Konsole. You can log into multiple servers (via ssh)
and run the same command in each Konsole tab at once. It's great when you have many computers with
same configuration. Just log in, and select one of Konsole's tabs to be the one to "broadcast" input
to all others. It works for all tabs in a single Konsole window.
It also useful when you have several users on the same computer, and you wish to make sure all of
them have the same rights, and that they can perform some operations without stepping on each others
toes.
One of the problems is monitoring the effects of commands. Well, you can detach the tabs (Detach
Session menu item) after you set up the broadcasting. If you have large enough screen, you can set
up 8 or 9 windows nicely, and watch what's happening. Really useful stuff.
One warning though: don't forget to turn it off once you're done. It's easy to forget yourself and
start some clean-up job (rm -rf /) which is only meant to one of machines.
Fanout and fanterm are two utilities that allow you
to run commands on multiple machines. The difference is that fanout only runs non-interactive commands
(like dd, cat, adduser, uname -a, etc.) and pipelines built of
these. The output is collected into a single display that can be viewed by less or redirected to
a file.
Fanterm, on the other hand, allows you to run interactive text mode commands on multiple machines
at the same time. Your keystrokes are sent to a shell or application running on each of the target
systems. The output from each system is shown in a seperate xterm.
Fanout allows you to run non-interactive commands on remote machines simultaneously, collecting
the output in an organized fashion. The syntax is:
fanout [--noping] "{space separated list of systems}" "{commands to run}"
By default, fanout pings each of the remote machines to make sure they're up before trying to
ssh to them. If they're not pingable (because of a firewall), put --noping as the
first parameter.
A "System" is a bit of a misnomer; it could be a fully qualified domain, an entry in /etc/hosts,
an IP address, an entry in ~/.ssh/config, or any of those preceeded with user_account@
. In short, if you can type ssh something and get a command prompt, it can be used
as a "system" above.
You can run as many commands as you'd like on the remote systems. These need to be separated by
semicolons. You can also run pipelines of comands, such as
If you set the SERVERS variable in your environment (I set a number of these in ~/.bash_profile),
you can run commands on these machines over and over:
export SERVERS="web1 web2 mail"
fanout "$SERVERS" "uname -a ; ( if [ -f /var/log/dmesg ]; then cat /var/log/dmesg ; else dmesg
; fi ) | egrep -i '(hd[a-h]|sd[a-h])' ; ls -al /proc/kcore ; cat /proc/cpuinfo" >serverspecs
Sample run
#Sample run
[wstearns@sparrow fanout]$ fanout "localhost wstearns@localhost aaa.bbb.ccc" "uptime" | less
aaa.bbb.ccc unavailable
Starting localhost
Starting wstearns@localhost
Fanout executing "uptime"
Start time Fri Apr 7 00:13:07 EDT 2000 , End time Fri Apr 7 00:13:20 EDT 2000
==== On aaa.bbb.ccc ====
==== Machine unreachable by ping
==== On localhost ====
12:13am up 3 days, 10:44, 0 users, load average: 0.17, 0.17, 0.22
==== As wstearns on localhost ====
12:13am up 3 days, 10:44, 0 users, load average: 0.15, 0.16, 0.22
The command(s) you execute run concurrently on each remote machine. Output does not show up until
all are done.
and you'll get 3 additional xterms. Type your commands in the original terminal; each command
will be sent to each machine and you'll see the output from each machine in the other xterms. This
even works for interactive commands like editors.
There isn't Capistrano package in Debian system and it's a very good utility.
There is a description what is capistrano bellow:
Capistrano is a utility that can execute commands in parallel on multiples
servers. It allows you to define tasks, which can include commands that
are executed on the servers. You can also define roles for your servers,
and then specify that certain tasks apply only to certain roles.
Kernel: Linux 2.6.18-3-686 (SMP w/2 CPU cores)
Locale: LANG=pt_BR.UTF-8, LC_CTYPE=pt_BR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
AdminCoPy A wrapper for SSH/SCP
to run one command on many hosts.
ACP is basically a wrapper for SSH and SCP that allows a user to select, or manually enter, a
group of hosts to connect to. The user can run a command or copy some files/directories to multiple
hosts by issuing a single command on the "admin" host. It requires the fping program, as it checks
the specified hosts for connectivity, and will only try to run the command on/copy files to hosts
that are reachable.
A tool to run multiple commands to multiple hosts from a central shell over SSH.
A Perl program that uses Net::SSH::Perl that allows systems administrators to run multiple commands
to multiple hosts from one central host.
I work in a shop with approximately 100 servers running Linux and Solaris. Every
Tuesday we do a publish of our production content that requires me to login to
several machines to run the same commands on each. I thought that this was
ridiculous for me to open 8 terminal windows to run the same dumb commands over and
over again. Thus the birth of cssh. Cssh is a Perl script that I wrote to allow
me to admin my servers by performing any number of commands to any number of
servers from one centrally located shell login.
As I was writing cssh I started to think of other things I could put into cssh to
allow administration to be that much easier and more flexible. Thus came the
current project called cssh. The program is small all by itself. It's sort of
large, however, when you consider all the modules that are needed to make the
ssh portion of cssh work.
Cssh is under fairly heavy development. I schedule as much time as I can to work
on it around my personal life issues :). So, I have a fairly large todo list. I
am excited for others to test it out and give me feedback on what they feel should
be done on cssh.
You can click below for the current documentation. This project is very
new so please don't expect too much, yet.
Here is the file itself. Please read the documentation to learn about how to use
cssh. It might be a little confusing at first but as you read through the docum-
entation, its use should become more understandable. Feel free to contact me for
support. Right now, since the project is so new I should be able to assist you. I
suspect that if this project matures that personal support will become increasingly
more difficult. We will cross that bridge when we get there.
Added: Small install script. See documentation for installation *not* using install
script. To install using the install script simply extract the tarball:
tar xvfz cssh_beta-0.03.2.tar.gz or depending on which compressed format you download tar xvfj cssh_beta-0.03.2.tar.bz2
then: cd cssh_beta-0.03.2
and: ./install.sh < prefix >
Where prefix is the path that you want to install cssh.
If you run the install script as root then the cssh_configs directory will be installed
in /etc by default. The man page will be placed in /usr/share/man/man1 by default
as well, and if you don't specify a prefix as root then cssh will be installed in
/usr/bin (it should actually be /usr/local/bin) otherwise it will be installed in
< prefix >/bin. Currently, there is no way, short of changing the script a
little, to change this behavior. I whipped this up in a matter of minutes and tested
it for a few hours making little changes here and there. It's meant to make installation
a little easier. It will install Net::SSH::Perl for you. NOTE! You really should
run the installation script as root given the nature of Net::SSH::Perl.
dsh (the distributed shell) is a program which executes a single command on multiple remote machines.
It can execute this command in parallel (i.e., on any number of machines at a time) or in serial
(by specifying parallel execution of the command on 1 node at a time). It was originally designed
to work with rsh, but has full support for ssh and with a little tweaking of the top part of the
dsh executable, should work with any program that allows remote execution of a command without an
interactive login.
MrTools is a suite of tools for managing large, distributed environments.
It can be used to execute scripts on multiple remote hosts without prior installation,
copy of a file or directory to multiple hosts as efficiently as possible in a relatively secure way,
and collect a copy of a file or directory from multiple hosts.
Release focus: Initial freshmeat announcement
Changes:
Hash tree cleanup in thread tracking code was improved in all tools in the suite. Mrtools Has now
adopted version 3 of the GPL. A shell quoting issue in mrexec.pl was fixed. This fixed
several known limitations, including the ability to use mrexec.pl with Perl scripts and awk if statements.
This fix alone has redefined mrexec.pl's capabilities, making an already powerful tool even more
powerful.
Once installed the pssh package installs a number of new commands:
parallel-slurp
This command allows you to copy files from multipl remote hosts to the local system. We'll
demonstrate the usage shortly.
parallel-ssh
This command allows you to run commands upon a number of systems in parallel. We'll also demonstrate
this command shortly.
parallel-nuke
This command likes you kill processes on multiple remote systems.
parallel-scp
This is the opposite of parallel-slirp and allows you to copy a file, or files,
to multiple remote systems.
General Usage
Each of the new commands installed by the pssh package will expect to read a list of hostnames
from a text file. This makes automated usage a little bit more straightforward, and simplifies
the command-line parsing.
Running Commands On Multiple Hosts
The most basic usage is to simply run a command upon each host, and not report upon the output.
For example given the file hosts.txt containing a number of hostnames we can run:
The Last but not LeastTechnology is dominated by
two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt.
Ph.D
FAIR USE NOTICEThis site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
You can use PayPal to to buy a cup of coffee for authors
of this site
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society.We do not warrant the correctness
of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be
tracked by Google please disable Javascript for this site. This site is perfectly usable without
Javascript.