Pdsh -- a multithreaded remote shell client

fanout · pdcp

pdsh ... |  dshbak
dsh
:
dshbak
-w TARGETS... 
-w host0,host1,host2... 
-
foo19.
 -w 
pdsh -w foo[01-05] command 
pdsh -w foo[7,9-10] command
pdsh -w foo[0-5] -x foo[1-3] command 
Run command on foo0-eth0,foo1-eth0,foo2-eth0,foo3-eth0  
 pdsh -w foo[0-3]-eth0 command
pdsh -w "foo[01-05]" command 
     host[0-10] -> host0,host1,host2,...host10
     host[0-2,10] -> host0,host1,host2,host10
^
Read hosts from /tmp/hosts    
 pdsh -w ^/tmp/hosts ... 

Also works for multiple files:
 pdsh -w ^/tmp/hosts,^/tmp/morehosts ...
/'
/
/node.*/
Select only hosts ending in a 0 via regex:
 pdsh -w host[0-20],/0$/ ...
-
Run on all hosts (-a) except host0:
 pdsh -a -w -host0 ...

Exclude all hosts ending in 0:
 pdsh -a -w -/0$/ ...

Exclude hosts in file /tmp/hosts:
 pdsh -a -w -^/tmp/hosts ...
ssh:user1@host0
ssh
host0
user1
Run with user `foo' on hosts h0,h1,h2, and user `bar' on hosts h3,h5:
 pdsh -w foo@h[0-2],bar@h[3,5] ...

Use ssh and user "u1" for hosts h[0-2]:
 pdsh -w ssh:u1@h[0-2] ...
pdsh_rcmd_type
-x TARGETS...
Exclude hosts ending in 0:
 pdsh -a -x /0$/ ...

Exclude hosts in file /tmp/hosts:
 pdsh -a -x ^/tmp/hosts ...

Run on hosts node1-node100, excluding node50:
  pdsh -w node[1-100] -x node50 ...
WCOLL
pdsh -w ^$WCOLL ...
/etc/genders
cnode[101-137],lmain        all
cnode[101-116]              category=default
cnode[117-120]              largememory
lmain                       headnode
cnode[121-136]              blades

 -g blades 
cnode121
cnode136 
nodename 		attr[=value],attr[=value],... 
nodename1,nodename2,... attr[=value],attr[=value],... 
nodenames[A-B] 		attr[=value],attr[=value],... 
[Ctrl-C]
pdsh@hpc137: interrupt (one more within 1 sec to abort)
pdsh@hpc137:  (^Z within 1 sec to cancel pending threads)
pdsh@hpc137: hpc0: connecting
pdsh@hpc137: hpc1: command in progress
pdsh@hpc137: hpc2: command in progress
pdsh@hpc137: hpc3: connecting
pdsh@hpc137: hpc4: connecting
...
[Ctrl-C Ctrl-Z]
pdsh@hpc137: interrupt (one more within 1 sec to abort)
pdsh@hpc137:  (^Z within 1 sec to cancel pending threads)
pdsh@hpc137: hpc0: connecting
pdsh@hpc137: hpc1: command in progress
pdsh@hpc137: hpc2: command in progress
pdsh@hpc137: hpc3: connecting
pdsh@hpc137: hpc4: connecting
pdsh@hpc137: hpc5: connecting
pdsh@hpc137: hpc6: connecting
pdsh@hpc137: Canceled 8 pending threads.=
pdsh -av 'grep . /proc/sys/kernel/ostype'
ehype78: Linux
ehype79: Linux
ehype76: Linux
ehype85: Linux
ehype77: Linux
ehype84: Linux
...
[Ctrl-C]
pdsh@hype137: interrupt (one more within 1 sec to abort)
pdsh@hype137:  (^Z within 1 sec to cancel pending threads)
pdsh@hype137: hype0: connecting
pdsh@hype137: hype1: command in progress
pdsh@hype137: hype2: command in progress
pdsh@hype137: hype3: connecting
pdsh@hype137: hype4: connecting
...
[Ctrl-C Ctrl-Z]
pdsh@hype137: interrupt (one more within 1 sec to abort)
pdsh@hype137:  (^Z within 1 sec to cancel pending threads)
pdsh@hype137: hype0: connecting
pdsh@hype137: hype1: command in progress
pdsh@hype137: hype2: command in progress
pdsh@hype137: hype3: connecting
pdsh@hype137: hype4: connecting
pdsh@hype137: hype5: connecting
pdsh@hype137: hype6: connecting
pdsh@hype137: Canceled 8 pending threads.
:
dshbak
# export PDSH_RCMD_TYPE=ssh   # To override rsh and make ssh the default

pdsh -w ssh:host[0-10]             # host0,host1,host2,...host10
pdsh -w ssh:host[0-2,10]           # host0,host1,host2,host10 
pdsh -w ^/tmp/hosts ...            # Read hosts from /tmp/hosts 
pdsh -w host[0-20],/0$/            # only hosts ending in a 0 via regex pdsh -a -w -host0 ...              # Run on all hosts (-a) except host0 pdsh -a -w -^/tmp/hosts ...        # Exclude hosts in file /tmp/hosts 
pdsh -a -x ^/tmp/hosts ...         # Same - Exclude hosts in file /tmp/hosts 
pdsh -w node[1-100] -x node50 ...  # Run on hosts node1-node100, excluding node50 
export WCOLL=/tmp/hosts            # WCOLL env variable containing a list of hosts to target 
pdsh hostname                      # running ‘hostname’ on hosts from WCOLL

$ pdsh -g all uname -r
indra: 3.2.0-3-amd64
kali: 3.2.0-3-amd64
ganesh: 3.2.0-3-amd64
hanuman: 3.2.0-2-686-pae
pdsh -g blades 'export http_proxy=http://proxy.mycorp.com:8080 && yum -y install gnuplot'

  pdsh -g blades 'cat /Templates/sys/proxy >> /root/.bashrc'
  pdsh -g blades tail -2 /root/.bashrc
  pdsh -g blades "groupadd -g 100000 testuser && useradd -u 100000 -g 100000 -G users -m testuser
  pdsh -g blades cp /home/DIRCOLOR /etc  # assuming home is shared for all nodes  
# vim /etc/profile.d/pdsh.sh
# setup pdsh for cluster users
export PDSH_RCMD_TYPE='ssh'
export WCOLL='/etc/pdsh/machines'
/etc/pdsh/machines
node1
node2
node3
.......
.......
 -a 
-g 
 -j
:
pdsh -av 'grep . /proc/sys/kernel/ostype'
hpc78: Linux
hpc79: Linux
hpc76: Linux
hpc85: Linux
hpc77: Linux
hpc84: Linux
...
 nodeattr -c "login||mgmt"
nodeattr -c "login&&cpus=4"
nodeattr -c "~(login||mgmt)" 
pdsh -w `nodeattr -c login` command
for i in `nodeattr -n login`; do rsh $i who; done
nodeattr head && echo yes
nodeattr -Q "head&&ntpserver" && echo yes
/etc/genders
hostname:
host0 pdsh_rcmd_type=ssh
pdsh -V
   pdsh -V
   pdsh-2.23 (+debug)
   rcmd modules: ssh,rsh,mrsh,exec (default: mrsh)
   misc modules: slurm,dshgroup,nodeupdown (*conflicting: genders)
   [* To force-load a conflicting module, use the -M <name> option]
conflicting
-L
> pdsh -L
8 modules loaded:

Module: misc/dshgroup
Author: Mark Grondona <[email protected]>
Descr:  Read list of targets from dsh-style "group" files
Active: yes
Options:
-g groupname      target hosts in dsh group "groupname"
-X groupname      exclude hosts in dsh group "groupname"

Module: rcmd/exec
Author: Mark Grondona <[email protected]>
Descr:  arbitrary command rcmd connect method
Active: yes

Module: misc/genders
Author: Jim Garlick <[email protected]>
Descr:  target nodes using libgenders and genders attributes
Active: no
Options:
-g query,...      target nodes using genders query
-X query,...      exclude nodes using genders query
-F file           use alternate genders file `file'
-i                request alternate or canonical hostnames if applicable
-a                target all nodes except those with "pdsh_all_skip" attribute
-A                target all nodes listed in genders database

...
genders
dshgroup
PDSH_MISC_MODULES
rcmd
pdsh -R exec -w foo[0-10] grep BUG console.%h
pdsh -a date
pdsh -a
      uptime
pdsh -a cat /etc/redhat-release

pdsh -w node[01-32] who

pdsh -a ps augx | grep pbs | grep -v grep

dshbak
 pdsh -w node[01-32] who | dshbak 
 pdsh -w node[01-32] who | dshbak -c 
 pdsh -a date | dshbak -c

option
Hm, this seems like a good idea, but I'm not sure dshbak is the right 
place for this. (That script is meant to simply reformat output which
is prefixed by "node: ")

If you'd like to track up/down nodes, you should check out Al Chu's
Cerebro and whatsup/libnodeupdown:

http://www.llnl.gov/linux/cerebro/cerebro.html
http://www.llnl.gov/linux/whatsup/

But I do realize that reporting nodes that did not respond to pdsh
would also be a good feature. However, it seems to me that pdsh itself
would have to do this work, because only it knows the list of hosts originally
targeted. (How would dshbak know this?)

As an alternative I sometimes use something like this:

 # pdsh -a true 2>&1 | sed 's/^[^:]*: //'  | dshbak -c
----------------
emcr[73,138,165,293,313,331,357,386,389,481,493,499,519,522,526,536,548,553,560,564,574,601,604,612,618,636,646,655,665,676,678,693,700-701,703,706,711,713,715,717-718,724,733,737,740,759,767,779,817,840,851,890]
----------------
 mcmd: connect failed: No route to host
----------------
emcrj
----------------
 mcmd: xpoll: protocol failure in circuit setup

i.e. strip off the leading pdsh@...: and send all errors to stdout. Then
 collect errors with dshbak to see which hosts are not reachable.

Maybe we should add an option to pdsh to issue a report of failed hosts
at the end of execution?

mark

> 

pdsh -w server[0-9],server10 'command1 ; command2 ;... ; command5' > logfile.txt

dshbak logfile.txt | less

pdsh -w server[0-9],server10 'command1 ; command2 ;... ; command5' | 
    dshbak > logfile.txt

# pdsh -g all uptime
kali:  12:03:18 up 33 days, 23 min,  3 users,  load average: 0.21, 0.06, 0.06
hanuman:  12:03:37 up 10 days, 21:59,  2 users,  load average: 0.04, 0.05, 0.05
indra:  12:02:57 up 13 days,  1:12,  3 users,  load average: 0.30, 0.26, 0.30
ganesh:  12:03:10 up 6 days, 10:11,  6 users,  load average: 1.18, 1.34, 1.35

server1  all,web
server2  all,web
server3  all,mail
server4  all,mail
server5  all,mysql
server6  all,mysql

PDSH_SSH_ARGS_APPEND="-q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o PreferredAuthentications=publickey"
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] "cat > ~/.screenrc <<EOF  
hardstatus alwayslastline \"%{= RY}%H %{kG}%{G} Screen(s): %{c}%w %=%{kG}%c  %D, %M %d %Y  LD:%l\"  
startup_message off  
msgwait 1  
defscrollback 100000  
nethack on  
EOF  
"
export PDSH_SSH_ARGS_APPEND="-o ConnectTimeout=5 -o CheckHostIP=no -o StrictHostKeyChecking=no"
sudo apt-get install pdsh

echo 'ssh' > /etc/pdsh/rcmd_default
$ pdsh -g all uname -r
indra: 3.2.0-3-amd64
kali: 3.2.0-3-amd64
ganesh: 3.2.0-3-amd64
hanuman: 3.2.0-2-686-pae

#! /bin/bash 

# a better version would use a command line arg (e.g. -h) to get a
# comma-separated list of hostnames, but hard-coding it here illustrates
# the concept well enough.
HOSTS="[email protected] [email protected]"

# last argument is the target directory on the remote hosts
target_dir="$BASH_ARGV"

# all but the last arg are the files to copy
files=${@:1:$((${#@} - 1))}

for h in $HOSTS; do
    scp $files "$h:$target_dir"
done
[David: This probably won't actually post, because I'm sending it from
another account than the one that's subscribed to the list.  I don't want to
fix this address issue right now.  Feel free to share the below with the
list.]

In HPC clusters, I use and love pdsh, built with genders support.

http://sourceforge.net/projects/pdsh/
http://sourceforge.net/projects/genders/

Some quick 'n' dirty examples follow, leaving out a whole lot of context and
discussion and fine points.  I just want to get across some of the joy of
having a tool like this. :)

Suppose you have a 1000-node cluster with hostnames like node0001 through
node1000, and pdsh with genders support.  You could create a file
/etc/genders with contents like this:

node[0001-1000],adm[01-03]    all
node[0001-0040]    compute
node[0001-0040]    rack1
...
node[0951-1000]    highmem
adm[01-03]            admin

and to find which compute nodes are down, do something like this:

pdsh -g compute true

whichever ones are down, will eventually return an ssh error.  (This assumes
passwordless ssh.)

To check only rack 1:  pdsh -g rack1 true
Or only your high-memory nodes: pdsh -g highmem true
Or only your administrative servers: pdsh -g admin true

Or to check whether all have the same filesystems mounted:

pdsh -g all "mount | sort" | dshbak -c

will return a summary of which nodes returned the same output.  It might
look like this:

node[0001-0432],node[0434-1000]
-----------
<some normal output for compute nodes>

admin[01-03]
-----------
<some normal output for admin nodes>

node0433
-----------
<some abnormal output>

and now you know you have a problem with node0433.

pdsh is fast.  Running the preceding command might take 5-10 seconds to get
the results on a typical 1000-node cluster, so it is quick and easy to use,
and supports quick work (especially ad hoc investigations of what's going
on).  The quickness is because pdsh forks several worker processes, 32 by
default; you can change that higher (for more parallelism) or lower (e.g. 1
to get serial operation) with the -f <number> option.

I also love the "exec" functionality of pdsh.  For example:

pdsh -R exec -g all ping -c1 -w1 %h

will ping every node instead of ssh'ing to every node to run a command.  The
advantage of doing this with pdsh instead of a loop is that it takes
advantage of the parallelism offered by pdsh.  Ping is a trivial example;
you can do lots of other things of course, like run an Expect script across
all your network switches. :)

David

On Mon, Jul 12, 2010 at 10:52 AM, David N. Blank-Edelman <dnb at ccs.neu.edu>wrote:

> Hi-
>  It's been a bit quiet here so I thought I'd ask a favorite tool question.
>
> I've been looking at the current crop of utilities that allow you to easily
> run a command on N of your machines in parallel.
>
> There are sort of two flavors:
>   1) those that let you type the same thing on N machines interactively
> (e.g. http://guichaz.free.fr/gsh/ or
> http://sourceforge.net/projects/clusterssh/) and
>
>   2) those that just run the command line as given (e.g.
> http://web.taranis.org/shmux/,
> http://www.netfort.gr.jp/~dancer/software/dsh.html<http://www.netfort.gr.jp/%7Edancer/software/dsh.html>,
> http://code.google.com/p/parallel-ssh/,
> http://sourceforge.net/projects/pdsh/,
> http://sourceforge.net/projects/mussh/).
>
> I'm mostly interested in the tools in category #2, but I'd be happy to hear
> about cool ones from #1 was well.
>
> What do you use and why? What do you like about it, dislike about it?
>
> Thanks!
>
>    -- dNb
> _______________________________________________
> sage-members mailing list
> sage-members at mailman.sage.org
> http://mailman.sage.org/mailman/listinfo/sage-members
$ pdsh -R ssh -w ^machines "<command>"

$ pdsh -w foo[1-5] echo "Hello World"
$ clush -w foo[1-5] echo "Hello World"
On 11 May 2010, at 19:20, Prentice Bisbal wrote:

> Since so many of you use and recommend pdsh, I have a few questions for
> you:
> 
> 1. Do you build and RPM from the .spec file, which doesn't support
> genders, or do you configure/compile yourself?

I build it myself.  From the top of my head the options I use are --with-ssh --without-rsh.  Last time i built it if both were built the default was to prefer rsh over ssh which should probably be changed at some point.

> 2. If not using genders, what is the syntax of the /etc/machines file? I
> assume it's the same as the gender file, but that's just a hunch.

It's just a flat list of hosts, one per line, although I believe it can take host-specs as well.  e.g compute[0-1023]

> 3. Are there any advantages/disadvantages to using machines over genders?

Genders is much more flexible, machines is easier to configure.

Two more things of note, "dshbak -c" is worth knowing about, pipe the output of pdsh into this and it'll sort the output by hostname and compress hosts with identical output into a single report.

The other really useful aspect of pdsh is the "-R exec" option, instead of running the command on a remote node it runs the command locally but replaces %h with the hostname.  One trivial example is "pdsh -a -R exec grep %h /var/log/messages | dskbak -c" but once you get used to it you can use it for much more advanced commands, earlier on today I ran "pdsh -w [0-25] -R exec tune2fs -O extents /dev/mapper/ost_%h" to re-tune all the devices in a lustre filesystem.

Ashley.

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk
pdsh x86_64_fileservers reboot
On Mon, 7 Nov 2011 17:24:32 -0800, Michael Lampe <mlampe0@...> wrote:
> Mark A. Grondona wrote:
> 
> > You can set /proc/sys/net/ipv4/tcp_tw_recycle and
> > /proc/sys/net/tcp_tw_reuse on the nodes if you want to be
> > sure TIME_WAIT connections aren't interfering here, but
> > I really don't think that is the problem.
> 
> Yep, the problem is on the frontend, not on the nodes.
> 
> To repeat myself in this thread:
> 
> "The idiot has been identified -- it's me."
> 
> When I assembled & installed this cluster ~3 years ago, I thought it a 
> good idea to have it a firewall -- and why not allow the nodes also 
> access to the outer world?
> 
> Because I was too lame to learn iptables, I installed firestarter. A 
> nice little tool, that does exactly that with just a few clicks. It's 
> really a nice little tool, but it also installed a file 
> /etc/firestarter/sysctl-tuning, which -- just for your personal 
> amusement -- is attached.
> 
> Nobody ever complained about this "tuning" -- OK, MPI uses Infiniband, 
> but NFS (over TCP/IP) is as ok as it can be. And I'm also eating my dog 
> food ...
> 
> So again: Thanks a bunch!! (And sorry for the wasted time.)

No problem, and glad to hear things are working now!? ;-)

> 
> -Michael
> 
> 
> ======================================================================
> 
> 
> # --------( Sysctl Tuning - Recommended Parameters )--------
> 
> # Turn off IP forwarding by default
> # (this will be enabled if you require masquerading)
> 
> if [ -e /proc/sys/net/ipv4/ip_forward ]; then
>    echo 0 > /proc/sys/net/ipv4/ip_forward
> fi
> 
> # Do not log 'odd' IP addresses (excludes 0.0.0.0 & 255.255.255.255)
> 
> if [ -e /proc/sys/net/ipv4/conf/all/log_martians ]; then
>    echo 0 > /proc/sys/net/ipv4/conf/all/log_martians
> fi
> 
> 
> # --------( Sysctl Tuning - TCP Parameters )--------
> 
> # Turn off TCP Timestamping in kernel
> if [ -e /proc/sys/net/ipv4/tcp_timestamps ]; then
>    echo 0 > /proc/sys/net/ipv4/tcp_timestamps
> fi
> 
> # Set TCP Re-Ordering value in kernel to '5'
> if [ -e /proc/sys/net/ipv4/tcp_reordering ]; then
>    echo 5 > /proc/sys/net/ipv4/tcp_reordering
> fi
> 
> # Turn off TCP ACK in kernel
> if [ -e /proc/sys/net/ipv4/tcp_sack ]; then
>    echo 0 > /proc/sys/net/ipv4/tcp_sack
> fi
> 
> #Turn off TCP Window Scaling in kernel
> if [ -e /proc/sys/net/ipv4/tcp_window_scaling ]; then
>    echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
> fi
> 
> #Set Keepalive timeout to 1800 seconds
> if [ -e /proc/sys/net/ipv4/tcp_keepalive_time ]; then
>    echo 1800 > /proc/sys/net/ipv4/tcp_keepalive_time
> fi
> 
> #Set FIN timeout to 30 seconds
> if [ -e /proc/sys/net/ipv4/tcp_fin_timeout ]; then
>    echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
> fi
> 
> # Set TCP retry count to 3
> if [ -e /proc/sys/net/ipv4/tcp_retries1 ]; then
>    echo 3 > /proc/sys/net/ipv4/tcp_retries1
> fi
> 
> #Turn off ECN notification in kernel
> if [ -e /proc/sys/net/ipv4/tcp_ecn ]; then
>    echo 0 > /proc/sys/net/ipv4/tcp_ecn
> fi
> 
> 
> # --------( Sysctl Tuning - SYN Parameters )--------
> 
> # Turn on SYN cookies protection in kernel
> if [ -e /proc/sys/net/ipv4/tcp_syncookies ]; then
>    echo 1 > /proc/sys/net/ipv4/tcp_syncookies
> fi
> 
> 
> # Set SYN ACK retry attempts to '3'
> if [ -e /proc/sys/net/ipv4/tcp_synack_retries ]; then
>    echo 3 > /proc/sys/net/ipv4/tcp_synack_retries
> fi
> 
> # Set SYN backlog buffer to '64'
> if [ -e /proc/sys/net/ipv4/tcp_max_syn_backlog ]; then
>    echo 64 > /proc/sys/net/ipv4/tcp_max_syn_backlog
> fi
> 
> # Set SYN retry attempts to '6'
> if [ -e /proc/sys/net/ipv4/tcp_syn_retries ]; then
>    echo 6 > /proc/sys/net/ipv4/tcp_syn_retries
> fi
> 
> 
> # --------( Sysctl Tuning - Routing / Redirection Parameters )--------
> 
> # Turn on source address verification in kernel
> if [ -e /proc/sys/net/ipv4/conf/all/rp_filter ]; then
>    for f in /proc/sys/net/ipv4/conf/*/rp_filter
>    do
>     echo 1 > $f
>    done
> fi
> 
> # Turn off source routes in kernel
> if [ -e /proc/sys/net/ipv4/conf/all/accept_source_route ]; then
>    for f in /proc/sys/net/ipv4/conf/*/accept_source_route
>    do
>     echo 0 > $f
>    done
> fi
> 
> # Do not respond to 'redirected' packets
> if [ -e /proc/sys/net/ipv4/secure_redirects ]; then
>    echo 0 > /proc/sys/net/ipv4/secure_redirects
> fi
> 
> # Do not reply to 'redirected' packets if requested
> if [ -e /proc/sys/net/ipv4/send_redirects ]; then
>    echo 0 > /proc/sys/net/ipv4/send_redirects
> fi
> 
> # Do not reply to 'proxyarp' packets
> if [ -e /proc/sys/net/ipv4/proxy_arp ]; then
>    echo 0 > /proc/sys/net/ipv4/proxy_arp
> fi
> 
> # Set FIB model to be RFC1812 Compliant
> # (certain policy based routers may break with this - if you find
> #  that you can't access certain hosts on your network - please set
> #  this option to '0' - which is the default)
> 
> if [ -e /proc/sys/net/ipv4/ip_fib_model ]; then
>    echo 2 > /proc/sys/net/ipv4/ip_fib_model
> fi
> 
> # --------( Sysctl Tuning - ICMP/IGMP Parameters )--------
> 
> # ICMP Dead Error Messages protection
> if [ -e /proc/sys/net/ipv4/icmp_ignore_bogus_error_responses ]; then
>    echo 1 > /proc/sys/net/ipv4/icmp_ignore_bogus_error_responses
> fi
> 
> # ICMP Broadcasting protection
> if [ -e /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts ]; then
>    echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
> fi
> 
> # IGMP Membership 'overflow' protection
> # (if you are planning on running your box as a router - you should either
> #  set this option to a number greater than 5, or disable this protection
> #  altogether by commenting out this option)
> 
> if [ -e /proc/sys/net/ipv4/igmp_max_memberships ]; then
>    echo 1 > /proc/sys/net/ipv4/igmp_max_memberships
> fi
> 
> 
> # --------( Sysctl Tuning - Miscellanous Parameters )--------
> 
> # Set TTL to '64' hops
> # (If you are running a masqueraded network, or use policy-based
> #  routing - you may want to increase this value depending on the load
> #  on your link.)
> 
> if [ -e /proc/sys/net/ipv4/conf/all/ip_default_ttl ]; then
>    for f in /proc/sys/net/ipv4/conf/*/ip_default_ttl
>    do
>     echo 64 > $f
>    done
> fi
> 
> # Always defragment incoming packets
> # (Some cable modems [ Optus @home ] will suffer intermittent connection
> #  droputs with this setting. If you experience problems, set this to '0')
> 
> if [ -e /proc/sys/net/ipv4/ip_always_defrag ]; then
>    echo 1 > /proc/sys/net/ipv4/ip_always_defrag
> fi
> 
> # Keep packet fragments in memory for 8 seconds
> # (Note - this option has no affect if you turn packet defragmentation
> #  (above) off!)
> 
> if [ -e /proc/sys/net/ipv4/ipfrag_time ]; then
>    echo 8 > /proc/sys/net/ipv4/ipfrag_time
> fi
> 
> # Do not reply to Address Mask Notification Warnings
> # (If you are using your machine as a DMZ router or a PPP dialin server
> #  that relies on proxy_arp requests to provide addresses to it's clients
> #  you may wish to disable this option by setting the value to '1'
> 
> if [ -e /proc/sys/net/ipv4/ip_addrmask_agent ]; then
>    echo 0 > /proc/sys/net/ipv4/ip_addrmask_agent
> fi
> 
> if [ "$EXT_PPP" = "on" ]; then
>      # Turn on dynamic TCP/IP address hacking
>      # (Some broken PPPoE clients require this option to be enabled)
>      if [ -e /proc/sys/net/ipv4/ip_dynaddr ]; then
>          echo 1 > /proc/sys/net/ipv4/ip_dynaddr
>      fi
> else
>      if [ -e /proc/sys/net/ipv4/ip_dynaddr ]; then
>          echo 0 > /proc/sys/net/ipv4/ip_dynaddr
>      fi
>      fi
> fi
> # --------( Sysctl Tuning - IPTables Specific Parameters )--------
> 
> # Doubling current limit for ip_conntrack
> if [ -e /proc/sys/net/ipv4/ip_conntrack_max ]; then
>    echo 16384 > /proc/sys/net/ipv4/ip_conntrack_max
> fi 
# vim /etc/profile.d
# setup pdsh for cluster users
export PDSH_RCMD_TYPE='ssh'
export WCOLL='/etc/pdsh/machines'
# vim /etc/pdsh/machines/
node1
node2
node3
.......
.......
# pdsh -a "rpm -Uvh /root/htop-1.0.2-1.el6.rf.x86_64.rpm"
# pdsh -w host1,host2 "rpm -Uvh /root/htop-1.0.2-1.el6.rf.x86_64.rpm"
$ pdsh -w 192.168.1.250 uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
$ pdsh -w ssh:[email protected] uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
[laytonjb@home4 ~]$ mkdir PDSH
[laytonjb@home4 ~]$ cd PDSH
[laytonjb@home4 PDSH]$ vi hosts
[laytonjb@home4 PDSH]$ more hosts
192.168.1.4
192.168.1.250
export WCOLL=/home/laytonjb/PDSH/hosts

$ pdsh -w 192.168.1.4,192.168.1.250 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
pdsh -w host[1-11]
pdsh -w host[1-4,8-11]
$ more /tmp/hosts
192.168.1.4
$ more /tmp/hosts2
192.168.1.250
$ pdsh -w ^/tmp/hosts,^/tmp/hosts2 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64
192.168.1.250: 2.6.32-431.11.2.el6.x86_64

$ pdsh -w -192.168.1.250 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64

$ pdsh -w -^/tmp/hosts2 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64

$ pdsh -x 192.168.1.4 uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
$ pdsh -x ^/tmp/hosts uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
$ more /tmp/hosts
192.168.1.4

robin@RNMMBP $ pdsh -w root@rnmcluster02-node0[1-4] date
rnmcluster02-node01: Fri Nov 28 17:26:17 GMT 2014  
rnmcluster02-node02: Fri Nov 28 17:26:18 GMT 2014  
rnmcluster02-node03: Fri Nov 28 17:26:18 GMT 2014  
rnmcluster02-node04: Fri Nov 28 17:26:18 GMT 2014
pdsh -w root@rnmcluster02-node0[1-4] yum install -y collectl  
pdsh -w root@rnmcluster02-node0[1-4] service collectl start  
pdsh -w root@rnmcluster02-node0[1-4] chkconfig collectl on

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] yum install -y ntp ntpdate
robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] ln -sf /usr/share/zoneinfo/Europe/London /etc/localtime

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] ntpdate pool.ntp.org  
rnmcluster02-node03: 30 Nov 20:46:22 ntpdate[27610]: step time server 176.58.109.199 offset -2.928585 sec  
rnmcluster02-node02: 30 Nov 20:46:22 ntpdate[28527]: step time server 176.58.109.199 offset -2.946021 sec  
rnmcluster02-node04: 30 Nov 20:46:22 ntpdate[27615]: step time server 129.250.35.250 offset -2.915713 sec  
rnmcluster02-node01: 30 Nov 20:46:25 ntpdate[29316]: 178.79.160.57 rate limit response from server.  
rnmcluster02-node01: 30 Nov 20:46:22 ntpdate[29316]: step time server 176.58.109.199 offset -2.925016 sec

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] chkconfig ntpd on

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] service ntpd start

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] "cat > ~/.screenrc <<EOF  
hardstatus alwayslastline \"%{= RY}%H %{kG}%{G} Screen(s): %{c}%w %=%{kG}%c  %D, %M %d %Y  LD:%l\"  
startup_message off  
msgwait 1  
defscrollback 100000  
nethack on  
EOF  
"

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] "date;sleep 5;date"  
rnmcluster02-node01: Sun Nov 30 20:57:06 GMT 2014  
rnmcluster02-node03: Sun Nov 30 20:57:06 GMT 2014  
rnmcluster02-node04: Sun Nov 30 20:57:06 GMT 2014  
rnmcluster02-node02: Sun Nov 30 20:57:06 GMT 2014  
rnmcluster02-node01: Sun Nov 30 20:57:11 GMT 2014  
rnmcluster02-node03: Sun Nov 30 20:57:11 GMT 2014  
rnmcluster02-node04: Sun Nov 30 20:57:11 GMT 2014  
rnmcluster02-node02: Sun Nov 30 20:57:11 GMT 2014

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node0[1-4] date;sleep 5;date  
rnmcluster02-node03: Sun Nov 30 20:57:53 GMT 2014  
rnmcluster02-node04: Sun Nov 30 20:57:53 GMT 2014  
rnmcluster02-node02: Sun Nov 30 20:57:53 GMT 2014  
rnmcluster02-node01: Sun Nov 30 20:57:53 GMT 2014  
Sun 30 Nov 2014 20:58:00 GMT

robin@RNMMBP $ pdsh -w root@rnmcluster02-node[01-4] "chkconfig collectl on && service collectl start"

rnmcluster02-node03: Starting collectl: [  OK  ]  
rnmcluster02-node02: Starting collectl: [  OK  ]  
rnmcluster02-node04: Starting collectl: [  OK  ]  
rnmcluster02-node01: Starting collectl: [  OK  ]

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] "chkconfig | grep collectl"  
rnmcluster02-node03: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off  
rnmcluster02-node01: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off  
rnmcluster02-node04: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off  
rnmcluster02-node02: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] chkconfig | grep collectl  
rnmcluster02-node02: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off  
rnmcluster02-node04: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off  
rnmcluster02-node03: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off  
rnmcluster02-node01: collectl           0:off   1:off   2:on    3:on    4:on    5:on    6:off

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] "chkconfig>/tmp/pdsh.out"  
robin@RNMMBP ~ $ ls -l /tmp/pdsh.out  
ls: /tmp/pdsh.out: No such file or directory

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] chkconfig>/tmp/pdsh.out  
robin@RNMMBP ~ $ ls -l /tmp/pdsh.out  
-rw-r--r--  1 robin  wheel  7608 30 Nov 19:23 /tmp/pdsh.out

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] sleep 30  
^Cpdsh@RNMMBP: interrupt (one more within 1 sec to abort)
pdsh@RNMMBP:  (^Z within 1 sec to cancel pending threads)  
pdsh@RNMMBP: rnmcluster02-node01: command in progress  
pdsh@RNMMBP: rnmcluster02-node02: command in progress  
pdsh@RNMMBP: rnmcluster02-node03: command in progress  
pdsh@RNMMBP: rnmcluster02-node04: command in progress

robin@RNMMBP ~ $ pdsh -w root@rnmcluster02-node[01-4] sleep 30  
^Cpdsh@RNMMBP: interrupt (one more within 1 sec to abort)
pdsh@RNMMBP:  (^Z within 1 sec to cancel pending threads)  
pdsh@RNMMBP: rnmcluster02-node01: command in progress  
pdsh@RNMMBP: rnmcluster02-node02: command in progress  
pdsh@RNMMBP: rnmcluster02-node03: command in progress  
pdsh@RNMMBP: rnmcluster02-node04: command in progress  
^Csending SIGTERM to ssh rnmcluster02-node01
sending signal 15 to rnmcluster02-node01 [ssh] pid 26534  
sending SIGTERM to ssh rnmcluster02-node02  
sending signal 15 to rnmcluster02-node02 [ssh] pid 26535  
sending SIGTERM to ssh rnmcluster02-node03  
sending signal 15 to rnmcluster02-node03 [ssh] pid 26533  
sending SIGTERM to ssh rnmcluster02-node04  
sending signal 15 to rnmcluster02-node04 [ssh] pid 26532  
pdsh@RNMMBP: interrupt, aborting.

robin@RNMMBP ~ $ pdsh -f 2 -w root@rnmcluster02-node[01-4] "sleep 5;date"  
^Cpdsh@RNMMBP: interrupt (one more within 1 sec to abort)
pdsh@RNMMBP:  (^Z within 1 sec to cancel pending threads)  
pdsh@RNMMBP: rnmcluster02-node01: command in progress  
pdsh@RNMMBP: rnmcluster02-node02: command in progress  
^Zpdsh@RNMMBP: Canceled 2 pending threads.
rnmcluster02-node01: Mon Dec  1 21:46:35 GMT 2014  
rnmcluster02-node02: Mon Dec  1 21:46:35 GMT 2014

The authenticity of host 'rnmcluster02-node02 (172.28.128.9)' can't be established.  
RSA key fingerprint is 00:c0:75:a8:bc:30:cb:8e:b3:8e:e4:29:42:6a:27:1c.  
Are you sure you want to continue connecting (yes/no)?

export PDSH_SSH_ARGS_APPEND="-q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null"

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!  
Someone could be eavesdropping on you right now (man-in-the-middle attack)!  
It is also possible that a host key has just been changed.  
The fingerprint for the RSA key sent by the remote host is  
00:c0:75:a8:bc:30:cb:8e:b3:8e:e4:29:42:6a:27:1c.  
Please contact your system administrator.

pdsh
pdsh
pdsh
pdsh -a <command>
<command>
pdsh -a -x node00 <command>
<command>
pdsh -w node[01-08] <command>
<command>
node01, node02, ..., node08
pdsh
dshbak
pdsh
pdsh -a date
pdsh -a uptime
mpd
pdsh -a ps augx | grep mpd
pdsh -a mpdcleanup 
pdsh -a /bin/rm -f /tmp/pvm*
dshbak
pdsh
-c
pdsh -a ls -l /etc/ntp | dshbak -c
[amit@onyx amit]$ pdsh -a ls -l /etc/ntp | dshbak -c
----------------
ws[01-16]
----------------
 total 16
 -rw-r--r--  1 root root   8 Jun  4 11:53 drift
 -rw-------  1 root root 266 Jun  4 11:53 keys
 -rw-r--r--  1 root root  13 Jun  4 11:53 ntpservers
 -rw-r--r--  1 root root  13 Jun  4 11:53 step-tickers
----------------
ws00
----------------
 total 16
 -rw-r--r--    1 ntp      ntp             8 Sep  5 21:51 drift
 -rw-------    1 ntp      ntp           266 Feb 13  2003 keys
 -rw-r--r--    1 root     root           58 Oct  3  2003 ntpservers
 -rw-r--r--    1 ntp      ntp            23 Oct  3  2003 step-tickers
----------------
ws[17-32]
----------------
 total 16
 -rw-r--r--  1 root root   8 May 27 13:31 drift
 -rw-------  1 root root 266 May 27 13:31 keys
 -rw-r--r--  1 root root  13 May 27 13:31 ntpservers
 -rw-r--r--  1 root root  13 May 27 13:31 step-tickers
[amit@onyx amit]$
./configure --with-ssh --without-rsh
make
make install
[laytonjb@home4 ~]$ pdsh -v
pdsh: invalid option -- 'v'
Usage: pdsh [-options] command ...
-S                return largest of remote command return values
-h                output usage menu and quit
-V                output version information and quit
-q                list the option settings and quit
-b                disable ^C status feature (batch mode)
-d                enable extra debug information from ^C status
-l user           execute remote commands as user
-t seconds        set connect timeout (default is 10 sec)
-u seconds        set command timeout (no default)
-f n              use fanout of n nodes
-w host,host,...  set target node list on command line
-x host,host,...  set node exclusion list on command line
-R name           set rcmd module to name
-M name,...       select one or more misc modules to initialize first
-N                disable hostname: labels on output lines
-L                list info on all loaded modules and exit
available rcmd modules: ssh,exec (default: ssh)
export PDSH_RCMD_TYPE=ssh
$ pdsh -w 192.168.1.250 ls -s
pdsh@home4: 192.168.1.250: rcmd: socket: Permission denied
$ pdsh -w 192.168.1.250 uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
$ pdsh -w ssh:[email protected] uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
[laytonjb@home4 ~]$ mkdir PDSH
[laytonjb@home4 ~]$ cd PDSH
[laytonjb@home4 PDSH]$ vi hosts
[laytonjb@home4 PDSH]$ more hosts
192.168.1.4
192.168.1.250
export WCOLL=/home/laytonjb/PDSH/hosts
$ pdsh -w 192.168.1.4,192.168.1.250 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
pdsh -w host[1-11]
pdsh -w host[1-4,8-11]
[laytonjb@home4 ~]$ pdsh -w ^/tmp/hosts uptime
192.168.1.4:  15:51:39 up  8:35, 12 users,  load average: 0.64, 0.38, 0.20
192.168.1.250:  15:47:53 up 2 min,  0 users,  load average: 0.10, 0.10, 0.04
[laytonjb@home4 ~]$ more /tmp/hosts
192.168.1.4
192.168.1.250
$ more /tmp/hosts
192.168.1.4
$ more /tmp/hosts2
192.168.1.250
$ pdsh -w ^/tmp/hosts,^/tmp/hosts2 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
$ pdsh -w -192.168.1.250 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64
$ pdsh -w -^/tmp/hosts2 uname -r
192.168.1.4: 2.6.32-431.17.1.el6.x86_64
$ pdsh -x 192.168.1.4 uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
$ pdsh -x ^/tmp/hosts uname -r
192.168.1.250: 2.6.32-431.11.2.el6.x86_64
$ more /tmp/hosts
192.168.1.4
[laytonjb@home4 ~]$ pdsh 'cat /proc/cpuinfo | grep bogomips'
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.250: bogomips : 5624.23
192.168.1.250: bogomips : 5624.23
192.168.1.250: bogomips : 5624.23
192.168.1.250: bogomips : 5624.23
[laytonjb@home4 ~]$ pdsh 'cat /proc/cpuinfo' | grep bogomips
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.4: bogomips   : 6997.39
192.168.1.250: bogomips : 5624.23
192.168.1.250: bogomips : 5624.23
192.168.1.250: bogomips : 5624.23
192.168.1.250: bogomips : 5624.23
laytonjb@home4 ~]$ pdsh vmstat 1 2
192.168.1.4: procs  ------------memory------------   ---swap-- -----io---- --system--  -----cpu-----
192.168.1.4:  r  b     swpd   free    buff   cache     si   so    bi    bo   in    cs  us sy id wa st
192.168.1.4:  1  0        0 30198704  286340  751652    0    0     2     3   48    66   1  0 98  0  0
192.168.1.250: procs -----------memory------------   ---swap-- -----io---- --system-- ------cpu------
192.168.1.250:  r  b   swpd   free    buff   cache     si   so    bi    bo   in    cs us sy  id wa st
192.168.1.250:  0  0      0 7248836   25632  79268      0    0    14     2   22    21  0  0  99  0  0
192.168.1.4:    1  0      0 30198100  286340 751668     0    0     0     0  412   735  1  0  99  0  0
192.168.1.250:  0  0      0 7249076   25632  79284      0    0     0     0   90    39  0  0 100  0  0

Softpanorama May the source be with you, but remember the KISS principle ;-)	Home	Switchboard	Unix Administration	Red Hat	TCP/IP Networks	Neoliberalism	Toxic Managers
	(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix

News	Enterprise Unix System Administration	Recommended Links	Unix Configuration Management Tools	Examples	dshbak - filter that normalizes pdsh output	pdcp - copy files to groups of hosts in parallel	rpdcp
rsync	rdist	Cluster SSH	Mussh	Pdsh a multithreaded remote shel	pdcp	pssh	parallel
SSH Power Tool	Tentakel	Multi Remote Tools	Slurping	C3 Tools	Perl Admin Tools and Scripts	Grid Engine	Unix System Monitoring
SSH Usage in Pipes	Password-less SSH login	scp	sftp	Tips	History	Humor	Etc

Top Visited <p>Your browser does not support iframes.</p>					Switchboard
					Latest
					Past week
					Past month

Genders

	Jim Garlick

	High Performance Computing Systems Group
	Livermore Computing
	Lawrence Livermore National Laboratory

	June 19, 2000

	(Revised by Albert Chu - May 2003, April 2004)
	(Revised by Ben Casses - July 2015)

	------------------------------------------------------------------------
	NOTE: This document is part of the "Genders" software package,
	approved by LLNL for public release. Please see the DISCLAIMER file
	distributed with the package for restrictions.
	------------------------------------------------------------------------

	Introduction

	Genders is a simple concept that has been extended to address configuration
	management issues that arise on MPP clusters[1]. The basic concept is that a
	genders file containing a list of node names and their attributes is
	replicated on each node in a cluster. By representing the configuration of
	an MPP in this file, scripts and rdist Distfiles that might otherwise
	contain hard-coded node lists, can be generalized to work with different
	configurations of the same cluster, and even with multiple clusters.

	The framework has proven to be useful on
	ASCI Blue-Pacific SST system at 1464 nodes, to collections of two or three
	workstations.

	Basic Operation

	This section describes the formats of the genders file and the nodeattr
	command. The genders file, briefly described in the introduction, contains
	a list of node names and attributes. The genders file is typically found
	in /etc (on Linux when installed from the genders RPM) or /admin/etc (on
	other systems).

	Genders Format

	Each line of the genders file may have one of the following formats. See
	the section "Host Ranges" below for most information on host ranges.

	nodename attr[=value],attr[=value],...
	nodename1,nodename2,... attr[=value],attr[=value],...
	nodenames[A-B] attr[=value],attr[=value],...

	The nodename(s) is the shortened[2] hostname of a node. This is followed by
	any number of spaces or tabs, and then the comma-separated list of attributes,
	each of which can optionally have a value. The substitution string "%n" can
	be used in an attribute value to represent the nodename. Nodenames can be
	listed on multiple lines, so a node's attributes can be specified on multiple
	lines. However, no single node may have duplicate attributes.

	There must be no spaces embedded in the attribute list and there is
	no provision for continuation lines. Commas and equal characters are special
	and cannot appear in attribute names or their values. Comments are prefixed
	with the hash chracter (#) and can appear anywhere in the file. The active
	genders file is typically found at /etc/genders or /admin/etc/genders.

	Here is an example genders file from a small 16-node linux cluster:

	# slc cluster genders file
	slci,slcj,slc[0-15] eth2=e%n,cluster=slc,all
	slci passwdhost
	slci,slcj management
	slc[1-15] compute

	Host Ranges

	As noted in section above, the genders database accepts ranges of
	nodenames in the general form: prefix[n-m,l-k,...], where n < m and l <
	k, etc., as an alternative to explicit lists of nodenames.

	This range syntax is meant only as a convenience on clusters with a
	prefixNN naming convention and specification of ranges should not be
	considered necessary -- the list foo1,foo9 could be specified as such,
	or by the range foo[1,9].

	Some examples of range usage follow:

	foo01,foo02,foo03,foo04,foo05: foo[01-05]
	foo3,foo7,foo9,foo11: foo[3,7,9-11]
	fooi,fooj,foo0,foo1,foo2: fooi,fooj,foo[0-2]

	Host ranges with suffixes, i.e. foo[0-2]x, are generally supported, but behavior
	may differ.

	Nodeattr Usage

	A program called nodeattr is used to query data in the genders file. Because
	the genders file is replicated on all nodes in a cluster, this query is a
	local operation, not dependent on the network. Nodeattr is invoked as
	follows:

	nodeattr [-f genders] [-q \| -c \| -n \| -s] [-r] attr[=val]
	nodeattr [-f genders] [-v] [node] attr[=val]
	nodeattr [-f genders] -l [node]
	nodeattr [-f genders] -k

	The -f option specifies a genders file path other than the default.

	The -q, -c, -n, or -s options followed by an attribute name cause a list of
	nodes having the specified attribute to be printed on stdout, formatted
	according to the option specified: -q (default) is host range, -c is
	comma-separated, -n is newline separated, and -s is space separated.

	If none of the formatting options are specified, nodeattr returns a zero
	value if the local node has the specified attribute, else nonzero. The -v
	option causes any value associated with the attribute to go to stdout. If a
	node name is specified before the attribute, it is queried instead of the
	local node.

	The -l option is used to generate a list of attributes for a particular
	node. If no node is listed, all attributes appearing in the genders file
	will be listed.

	The -k option checks the genders file for proper formatting. Information
	about improper formatting will be output to standard error.

	Nodeattr can usually be found in /usr/bin (Linux RPM) or /admin/scripts
	(other systems).

	Programming with Genders

	Three different APIs have been developed that allow programmers to parse
	and query the genders file. "libgenders" is a C API which can be linked
	by including genders.h and linking the library libgenders. "Libgenders"
	is a Perl module developed with Perl extensions. "Genders" is another
	Perl modules, but offers a more traditional Perl API than "Libgenders."
	The reader can learn more about these libraries by reading the manpages
	libgenders(3), Libgenders(3), and Genders(3).

	Example: Rc Script

	The nodeattr command can enable a single script to behave differently
	depending on the role or gender of the node it is executing on, simplifying
	configuration management of the MPP. For example, all nodes in an MPP
	could have the same rc.local file:

	#!/bin/sh
	if nodeattr raid; then
	initialize_raid
	fi
	if nodeattr login;
	then sshd
	fi
	...

	If the node has "raid" in its attribute list, it would execute
	initialize_raid; if it had "login" in its attribute list, it would start the
	secure shell daemon. This technique is simpler and less prone to failure
	than adding code to test for "clues" to a nodes function, such as the
	existence of a RAID device in /dev or a different set of installed products
	on a login node. Further, changing the role of a node does not require
	a laborious audit of scripts; one simply changes the genders file and
	all genders-enabled scripts change their behavior.

	Attribute values might be used to store node configuration information such
	as a node's default IP route. For example, the node may have an attribute:

	iprouter=big-rtr.llnl.gov

	and rc script could extract this value and pass it to the route command:

	route add net default `nodeattr -v iprouter`

	Example: Host Lists

	Host lists are commonly needed on MPP's to set up parallel jobs or to run
	system administration commands on particular sets of nodes. In the following
	example, nodeattr generates a newline-separated list of nodes with the login
	attribute to pass to the fping parallel ping command:

	fping `nodeattr -c login`

	Batch system configurations typically need lists of nodes that belong to
	batch queues. These might be generated by assigning nodes belonging to a
	particular queue an attribute in the genders file and then using nodeattr to
	generate the batch configuration. For example, an NQS manager host might get
	its configuration as follows, assuming the nqsmgr and nqscompute attributes
	are appropriately assigned in the genders file:

	#!/bin/sh

	#
	# Configure queues on the nqsmgr node.
	#
	if nodeattr nqsmgr; then

	# All nodes with the nqscompute attribute will be B queue destinations.
	# Make a comma-separated list of B@host destinations.
	for host in `nodeattr -n nqscompute`; do
	CPU_QUEUES=${CPU_QUEUES}B@${host},
	done
	CPU_QUEUES=`echo $CPU_QUEUES\|sed 's/,$//'`

	echo Adding manager queues
	qmgr << EOT
	set log_file /var/spool/nqs/log
	set shell_strategy fixed = (/bin/sh)
	set default destination_retry wait 1
	create pipe_queue X priority = 10 \
	server = (/usr/lib/nqs/pipeclient) \
	destination = ($CPU_QUEUES)
	run_limit = 1
	set default batch_request queue X
	enable queue X
	start queue X
	exit
	EOT
	fi

	#
	# Configure queues on the nqscompute nodes.
	#
	if nodeattr nqscompute; then
	echo Adding compute queues
	qmgr << EOT2
	set log_file /var/spool/nqs/log
	set shell_strategy fixed = (/bin/sh)
	create batch_queue A priority=10
	create pipe_queue B priority=10 \
	server=(/usr/lib/nqs/pipeclient) \
	destination=(A) run_limit=1
	enable queue A
	enable queue B
	start queue A
	start queue B
	exit
	EOT2
	fi
	exit 0

	Example: nqs_mknmap

	The following script uses the Genders library to get a list of all nodes
	in the cluster. It then adds them to the NMAP database used by NQS:

	#!/usr/bin/perl
	#
	# $Id: TUTORIAL,v 1.9 2004-11-13 19:58:09 achu Exp $
	# $Source: /g/g0/achu/temp/genders-cvsbackup-full/genders/TUTORIAL,v $
	#

	use Genders;

	$prog = "nqs_mknmap";
	$path_nmapmgr = "/usr/bin/nmapmgr";
	$path_nmapdir = "/etc/nmap";

	if (opendir(DIR, $path_nmapdir)) {
	foreach $file (grep(!/^\.\.?$/, readdir(DIR))) {
	unlink("$path_nmapdir/$file");
	}
	closedir(DIR);
	}
	if (! -d $path_nmapdir) {
	mkdir($path_nmapdir, 0755);
	}

	if (open(PIPE, "\|$path_nmapmgr")) {
	printf PIPE ("create\n");
	$i = 0;
	$obj = Genders->new();
	foreach $node ($obj->getnodes()) {
	printf PIPE ("add mid %d %s\n", $i, $node);
	$i++;
	}
	printf PIPE ("exit\n");
	close(PIPE);
	} else {
	printf STDERR ("%s: could not exec $path_nmap_mgr\n", $prog);
	exit(1);
	}
	exit(0);

	------------------------------------------------------------------------
	[1] Here a cluster is a collection of nodes that are administered as a unit,
	where each node runs its own UNIX operating system image.

	[2] This should be the hostname as reported by the hostname -s command.
	Names for other network interfaces on the same node do not appear in the
	genders file.

	------------------------------------------------------------------------
	$Id: TUTORIAL,v 1.9 2004-11-13 19:58:09 achu Exp $
	$Source: /g/g0/achu/temp/genders-cvsbackup-full/genders/TUTORIAL,v $

Pdsh -- a multithreaded remote shell client

The structure of output

Using gender database:

qsh/mqsh module options

rms module options

SDR module options

dshgroup module options

netgroup module options

Rcmd Modules

Limitations

Old News ;-)

[Feb 27, 2021] Parallel Shell

Feb 27, 2021 | cs.boisestate.edu

[Feb 27, 2021] pdsh Utility Wrappers

Feb 27, 2021 | docstore.mik.ua

[Feb 27, 2021] Parallel Distributed Shell - Thread- [Pdsh-users] Re- Dshbak option

Feb 27, 2021 | sourceforge.net

[Feb 27, 2021] Shell scripting - want to login on some server, which are in same domain and execute command and exit

Feb 27, 2021 | unix.stackexchange.com

[Feb 27, 2021] AIX for System Administrators

Feb 27, 2021 | aix4admins.blogspot.com

[Feb 27, 2021] pdsh - A Sysadmin's Secret Weapon

Jul 27, 2011 | aminastaneh.net

clush

Aug 4, 2013 | linuxtoolkit.blogspot.in

[Nov 12, 2018] Shell Games

Notable quotes:

"... A very common way of using pdsh is to set the environment variable WCOLL to point to the file that contains the list of hosts you want to use in the pdsh command. For example, I created a subdirectory PDSH where I create a file hosts that lists the hosts I want to use ..."

Nov 12, 2018 | www.linux-magazine.com

[Nov 08, 2018] Parallel command execution with PDSH

Notable quotes:

"... open an editor and paste the following lines into it and save the file as /foo/bar ..."

Nov 08, 2018 | www.rittmanmead.com

[Sep 02, 2014] Parallel Shell

The cluster comes with a simple parallel shell named pdsh. The pdsh shell is handy for running commands across the cluster. See the man page, which describes the capabilities of pdsh in detail. One of the useful features is the capability of specifying all or a subset of the cluster.

For example:

[May 24, 2012] Parallel remote "shelling" via pdsh by Jonathan Rad

May 24, 2012 | radfest.wordpress.com

Shell Games by Jeff Layton

... ... ...

Google matched content

Softpanorama Recommended

Project pages

Man pages

Articles

Livermore lab

Etc

The cluster comes with a simple parallel shell named `pdsh`. The `pdsh` shell is handy for running commands across the cluster. See the man page, which describes the capabilities of `pdsh` in detail. One of the useful features is the capability of specifying all or a subset of the cluster.