cdist
Introduction
cdist is an agentless system which
is much less known then iether Ansible or Rex.
Authors claim to adhere to
KISS principle
which is positive, but such declarations generally does not worth much.
Licensed
under GPL. Initially released in 2010 at
ETH Zurich so it
originated in the university environment, which has its own specifics. And it shows.
Initially written and still is maintained by Nico Schottelius and Steven Armstrong. It requires only ssh
and Posix shell on the target host.. On the master host it
requires Python 3.2. cdist is being used at a couple of organizations in Switzerland such as
ETH Zurich
((Swiss Federal Institute of Technology in Zurich from which Albert Einstein graduated) and
the OMA Browser project ), as well as the USA, Germany and France. Unlike most Unix configuration
systems, cdist is not distributed as a package (like .deb or .rpm), but installed via
git.
Why they are using Python 3.2 (not available as default on RHEL up to 7.2) on the master, while writing a commodity software is a mystery to me
. Documentation is both very scarce and very bad. It is almost impossible to understand how the system operates
and why particular structure was adopted. But there is
cdist
group on Linkedin. The major part of the discussion about cdist happens on the mailinglist
and on the IRC channel #cstar in the
Freenode network. The last version is from 2015, but the latest commit in
github is
from Aug 19, 2016. It was mentioned on
Hacker News, on Reddit and
on Twitter. Ubuntu has man pages
for it availbe in Web format. It has some following, see
Migrating away from Puppet to
cdist (Python3) Hacker News
cdist consists of two main components:
- The core which is running of the master host. The core of cdist is implemented in Python
3.2 and provides all the executables to configure target hosts. The core operates in a push model: It connects
from the source host to the target hosts and executes scripts on them. For communication and file
transfer SSH is being
used.
- The configuration scripts called types and they are executed of target hosts via SSH.
The "types" are written in
Bourne Shell which is
not the bext flavour of shell availble (ksh93 is much better; bash is better as well and should
be used as main domain fo cdist are Linux flavoues, not anything else). To allow parallel
configuration of hosts, the core supports a parallel mode in which it creates a child process for
every target host. This model allows cdist to scale horizontally and with the available on a
typical server computing resources it can reach pretty high number of instances.
cdist operates in push based approach, in which a server pushes configurations to the client. It's
one way system -- the clients do not poll for updates. All commands are run from the single master
host. The entry point for any configuration is the shell script conf/manifest/init,
which is called initial manifest in cdist terms. It runs in several"stages" with only the
final being execution of scripts on the target. That allow generation of code on one of the previous
steps.
Cdist does contain
three idea that brought my attention to it:
- the usage as DSL of a regular POSIX shell. This is the idea I also subscribe to.
- Idea of "code generators" a shell scripts that
are not executed directly on the target hosts, but instead generate shell code, which later is executed on the target
hosts (nodes). Those days, code generation is not a widely
used technique and among few applications that still are using it we can mention only XSLT which is typically used
to transform XML to HTML. But it could be
used for more generic "template driven code generation". See the book
Program Generators with XML and Java for more information.
- I would also like to mention a creative use of Unix hierarchical directory structure for
encoding information about "objects" in this configuration management system.
Usage of shell as DSL means that after you install cdist, you do not need to learn ugly new
DSL and curse
the designers for incompetence and bugs. But cdist does not used the idea "translate from the
"Classic Linux" approach. Is uses typical for all other Unix configurationa management system a set
of new, custom, primitives called types
and that's problematic. For example here is a description of the "type" package which as you
can guess allow you to install packages to the target systems:
This cdist type allows you to install or uninstall packages on the target. It dispatches the actual
work to the package system dependent types.
REQUIRED PARAMETERS: None
OPTIONAL PARAMETERS:
- name (The name of the package to install. Default is to use the object_id
as the package name.)
- version: The version of the package to install. Default is to install the version
chosen by the local package manager.
- type: The package type to use. Default is determined based on the $os explorer variable.
e.g. package_apt for Debian package_emerge for Gentoo
- state: Either "present" or "absent", defaults to "present"
EXAMPLES
# Install the package vim on the target
__package vim --state present
# Same but install specific version
__package vim --state present --version 7.3.50
# Force use of a specific package type
__package vim --state present --type __package_apt
In my very limited understanding of
the system type is a complex object, consisting of a set of executable (let's say object methods
;-) and files (let's day object variables). The whole cdist looks like a large API for writing shell
scripts, designed to simplify writing complex configuration management scripts. Types is structures
as subtree in Unix file system, consisting of a set of files and
directories. The subtree is the same name as the name of the type and is provides via $__object
variable in script. The tree includes:
Types are stored in the directory called $CDIST_ROOT/cdist/conf/type/. Each type name is
prefixed with two underscores (like in __file) to prevent collisions with other executables in
$PATH, because in scripts the names of those components are used with qualification by the
directory. So the names should not
conflict with system executables:
Here is example that might help to understand how those directories and files re create. It contains
the partial definition of the type __nginx_vhostTARGET=$CDIST_ROOT/cdist/conf/type/__nginx_vhost
echo servername >> $TARGET/parameter/required
echo logdirectory >> $TARGET/parameter/optional
echo loglevel >> $TARGET/parameter/optional
echo use_ssl >> $TARGET/parameter/boolean
mkdir $TARGET/parameter/default
echo warning > $TARGET/parameter/default/loglevel
echo server_alias >> $TARGET/parameter/optional_multiple
As manifest of a type is a shell script, you can call other "types" from it, creating
some kind
of "poor man" inheritance in shell. For example, the type __package abstracts from the type
of the OS for which package manager is executed in the following way (this is a bad example, which
simultaneously shows the weakness of -- cdist -- the absence of meaningful abstraction of the
OS version, but never mind) :
os="$(cat "$__global/explorer/os")" # get the OS for the target
case "$os" in
archlinux) type="pacman" ;;
debian|ubuntu) type="apt" ;;
gentoo) type="emerge" ;;
*)
echo "Don't know how to manage packages on: $os" >&2
exit 1
;;
esac
__package_$type "$@" # execute script appropriate for the Os on the target.
This is actually a very ugly solution (see a letter by a user Ideas for a nicer way to support
different os's-implementation in types ) which results that this case statement is present
in each type definition (which emonstrates the lack of imagination by the authors).
Code generation is another interesting feature of cdist. Instrad of writing a script
for all cases imaginable is allow to generate the code for a specific node which takes into
account version of Linux it is running and other relevant parameters. Which is by the order of
magnitute easer to understadn then generic scripts.
Such generated scripts can be executed iether on master or on target nodes and use "context
files" generated on other steps of cdist exection (resuts of exection of "explorer" scripts). In the
generated scripts, you have access to the following cdist variables
- __object -- the path to the manifest -- essentially the type directory path.
- __object_id
They can only read information from this tree, not write to is as there is no back copy of this files
and they can't be restored after the script execution.
if [ -f "$__object/parameter/name" ]; then
name="$(cat "$__object/parameter/name")"
else
name="$__object_id"
fi
The idea of type in cdist
The main components of cdist are so called types, which bundle functionality. Each
type consists of a set of shell scripts (similar to OO methods) and can reuse "sub-types"
Every type can access what has been written on stdin when it has been called. The result is saved
into the stdin file in the object directory.
Example use of a type: (e.g. in cdist/conf/type/__archlinux_hostname)
__file /etc/rc.conf --source - << eof
...
HOSTNAME="$__target_host"
...
eof
In the manifest of a type you can use other types, so your type extends their functionality.
A good example is the __package type, which in a shortened version looks like this:
os="$(cat "$__global/explorer/os")"
case "$os" in
archlinux) type="pacman" ;;
debian|ubuntu) type="apt" ;;
gentoo) type="emerge" ;;
*)
echo "Don't know how to manage packages on: $os" >&2
exit 1
;;
esac
__package_$type "$@"
Explorers -- scripts that put one line information about the host into stdin
Explorer are small shell scripts, which are always executed on the target host. The aim of the explorer
is to extract from the target host properties of the which in summary should provide enough context
to types so that they can act correctly on the on the target system based on the type of OS and other
individual properties. An explorer outputs the result to stdout, which is usually a one liner,
but may be empty or multi line especially in the case of type explorers.
There are general explorers, which are run in an early stage, and type explorers. Both work almost
exactly the same way, with the difference that the values of the general explorers are stored in a general
location and the type specific below the object.
Explorers can reuse other explorers on the target system by calling $explorer/<explorer_name>
(general and type explorer) or $type_explorer/<explorer name>
(type explorer).
In case of significant errors, the explorer may exit non-zero and return an error message on stderr,
which will cause cdist to abort.
You can also use stderr for debugging purposes while developing a new explorer. A very simple explorer
may look like this:
hostname
Which provide the hostname of a given host
A more complex explorer, which checks for the status of a package may look like this:
if [ -f "$__object/parameter/name" ]; then
name="$(cat "$__object/parameter/name")"
else
name="$__object_id"
fi
# Except dpkg failing, if package is not known / installed
dpkg -s "$name" 2>/dev/null || exit 0
The following global explorers are available:
- cpu_cores
- cpu_sockets
- disks
- hostname
- interfaces
- lsb_codename
- lsb_description
- lsb_id
- lsb_release
- machine
- machine_type
- memory
- os
- os_version
- runlevel
Code generators in cdist are called GENCODE scripts
There are two type of code generators in cdist (called gencode scripts):
- gencode-local The generated script which is the output of gencode-local generator is executed
locally on the master host
- gencode-remote. the generated script which is the output of gencode-remote is executed on the
target host.
The gencode scripts can make use of the parameters, the properties extracted by any of the global
explorers as well as the type specific explorers.
If the gencode scripts encounters an error, it should print diagnostic messages to stderr and exit
non-zero. If you need to debug the gencode script, you can write to stderr:
# Debug output to stderr
echo "My fancy debug line" >&2
# Output to be saved by cdist for execution on the target
echo "touch /etc/cdist-configured"
In the generated scripts, you have access to the following cdist variables
· __object
· __object_id
but only for read operations, as if you ovewrite them there is no back copy of those files after the
script execution. So when you generate a script with the following content, it will work:
if [ -f "$__object/parameter/name" ]; then
name="$(cat "$__object/parameter/name")"
else
name="$__object_id"
fi
Configuration
The configuration is written in
Bourne Shell and consists
of
- The initial manifest conf/manifest/init (which defines which host is assigned
which types)
- Global Explorers (to gain information about the target system)
- Types (which provide all functionality and consist of a manifest, type explorers and gencode
scripts)
Although all of these are written in Shell script, the order of execution in the manifests does not
matter: cdist employs an idempotent configuration.
All user configurable parts are contained in manifests or gencode-scripts, which are shell scripts.
Shell scripts were chosen, because Unix System Administrators are usually profound in reading and writing
shell scripts.
cdist reads its configuration from the initial manifest (conf/manifest/init), in which hosts are
mapped to types:
case "$__target_host" in
myhostname)
__package zsh --state present
__addifnosuchline /tmp/cdist-welcome --line "Welcome to cdist"
;;
esac
Names of types in cdist DSL always start with "__" to avoid conflicts in PATH. They are called like
normal shell scripts and can perform advanced parameter parsing as well as reading from stdin:
# Provide a default file, but let the user change it
__file /home/frodo/.bashrc --source "/etc/skel/.bashrc" \
--state exists \
--owner frodo --mode 0600
# Take file content from stdin
__file /tmp/whatever --owner root --group root --mode 644 --source - << DONE
Here goes the content for /tmp/whatever
DONE
Dependencies are expressed by setting up the require environment variable:
__directory /tmp/foobar
require="__directory//tmp/foobar" __file /tmp/foobar/baz
Access to paths and files within types is given by environment variables like $__object.
Stages of execution
cdist execution consest of several stages:
STAGE 1: TARGET INFORMATION RETRIEVAL
In this stage information is collected about the target host using so called explorers. Every existing
explorer is run on the target and the output of all explorers are copied back into the local cache.
The results can be used by manifests and types.
STAGE 2: RUN THE INITIAL MANIFEST
The initial manifest, which should be used for mappings of hosts to types, is executed. This stage
creates objects in a cconfig database that contains the objects as defined in the manifest for the specific
host. In this stage, no conflicts may occur, i.e. no object of the same type with the same id may be
created, if it has different parameters.
STAGE 3: OBJECT INFORMATION RETRIEVAL
Every object is checked whether its type has explorers and if so, these are executed on the target
host. The results are transferred back and can be used in the following stages to decide what changes
need to be made on the target to implement the desired state.
STAGE 4: RUN THE OBJECT MANIFEST
Every object is checked whether its type has a executable manifest. The manifest script may generate
and change the created objects. In other words, one type can reuse other types. For instance the object
apache/www.example.org is of type apache, which may contain a manifest script,
which creates new objects of type __file. The newly created objects are merged back into the existing
tree. No conflicts may occur during the merge. A conflict would mean that two different objects try
to create the same object, which indicates a broken configuration.
STAGE 5: CODE GENERATION
In this stage for every created object its type is checked for executable gencode scripts. The gencode
scripts generate the code to be executed on the target on stdout. If the gencode executables fail, they
must print diagnostic messages on stderr and exit non-zero.
STAGE 6: CODE EXECUTION
For every object the resulting code from the previous stage is transferred to the target host and
executed there to apply the configuration changes.
STAGE 7: CACHE
The cache stores the information from the current run for later use.
PATHS
$HOME/.cdist
The standard cdist configuration directory relative to your home
directory This is usually the place you want to store your site
specific configuration
cdist/conf/
The distribution configuration directory This contains types and
explorers to be used
confdir
Cdist will use all available configuration directories and create a
temporary confdir containing links to the real configuration
directories. This way it is possible to merge configuration
directories. By default it consists of everything in $HOME/.cdist
and cdist/conf/. For more details see cdist(1)
confdir/manifest/init
This is the central entry point. It is an executable (+x bit set)
shell script that can use values from the explorers to decide which
configuration to create for the specified target host. Its intent
is to used to define mapping from configurations to hosts.
confdir/manifest/*
All other files in this directory are not directly used by cdist,
but you can separate configuration mappings, if you have a lot of
code in the conf/manifest/init file. This may also be helpful to
have different admins maintain different groups of hosts.
confdir/explorer/<name>
Contains explorers to be run on the target hosts, see
cdist-explorer(7).
confdir/type/
Contains all available types, which are used to provide some kind
of functionality. See cdist-type(7).
confdir/type/<name>/
Home of the type <name>. This directory is referenced by the
variable __type (see below).
confdir/type/<name>/man.text
Manpage in Asciidoc format (required for inclusion into upstream)
confdir/type/<name>/manifest
Used to generate additional objects from a type.
confdir/type/<name>/gencode-local
Used to generate code to be executed on the source host
confdir/type/<name>/gencode-remote
Used to generate code to be executed on the target host
confdir/type/<name>/parameter/required
Parameters required by type, \n separated list.
confdir/type/<name>/parameter/optional
Parameters optionally accepted by type, \n separated list.
confdir/type/<name>/parameter/default/*
Default values for optional parameters. Assuming an optional
parameter name of foo, it’s default value would be read from the
file confdir/type/<name>/parameter/default/foo.
confdir/type/<name>/parameter/boolean
Boolean parameters accepted by type, \n separated list.
confdir/type/<name>/explorer
Location of the type specific explorers. This directory is
referenced by the variable __type_explorer (see below). See
cdist-explorer(7).
confdir/type/<name>/files
This directory is reserved for user data and will not be used by
cdist at any time. It can be used for storing supplementary files
(like scripts to act as a template or configuration files).
out/
This directory contains output of cdist and is usually located in a
temporary directory and thus will be removed after the run. This
directory is referenced by the variable __global (see below).
out/explorer
Output of general explorers.
out/object
Objects created for the host.
out/object/<object>
Contains all object specific information. This directory is
referenced by the variable __object (see below).
out/object/<object>/explorers
Output of type specific explorers, per object.
asteven commented
on Jan 12
Just thinking out loud.Instead of the endless case esac if os then else what if a type would have
a internal API? Maybe in form of shell functions?Then there could be a default implementation,
e.g. in __some_type/lib/default
And if some $os doesn't like that it can create it's own implementation in __some_type/lib/$os
The gencode-* would then just call shell functions which write to stdout.
e.g.
_some_type/lib/default:
_add_user() {
printf 'gpasswd -a "%s" "%s"\n' "$1" "$2"
}
_remove_user() {
printf 'gpasswd -d "%s" "%s"\n' "$1" "$2"
}
_some_type/lib/netbsd:
_add_user() {
printf 'usermod -G "%s" "%s"\n' "$1" "$2"
}
_remove_user() {
printf 'usermod ;;# "%s" "%s"\n' "$1" "$2"
}
_some_type/gencode-remote:
. "$__type/lib/default"
if [ -f "$__type/lib/$os" ]; then
. "$__type/lib/$os"
fi
...
case "$state_should" in
present)
for group in $(comm -13 "$__object/explorer/group" "$__object/files/group.sorted"); do
_add_user "$group" "$user"
done
;;
absent)
for group in $(comm -12 "$__object/explorer/group" "$__object/files/group.sorted"); do
_remove_user "$group" "$user"
done
;;
esac
One could even implement kind of 'call super' using an aproach like
this.
No more stinking spagetti code.
telmich commented
on Jan 14, 2017
Great idea! But I suggest using $__type/files/lib instead
Softpanorama Recommended
...
References
- cdist on Hacker News
https://news.ycombinator.com/item?id=3422678
- Reddit discussions
http://www.reddit.com/r/programming/comments/gvhqo/cdist_a_zero_dependency_shell_based_configuration/
and
http://www.reddit.com/r/linux/comments/gvi29/cdist_a_zero_dependency_shell_based_configuration/
- cdist related discussions on Twitter -
http://topsy.com/www.nico.schottelius.org/software/cdist/
- Cia Development Statistics - http://cia.vc/stats/project/cdist
- cdist development at https://github.com/telmich/cdist
- cdist mailinglist http://l.schottelius.org/mailman/listinfo/cdist
- Sans/ETH website http://sans.ethz.ch/projects/cdist/
- OMA Browser http://omabrowser.org/about.html
- Puppet bootstrap via cdist -
https://groups.google.com/group/puppet-users/browse_thread/thread/e1b1ede3ad3b0a8e/98f6b2c9d78032e8
- cdist type manpage -
http://www.nico.schottelius.org/software/cdist/man/latest/man7/cdist-type.html
- Why cdist requires Python 3.2 on the source host -
http://www.nico.schottelius.org/blog/cdist-python-3.2-requirement/
Society
Groupthink :
Two Party System
as Polyarchy :
Corruption of Regulators :
Bureaucracies :
Understanding Micromanagers
and Control Freaks : Toxic Managers :
Harvard Mafia :
Diplomatic Communication
: Surviving a Bad Performance
Review : Insufficient Retirement Funds as
Immanent Problem of Neoliberal Regime : PseudoScience :
Who Rules America :
Neoliberalism
: The Iron
Law of Oligarchy :
Libertarian Philosophy
Quotes
War and Peace
: Skeptical
Finance : John
Kenneth Galbraith :Talleyrand :
Oscar Wilde :
Otto Von Bismarck :
Keynes :
George Carlin :
Skeptics :
Propaganda : SE
quotes : Language Design and Programming Quotes :
Random IT-related quotes :
Somerset Maugham :
Marcus Aurelius :
Kurt Vonnegut :
Eric Hoffer :
Winston Churchill :
Napoleon Bonaparte :
Ambrose Bierce :
Bernard Shaw :
Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient
markets hypothesis :
Political Skeptic Bulletin, 2013 :
Unemployment Bulletin, 2010 :
Vol 23, No.10
(October, 2011) An observation about corporate security departments :
Slightly Skeptical Euromaydan Chronicles, June 2014 :
Greenspan legacy bulletin, 2008 :
Vol 25, No.10 (October, 2013) Cryptolocker Trojan
(Win32/Crilock.A) :
Vol 25, No.08 (August, 2013) Cloud providers
as intelligence collection hubs :
Financial Humor Bulletin, 2010 :
Inequality Bulletin, 2009 :
Financial Humor Bulletin, 2008 :
Copyleft Problems
Bulletin, 2004 :
Financial Humor Bulletin, 2011 :
Energy Bulletin, 2010 :
Malware Protection Bulletin, 2010 : Vol 26,
No.1 (January, 2013) Object-Oriented Cult :
Political Skeptic Bulletin, 2011 :
Vol 23, No.11 (November, 2011) Softpanorama classification
of sysadmin horror stories : Vol 25, No.05
(May, 2013) Corporate bullshit as a communication method :
Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000):
the triumph of the US computer engineering :
Donald Knuth : TAoCP
and its Influence of Computer Science : Richard Stallman
: Linus Torvalds :
Larry Wall :
John K. Ousterhout :
CTSS : Multix OS Unix
History : Unix shell history :
VI editor :
History of pipes concept :
Solaris : MS DOS
: Programming Languages History :
PL/1 : Simula 67 :
C :
History of GCC development :
Scripting Languages :
Perl history :
OS History : Mail :
DNS : SSH
: CPU Instruction Sets :
SPARC systems 1987-2006 :
Norton Commander :
Norton Utilities :
Norton Ghost :
Frontpage history :
Malware Defense History :
GNU Screen :
OSS early history
Classic books:
The Peter
Principle : Parkinson
Law : 1984 :
The Mythical Man-Month :
How to Solve It by George Polya :
The Art of Computer Programming :
The Elements of Programming Style :
The Unix Hater’s Handbook :
The Jargon file :
The True Believer :
Programming Pearls :
The Good Soldier Svejk :
The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society :
Ten Commandments
of the IT Slackers Society : Computer Humor Collection
: BSD Logo Story :
The Cuckoo's Egg :
IT Slang : C++ Humor
: ARE YOU A BBS ADDICT? :
The Perl Purity Test :
Object oriented programmers of all nations
: Financial Humor :
Financial Humor Bulletin,
2008 : Financial
Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related
Humor : Programming Language Humor :
Goldman Sachs related humor :
Greenspan humor : C Humor :
Scripting Humor :
Real Programmers Humor :
Web Humor : GPL-related Humor
: OFM Humor :
Politically Incorrect Humor :
IDS Humor :
"Linux Sucks" Humor : Russian
Musical Humor : Best Russian Programmer
Humor : Microsoft plans to buy Catholic Church
: Richard Stallman Related Humor :
Admin Humor : Perl-related
Humor : Linus Torvalds Related
humor : PseudoScience Related Humor :
Networking Humor :
Shell Humor :
Financial Humor Bulletin,
2011 : Financial
Humor Bulletin, 2012 :
Financial Humor Bulletin,
2013 : Java Humor : Software
Engineering Humor : Sun Solaris Related Humor :
Education Humor : IBM
Humor : Assembler-related Humor :
VIM Humor : Computer
Viruses Humor : Bright tomorrow is rescheduled
to a day after tomorrow : Classic Computer
Humor
The Last but not Least Technology is dominated by
two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt.
Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org
was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP)
without any remuneration. This document is an industrial compilation designed and created exclusively
for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong
to respective owners. Quotes are made for educational purposes only
in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness
of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be
tracked by Google please disable Javascript for this site. This site is perfectly usable without
Javascript.
Last modified: March, 12, 2019