Ansible in a box or .iso
At my $NEWJOB I'm in the team that installs the products we sell on the client's machines. Most of the time the client has bought appliances from us, so they come with the installer and some tools for setting them up. Because of the type of product we sell, the customer might have bought between 3 to 12 or more nodes that will form a cluster, and some times they're spread over several data centers.
The system needs a frontend network and a backend one, and the nodes come with two high speed NICs (typically 10Gib), two low speed (1Gib), and a BMC/IPMI interface. The typical use is to bond both high speed NICs and then build two VLANs on top. The atypical use might be whatever the client came up with. One client has bonded each of the high speed NICs with one of the low speed in primary/backup mode, and has two physical networks. Another one does everything through a single interface with no VLANs. This should give you an idea of how disparate the networking setups can be, so the networking has to be custom made for each client.
Our first step is to connect to the nodes and configure networking. The only preconfigured interface is the BMC/IPMI one, which asks for an IPv4 via DHCP. So we connect to the BMC interface. This involves connecting via HTTP to a web interface that is run within the IPMI subsystem, then download a Java application that gives us a virtual KVM so we can use the computer as if we just had connected a keyboard and a monitor to it.
For those who don't know (I didn't before I started this new position), the IPMI/BMC system is a mini computer fully independent of the main system, which boots once the node has power connected to it, but not necessarily powered on. You can turn on/off the machine, divert KVM as I mentioned before, and more, as you'll see. If you're surprised to find out you have more than one computer in your computer, just read this.
Once connected to the node, we run a setup script, to which we feed all the networking info, including static IPs, gateways, DNS servers, timezone, etc. All this for each node. By hand. Slow, error prone, boring.
Let's automate this. The simplest tool I can think of is Ansible. In fact, I
also think it's perfect for this. But there's a catch: there's no Ansible
installed on the node, there is no way Ansible
will be able to talk KVMimplementedasajavaapp-ese, and again, there's no networking
yet, so no ssh
or any other remote access. But most modern IPMI systems
have an extra feature: virtual devices. You can upload iso images and IPMI will present
them as a USB cd reader with media inside.
So today's trick involves in creating an iso image with ansible
on it that can
run on the target system. It's suprisingly easy to do. In fact, it would be as
easy as creating a virtualenv
, install ansible
, add the playbooks and stuff,
etc, and create an iso image from that,
if it were not for the fact that the image has to be at least less than 50MiB (we
have seen this limit on Lenovo systems). Ansible alone is 25MiB of source code,
and compiled into .pyc
files doubles that. So the most difficult part is to
trim it down to size.
Of course, we fisrt get rid of all the .py
source code. But not all. Modules
and module tools in ansible
are loaded from the .py
files, so we have to
keep those. I can also get rid of pip
, setuptools
and wheel
, as I won't
be able to install new stuff for two reasons: one, this is going to be a read
only iso image, and two, remember, networking is not setup yet :) Also, ansbile
is going to be run locally (--connection local
), so paramiko
is gone too.
Next come all those modules I won't be using (cloud
, clustering
, database
,
etc). There a couple more details, so let's just have the script we currently use:
#! /bin/bash # this is trim.sh set -e while [ $# -gt 0 ]; do case "$1" in -a|--all) # get rid of some python packages for module in pip setuptools pkg_resources wheel; do rm -rfv "lib/python2.7/site-packages/$module" done shift ;; esac done # cleanup find lib -name '*.py' -o -name '*.dist-info' | egrep -v 'module|plugins' | xargs rm -rfv for module in paramiko pycparser; do rm -rfv "lib/python2.7/site-packages/$module" done ansible_prefix="lib/python2.7/site-packages/ansible" # trim down modules for module in cloud clustering database network net_tools notification remote_management source_control web_infrastructure windows; do rm -rfv $ansible_prefix/modules/$module done # picking some by hand find $ansible_prefix/module_utils | \ egrep -v 'module_utils$|__init__|facts|parsing|six|_text|api|basic|connection|crypto|ismount|json|known_hosts|network|pycompat|redhat|service|splitter|urls' | \ xargs -r rm -rfv find $ansible_prefix/module_utils/network -type d | egrep -v 'network$|common' | xargs -r rm -rfv find $ansible_prefix/modules/packaging -type f | \ egrep -v '__init__|package|redhat|rhn|rhsm|rpm|yum' | xargs -r rm -v find $ansible_prefix/modules/system -type f | \ egrep -v '__init__|authorized_key|cron|filesystem|hostname|known_hosts|lvg|lvol|modprobe|mount|parted|service|setup|sysctl|systemd|timezone' | \ xargs -r rm -v
Notice that if I was even more space constrained (and it could be possble, if we find another IPMI implementation with smaller staging space) I could go further and make the venv use the Python installed in the system and not the one copied in the venv.
Now, the next step is to fix the venv to be runnable from any place. The first
step is to make it relocatable. This fixes all the binaries in bin
to use
/usr/bin/env python2
instead of the hardcoded path to the python
binary
copied into the venv. One thing I never understood is why it didn't went a step
further and also declared the VIRTUAL_ENV
as relative to the path were
bin/activate
resides. In any case, I do an extra fix with sed
and I'm done.
Last step is just to create the iso image. It's been ages since I last generated
one by hand, and the resulting command line (which I simply stole from running k3b
)
resulted more complext than I expected (what happened to sensible defaults?).
Here are the interesting parts:
#! /bin/bash set -eu ansible-playbook --syntax-check --inventory-file inventory.ini playbook.yaml ./check_config.py ./trim.sh --all # make relative /usr/bin/python2 -m virtualenv --relocatable . # try harder # this doesn't cover all the possibilities were bin/activate might be sourced # from, but in our case we have a wrapper script that makes sure we're in a sane place sed -i -e 's/VIRTUAL_ENV=".*"/VIRTUAL_ENV="$(pwd)"/' bin/activate genisoimage -sysid LINUX -rational-rock -joliet -joliet-long \ -no-cache-inodes -full-iso9660-filenames -disable-deep-relocation -iso-level 3 \ -input-charset utf-8 \ -o foo.iso .
We threw in some checks on the syntax and contents of the playbook (it's annoying
to find a bug when running on the target machine, come back, generate a new iso,
upload it, mount, etc). It is possible that you would also like to exclude
more stuff in your working directory, so just create a build
dir, copy over
your files (maybe with rsync --archive --update --delete
) and run genisoimage
there.
This method produces an iso image 26MiB big that works both in virtual machines, with which I developed this solution, and on some IPMI systems, like the Lenovo I mentioned before. Unluckily I couldn't get my hands on many different systems that have IPMI and not being used for anything else.
One final note about sizing. If you run du
on your
working/staging directory to see how far are from the limit, use --apparent-sizes
,
as the iso format packs files better than generic filesystems (in my case I see
26MiB apparent vs 46MiB 'real'; this is due to block sizes and internal
fragmentation).