Recently I had to do something that sounds very simple to any good SysAdmin:
create a disk image with a booting Debian installation, from a script, with no
human interaction. The idea is to later install our software on it. Those who
want to test out soft would just need to download the image and boot it in any
virtual machine they have: `qemu`, `virtualbox`[^1], you name it.

So the process could be thought as this: create a disk image, partition it,
install Debian, install a bootloader, profit! Let's try to tackle them
separately, looking at different approaches[^2]:

A disk image is simply a file big enough: 1GiB, 10GiB, whatever you want. A
string of 1Gi of 0s should be enough:

```bash
dd bs=$((1024*1024)) count=1024 if=/dev/zero of=stable.img
```

Now, that file is using 1GiB of space, but we're not sure if we're going to use
it all, and so is kinda a waste of space. Luckly, Linux is able to handle sparse
files: files that do not reserve
all the file system blocks would normally be needed, only those where data is
written. So for instance, a way to create a 1GiB (almost) empty sparse file is
this:

```bash
dd bs=1 count=1 seek=$((1024*1024*1024)) if=/dev/zero of=stable.img
```

That is, we write a 0 at the end of a 1GiB file[^3], but even if the file is so
big, it's actually using one file system block (4096 bytes, according to
`dumpe2fs`).

A simpler, or maybe more-intuitive-when-you-read-it[^4] alternative is to use
[a tool that comes in `qemu-utils`](http://pierre.palats.com/scratch/index.php?post/2008/12/15/Automatically-creating-a-disk-image-with-partition-and-bootloader):

```bash
qemu-img create -f raw stable.img 1G
```

That was easy. Now, how do we partition it? The first answer it's obvious,
namely,
`fdisk`, but it is not scriptable. So we look for alternatives, and one that
comes to mind is `parted`: it is designed with scriptability in mind, it should
be perfect!

Almost. `parted` needs a partition table signature in the MBR[^5] and it has no
way to create one. This is at least surprising, but a little more (ab)use of
`dd` can save the day. It's just a matter of writing the bytes `0x55 0xaa`[^8] in
the last two bytes of the first sector of the image. A disk sector, up to
recently, is just 512 bytes, so:

```bash
echo -e "\x55\xaa" | dd bs=1 count=2 seek=510 of=stable.img conv=notrunc
```

The `notrunc` is so `dd` doesn't truncate the image to be 512 bytes long (it
took me a while figuring that out). Now to `parted`:

```bash
parted -s stable.img mkpart primary ext2 0 1G
Warning: The resulting partition is not properly aligned for best performance.
```

But this will rise a problem that I'll mention later, at its proper time. So
instead, and given that we're gonna install another package anyways, we're gonna
use `sfdisk`[^6][^7]:

```bash
sfdisk -D stable.img <<EOF
,,L,*
;
;
;
EOF
```

The next step is to format the partition inside the image. If this were a
partition image we could simply apply `mkfs.ext2` (or whatver filesystem type
you want) to the file, because the filesystem would start from the beginning of
the file. But as this is a disk image, the partition starts at an offset from
the beginning:

```bash
sfdisk --list stable.img
Disk stable.img: cannot get geometry

Disk stable.img: 130 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

    Device Boot Start     End   #cyls    #blocks   Id  System
stable.img1   *      0+    129     130-   1044193+  83  Linux
stable.img2          0       -       0          0    0  Empty
stable.img3          0       -       0          0    0  Empty
stable.img4          0       -       0          0    0  Empty
```

The `0+` in the third column tells us that the partition doesn't start exactly
in the cylinder 0. That would mean it starts where the MBR is. Actually it
starts in the cylinder 0 but in the second head. According to CHS reported by
`sfdisk`, there are 63 sectors per track, so we just need to skip so many bytes:
63x512=32256. Coincidentally, `130-` in the fourth column means that the
partition does not reach the end of cylinder 130, which is exactly what `parted`
was complaining above[^9].

To fix the aligment we will have to do it the other way around: instead of
discovering the CHS from the image size, we'll compute the image size from some
desired CHS and a minimum image size. This can be done as such:

```bash
# bytes per sector
bytes=512
# sectors per track
sectors=63
# heads per track
heads=255
# bytes per cylinder is bytes*sectors*head
bpc=$(( bytes*sectors*heads ))
# number of cylinders
cylinders=$(($img_size/$bpc))
# rebound the size
img_size=$(( ($cylinders+1)*$bpc ))
qemu-img create -f raw stable.img $image_size
```

So we will have to somehow tell to `mkfs.ext2` about the partition offset inside
the disk image. We can use something that we have been using unknowingly: loopback
devices. Who hasn't mounted an ISO-9660 image in the past? We used something like
this:

```bash
mount -o loop debian-505-i386-CD-1.iso /mnt
```

This is more or less equivalent to:

```bash
losetup /dev/loop0 debian-505-i386-CD-1.iso
mount /dev/loop0 /mnt
```

Good thing is, we can tell `losetup` to simulate the start of the device some
bytes inside the file. And given that everything is a file in Linux, we can even
chain loop devices, such as:

```bash
losetup /dev/loop0 stable.img
losetup -o 32256 /dev/loop1 /dev/loop0
```

Now `/dev/loop0` points to the disk image and `/dev/loop1` points to the
partition. There's really not much option here, so we skip to the formatting
part, which is even more straightforward:

```bash
mkfs.ext2 /dev/loop1
```

Now to install Debian in this beast. Here we won't be exploring much either, but
I will explain a couple of tricks I learned to complete this task successfully.
The tool of choice is `debbootstrap`, which is able to install packages in a
directory as if it where the root partition, so we will need to mount it first:

```bash
mount /dev/loop1 mnt
```

In my case I will need to install several packages besides the base install:

```bash
debootstrap --arch i386 --include=cdbs,debhelper,libsqlite3-dev,\
libssl-dev,libgstreamer-plugins-base0.10-dev,libgmp3-dev,build-essential,\
linux-image-2.6-686,grub-pc stable mnt
```

Notice that the base set of packages does not include nor a kernel or a boot
loader, because this is normally installed by Debian Installer, so I added
them to the list of packages. But this is not the only thing that the installer
does (and that there is no way to repeat besides by hand): it also sets up the
environment, users, apt config (from the ones used to install) and more. We will
have to set those by hand.

Before running anything else, which will run under `chroot`, we will
need to also setup some of the virtual filesystems that are running on a normal
GNU/Linux setup; namely, `/dev`, `/dev/pts` and `/proc`. We will reuse the
host's ones, using the hability to mount a dir in another:

```bash
mount -o bind /dev/ mnt/dev
mkdir mnt/dev/pts
mount -o bind /dev/pts mnt/dev/pts
mount -o bind /proc mnt/proc
```

Some minimal config needed includes:

```bash
# apt
echo "deb http://http.us.debian.org/debian stable         main" >  mnt/etc/apt/sources.list
echo "deb http://security.debian.org       stable/updates main" >> mnt/etc/apt/sources.list
# otherwise perl complains during installation that it can't set the locale
# actually we will have to do some little more than just this; see below
echo "en_US.UTF-8 UTF-8" > mnt/etc/locale.gen
# when installing the kernel, if this setting is not present, it thinks the
# bootloader is not able to handle initrd images[^10]
echo "do_initrd = Yes" > mnt/etc/kernel-img.conf
```

So we use this basic config to complete even more the installation:

```bash
chroot="chroot mnt"
# compile the locales as per /etc/locale.gen
$chroot locale-gen
# download package definitions
$chroot apt-get update
# resolve virtual packages and finish the setup of packages
$chroot apt-get -f -y --force-yes install
# while we're at it, install upgrades
$chroot apt-get -y --force-yes upgrade
```

The last step is to install a bootloader. Here we have several options. `lilo` was
the first Linux bootloader, which was started in 1992. Even if it can
bootload almost any operating system in lots of filesystems, one of its
main drawbacks is it staticness: it reads a config file, compiles the bootloader
and installs it. After that you can't change anything (except for adding more
boot parameters to the kernel), so if you wrote something wrong and your system
does not boot, it's hard to recover. Also, if you change anything in the config file,
you have to compile and install the bootloader again.

The second and third options are the two
flavors of `grub`, the GNU GRand Unified Bootloader. The first iteration of `grub`,
`grub1` or `grub-legacy` how it is called now, is no longer under development or
support, but a lot of people still use it for its simplicity and power. First
developed in 1999, it has the hability to read the config file at boot time and
it lets edit it and read the filesystems before booting. Its successor, `grub2`
or `grub-pc`, is even more modular and flexible, but takes time to relearn it.

Even with this last two options, I couldn't managed to reliably get a booting
image. To be fair, I managed to do it with `grub-pc`, but my script had to work
in a machine that boots with `grub-legacy`. Installing both at the same time is
impossible, and I need to use the host's bootloader because I can't reliably
fake the devices in a `chroot`ed environment and using any virtual machine was
imposible because the image doesn't boot yet! Talk about chicken and eggs...
For the record, here's how I managed to make it work with `grub-pc`:

```bash
grub-install --root-directory=mnt/ --no-floppy --modules 'part_msdos ext2' /dev/loop0
```

So, I needed to find a bootloader that could be installed in the host machine
without changing the actual bootloader in use. Luckily I talked to a friend
sysadmin/guru, Ignacio Sánchez, which pointed me to `extlinux`, which is part
of the `syslinux` family of bootloaders. This family also includes `isolinux`,
famously known for booting the iso images of most of the distributions for years.
I knew about the latter two, and I even used `syslinux` in a company I worked for
two years ago in a floppy disk (!!!) used to boot the old firewall[^12] and another
set of diskettes for two diskless thin clients. `extlinux` is
the youngest of the family, which is able to read and boot from `extX` partitions.
The config file looks like a very simple `lilo.conf`:

```bash
default Hop
timeout 30

label Hop
    kernel /boot/vmlinuz-2.6.28-5
    append initrd=/boot/initrd.img-2.6.28-5 root=UUID=e3447f08-f8b2-4c25-93e4-76420c467384 ro
```

The UUID can be obtained whit this command:

```bash
blkid /dev/loop1
```

Installing it actually consists of two steps: first installing MBR code that
boots from the partition marked as bootable[^11]. The `syslinux` family comes with
such a MBR code, so we use it:

```bash
dd if=/usr/lib/extlinux/mbr.bin of=stable.img conv=notrunc
```

We're almost there. Installing `extlinux` is really straightforward:

```bash
extlinux --heads $heads --sectors $sectors --install mnt/boot/syslinux
```

It only rests to umount and dismantle the loop devices in the reverse order:

```bash
umount mnt/
losetup -d /dev/loop1
losetup -d /dev/loop0
```

Sometimes you need to wait a couple of seconds between these commands, because
they seem to be asynchronous. Otherwise you'll get errors that the device is
still busy, because the previous command has finished, but the async process in
the kernel has not.

The image as it is is bootable with `qemu` and `virtualbox`, but if you want to
make it bootable in other, closed virtual machines, you must convert it to `vmdk`.
`qemu-utils` to the rescue again:

```bash
qemu-img convert -O vmdk stable.img stable.vmdk
```

I have lots of things more to mention, but this post has got long enough as it
is. Mostly they were references to the sites I got info from, but I know that if
I try to clean it up I will procrastinate it for another month or so and probably
forget about it.

[^1]: Currently this needs an image conversion.

[^2]: Of course, I strongly recommend to check the manpages of the mentioned
    tools.

[^3]: Quick, which is the actual size of the file? You can answer with powers of 2
    if it makes it easier for you :)

[^4]: I have the tendency to write as-understandable-as-possible code; that means,
    I know that I'll have to read and try to understand it 6 months after I wrote
    it, soI try to
    make it as readable as possible. That includes using long options when I
    invoke tools in scripts and, of course, sensible class and variable names.

[^5]: Master Boot Record, the first sector in a disk.

[^6]: You will have to read `sfdisk`'s manpage to understand what's all that.

[^7]: `sfdisk` has a neat trick: you can dump the partition table from one disk
    and pipe it to a `sfdisk` affecting another disk, actually copying the
    partition scheme. It comes very handy when adding disks to a raid setup.

[^8]: Technically we're marking it as a `MSDOS` type partition table.

[^9]: Notice that it only complains about the end bound, not the beginning bound.

[^10]: One interesting note: even if above I told `debbootstrap` to install a
     Linux kernel, it actually hasn't. The package `linux-image-2.6-686` is a
     virtual one, and `debbootstrap` seems to not resolve this ones, but it
     doesn't complain either.

[^11]: See that `Boot` column in the output of `sfdisk` at the beginning of the
     post? And the `*` in the first and only partition? That shows it as bootable.
     This is an old relic from the times when operating systems relied on a dumb
     MBR code to boot. And now we're using exactly that to load a bootloader.

[^12]: **Really**
       old; we're talking about a Cyrix 486DX2 at 50MHz with 16MB de RAM, 4 NICs,
       all of them ISA, two of them still donning 10Base2 connectors and configurable
       via jumpers. We really didn't need anything bigger since the ADSL line was
       merely 2.5Mib/s.

