Create a disk image with a booting running Debian

Marcos Dione

2010-11-26 12:10

Recently I had to do something that sounds very simple to any good SysAdmin: create a disk image with a booting Debian installation, from a script, with no human interaction. The idea is to later install our software on it. Those who want to test out soft would just need to download the image and boot it in any virtual machine they have: qemu, virtualbox¹, you name it.

So the process could be thought as this: create a disk image, partition it, install Debian, install a bootloader, profit! Let's try to tackle them separately, looking at different approaches²:

A disk image is simply a file big enough: 1GiB, 10GiB, whatever you want. A string of 1Gi of 0s should be enough:

dd bs=$((1024*1024)) count=1024 if=/dev/zero of=stable.img

Now, that file is using 1GiB of space, but we're not sure if we're going to use it all, and so is kinda a waste of space. Luckly, Linux is able to handle sparse files: files that do not reserve all the file system blocks would normally be needed, only those where data is written. So for instance, a way to create a 1GiB (almost) empty sparse file is this:

dd bs=1 count=1 seek=$((1024*1024*1024)) if=/dev/zero of=stable.img

That is, we write a 0 at the end of a 1GiB file³, but even if the file is so big, it's actually using one file system block (4096 bytes, according to dumpe2fs).

A simpler, or maybe more-intuitive-when-you-read-it⁴ alternative is to use a tool that comes in qemu-utils:

qemu-img create -f raw stable.img 1G

That was easy. Now, how do we partition it? The first answer it's obvious, namely, fdisk, but it is not scriptable. So we look for alternatives, and one that comes to mind is parted: it is designed with scriptability in mind, it should be perfect!

Almost. parted needs a partition table signature in the MBR⁵ and it has no way to create one. This is at least surprising, but a little more (ab)use of dd can save the day. It's just a matter of writing the bytes 0x55 0xaa⁸ in the last two bytes of the first sector of the image. A disk sector, up to recently, is just 512 bytes, so:

echo -e "\x55\xaa" | dd bs=1 count=2 seek=510 of=stable.img conv=notrunc

The notrunc is so dd doesn't truncate the image to be 512 bytes long (it took me a while figuring that out). Now to parted:

parted -s stable.img mkpart primary ext2 0 1G
Warning: The resulting partition is not properly aligned for best performance.

But this will rise a problem that I'll mention later, at its proper time. So instead, and given that we're gonna install another package anyways, we're gonna use sfdisk⁶⁷:

sfdisk -D stable.img <<EOF
,,L,*
;
;
;
EOF

The next step is to format the partition inside the image. If this were a partition image we could simply apply mkfs.ext2 (or whatver filesystem type you want) to the file, because the filesystem would start from the beginning of the file. But as this is a disk image, the partition starts at an offset from the beginning:

sfdisk --list stable.img
Disk stable.img: cannot get geometry

Disk stable.img: 130 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

    Device Boot Start     End   #cyls    #blocks   Id  System
stable.img1   *      0+    129     130-   1044193+  83  Linux
stable.img2          0       -       0          0    0  Empty
stable.img3          0       -       0          0    0  Empty
stable.img4          0       -       0          0    0  Empty

The 0+ in the third column tells us that the partition doesn't start exactly in the cylinder 0. That would mean it starts where the MBR is. Actually it starts in the cylinder 0 but in the second head. According to CHS reported by sfdisk, there are 63 sectors per track, so we just need to skip so many bytes: 63x512=32256. Coincidentally, 130- in the fourth column means that the partition does not reach the end of cylinder 130, which is exactly what parted was complaining above⁹.

To fix the aligment we will have to do it the other way around: instead of discovering the CHS from the image size, we'll compute the image size from some desired CHS and a minimum image size. This can be done as such:

# bytes per sector
bytes=512
# sectors per track
sectors=63
# heads per track
heads=255
# bytes per cylinder is bytes*sectors*head
bpc=$(( bytes*sectors*heads ))
# number of cylinders
cylinders=$(($img_size/$bpc))
# rebound the size
img_size=$(( ($cylinders+1)*$bpc ))
qemu-img create -f raw stable.img $image_size

So we will have to somehow tell to mkfs.ext2 about the partition offset inside the disk image. We can use something that we have been using unknowingly: loopback devices. Who hasn't mounted an ISO-9660 image in the past? We used something like this:

mount -o loop debian-505-i386-CD-1.iso /mnt

This is more or less equivalent to:

losetup /dev/loop0 debian-505-i386-CD-1.iso
mount /dev/loop0 /mnt

Good thing is, we can tell losetup to simulate the start of the device some bytes inside the file. And given that everything is a file in Linux, we can even chain loop devices, such as:

losetup /dev/loop0 stable.img
losetup -o 32256 /dev/loop1 /dev/loop0

Now /dev/loop0 points to the disk image and /dev/loop1 points to the partition. There's really not much option here, so we skip to the formatting part, which is even more straightforward:

mkfs.ext2 /dev/loop1

Now to install Debian in this beast. Here we won't be exploring much either, but I will explain a couple of tricks I learned to complete this task successfully. The tool of choice is debbootstrap, which is able to install packages in a directory as if it where the root partition, so we will need to mount it first:

mount /dev/loop1 mnt

In my case I will need to install several packages besides the base install:

debootstrap --arch i386 --include=cdbs,debhelper,libsqlite3-dev,\
libssl-dev,libgstreamer-plugins-base0.10-dev,libgmp3-dev,build-essential,\
linux-image-2.6-686,grub-pc stable mnt

Notice that the base set of packages does not include nor a kernel or a boot loader, because this is normally installed by Debian Installer, so I added them to the list of packages. But this is not the only thing that the installer does (and that there is no way to repeat besides by hand): it also sets up the environment, users, apt config (from the ones used to install) and more. We will have to set those by hand.

Before running anything else, which will run under chroot, we will need to also setup some of the virtual filesystems that are running on a normal GNU/Linux setup; namely, /dev, /dev/pts and /proc. We will reuse the host's ones, using the hability to mount a dir in another:

mount -o bind /dev/ mnt/dev
mkdir mnt/dev/pts
mount -o bind /dev/pts mnt/dev/pts
mount -o bind /proc mnt/proc

Some minimal config needed includes:

# apt
echo "deb http://http.us.debian.org/debian stable         main" >  mnt/etc/apt/sources.list
echo "deb http://security.debian.org       stable/updates main" >> mnt/etc/apt/sources.list
# otherwise perl complains during installation that it can't set the locale
# actually we will have to do some little more than just this; see below
echo "en_US.UTF-8 UTF-8" > mnt/etc/locale.gen
# when installing the kernel, if this setting is not present, it thinks the
# bootloader is not able to handle initrd images[^10]
echo "do_initrd = Yes" > mnt/etc/kernel-img.conf

So we use this basic config to complete even more the installation:

chroot="chroot mnt"
# compile the locales as per /etc/locale.gen
$chroot locale-gen
# download package definitions
$chroot apt-get update
# resolve virtual packages and finish the setup of packages
$chroot apt-get -f -y --force-yes install
# while we're at it, install upgrades
$chroot apt-get -y --force-yes upgrade

The last step is to install a bootloader. Here we have several options. lilo was the first Linux bootloader, which was started in 1992. Even if it can bootload almost any operating system in lots of filesystems, one of its main drawbacks is it staticness: it reads a config file, compiles the bootloader and installs it. After that you can't change anything (except for adding more boot parameters to the kernel), so if you wrote something wrong and your system does not boot, it's hard to recover. Also, if you change anything in the config file, you have to compile and install the bootloader again.

The second and third options are the two flavors of grub, the GNU GRand Unified Bootloader. The first iteration of grub, grub1 or grub-legacy how it is called now, is no longer under development or support, but a lot of people still use it for its simplicity and power. First developed in 1999, it has the hability to read the config file at boot time and it lets edit it and read the filesystems before booting. Its successor, grub2 or grub-pc, is even more modular and flexible, but takes time to relearn it.

Even with this last two options, I couldn't managed to reliably get a booting image. To be fair, I managed to do it with grub-pc, but my script had to work in a machine that boots with grub-legacy. Installing both at the same time is impossible, and I need to use the host's bootloader because I can't reliably fake the devices in a chrooted environment and using any virtual machine was imposible because the image doesn't boot yet! Talk about chicken and eggs... For the record, here's how I managed to make it work with grub-pc:

grub-install --root-directory=mnt/ --no-floppy --modules 'part_msdos ext2' /dev/loop0

So, I needed to find a bootloader that could be installed in the host machine without changing the actual bootloader in use. Luckily I talked to a friend sysadmin/guru, Ignacio Sánchez, which pointed me to extlinux, which is part of the syslinux family of bootloaders. This family also includes isolinux, famously known for booting the iso images of most of the distributions for years. I knew about the latter two, and I even used syslinux in a company I worked for two years ago in a floppy disk (!!!) used to boot the old firewall¹² and another set of diskettes for two diskless thin clients. extlinux is the youngest of the family, which is able to read and boot from extX partitions. The config file looks like a very simple lilo.conf:

default Hop
timeout 30

label Hop
    kernel /boot/vmlinuz-2.6.28-5
    append initrd=/boot/initrd.img-2.6.28-5 root=UUID=e3447f08-f8b2-4c25-93e4-76420c467384 ro

The UUID can be obtained whit this command:

blkid /dev/loop1

Installing it actually consists of two steps: first installing MBR code that boots from the partition marked as bootable¹¹. The syslinux family comes with such a MBR code, so we use it:

dd if=/usr/lib/extlinux/mbr.bin of=stable.img conv=notrunc

We're almost there. Installing extlinux is really straightforward:

extlinux --heads $heads --sectors $sectors --install mnt/boot/syslinux

It only rests to umount and dismantle the loop devices in the reverse order:

umount mnt/
losetup -d /dev/loop1
losetup -d /dev/loop0

Sometimes you need to wait a couple of seconds between these commands, because they seem to be asynchronous. Otherwise you'll get errors that the device is still busy, because the previous command has finished, but the async process in the kernel has not.

The image as it is is bootable with qemu and virtualbox, but if you want to make it bootable in other, closed virtual machines, you must convert it to vmdk. qemu-utils to the rescue again:

qemu-img convert -O vmdk stable.img stable.vmdk

I have lots of things more to mention, but this post has got long enough as it is. Mostly they were references to the sites I got info from, but I know that if I try to clean it up I will procrastinate it for another month or so and probably forget about it.

Currently this needs an image conversion. ↩
Of course, I strongly recommend to check the manpages of the mentioned tools. ↩
Quick, which is the actual size of the file? You can answer with powers of 2 if it makes it easier for you :) ↩
I have the tendency to write as-understandable-as-possible code; that means, I know that I'll have to read and try to understand it 6 months after I wrote it, soI try to make it as readable as possible. That includes using long options when I invoke tools in scripts and, of course, sensible class and variable names. ↩
Master Boot Record, the first sector in a disk. ↩
You will have to read sfdisk's manpage to understand what's all that. ↩
sfdisk has a neat trick: you can dump the partition table from one disk and pipe it to a sfdisk affecting another disk, actually copying the partition scheme. It comes very handy when adding disks to a raid setup. ↩
Technically we're marking it as a MSDOS type partition table. ↩
Notice that it only complains about the end bound, not the beginning bound. ↩
One interesting note: even if above I told debbootstrap to install a Linux kernel, it actually hasn't. The package linux-image-2.6-686 is a virtual one, and debbootstrap seems to not resolve this ones, but it doesn't complain either. ↩
See that Boot column in the output of sfdisk at the beginning of the post? And the * in the first and only partition? That shows it as bootable. This is an old relic from the times when operating systems relied on a dumb MBR code to boot. And now we're using exactly that to load a bootloader. ↩
Really old; we're talking about a Cyrix 486DX2 at 50MHz with 16MB de RAM, 4 NICs, all of them ISA, two of them still donning 10Base2 connectors and configurable via jumpers. We really didn't need anything bigger since the ADSL line was merely 2.5Mib/s. ↩