Porting big codebases to Android

Last year and a half I was working in research. This position was about, among other things, to port a programming language (two, actually, Hop and Bigloo) to the Antroid platform. I already had written something about it, but this time I want to show my high level impressions of the platform. What follows is part of a report I wrote at the end of that job, which includes the things I wanted to say.

The Android port can be viewed as four separate sub-tasks. Hop is software developed in the Scheme language; more particularly, it must be compiled with the Bigloo Scheme compiler, which in turn uses gcc for the final compilation. That means that we also needed to port Bigloo first to the platform, not because we were planning to use it in the platform, but because we need the Bigloo runtime libraries ported to Android, as Hop and any other program compiled with Bigloo uses them. The other three subtasks, which are discussed later, are porting Hop itself; developing libraries to access devices and other features present in the platform; and, we'll see later the reasons, make the port work with threads.

When we started to investigate how to port native code to the platform we found that there wasn't much support. At fisrt the only documentation we could find was blog posts of people trying to do it by hand. They were using the compiler provided in Android's source code to compile static binaries that could be run on the platform. Because Bigloo uses dinamic libraries to implement platform dependent code and modules, we aimed to find a way to compile things dinamically. After 3 or 4 weeks we found a wrapper written in Ruby that managed all the details of calling gcc with the proper arguments. With this we should be able to port anything that uses gcc as the compiler, just like Bigloo does. At the same time, the first version of Android's NDK (Native Development Kit) appeared, but it wasn't easy to integrate in our build system.

(Note: Actually I think most of the problems we faced doing this port stem from this. The NDK forces you to write a new set of Makefiles, but our hand-made build system and build hierarchy made such an effort quite big. Also, that mean supporting a parallel build system, while it should not be so crazy to spect a cleaner way to integrate the toolchain into an existing build system, not only in hand-made like in this case, but also the most common ones, like autotools, cmake, etc.)

Even having the proper compiler, we found several obstacles related to the platform itself. First of all, Bigloo relies heavily on the C and pthread libraries to implement lowlevel functionality. Bigloo can use both glibc, GNU's implementation, or µlibc, an implementation aimed for embedded aplications. Bigloo also relies on Boehm's Garbage Collector (GC) for its memory management. The C library implementation in Android is not the glibc or the µlibc, but an implementation developed by Google for the platform, called Bionic. This version of the C library is tailored to the platform's need, with little to no regards to native application development.

The first problem we found is that GC compiled fine with Bionic, but the apllications that used GC did not link: there was a missing symbol that normally is defined in the libc, but that Bionic did not. We tried cooperating with the GC developers, we tried inspecting a Mono port to Android, given that this project also uses GC, trying to find a solution that could be useful for everyone, but at the end we just patched our sources to fake such symbol with a value that remotely made sense.

We also found that Bionic's implementation of pthreads is not only incomplete, but also has some glitches. For instance, in our build system, we test the existence of a function like everybody else: we compile a small test program wich uses it. With this method we found at least one function that is declared but never defined. That means that Bionic declares that the function exists, but then it never implements it. Another example is the declaration and definition of a function, but the lack of definition of constants normally used when calling this function.

Also, because most of the tests also must be executed to inform about the peculiarities of each implementation, we had to change our build system to be able to execute the produced binaries in the Android emulator.

Google also decided to implement their own set of tools, again, trimmed down to the needs of the platform, instead of using old and proven versions, like Busybox. This means that some tools behave differently, with no documentation about it, so we mostly had to work around this differences everytime a new one apperared.

All in all, we spent two and a half months just getting Bigloo to run in Android, dismissing the problem that Boehm's GC, using its own build system, detected that the compiler declared to not support threads, and refused to compile with threads enabled. This meant that Bigloo itself could not be compiled with pthreads support.

With this caveat in mind, we tackled the second subtask, porting Hop itself. This still raised problems with the peculiarities of the platform. We quickly found that the dinamic linker wasn't honoring the LD_LIBRARY_PATH environment variable, which we were trying to use to tell the system where to find the dynamic libraries.

The Android platform installs new software using a package manager. The package manager creates a directory in the SD card that it's only writable by the applilcation being installed. Within this directory the installer puts the libraries declared in the package. Bigloo, besides the dinamic libraries, requieres some aditional files that initialize the global objects. This files are not extracted by the installer, so we had to make a frontend in Java that opens the package and extract them by hand. But the installer creates the directory for the libraries in such a way that the application later cannot write in it.

Also, we found that the dinamic linker works for libraries linked at runtime, but does not for dlopen()'ing them, so we also had to rewrite a great part of our build system for both Bigloo and Hop to produce static libraries and binaries. This also needed disabling the dynamic loading of libraries, and with them, their initialization, so we had to initialize them by hand.

To add more unsuspected work, the Android package builder, provided with the SDK, ignores hidden files, which Bigloo uses to map Scheme module names to dynamic libraries. We had to work around this feature in the unpacking algorithm.

Then we moved to improve the friendliness of the frontend. So far, we could install Hop in the platform, either in a phone or in the emulator, but we could only run it in the emulator, because we were using a shell that runs as root on the emulator, but that runs as a user in a real device. This user, for the reasons given above, cannot even get into Hop's install dir. Even when Android has a component interface that allows applications to use components from other apps, none of the terminal apps we found at that time declared the terminal itself as a reusable component. We decided to use the code from the most popular one, which was based on a demo available on Android's source code, but not installed in actual devices. We had to copy the source code and trimm it down to our needs.

Having a more or less usable Hop package for Android, we decided to try and fix the issue we mentioned before: GC didn't compile with threads enabled. This means that we can't use the pthreads library, which is very useful for Hop. Hop uses threads to attend several requests at the same time. Bigloo implements two threads APIs, one based on pthreads and another which implements fair threads. Hop is able to use 5 different request schedulers, but works better with the one based on pthreads.

For these reasons we decided to focus in getting GC to use threads with the Android platform. GC's build system tests the existence of a threading platform checking the thread model declared by gcc. The gcc version provided with Android's SDK declares to have a 'single thread model', but we couldn't find what does this mean in terms of the code produced by gcc or how this could affect to GC's execution.

(Note: we didn't manage to make GC compile with threads.)

With a threadless Hop running, we had to add code to the server so we could talk between the server and the frontend while at the same time it is attending the requests from a web client. After several attempts to attack this problem, we decided that the best solution was to make this interface another service served by Hop. This meant less modifications to Hop itself, but a bigger one to the frontend we already had.

During these changes we found out a problem with JNI. The terminal component we imported into our code uses a small C library for executing the application inside (normaly a shell, in the original code, but Hop in our case) which is accessed from Java using JNI. The original Term application exported this class as com.android.term.Exec, but our copy exported it as fr.inria.hop.Exec. Even with this namespace difference JNI got confused and tried to use the Exec class from the original Term app. This is just another example how the platform is hard to work with. We found that the community support is more centered around Java and that very few people know about JNI, the NDK or any other native related technologies. We couldn't find an answer to this problem, so we worked around this by renaming the class.

So that's it. I can provide all the technical details for most the assertions I postulated above, but that would make this post unreadbal for its length. If you have any question about them, just conact me.