Should I declare defeat on the research topic of API migration?

Of course, I won't, but perhaps I should! Then, I could turn to lower-hanging fruits in research, which I first need to spot, which I can't though because I am a bit obsessed with API migration (and admittedly some other stuff such as megamodeling). Sigh!

It was around 2004 that I became interested in API migration and I have talked about it here and there ever since. Perhaps I am thinking that talking about a difficult problem of interest helps in discovering the solution of the problem, or at least a sensible path to go. Wishful thinking so far!

In theory, the objective of API migration made a lot of sense while I was on the XML team at Microsoft because there are obviously way too many XML APIs. In practice, nothing happened on this front because I didn't understand automated API migration well enough back then. Add to this that API migration is something that is potentially risky for the API provider and the API migrator. So you need to mash up a rocket scientist and top politician to succeed. I am not yet there.

Back in academia, it took until like 2009 that we had a useful and publishable effort on API migration (see the SLE 2009 paper); just a year later another one (see the ICSM 2010 paper). I kept on working with Thiago in 2010-2012, but our efforts on language support for wrapper- and transformation-based migration hit sort of a brick wall. At least, for now, we take some rest. We have submitted another API migration paper, it's about an advanced technique for automated testing in wrapper development. This research is also backed up by additional wrapper studies.

So we haven't failed, by no means, but we are depressingly just at the stage of wrapper designers and engineers: we understand how to design wrappers (using patterns, for example), how to test wrappers (on the grounds of sophisticated test-data generation and contract assertions), what API differences to expect, how to spot them, and how to respond to them. We would like to be at the stage of language-based API migrators.

What am I supposed to do when a research effort hasn't made the progress that I expected years back when I was too naive? Rather than bailing out, I am going to do two things: a) I am going to compile a talk that deeply analyses what I have learned and what I think could/should be done; b) I am going to compile a funding application so that focused research efforts can target the interesting topic of API migration.

As to the talk, I am looking forward a visit of Suraj C. Kothari at Iowa State University in Ames next week, and here is the plan for this talk. (The trip to Ames is a trip during the trip because I am going to Ames during a trip to Omaha. From a recursion-theoretic point of view, I am obviously interested in carrying out a trip during the trip during the trip. This is certainly a good exercise in trying to understand the difference between left- and right-associativity.)


Title: API migration is a hard problem!

Slides: [.pdf]

Abstract: API migration refers to software migration in the sense of software reengineering: the objective is to eliminate an application's dependencies on a given API and make it depend instead on another API. Hence, we may speak of original API versus replacement API. In principle, migration can be achieved by a wrapping approach (such that the original API is re-implemented in terms of the replacement API so that the original implementation becomes obsolete and the application itself does not need to be changed) or by a transformation approach (such that the code of the application is rewritten so that the references to the original API are replaced by references to the replacement API). A degenerated case of API migration would be API upgrade or downgrade where the two APIs are essentially versions of each other with an effective relationship between the versions such that the wrapper or the transformation for migration can be derived from a suitably recorded, inferred, or specified relationship. The synthesis of a transformation or a wrapper is considerably more involved when the APIs at hand do not relate in such an "obvious" manner, i.e., when they have been developed more or less independently. The two APIs still serve the same domain (e.g., GUI or XML programming), but they differ in terms of features, protocols, contracts, type hierarchy, and other aspects. In this talk, I provide insight into such differences and explain existing, often primitive (laborious) migration techniques, which are mostly focused on wrapping. I use a number of case studies for empirical substantiation. I conclude with an outlook in terms of the challenges ahead with indications as to the techniques and methods to be used or developed. Program analysis must provide the heavy lifting to make progress on the hard problem of API migration.

Acknowledgement: This is joint work with (in alphabetic order) Thiago Tonelli Bartolomei (University of Waterloo, Canada), Krzysztof Czarnecki (University of Waterloo, Canada), Tijs van der Storm (CWI, Amsterdam, The Netherlands). I also acknowledge joint work within the Software Languages Team on the related subject of API (usage) analysis; special thanks are due to Ekaterina Pek.



  1. This comment has been removed by the author.

  2. Well, most of the people with whom I had any conversation about API migration outside the circle of you, Tijs, Krzysztof and Thiago, readily admitted that it is a fascinating topic, but also just as quickly became disinterested and bored when they learned that the currently available research attempts are about wrappers and not "real" transformers.

  3. The issues underlying these wrappers are by no means boring or trivial or non-scientific as far as I can tell, but I can understand that scientists often like to push for a different kind of abstraction level. So we are the plumbers trying to get this topic off the ground.

  4. Hi Ralf - Huiqing Li and I have been working on this in the context of Wrangler, our tool for refactoring Erlang programs.

    The idea is this: if API A is replaced by B then for "old" code to use B, we can write an adapter module M, implementing A in terms of B. I guess this is what you call a wrapper. What we then do is generate a set of transformations from M that factor the adaptation code into client modules, so that these modules then use the new interface directly.

    Of course, the new code can be ugly, but in many cases the adapter function completely disappears from the module. We've tried it out with the regular expressions modules in the Erlang library, where substantially change,d including indexing string positions from a different point.