3/14/2012

Should I declare defeat on the research topic of API migration?

Of course, I won't, but perhaps I should! Then, I could turn to lower-hanging fruits in research, which I first need to spot, which I can't though because I am a bit obsessed with API migration (and admittedly some other stuff such as megamodeling). Sigh!

It was around 2004 that I became interested in API migration and I have talked about it here and there ever since. Perhaps I am thinking that talking about a difficult problem of interest helps in discovering the solution of the problem, or at least a sensible path to go. Wishful thinking so far!

In theory, the objective of API migration made a lot of sense while I was on the XML team at Microsoft because there are obviously way too many XML APIs. In practice, nothing happened on this front because I didn't understand automated API migration well enough back then. Add to this that API migration is something that is potentially risky for the API provider and the API migrator. So you need to mash up a rocket scientist and top politician to succeed. I am not yet there.

Back in academia, it took until like 2009 that we had a useful and publishable effort on API migration (see the SLE 2009 paper); just a year later another one (see the ICSM 2010 paper). I kept on working with Thiago in 2010-2012, but our efforts on language support for wrapper- and transformation-based migration hit sort of a brick wall. At least, for now, we take some rest. We have submitted another API migration paper, it's about an advanced technique for automated testing in wrapper development. This research is also backed up by additional wrapper studies.

So we haven't failed, by no means, but we are depressingly just at the stage of wrapper designers and engineers: we understand how to design wrappers (using patterns, for example), how to test wrappers (on the grounds of sophisticated test-data generation and contract assertions), what API differences to expect, how to spot them, and how to respond to them. We would like to be at the stage of language-based API migrators.

What am I supposed to do when a research effort hasn't made the progress that I expected years back when I was too naive? Rather than bailing out, I am going to do two things: a) I am going to compile a talk that deeply analyses what I have learned and what I think could/should be done; b) I am going to compile a funding application so that focused research efforts can target the interesting topic of API migration.

As to the talk, I am looking forward a visit of Suraj C. Kothari at Iowa State University in Ames next week, and here is the plan for this talk. (The trip to Ames is a trip during the trip because I am going to Ames during a trip to Omaha. From a recursion-theoretic point of view, I am obviously interested in carrying out a trip during the trip during the trip. This is certainly a good exercise in trying to understand the difference between left- and right-associativity.)

Regards,
Ralf

Title: API migration is a hard problem!

Slides: [.pdf]

Abstract: API migration refers to software migration in the sense of software reengineering: the objective is to eliminate an application's dependencies on a given API and make it depend instead on another API. Hence, we may speak of original API versus replacement API. In principle, migration can be achieved by a wrapping approach (such that the original API is re-implemented in terms of the replacement API so that the original implementation becomes obsolete and the application itself does not need to be changed) or by a transformation approach (such that the code of the application is rewritten so that the references to the original API are replaced by references to the replacement API). A degenerated case of API migration would be API upgrade or downgrade where the two APIs are essentially versions of each other with an effective relationship between the versions such that the wrapper or the transformation for migration can be derived from a suitably recorded, inferred, or specified relationship. The synthesis of a transformation or a wrapper is considerably more involved when the APIs at hand do not relate in such an "obvious" manner, i.e., when they have been developed more or less independently. The two APIs still serve the same domain (e.g., GUI or XML programming), but they differ in terms of features, protocols, contracts, type hierarchy, and other aspects. In this talk, I provide insight into such differences and explain existing, often primitive (laborious) migration techniques, which are mostly focused on wrapping. I use a number of case studies for empirical substantiation. I conclude with an outlook in terms of the challenges ahead with indications as to the techniques and methods to be used or developed. Program analysis must provide the heavy lifting to make progress on the hard problem of API migration.

Acknowledgement: This is joint work with (in alphabetic order) Thiago Tonelli Bartolomei (University of Waterloo, Canada), Krzysztof Czarnecki (University of Waterloo, Canada), Tijs van der Storm (CWI, Amsterdam, The Netherlands). I also acknowledge joint work within the Software Languages Team on the related subject of API (usage) analysis; special thanks are due to Ekaterina Pek.

Resources:



3/13/2012

More than you ever wanted to know about grammar-based testing

Preamble: Ever since 1999 +/- 100 years, I have been working (sporadically, intensively) on grammar-based testing. The latest result was our SLE'11 paper on grammar comparison (joint work with Bernd Fischer and Vadim Zaytsev). I have tried previously to compile a comprehensive slide deck on grammar-based testing, also with coverage on this blog, but this was relatively non-ambitious. With the new SLE'11 results at hand and with the firm goal of pushing grammar-based testing more into CS education (in the context of both formal language theory and software language engineering), I have now developed an uber-comprehensive slide deck with awesome illustrations for the kids. If you are reading this post ahead of the lecture, if you are still planning to attend, then you are well advised to bring brains and coffee. You may also bring a body bag, in case you pass out or worse. As it happens, this is "too much stuff" for a regular talk, lecture, or any reasonable format for that matter. I will run a first "user study" on this slide deck in a class on formal language theory in Omaha this Thursday; thanks to Victor Winter's trust in the survivability of this stuff, or why would he share his class with me otherwise? As a last resort and an exercise in adaptive talking, I am just going to skip major parts based on (missing) background of my audience. To summarize, if I get under the bus today, then all the grammar-based testing stuff is documented for mankind. (That's what Victor said.)

Title of the lecture: Quite an introduction to grammar-based testing

Slides of the lecture: [.pdf]

Elevator pitch for the lecture: Grammars are everywhere; resistance is futile. (More verbosely: If it feels like a grammar (after due consideration and subject to a Bachelor+ degree in CS), then it's probably just one. Just because some grammars mask themselves as object models, schemas, ducks, and friends, you should not move over to the dark side.) Seriously, non-grammars are cool, but life is short, so we need to focus. (I am sort of focusing on grammars and I am not even @grammarware.) Now, even grammars and grammar-based software components have problems, and testing may come to rescue. Perhaps, you think you know what's coming, but you don't have a clue.

Abstract of the lecture: Depending on where you draw the line, grammar-based testing was born, proper, in 1972 with Purdom's article on sentence generation for testing parsers. Now, computer scientists were really obsessed with parsers and compilers in the last millenium and much work followed in the seventies, eighties, and early nineties. Burgess' survey on the automated generation of test cases for compilers summarized this huge area in 1994. Why would you want to test a compiler: it could suffer from regressions along evolution; it could be different than another compiler that serves as reference; it could fail to comply with the language specification (perhaps even the grammar in there); it could break when being stressed; it could simply miss some important case. Non-automated testing really does not suffice in these cases. You cannot possibly (certainly not systematically) test a grammar-based software component other than by generating test data (in fact, test cases) automatically, unless the underlying grammar is trivial. Grammar-based testing suddenly becomes super-important, when much software turns out to be grammar-based (other than parsers and compilers): virtual machines, de/-serialization frameworks, reverse and re-engineering tools, embedded DSL interpreters, APIs, and what have you. Such promotion of grammar-based testing to the horizon of software engineering was perhaps first pronounced by Sirer and Bershad's paper on using grammars for software testing in 1999. Grammar-based testing is not straightforward, by all means, in several dimensions. For instance, coverage criteria for test-data generation must be convincing in terms of "covering all the important cases" and "scaling for non-trivial grammars". Also, all the forms of grammars in practice are "impure" more often than not; think of semantic constraints represented in different ways. Related to the matter of semantics, any automated test-data generation approach relies on an automatic oracle, and getting such an oracle is never easy. This lecture is going to present a certain view on grammar-based testing, which is heavily influenced by the speaker's research and studies. In addition to the speaker's principle admiration of grammars and grammar-based software, the reason for such obsession with grammar-based testing is that this domain is so exciting in terms of combining formal language theory, (automated) software engineering, and declarative programming. This lecture is an attempt to convey important techniques and interesting challenges in grammar-based testing.

Bio of the speaker: As earlier this week. (Nothing much has happened very recently.)

Acknowledgement: The presented work was carried out over several years in collaboration with (in alphabetical order) Bernd Fischer (University of Southampton, UK), Jörg Harm (akquinet AG, Hamburg, Germany), Wolfram Schulte (MSR, Redmond, WA, USA), Chris Verhoef (Vrije Universiteit, Amsterdam, NL), Vadim Zaytsev (CWI, Amsterdam, NL)

Related papers by the speaker (and collaborators):

Related patent:

Have fun!

Ralf

3/08/2012

Technical space travel for developers, researchers, and educators

The inevitable has happened.
I have committed myself to giving the first major talk on 101companies (not counting the AOSD 2011 tutorial, which described an early view on the universe).
This outing talk happens to be at the CS Department at University of Nebraska at Omaha, as I will be visiting Victor Winter the next two weeks.

Speaker
:
Ralf Lämmel (University of Koblenz-Landau)

Acknowledgement:
Joint work with Jean-Marie Favre, Thomas Schmorleiz, and Andrei Varanovich.

Title:
Technical space travel for developers, researchers, and educators

Abstract:
A technical space is a technology and community context in computer science and information technology. For example, the technical space of XMLware deals with data representation in XML, data modeling with XML schema, and data processing with XQuery, XSLT, DOM, and LINQ to XML. Likewise, the technical space of tableware deals with data representation in a relational database, data modeling according to the relational model or the ER model, and data processing with SQL and friends. There are various other, not necessarily orthogonal technical spaces: Javaware, grammarware, objectware, lambdaware, serviceware, etc. How can we easily travel between spaces such that software products may involve multiple spaces? How can we deal reasonably with the plethora of technologies and languages in computer science and information technology? How can we profoundly experience the universe in a scientifically and educationally relevant manner? We approach these questions in the emerging 101companies project for space-traveling developers, researchers, and educators on the grounds of a wiki, a source-code repository, and an ontology.

Slides: [.pdf]