9/14/2010

Dealing comfortably with the confusion of tongues

That's the title of my invited talk at CHOOSE Forum 2010.

See the abstract below.

I just realized that I am going to meet some esteemed colleagues there: Jean Bezivin, Jean-Marie Favre, Uwe Zdun.


Abstract: Yahweh brought the confusion of tongues upon the people in the plain of Shinar because he did not like their efforts of building the Tower of Babel with its top in the heavens. In IT, we brought the confusion of tongues upon ourselves with the continuous aggregation of ideas for programming (languages), modeling (languages), domain concepts and domain-specific languages, design patterns, APIs, and so on. All this diversity makes a nontrivial software system look like the Tower of Babel—in terms of the involved complexity. That is, a nontrivial software system involves multiple technical spaces, technologies, bridges between them, and harmful amounts of software asbestos. How can IT personnel acquire the relevant skills more effectively? How would we want to prepare an undergraduate in CS for such toughness? These are the questions that we address in this talk. We put to work a combo of technology modeling, metamodeling, and megamodeling—plus a suite of well-designed examples in any lingo that we could think of. (Contrary to this abstract, the talk is going to be very technical; target audience: software engineers.)

Acknowledgment: This is joint work with Dragan Gasevic and Jean-Marie Favre. Further, I acknowledge contributions by some of my students—most notably Thomas Schmorleiz.

9/13/2010

(Mega)modeling Software Language Artifacts


GPCE/SLE 2010 tutorial

Jean-Marie Favre (OneTree Technologies, Luxembourg)
Dragan Gašševic (Athabasca University, Canada)
Ralf Lämmel (University of Koblenz, Germany)

Description

Modern software is typically made of heterogeneous sets of software artifacts including for instance databases, programs, transformations, grammars, models and metamodels, compilers, interpreters, formats, ontologies, frameworks, APIs, schemas, configuration files, makefiles, etc. In practice particular languages, tools, implementations, and standards are used such as SQL DDL, Saxon, XLST, Java, Hibernate, XSD, OWL, DOM, Antlr, UML, XMI, Ecore, Awk, and so on. In the absence of a conceptual framework it is difficult to understand the relationships between these software artifacts, if any. The goal of this tutorial is to provide such a framework, showing that the similarity and relationships between techniques can be modeled at a high level of abstraction, and even more importantly that recurring patterns occur in such models. Some of these patterns, for instance those involving ““bridges”” between technologies, would be really difficult to grasp without a proper conceptualization. As a result software engineers and researchers usually find it hard to understand the intricacies of technologies that are out of their area of expertise and it is more than likely that they do not realize the analogies that exist between heterogeneous technologies. This tutorial aims to unveil these recurring patterns and to show participants coming from different horizons how to model the technologies they design or work with in an uniform way and how to situate them into the overall software language landscape.

In the first part of the tutorial, the notions of software languages and technical spaces are briefly presented with a special emphasis on their unifying character. Then fundamental relations such RepresentationOf and ElementOf are introduced forming the basis of a (mega)modeling framework. Recurrent patterns based on these relations are then presented, allowing to describe for instance the ““conformance”” relation between let’’s say a program and a grammar, an xml file and an xsd schema, or an uml model and its metamodel, etc. More complex patterns such as bridges between technologies (e.g. XML <==> Relational, OO <==> XML, etc.) are defined following the same approach. Though this notion of bridges seems easy to grasp informally at the first sight, it often leads to a rather large and complex set of technologies that are hard to understand and compare without an appropriate framework.

In the second part of the tutorial, the use of (mega)modeling framework is illustrated through its application in three different technical spaces: Grammarware, Modelware and Ontologyware. Concrete examples of various degree of complexity are provided in each case, with again an emphasis on similarities between technical spaces. The hope of this approach is that it should be possible for someone with some knowledge in technical spaces (let’’s say grammarware) to improve significantly his or her comprehension about another space (let’’s say ontologyware), and this by virtue of analogy. It is our believe that the (mega)modeling approach, by raising the level of abstraction and focusing on essential software language concepts, enables both to better understand complex structures involving many heterogeneous software artifacts, but also to better apprehend new technologies coming from other spaces.

One of the objectives of the tutorial is to show that bridges can be successfully established between heterogeneous technical spaces such as Modelware, Ontologyware and Grammarware, and in particular to go beyond traditional divisions of fields of expertise. This tutorial being directly inscribed into a ““community engineering”” perspective, we believe that having three presenters coming from different horizons would be the best way to insist on the integrative aspect of the mega-modeling approach.

Presenters

  • Jean-Marie Favre is a software anthropologist and a software language archeologist. He is principal scientist at One Tree Technologies. He has published numerous papers and coedited a book (in French) Beyond MDA: Model Driven Engineering. He has given tutorials and keynotes in more than dozen of international events and summer schools and has organized various national and international events. His research interests include software language engineering, software linguistics, software evolution and reverse engineering, model driven engineering and research 2.0.
  • Dragan Gašševic is a Canada Research Chair in Semantic Technologies and an Associate Professor in the School of Computing and Information Systems at Athabasca University. His research interests include semantic technologies, software language engineering, technology-enhanced learning, and service- oriented architectures. He has (co-)authored numerous research papers and is a led author of the book "Model Driven Engineering and Ontology Development." He has given tutorials at many well-known conferences such as WWW, ISWC, and CAiSE.
  • Ralf Lämmel is Professor of Computer Science at University of Koblenz-Landau. In his career, he also served at Microsoft Corp., Free University of Amsterdam, Dutch Center for Mathematics and Computer Science (CWI), and University of Rostock. Ralf Lämmel is generally interested in the combination of software engineering and programming languages. Together with the other tutorial speakers and further researchers, he is one of the founding fathers of the SLE conference. He is one of the founding fathers of the summer school series GTTSE--Generative and Transformational Techniques on Software Engineering.

Content

The tutorial will be divided in various parts introducing first the key issues relative to modeling heterogeneous software language artifacts, then presenting the conceptual framework for mega-modeling, and finally examples of application in different technical spaces.


Provisional outline

  • Introduction and objectives (5’’)
  • Technical Spaces and Software Languages (5’’)
  • Megamodeling fundamentals (10’’)
  • Application to Grammarware (10’’)
  • Application to Ontologyware (10’’)
  • Application to Modelware (10’’)
  • Synthesis and conclusion (10’’)

Relevance to GPCE/SLE attendees

Because of its emphasis of software language artifacts, this tutorial should be of interest to SLE attendees. Compilers, transformations and generators being extensively used in generative programming the tutorial should also attract GPCE attendees. We believe that providing a conceptual and unified approach for mega- modeling and showing its application across various technical spaces could improve cross fertilization between communities.

9/12/2010

The essence of "The essence of functional programming"


If "code is the essence", then this blog post is concerned with the essence of "The essence of functional programming" --- because the post matches a code distribution that is derived from P. Wadler's paper of just that title.

The code distribution has two branches: "origin" --- where it indeed stays close to the original code of the paper, and "overhaul" --- where it uses the Haskell's monad/monad transformer library to show some of the examples in modern style. This code has been initially prepared for a Channel9 lecture about monads. The code is available through the SourceForge project developers; browse the code here; see the slide deck for the Channel9 lecture here.

Here is a more complete reference to Wadler's paper:

P. Wadler: The essence of functional programming
POPL 1992

Henceforth, we refer to this paper as "the paper".
Recovery and modularization effort due to Ralf Lämmel.


Disclaimer

The code distribution has not been approved, in any form, by the original author (P. Wadler). The author of the code distribution (R. Lämmel) takes all responsibility for any imprecision or misconception of this recovery and elaboration effort. Comments are welcome, as usual. Also, the particular way of distributing (modularizing) the code by means of heavy cpp usage is not meant to suggest any useful style of functional programming. It is merely used as a convenience for organizing the code such that code reuse is explicitly captured.


The paper's code vs. this code distribution

The branch "origin" is meant to stay close to the code in the paper. In particular, Haskell's monad (transformer) library is used only in branch "overhaul". We identify the following deviations (in branch "origin"):
  • Most importantly, the original paper did not organize the code in any way that would explicitly represent code reuse. In contrast, the code distribution leverages cpp to this end. This approach may be viewed as an ad-hoc cpp-based product-line-like approach.
  • We use the type class Show for "showing" values and for "getting out of the monads".
  • We added a few samples; they are labeled with code comments.
  • Minor forms of name mangling and refactoring are not listed here.
  • We renamed a few operators to better match the current Haskell library:
unit... -> return
bind... -> (>>=)
error... -> fail
zero... -> mzero
plus... -> mplus
out... -> tell


How to understand and run the code?

One should probably have modest knowledge of monads. For instance, one may read the paper. The easiest approach towards understanding the code distribution is to study the final interpreters in subdirectory "cache". (These interpreters were assembled from pieces, but this may be less important initially.) With two exceptions, these are all complete Haskell 98 programs (modulo TypeSynonymInstances and FlexibleInstances for the sake of Show) with a main function with test cases corresponding to the samples in the paper. The interpreters correspond to scenarios in the paper, and there is a corresponding comment at the beginning of each file to identify relevant sections in the paper. A complete list of all interpreters follows below.

If you want to better understand the structure of the interpreters and their relationships, you may want to have a look at the primary sources that are preprocessed via cpp. Each interpreter is assembled through cpp #includes; see the files in subdirectory "templates". A more profound description of our ad-hoc cpp-based product-line-like approach follows below.


List of interpreters in the order of occurrence in the paper
  • Baseline.hs: This is a non-monadic, CBV interpreter. This is the (non-monadic) baseline for all experiments. That is, we will turn this interpreter into monadic style, instantiate the monad parameter, perform data extensions and selective code replacements. We will also vary CBV vs. CBN, and we will use CPS eventually.
  • CBV.hs: This is the scheme of a monadic-style CBV intepreter. This file is incomplete as it lacks a concrete monad. (Don't run this file.) Some of the following interpreters complete this scheme. Such completion typically involves data extension and selective code replacement.
  • Identity.hs: This is CBV.hs completed by the identity monad.
  • Errors.hs: This is an interpreter with error messages. To this end, we instantiate the monad parameter of our monadic-style CBV interpreter with the error monad, and we apply selective code replacement so that error messages are actually thrown by the interpreter.
  • Positions.hs: This is an elaboration of Errors.hs so that position information is maintained and accordingly used in the error messages.
  • State.hs: This is an interpreter with a reduction count for function applications including additions. To this end, we instantiate the monad parameter with the state monad, where the state models the reduction count. We also need to apply selective code replacement so that function applications are actually counted. We also perform a data extension to provide a language construct for reading the reduction count within the interpreted language.
  • Output.hs: This is an interpreter with output. To this end, we instantiate the monad parameter with the writer monad, so that output is aggregated as a string. We also perform a data extension to provide a language construct for producing output within the interpreted language.
  • Nondeterminism.hs: This is an interpreter with non-deterministic choice. To this end, we instantiate the monad parameter with the list monad, so that nondeterminism is modeled with a multiplicity of result values. We also perform a data extension to provide language constructs for the failing computation (that produces the empty list of results) and for nondeterministic choice, indeed.
  • Backwards.hs: This is a variation on State.hs with backwards propagation of state.
  • CBN.hs: This is the scheme of a monadic-style CBN interpreter. This file is incomplete as it lacks a concrete monad. (Don't run this file.) Some of the following interpreters complete this scheme.
  • StateCBN.hs: This is a variation on State.hs with CBN.
  • NondeterminismCBN.hs: This is a variation on Nondeterminism.hs with CBN.
  • CPS.hs: This is a non-monadic, CBV and CPS interpreter.
  • Callcc.hs: This is a monadic-style interpreter (CBV) that uses the continuation monad to provide "call with current continuation" (callcc). We perform a data extension to provide callcc as a language construct within the interpreted language.
  • ErrorsCPS.hs: This is a variation on Errors.hs which leverages the continuation monad and uses the Answer type to plug errors into the interpreter.
  • StateCPS.hs: This is a variation on State.hs which leverages the continuation monad and uses the Answer type to plug state into the interpreter.

Code organization of the distribution

Directories:

  • cache: ready-to-run interpreters
  • templates: cpp templates for interpreters
  • monads: monads used in the interpreters
  • types: syntactic and semantic domains
  • functions: functions making up the interpreters
  • terms: sample terms for the interpreters
  • baselines: interpreter outputs for regression testing

Some types and functions require variation for the different monads and also in terms of the choice CBV vs. CBN. The subdirectory cba (read as call-by-"any") hosts definitions that are shared by CBV, CBN, and CPS. (Here, we should note that the CPS experiments use CBV (not CBN).) Within directories types and functions, the subdirectories cbv and cbn host definitions specific to CBV or CBN respectively. Monad-specific variations are hosted in yet deeper subdirectories with the (one-letter) name of the corresponding monad; see, for example, functions/cbv/E.

Comments and questions and contributions welcome.

Ralf

9/10/2010

Bulk mailing for your conferences not appreciated

Dear Dr. Nagib Callaos,

For some years now, my mailbox is flooded by CFPs for your many conferences such as WMSCI 2011 (for which you are General Chair etc.). My email address is rlaemmel@gmail.com. Please remove me from your bulk mailing system. (I have tried your instructions for unsubscribing in the past, obviously, w/o success.)

Keep up the good work.

Thanks,
Ralf Lämmel

PS: It is not easy to set up a mail filter that addresses this problem reliably over the years. Or does anyone have an idea as to how to stretch gmail's filter mechanism? I can think of a rule that checks whether a) we face a CFP or some other form of conference ad, and b) certain keywords (names) occur on the website of the conference. Now, gmail, please provide us with that expressiveness :-)