Thoughts on a very semantic wiki

Preamble

101wiki started as a boring mediawiki installation to document software systems in the chrestomathy ‘101’semantic wiki extensions were quickly adopted; eventually our team developed a full blown proprietary semantic wiki sort of from scratch. Now we also rehosted it and provided it with new looks. (BTW, the 101companies brand name is now all gone. It's now just '101' really.)


https://101wiki.softlang.org/


The biggest mistake we (me!) made in said project ‘101’ is that we had only very loose specs for system implementation and system documentation; we had no proper process for checking and accepting contributions either. Thus, the 101wiki content was always a big mess and it still is. This problem is so serious that we switched to discouraging contributions a few years ago and rather deal with what we have and add content only when absolutely necessary. However, we depend on the 101wiki content for teaching; we also use it as a linked data hub for software language engineering-related research projects such as MetaLibMegaLib, and YAS.


With a small group of people, we are starting now a significant content and ontology-modeling push, which hopefully will lead to some islands of sanity on 101wiki. In what follows I am going to describe the rationale for what’s emerging.


Feedback more than welcome.



Semantic wiki basics


  • Typed links: Property names are used to qualify (to ‘type’) links. For instance, we use ‘sameAs’ to express that a 101wiki entity (page) is the same as some entity (page) elsewhere. Also, we use ‘uses’ to express that a contribution (a system implementation) uses some language or technology. We tend to relate to 101wiki entities (pages) to Wikipedia resources. See here for a list of 101wiki’s properties.
  • Typed pages: We organize pages in ‘namespaces’ such as 'Language', 'Technology', or 'Contribution'. We use namespace names as prefixes/qualifiers of page names. For instance, we say ‘Language:Java’ rather than ‘Java_(Programming language)’ on Wikipedia. The fact that Java is a programming language is taken care of by a semantic property. That is, Java is declared to be an instance of 'OO programming language' which is a subtype of 'Programming language'. See here for a list of 101wiki’s namespaces.
  • Bits of content management: We expect that the structure of pages can vary, in our case, depending on the namespace (the ‘type’) of page. That is, there are different sections that may be used and each type of section may come with certain expectations regarding its content. For instance, a ‘headline’ is a section that should be used by any 101wiki page while a ‘motivation’ is (currently) only expected by a page for a system 'feature'. See here for a list of 101wiki’s sections.

For instance, here is (most) of the content of 101wiki's page for the Haskell programming language:

Content for https://101wiki.softlang.org/Language:Haskell

In fact, we show the metadata section of the Haskell page separately:

Metadata for https://101wiki.softlang.org/Language:Haskell

That is, Haskell is also located on haskell.org and Wikipedia. We use 'sameAs' to express that these are all resources describing the Haskell language. There is also an 'instanceOf' property to express that Haskell is a functional programming language. 'Inbound' properties are also shown to help the user realize what other pages relate to Haskell.

Semantic wiki self-description

  • Link types are to be declared on the wiki itself: This means, in our case that, there is a type (a ‘namespace’) of properties. It also means that there are ‘meta-properties’ dealing with the properties of properties. That is, each property, just like in Semantic Web, has a domain and a range.
  • Pages types are to be declared on the wiki itself: This means, in our case, that there is a type (a ‘namespace’) of namespaces. It also means that there are ‘meta-properties’ dealing with the properties of (pages as members of) namespaces. That is, each namespace associates with mandatory and optional sections and properties. Accordingly, there is also a type (a ‘namespace’) of sections.
  • Link endpoint types are to be declared on the wiki itself: This means, in our case, that there is a type (a ‘namespace’) of types. There is basically a type for each 101wiki namespace, but there are additional types such as ‘String’ for string-typed properties, ‘URI’ for reaching out of 101wiki, and ‘Any’ to refer to the union of all 101wiki namespaces.
For instance, these are the properties for the namespace of languages:

Metadata for https://101wiki.softlang.org/Namespace:Language

That is, the namespace relates to the concept of 'software language'. Each page in the namespace, must have a 'headline' as well as a section with metadata; it may have sections 'details', 'quote', and 'illustration'. The metadata must at least exercise the 'instanceOf' property for classification. The 'exemplifiedBy' property at the bottom of the figure is a bit special; we discuss it just below.

Semantic wiki quality monitoring

Given how much messy content there is on 101wiki, given how difficult it still is to agree on semantics of page and link types, we are starting to use one magic property, ‘exemplifiedBy’, to designate 101wiki pages that are reasonably representative of a type (a namespace, a property, a section, etc.). This helps the team to consult these exemplars in trying to migrate more legacy to an emerging 'metamodel'. The metadata for the property is mind-boggling.

Metadata for https://101wiki.softlang.org/Property:exemplifiedBy

That is:

  • The page describing the property is linked to the notion of Exemplar.
  • Subjects of the property maybe a namespace, section, or property page. That is, these kinds of pages can be 'exemplified'.
  • Objects of the property maybe pages in 'any' namespace. This is a bit weakly typed because, we expect of course that an exemplar for namespace should be a page in the namespace. (So basically 101wiki's type system is not powerful enough to capture all details.)
  • It so happens that the property page for 'exemplifiedBy' itself is a feature page for the property; see 'this exemplifiedBy this'.
  • We also see how the use of the property is documented in the 'metamodel' of the namespaces namespace, section, and property. 


Acknowledgments

I take responsibility for the content mess on 101wiki, but I like to acknowledge some people who have contributed or are contributing to 101 in a significant way, despite my epic failure. Hopefully this acknowledgment will not be used against them :-)

  • Andrei Varanovich (former developer and content author)
  • Thomas Schmorleiz (former developer)
  • Kevin Klein (the incredible current developer)
  • Marcel Heinz (current content author and ontologist)
  • Johannes Härtel (current content author and data miner)
  • Hakan Aksu (current content author and educator)
  • Wojciech Kwasnik (the team's logo artist acknowledged here)


The logo of '101': it hints at the Tower of Babel and how the project illuminates hopefully the knowledge area of software languages, technologies, and concepts on the grounds of an advanced chrestomathy approach .

Regards,
Ralf

Comments

Popular posts from this blog

SWI-Prolog's Java Interface JPL

Software Engineering Teaching Meets LLMs

Lecture series on advanced (functional) programming concepts