Skip to content
Robert Glaser edited this page May 15, 2017 · 28 revisions

Technical Concept

This paper describes the goals, architecture and implementation of iQvoc - the web-based open source vocabulary management framework.

1. Introduction

iQvoc is a web-based vocabulary management framework which provides both an intuitive user interface and Semantic Web interoperability.

iQvoc supports vocabularies that are common to many knowledge organization systems, such as:

  • Thesauri
  • Taxonomies
  • Classification schemes
  • Subject heading systems

iQvoc provides comprehensive functionality for all aspects of managing such vocabularies:

  • multilingual display and navigation in any web browser
  • editorial control for approved versions
  • publishing the vocabulary in the Semantic Web
  • easy customization according to users' needs
  • import of existing vocabularies from a SKOS representation

1.1 Goals

The goals for developing iQvoc were:

  1. Create an application that incorporates the features listed in chapter 1.
  2. Be able to use the complete application as an extendible framework.

1.2 Target audience

This document targets both decision makers and engineers who want to get insight into the architectural decisions we made to achieve the goals listed above. It describes the current status quo of iQvoc and therefore acts as a part of the project's technical documentation.

2. Constraints

The only hard constraints for the (continuous) development of iQvoc were:

  • Runnable on the JVM
  • Deployable in Apache Tomcat >= 6.0
  • Using a relational Oracle database

These constraints largely originate from the projects iQvoc is being used in by a specific customer. However, iQvoc is not limited to the JVM as a runtime environment and allows different relational storage backends.

3. Context

iQvoc is actively being developed by innoQ Deutschland GmbH and is being employed in a variety of diverse projects.

At the moment iQvoc is actively being used by several projects. The German Federal Environment Agency (Umweltbundesamt) employs iQvoc in the public thesaurus UMTHES.

4. History & Approach

This chapter describes the history of iQvoc and its iterative development into a generic framework.

iQvoc started as a research project to evaluate the conceptual challenges and technological possibilities of publishing linked data using a state of the art web application framework. Based on our expertise with Ruby and Rails and the initial lack of technical constraints we chose Ruby on Rails as the framework, thereby allowing ourselves to later choose a different production runtime environment and persistence layer to meet the eventually emerging requirements in this regard. In practise, JRuby allowed us to develop the application in our environment of choice whilst paying respect to the production environment (listed in chapter 2).

Version 1.0 was closely tailored to the requirements of UMTHES (Umweltthesaurus). In fact the UMTHES system was a single application strongly lacking generalization and modularization - at this point it was impossible to reuse generic logic and components for other vocabulary implementations in a practical manner.

This situation led us to a consolidation phase wherein we extracted and refactored the iQvoc core logic into a separate component. While establishing a clean split between core logic and customer extensions we made large parts of the core configurable. By providing a central configuration we were able to avoid hacks that otherwise would have been necessary to overload specific core functionality within a special customer extension. The result of these efforts was iQvoc 2.0.

Before open-sourcing iQvoc we introduced some major API changes and feature extensions which led us to a version bump to 3.0 for the initial public release. These changes mainly consisted of:

  • Extraction of SKOS-XL support into a separate component
    SKOS-XL (SKOS Extension for Labels) support was tied to the iQvoc core from the beginning because UMTHES required it. Technically speaking SKOSXL elevates labels to first-level entities, alongside concepts. SKOS-XL labels can have their own relations between each other and therefore their own URIs. We decided that SKOS-XL should not be a core functionality; that led to the extraction of iqvoc_skosxl into a separate library.
  • SKOS importer
    iQvoc is able to import standard and valid SKOS data.

5. Architecture

In this chapter we document the important architecture choices that enabled us to release and maintain iQvoc as a generic framework.

The core schema is closely tailored to the SKOS (Simple Knowledge Organization System) standard. Vocabulary items can be created as concepts. Concepts can be assigned different names by using so-called labels, i.e. they are labeled. Concepts can be assigned notes of different types, e.g. a definition of what the concept is about. Concepts (as well as collections) can be grouped in collections.

iQvoc employs SKOS in a relational model design. It is in the very nature of the Semantic Web to associate and connect concepts in a vast number of ways. In order to support this we developed configurable relation types.

Example: If one wants to extend iQvoc's standard SKOS concept relations, a new relation class inheriting from the base Concept::Relation class can be implemented and configured. The core configuration provides hooks for every existing relation.

5.1 Goals

5.1.1 Retrospective

In the beginning the primary goal whilst building iQvoc was to develop a thesaurus editor with certain features for one customer project - it is important to recall this as iQvoc is now much more.

The main goals at the beginning were:

  • Publishing and editing of one specific thesaurus
    The application should be able to let users collaborate in editing the managed thesaurus. An editing workflow should offer simplified versioning of thesaurus terms and collaboration.
  • Deep integration into the Semantic Web
    Supporting SKOS incorporates support for concept representations in different RDF (Resource Description Framework) formats. We wanted to be able to implement the various RDF views in a concise and DRY (Don't Repeat Yourself) way. With that came the requirement to support the importing of standard SKOS data in different RDF formats into an iQvoc instance.

5.1.2 Generalization

While finishing the mentioned customer project and achieving the goals listed in the respective we got more requests for implementing custom thesauri and vocabularies which lead us to the decision to generalize the architecture of iQvoc, remove any customer- or project-specific components and restructure it as a hybrid of a standalone editing application and a classic framework:

  • iQvoc as a framework
    We wanted to be able to reuse a specific amount of code over and over again - the typical case of a framework. Copying parts or the whole core logic were definitely not an option with respect to several thesauri and vocabulary projects for customers. Vocabulary applications embedding iQvoc should still remain customizable. Because of the inherent complexity of abstraction and generalization that comes with creating a software framework, this goal was also the one on the list that had the biggest impact on our architectural design decisions.
  • iQvoc as a stand-alone application
    Apart from the need to reuse iQvoc as a framework for applications employing vendor-specific customizations, we wanted the software to be usable as a stand-alone application for cases that do not require modifications or extensions of core functionality. This may be also convenient for quick production or demo setups as well as sample instances.

5.2 Conceptual Domain Model

iQvoc's domain model is closely tailored to the SKOS (Simple Know Organization System) standard. SKOS provides a model to represent concept schemes like thesauri and vocabularies within the context of the Semantic Web.

Taken from the SKOS Primer:

The fundamental element of the SKOS vocabulary is the concept. Concepts denote ideas or meanings that are the units of thought [Willpower Glossary] which underly the KOSs used in a number of applications. As such, concepts exist in the mind as abstract entities which are independent of the terms used to label them.

As the text implies the concept is the central domain concept of a vocabulary whereas labels are used to name concepts.

Concepts can also have

  • relations of many kinds to other concepts
  • different kinds of notes (like descriptions) which attach information
  • matches: references to concepts in other concept schemes
  • assignments to collections which can be used for grouping and/or organization. Collections can contain both concepts as well as collections - thus concepts can be organized in a hierarchical way.

For a detailed description of the SKOS model please cf. the SKOS standard reference. In the attachments you can find a detailed domain model for SKOS itself.

Because of the many typed relations between models the domain model graphic only shows a simplified schema of iQvoc's model classes.

Class diagram

Comparison of iQvoc's model schema to standard SKOS:

  • There is no thesaurus class. An iQvoc installation is self-contained and represents a thesaurus instance. Multiple thesauri can be managed by installing multiple iQvoc instances. Connecting concepts can be done with matches.
  • Concept groups and schemes can be implemented by using collections.

5.3 Infrastructure

5.3.1 Server-side

On the server-side iQvoc is based on Ruby on Rails - it is compatible with a variety of SQL databases. Due to its relational database schema it is not compatible with non-relational databases such as graph-, document- or key-value stores.

The model layer is implemented using Rails' ActiveRecord ORM. Many model classes make heavy use of Single Table Inheritance (STI). E.g. the different kinds of a concept's notes are stored in the table notes. Some standard note types are: Definitions, Change notes and Editorial notes. Lots of the given classes share all of the attributes and don't employ significant amounts of custom logic so STI comes up as a suitable pattern.

As an authentication library we chose Authlogic because of its unintrusive approach. Authorization is implemented using the CanCan library.

HTML rendering uses Rails's ERb template language. Additionally we implemented RDF rendering in both Turtle and RDF-XML using a DSL which was extracted into the open source project IqRdf.

As iQvoc provides RDF rendering in multiple formats it can be easily connected to triple stores like Virtuozo.

5.3.2 Client-side

iQvoc's user interface employs progressive enhancement by making use of jQuery, providing a variety of JavaScript widgets to simplify navigation and data entry:

  • treeview provides a dynamic tree navigation of hierarchical constructs
  • datepicker simplifies entry of dates
  • autocomplete provides in-place suggestions when entering references
  • jit-rgraph used to visualize concept and label relations

5.3.3 Main components

The main components are:

A detailed dependency graph can be found in the attachments. Please keep in mind that this can only be seen as a snapshot at the current point of time this document has been created or updated.

5.4 Modularization

Strong use of modularization enables us to develop and maintain a full-featured core software whilst being able to extend it easily with new features or replace core logic. This chapter outlines some of the important modularization techniques we used.

5.4.1 Rails Engines

Rails 2.3 introduced a very powerful new feature called "Rails Engines". By using engines one can hook up Rails applications into another. The entire iQvoc architecture is based on this technique. Rails 3 elevated the engine feature to a more advanced level and enabled complete "mountable apps".

There are two ways the iQvoc core system can be used:

  • Stand-alone
    The iQvoc source code can be cloned in order to set up a vocabulary instance.
  • As an engine
    If a vocabulary requires deeper customizations and/or extensions, iQvoc can be mounted into a separate vocabulary application. Every model, controller, view or route that the iQvoc core system provides is then available within the stand-alone vocabulary application.

In order to be able to run iQvoc as both a stand-alone application and a mountable engine, we had to use a small hack:

# lib/iqvoc.rb
unless Iqvoc.const_defined?(:Application)
  require File.join(File.dirname(__FILE__), '../config/engine')
end

This works because the constant Iqvoc::Application is only available when iQvoc is booted as a stand-alone Rails application.

5.5 APIs

iQvoc provides several interfaces that allow the integration with the core framework.

5.5.1 Configuration

The core configuration is implemented as a standard Ruby module and the most important part of the framework API. Independent of the given iQvoc setup (stand-alone or engine, see 5.3.1) it is encouraged to leave the standard configuration module lib/iqvoc.rb as is. Default configuration options are implemented as standard module attributes and can be overwritten.

Example

# config/initializers/iqvoc.rb
require 'iqvoc'
Iqvoc::Concept.base_class_name = 'Concept::MyNamespace::Base'

The configuration module provides several hooks that can be used to configure classes to use for domain models and model relations. More information can be found in the configuration source documentation. rdoc documentation is not yet available at this point.

5.5.2 Assets

iQvoc uses the Rails asset pipeline feature to organize and package its core images, stylesheets and JavaScript files. If iQvoc is being used as a framework it is important to pay respect to some defaults in order to inject custom assets while keeping the frameworks core assets intact. More information on using the core framework asset API is documented here.

5.5.3 Internationalization

iQvoc uses standard Rails I18n to translate UI strings and parts of the source code like model and attribute names. If iQvoc is being used as a framework and the configuration API is used to extend it with custom classes, these can be translated by using standard Rails patterns in the application.

5.6 Dependency Management

Dependencies are managed with Bundler.

5.7 Software Configuration Management

git is used for source code control. The open source code is hosted on github.

5.8 Versioning & Release

iQvoc uses Semantic Versioning in order to deliver a consistent and understandable versioning schema of the open source code.

Every release has its own Git tag and gets pushed to Rubygems as well so iQvoc can be installed via the standard Ruby package management system when it shall be used as a framework: gem install iqvoc.

5.9 Setup

iQvoc is targeted on maintaining a single thesaurus in a single application. The application is not multi-tenant-capable in the meaning of managing multiple thesauri within one application though this requirement can be accomplished by setting up multiple iQvoc applications; concepts can be linked and referenced between the instances by using matches (cf. 5.2).

The detailed steps and requirements for setting up iQvoc in different environments are explained here.

5.10 Import

Standard SKOS RDF data can be imported into an iQvoc application. To achieve this iQvoc employs a CLI based importer as well as a web frontend in the latest core version. Detailed instructions on how to use the import CLI can be found here.

6. Quality Assurance

iQvoc provides a test suite consisting of:

  • Unit tests
    Ensure business logic implemented in the model layer does what it is expected to do.
  • Integration tests
    We make use of integration tests in which the site is browsed within a headless Webkit browser. This ensures full-stack testing from the outside in by replaying important user workflows in an automated way.

Resolving dependencies as well as the execution of the test suite and the migration of the database schema are part of a continuous integration process that runs on the open source distributed build platform Travis CI.

7. Risks & Complexities

7.1 Configuration

Because of iQvoc being a generic framework it can suffer from the typical problems for this case:

  • Convention or configuration?
  • How many configuration options are provided?

The flexible configuration described in 5.5.1 is definitely one of the most complex things in the iQvoc core. Configurable model associations make class definition code rather complex and hard to read. Additionally, model references all over become rather implicit.

These things could be a potential risk for code maintainability.

7.2 Editing Workflow

iQvoc features a full-fledged editing and publishing workflow for concepts based on roles and permissions. As described in chapter 5.2.1 we use CanCan to define permissions for roles. We chose a very pragmatic approach for roles: Each user can have one role, the role is stored hardcoded in the user record. The permissions are defined with a DSL that CanCan provides in a so-called ability file.

  • When viewing an individual concept or label, editors can choose to edit the respective entry by creating a new version - alternatively, new entries can be created via the user's dashboard
  • Creating a new version creates a private revision and locks the entry, preventing concurrent edits by other users
  • After editing an entry, the editor can propose the changes for publication
  • Proposed changes appear in the publishers' dashboard where they can be reviewed and publication can be approved
  • Upon publication, the updated version replaces the public version

Creating versioning logic and state machines is non-trivial and can lead to bugs. We tried to ensure the software contains as few bugs as possible in the integration of the editing workflow by extending the test suite covering these sections.

8. Attachments

8.1 Standard SKOS diagram

Standard SKOS class diagram

8.2 Dependency Graph

Dependency graph