this document is a proposal for a standard extension and an
interface for non-standard inferences for DIG
2.0. Currently, this proposal is under construction.
Non-standard inferences (NSI) are more and more recognised as
useful means to realise new kinds of applications. The number of
applications that use non-standard inferences in their framework is
growing. To implement these kinds of applications it is desirable to
make use of existing implementations of NSIs. In order to have easy
access to the implementations of the NSIs, an extension of the DIG
interface to NSIs is desirable.
There are different NSIs investigated and developed by different
research groups. This extension proposal focuses on the NSIs studied
at the TU Dresden. However, the extensions for NSIs are not restricted
to this set of inferences. For an overview of the non-standard
inferences covered here please see [NSIs]. So far
all of the here proposed NSIs are the ones implemented in the system
SONIC. We
try to propose a general scheme here for NSIs, but at this stage the
proposal is still quite tailored to this system.
The terminologies acceptable as input for most of the NSIs covered
here are currently restricted to unfoldable (i.e. acyclic and with
unique definitions) terminologies and to a set of concept constructors
that is in most cases restricted to the set of sub-boolean operators
plus (unqualified) number restrictions.
We have a couple of technical requirements for the environment in
which our NSI extension operates. The NSI component works as a DIG
server for the application and as a DIG client, since tests for
satisfiability and subsumption are needed during the computation of
the NSIs as well as access to told information to retrieve the concept
definitions of concept names. To avoid duplicate storing and
classifying of the KB and to ensure that the NSI component works on
the identical KB as the application, the NSI component must use
the same instance of the DL reasoner as the application. This
way it is guaranteed that the NSI component always works on the
current version of the KB even in case the application has
re-submitted an edited version of the KB.
In its current implementation the SONIC NSI component is a stand-alone
application that is invoked by the application and connects to the
same instance of the DIG DL reasoner as the application. It gets the
port number etc. from the application. For the forthcoming version of
DIG our NSI component also needs a connection to a component that
implements the told information access interface as specified by the
proposal for accessing told data. We assume here that one
connection suffices to access both, the told information component and
the core DIG reasoner, which can either be realised by a core DIG
reasoner supporting the told information access interface or by the
middle-ware reference implementation of the DIG group.
Besides the information for the connection the NSI component needs the
URI of the knowledge base that is loaded into the reasoner. Thus the
URI of the KB must be transmitted as an identifier.
In contrast to standard reasoning tasks, most NSIs return complex
concept descriptions and not only Boolean values. Furthermore the NSIs
take into account the "target DL" for which they are applied. For
example concept approximation "translates" concept descriptions from
one DL L1 to another DL L2. Thus the NSI
extension schema must provide means to communicate the respective DL.
To sum it up, our NSI component needs the following information
- in each ask:
- Port number under which a told information
access component and the standard DL reasoner runs.
- Machine name where the told information
access component and the standard DL reasoner runs; default is
local host.
- URI of the knowledge base as ID of the KB in
use.
- in some queries:
- the target DL for the NSI.
The operators offered by our NSI extension are enlisted and explained
in the following. First we give an abstract pseudo-LISP style syntax
to illustrate the use of them. These inference services will be
offered as ask statements in the XML schema. The whole NSI extension
schema can be found here.
- Approximation
Intuitively, the approximation of a concept description written
in a DL L1 is a translation into a
typically less expressive DL L2 with a
minimal loss of information, for example computing
ALEN-approximations of ALCN-concept descriptions. The following
ask statement gets the approximation in the target DL
w.r.t. the current TBox:
(approximation <concept description>, <target DL>)
it returns a concept description in the target DL.
- Least Common Subsumer (LCS)
The LCS of a set of concept descriptions is a concept
description that subsumes all input concepts and is the
least one w.r.t. subsumption to do so. The most
expressive DL for which the LCS inference is currently
available is ALEN. The following ask statement queries for the
LCS w.r.t. the current TBox:
(lcs <concept description>, <concept description>+, <target DL>)
it returns a concept description.
- Good Common Subsumer (GCS)
The GCS of a set of concept descriptions is a concept
description that subsumes all input concepts, but does not need
to be the least one w.r.t. subsumption. Furthermore, the
implementation performs the computation w.r.t. a background terminology. The approach
supports a (usually expressive) background terminology that is
extended by the (usually not so expressive) user
terminology. The most expressive DL for which the GCS is
available is ALE user terminology w.r.t. an ALC background
terminology.
The following ask statement queries for the GCS w.r.t. the
current TBox:
(gcs <concept description>, <concept description>+, <target DL>)
it returns a concept description.
- Find Matching Concepts
This operation has one parameter: a concept
pattern. Roughly speaking, a concept pattern is a concept
description where variables can appear at the places where
concept names can appear in usual concept descriptions. The
variables that appear in the concept patterns must be
distinguished from concept names in the DIG standard by an
extra tag.
(Find_Matching_Concepts <concept pattern>)
Returns a set of concept names
of concepts defined in the knowledge base which can be matched
with the concept pattern, i.e., the variables in the concept
pattern can be substituted by concept descriptions such that
the resulting concept description is equivalent to the
concepts in the returned set.
- Find Matcher
This operation has two parameters: a concept description and a
concept pattern.
Matching of a concept description C and a concept
pattern D answers the question whether and how the
variables in the concept pattern D can be assigned
to concept descriptions s.t. a concept description is obtained
that is equivalent w.r.t. the current TBox to the concept
description C .
(Find_Matcher <concept description>, <concept pattern>)
It returns a (possibly empty) set of assignments, i.e., pairs
of the variables and concept descriptions. Each assignment
yields a concept description that is equivalent w.r.t. the
TBox to the input concept description, if substituted in the
concept pattern.
- Minimal Rewriting
The minimal rewriting of a concept description C1
w.r.t. a TBox is a concept description C2, which is
equivalent to C1 and of minimal size. So, intuitively, a
minimal rewriting returns a more compact form of a concept
description. In our implementation we realised a heuristic for
minimal rewriting that returns a small, but not necessary the
smallest rewriting for ALE-concept descriptions.
(minimal_Rewriting <concept description>)
It returns a smaller concept description equivalent to the
input concept description.
However, this list is surely not complete and this proposal is open to
be extended by other NSIs such as, for example, explanation.
Our NSI extension schema mainly focuses on the two following aspects:
- Ask statements for the above mentioned inferences. Here the use
of variables in the concept patterns and the use of the target DL
are most notable extensions
- Response statements for the queries. Since the core DIG schema
does not yet supply a response schema, our proposal is the first in
that direction and, of course, under construction.
The most queries require concept descriptions as input or produce
concept descriptions as output. These are to be specified according to
the DIG
core schema. The whole XML schema for the NSI extension can be
found here.
The ask statements use parts of the DIG
core schema and provides the group nsiQueries with elements that
are likely to be useful for other NSIs or inference services that are
not yet incorporated in DIG or one of its extensions.
The root of NSI requests is the tag <asks>. Thus all NSI queries
are children of this element. A request may contain several
queries. As stated in the Requirements
the NSI reasoner needs some information about other components. This
information is provided by the following attributes of the ask
element:
- port - The port number where told information
access component (and the DIG 2.0 standard reasoner) is
listening.
- machineName - The host name/IP of the machine
that is running the told information access component (and the DIG
2.0 standard reasoner).
- KB URI - The name/identifier of the knowledge
base (TBox).
It is necessary to pass this information in each NSI request, since
the DIG protocol is stateless. Each request contains at least one
query. The schema for a NSI request is the following:
<xsd:element name="asks">
<xsd:complexType>
<xsd:sequence>
<xsd:group ref="nsiQueries" minOccurs="1" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="machineName" type="xsd:string" use="optional" default="localhost"/>
<xsd:attribute name="port" type="xsd:int" use="required"/>
<xsd:attribute name="kb" type="xsd:string" use="required"/>
</xsd:complexType>
</xsd:element>
The subelement of <asks> is an element of the group
"nsiQueries". The elements of nsiQueries are elements that represent
"ask" query types for the NSI services. Although currently limited to
the ones available in SONIC, the idea is to specify query types that
might be useful for other NSIs or even other services. Before turning
to the schema, we would like to explain the idea behind the new types
and their naming:
- basicTargetDLClassAsk: Covers kinds of
queries that have a concept description (class) and a target DL as
input.
- findAssignmentsAsk: Covers kinds of queries
that return an assignment. In our case an assignment is a set of
variable name pairs. — Might be interesting to use this in the
forthcoming conjunctive query proposal. In contrast to the DIG
naming scheme, this is name points to the return value instead of
the input value.
- findAssignableConceptsAsk: Covers all kinds
of queries that return a variable assignment to concepts. In
contrast to the DIG naming scheme, this is name points to the return
value instead of the input value.
- basicTargetDLClassSequenceAsk: Covers kinds
of queries that have a sequence (rather a set) of concept
descriptions and a target DL as input.
These types are used in the following way in the schema:
<xsd:group name="nsiQueries">
<xsd:choice>
<xsd:element name="approximation" type="basicTargetDLClassAsk"/>
<xsd:element name="findMatcher" type="findAssignmentsAsk"/>
<xsd:element name="findMatchingConcepts" type="findAssignableConceptsAsk"/>
<xsd:element name="lcs" type="basicTargetDLClassSequenceAsk"/>
<xsd:element name="gcs" type="basicTargetDLClassSequenceAsk"/>
<xsd:element name="minimalRewriting" type="ask:basicClassAsk"/>
</xsd:choice>
</xsd:group>
Next, we will see the type for each NSI
service that is provided by our extension. Note, in the following we
use notions from these "base schemas":
In fact all tags extend the basicClassAsk
from the core
DIG 2.0 ask schema.
Since most NSIs need a target DL as input, the queries must specify
these. The target DL language should be specified in string format in
the attribute targetDL. We propose to chose
this string according the DL community's naming convention for DLs
— such as "EL" or "ALEN". An alternative would be to specify the
DL in a separate schema that is then used as a URI in the NSI
extension. However, if the core DIG 2.0 schema will provide a standard
for naming DLs, then we will use that standard in our extension. In
addition, as in the "standard" ask in the core DIG, each query has an
attribute id for identification in the
response.
A.1 Ask statement for Approximation
The subelement of this tag is the concept description for which the
approximation has to be computed and the target DL in the attribute
target.
<xsd:complexType name="basicTargetDLClassAsk">
<xsd:complexContent>
<xsd:extension base="ask:basicClassAsk">
<xsd:attribute name="targetDL" type="xsd:string" use="required"/>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
A.2 Ask statements for LCS and GCS
These two queries have the same type:
basicTargetDLClassSequenceAsk. The
subelement of this tag are (at least two) concept descriptions. The
user must also state the target DL for the LCS or GCS in the attribute
target.
<xsd:complexType name="basicTargetDLClassSequenceAsk">
<xsd:complexContent>
<xsd:extension base="ask:basicAsk">
<xsd:sequence>
<xsd:group ref="lang:description" minOccurs="2" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="targetDL" type="xsd:string" use="required"/>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
A.3 Ask statement for Find Matcher.
The subelements of this tag are a concept description followed by a
concept pattern.
<xsd:complexType name="findAssignmentsAsk">
<xsd:complexContent>
<xsd:extension base="ask:basicAsk">
<xsd:sequence>
<xsd:group ref="lang:description"/>
<xsd:group ref="conceptPattern"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
A.4 Ask statement for Find Matching Concepts
The subelement of this tag is a concept pattern.
<xsd:complexType name="findAssignableConceptsAsk">
<xsd:complexContent>
<xsd:extension base="ask:basicAsk">
<xsd:group ref="conceptPattern"/>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
A.5 Ask statement for Minimal Rewriting
The only subelement of this tag is the concept description that is
going to be rewritten, thus we can use "basicClasAsk" from the ask
schema and don't need an extension for this service.
?: Tag nsiResponse for grouping responses?
Since the core DIG does not yet have a response schema this proposal
is quite preliminary and will evolve as the DIG response schema
evolves.
Every individual response has to include the
id of the query it refers to. Most of the
services that NSI reasoner provide return (a set of) concept
descriptions. These (sets of) concept descriptions should be handled
according the core DIG 2.0 schema. Thus we
do not propose a formal and complete schema for this kind of response.
Instead we just give a proposal of valid XML document for this kind of
response in the example
section.
The only reasoning service that does not return a set of concept
descriptions is "Find Matcher". This service returns a (possibly
empty) set of of matchers, i.e, assignments, which are pairs of
variables and concept names. So, we need a schema definition for a set
of assignments, which is then refined to a set of matchers.
<xsd:complexType name="basicAssignment">
<xsd:sequence>
<xsd:element name="variable" type="lang:named"/>
<xsd:group ref="lang:description"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="basicAssignmentsResponse>
<xsd:sequence>
<xsd:element name="assignment" type="basicAssignment" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attributeGroup ref="ask:basicAskAttributes"/>
</xsd:complexType>
<xsd:element name="matchers" type="basicAssignmentsResponse"/>
The complete definition of our proposal of NSI extension schema can be
found here.
This page
illustrates the use of the inferences by providing sample KBs, queries
and the responses.
- Architecture of core DIG and stand-alone DIG extensions
- Referencing of DLs
Should the targetDL be a string or a URI?
- Clever naming scheme for response involving variables, so that
it can be re-used by the forthcoming "conjunctive query" extension.
- Error Messages
The NSI extension needs different error messages and codes than the
core part of DIG. Fro example there should be error codes be
reserved for the following kinds of messages:
- Inference not supported for this concept language.
- TBox is not unfoldable.
- The concept pattern given has no variable.
Can some of them be shared with other extensions?
- [NSIs]
-
Using
Non-standard Inferences in Description Logics — what does
it buy me?
- [GCS]
-
Computing
the GCS w.r.t. a Background terminology.
Anni-Yasmin Turhan
Last modified: Thu Sep 7 15:23:17 CEST 2006