2.7 Digital Dictionaries and Thesauri
This project will rely heavily on dictionary and thesauri data. There are several digital
encoding formats, sources, and services openly available. In this section we identify
those which are most likely to be of use for the project.
There is some terminology we need to be aware of before continuing.
Term that posses the opposite meaning of another term
Most basic lexical form of a term e.g.
, but not
Area of identifiable knowledge that can be associated with other entities and
Canonical form of a term
Vocabulary of terms usually including descriptions, sometimes restricted to a
specific area of interest
Data about data
Structure and form of words
Data model that describes concepts (usually within a specific knowledge-
Semantic relationships are described as related, broader, and narrower
Meaning of a term or set of terms (sentence)
Term which has the same or similar meaning as another term
Set of synonyms
Word or phrase
SKOS (Simple Knowledge Organisation System) is a format used to encode data around
concepts. Miles, a key member of the SKOS community, describes SKOS as allowing us
"- identify concepts with URIs
- label concepts with literals (e.g. `love'@en), symbols, sounds? other?
- document concepts with definitions, examples, scope notes, history notes, editorial notes...
- semantically relate concepts
- organise concepts into concept schemes, and into smaller meaningful groupings (`arrays')
- use concepts to subject-index documents"
It is worth noting that at the time of writing SKOS has yet to publicise a final
specification. The current spec. was written in 2005 and is a working draft [MB05].
Central to SKOS is RDF (Resource Description Framework). RDF is a metadata model
that builds on XML and URI technology and provides a format in which data can be
encoded and optionally referenced using URIs.
RDF is written in XML and is intended to provide a logical hierarchical format that is
easily understood by machines. RDF attempts to provide a semantic view of data so that
machines can understand how resources relate to one another. RDF documents are not
designed to be viewed directly on the World-Wide-Web [HK07, W3S07, BM04]