A common problem of terminology work is that the importance and indeed the very nature of terminology is poorly understood. Thus many people simply have no idea at all of what it is, while others, searching for an explanation of some sort, end up associating it with "thermal science" and hence radiators(1). Related professions in the communications field, such as translation and technical writing, will often be aware of the word without having precise knowledge of what it entails (cf. Chapter 3.1: "Actors and Working Conditions " for a more detailed discussion of this point).
In fact, terminology is a many-faceted subject being, depending on the perspective from which it is approached and the affiliations of the person discussing it:
To avoid confusion during its work, in particular when talking to non-specialists, the POINTER Project adopted a pragmatic definition of the word. In the context of this document and the POINTER Terms of Reference, therefore, "terminology" (or, in the plural, "terminological resources") has been defined as:
Three major points need to be made here:
In addition, the word "structured" needs some explanation: it should be noted that, in practice, terminological collections may well contain not only well structured standardised terms and concepts, but also innovative, vague and unstructured conceptual and linguistic information.
This basic definition of terminology is supplemented in this Final Report by two other terms:
and
One particular area of confusion highlighted by the POINTER Project is that of the differences between terminology and lexicology, and terminography and lexicography. Not only many non-specialists, but even many individuals working in such fields as language engineering and translation frequently confuse these concepts, and it is hoped that the explanations given below will contribute to a clearer understanding of the distinctions between these fields of activity.
While lexicology is the study of words in general, terminology is the study of special-language words or terms associated with particular areas of specialist knowledge. Neither lexicology nor terminology is directly concerned with any particular application. Lexicography, however, is the process of making dictionaries, most commonly of general-language words, but occasionally of special-language words (i.e. terms). Most general-purpose dictionaries also contain a number of specialist terms, often embedded within entries together with general-language words. Terminography (or often misleadingly "terminology"), on the other hand, is concerned exclusively with compiling collections of the vocabulary of special languages. The outputs of this work may be known by a number of different names - often used inconsistently - including "terminology", "specialised vocabulary", "glossary", and so on.
The work and objectives of lexicographers and terminographers are in many ways complementary, but there are a number of important differences which need to be noted.
Dictionaries are word-based: lexicographical work starts by identifying the different senses of a particular word form. The overall presentation to the user is generally alphabetical, reflecting the word-based working method. Synonyms - different form same meaning - are therefore usually scattered throughout the dictionary, whereas polysemes (related but different senses) and homonyms (same form, different meaning) are grouped together.
While a few notable attempts have been made to produce conceptually-based general-language dictionaries - or "thesauri", the results of such attempts are bound to vary considerably according to the cultural and chronological context of the author.
By contrast, high-quality terminologies are always in some sense concept-based, reflecting the fact that the terms which they contain map out an area of specialist knowledge in which encyclopaedic information plays a central role. Such areas of knowledge tend to be highly constrained (e.g. "viticulture"; "viniculture"; "gastronomy"; and so on, rather than "food and drink"), and therefore more amenable to a conceptual organisation than is the case with the totality of knowledge covered by general language. The relations between the concepts which the terms represent are the main organising principle of terminographical work, and are usually reflected in the chosen manner of presentation to the user of the terminology. Conceptually-based work is usually presented in the paper medium in a thesaurus-type structure, often mapped out by a system of classification (e.g. UDC) accompanied by an alphabetical index to allow access through the word form as well as the concept. In terminologies, synonyms therefore appear together as representations of the same meaning (i.e. concept), whereas polysemes and homonyms are presented separately in different entries.
In the electronic medium, similar considerations apply in principle to the organisation of entries with reference to synonyms and polysemes/homonyms. However, the retrieval of data still operates at present largely through the term (or a component ! of the term) rather than through the concept. Conceptually-based solutions for the representation and retrieval of data are being sought in the techniques of artificial intelligence.
Work organised conceptually may also be presented alphabetically, whereas the converse, i.e. the presentation of work originally organised according to the form of the word in a thesaurus-type structure, is highly problematic.
In dictionaries, related but different senses (or "polysemes") of the same word form are usually presented within one entry, e.g. bridge (of a violin, crossing a river, over a gap in teeth); unrelated different senses ("homonyms") of the same word form are normally presented as separate head words or entries, e.g. pupil (of the eye) and pupil (in a school). Synonym relations are not always made explicit in dictionaries, and the division of word forms into different senses tends to vary considerably between dictionaries. This lack of clear division into senses reflects the "slippery" nature of general-language words, compared to the more precise nature of terminological meaning.
In terminologies, homonyms and polysemes within the same subject field are treated as separate entries in a terminology (because the definition of the concept is different), e.g. in Automotive Engineering emission (the process of emitting exhaust gases) and emission (the exhaust gases themselves). Homonyms and polysemes of other subject fields are excluded. Synonyms, on the other hand, are always included as a part of the same entry in a terminology (being alternative representations of the same concept), e.g. automotive catalyst, catalytic converter.
The "headwords" or rather "entry terms" in terminologies are all open-class words, i.e. nouns (the vast majority), some adjectives, verbs and adverbs. The headwords in general-language dictionaries cover all word classes, including so-called grammatical words such as modal auxiliaries (e.g. can, must), prepositions (e.g. on, with), articles (e.g. the, an), certain adverbs (e.g. very), and so on. In terminologies, such words may appear as a component of the term or be shown as a part of the term's phraseology (i.e. the usual pattern of its immediate linguistic environment), but never as independent entry terms.
Dictionaries of the general language are descriptive in their orientation, arising from the lexicographer's observation of usage. Terminologies may also be descriptive in certain cases (depending on subject field and/or application), but prescription (also: "normalisation" or "standardisation") plays an essential role, particularly in scientific, technical and medical work where safety is a primary consideration. Standardisation is normally understood as the elimination of synonymy and the reduction of polysemy/homonymy, or the coinage of neologisms to reflect the meaning of the term and its relations to other terms. Terminologies - the outcome of this work, often in electronic form as termbases - are then the principal means of dissemination. In other words, in certain circumstances, terminologists may attempt to regulate language (in this case, the vocabularies of special languages), whereas lexicographers describe the words of general language.
Lexicographers have at their disposal a number of "style labels" which aim to distinguish between, for instance, informal, slang, or vulgar expressions, archaisms, and so on. Terminologists also need to distinguish between different communicative situations, although in a rather different way. While traditional terminology work is concerned mainly with the terms which characterise communication between subject experts, a broader view also incorporates less abstract levels of communication, e.g. between technicians, or between expert and layperson (such as doctor-patient; lawyer-client). In high-quality terminography, such variants must also be labelled or assigned to a particular source in order to identify the appropriate communicative context for their use.
The following table summarises the above comparison:
Table 1 : Comparison between Lexicography
and Terminography
A large majority of documents today are designed for specialist
communication (including business and commercial texts). They
are thus written in specialist language, 30-80% of which (depending
on the particular domain and type of text in question) is composed
of terminology(2). In other words, terminology (which as we have
seen may also include non-linguistic items such as formulae, codes,
symbols and graphics) is the main vehicle by which facts, opinions
and other "higher" units of knowledge are represented
and conveyed. Sound terminology work reduces ambiguity and increases
clarity - in other words, the quality of specialist communication
depends to a large extent on the quality of the terminology employed,
and terminology can thus be a safety factor, a quality factor
and a productivity factor in its own right.
The communication of specialist knowledge and information, whether
monolingual or multilingual, is thus irretrievably bound up with
the creation and dissemination of terminological resources and
with terminology management in the widest sense of the word. This
process is not restricted to science and engineering, but is also
vital to law, public administration, and health care, to quote
just three examples. In addition, terminology plays a key role
in the production and dissemination of documents, and in workflow.
Terminology as an academic discipline offers concepts and methodologies
for high-quality, effective knowledge representation and transfer.
These methodologies can be used both by language specialists and
by domain specialists after appropriate training. In addition,
they form the basis for an increasing number of tools for the
identification, extraction, ordering, transfer, storage and maintenance
of terminological resources and other types of knowledge.
Terminological resources are also valuable in many other ways:
as collections of names or other representations, as the object
of standardisation and harmonisation activities, and as the input
(or output) of a wide range of applications and disciplines, whether
human or machine-based (see the Figure below). The range of applications
to which terminology is of direct relevance was a primary motivating
factor at the inception of the POINTER Project with its brief
to analyse the situation of terminology in Europe, and to make
concrete suggestions for a future infrastructure and activities.
This wide range of applications and products is all the more important
given the current technological and political developments in
Europe. The last few decades have been characterised by the exponential
spread and implementation of the concept of "globalisation".
Although international activities and multinational trade existed
well before this date, a new quality has recently emerged. Not
only are raw materials sourced, and products sold, on a supranational
scale, they are now increasingly developed, manufactured, marketed
and sold for a global audience. Global competition and global
co-operation - both of which presuppose global communication -
are now common concepts. In the cultural arena, too, we can trace
the development of what is often called the "global village",
with greatly increased social and cultural contact, both active
and passive(3).
At the same time, rapid technological development in general,
and the rise of whole new fields and industries in particular,
has led to shorter and shorter innovation cycles and to an exponential
growth in knowledge and the need for its rapid and effective communication.
Thus the total amount of specialist knowledge is currently thought
to be doubling every five to fifteen years, depending on the area
concerned(4).
This explosion in communication has been facilitated and driven
by the computing and telecommunications revolutions, which have
provided cheap processing power and new technologies for document
processing. Vast databases can now be processed efficiently, and
their contents transported effortlessly across national and geographical
boundaries. Information is now commonly regarded as a fourth production
factor alongside property, labour and capital. The number of intangible
products is increasing rapidly, in contrast to the number of tangible
ones. The practical effects of this can be seen, among other things,
in the vast increase in the creation, capture, processing, storage,
archiving, retrieval and subsequent evaluation of documents. For
example, the Danzin Report [Danz 92] estimated that the European
economies (calculated before the latest enlargement of the European
Union in January 1995) would spend 650 million ECU on this in
1994. Equally, the number of major different subject fields (or
"domains") for which terminology exists is estimated
at several hundred or many thousand, depending on the degree of
detail of the classification system used(5). In turn, each of these
domains contains between several hundred and over ten million
(e.g. chemistry) terms, again depending on the granularity of
the system. The number of terms in each of the highly developed
languages is commonly estimated at 50 million, excluding product
names, which account for roughly another 100 million terms.
A point to be remembered here is that specialist (and indeed general)
communication is normally an iterative and multilinear process,
since knowledge is generally created in an evolutionary process
and in several different places at once. Thus potential sources
of uncertainty and misunderstanding arise in the form of homonyms
(i.e. words that are used to denote more than one concept) and
synonyms (i.e. more than one word for the same concept). This
problem is becoming particularly acute with the strong tendency
to interdisciplinarity in important modern scientific disciplines
such as biotechnology, environmental science and materials science
(it is a paradox that in this age of increasing specialisation
science is becoming more and more interdisciplinary). At the same
time, the risks involved in failing to communicate unambiguously
and in a timely manner have often increased dramatically (two
classic examples of this are the aerospace and environmental industries).
For all these reasons, contents-based information management is
a prerequisite for improving the efficiency of communication.
In addition, it should be borne in mind that communication is
not solely monolingual, especially not within Europe. In fact,
there is a clear trend at the moment towards an increased awareness
of multilingual issues, despite the predominance or at least lead
function of English in the technical, business, economic, political
and - to a lesser extent - cultural fields.
One factor influencing this trend is the concern of a number of
national and regional governments to ensure the long-term viability
of their official languages in the face of competition from English
and to ensure equal access for all citizens and social and economic
groupings to new ideas and other information. Other significant
factors are product liability and similar consumer protection
legislation, as well as a more general wish among enterprises
in particular to increase efficiency by improving internal and
external communication and information flows. In addition, consumer
goods manufacturers in particular are discovering the competitive
advantage which products can achieve (especially in saturated
or highly competitive markets) when localised into the languages
spoken by their target groups.
The importance of these developments for a multilingual political
federation such as Europe with its eleven official working languages
and countless lesser-used ones(6) cannot be overemphasised. In fact,
the European Commission sees itself as living in what it calls
the Multilingual Information Society(7). Europe's dual position as
a world player (and the original home of three world languages)
and a multilingual collection of states means that effective multilingual
communication on a vast scale is a prerequisite for both internal
and external success. To quote only one statistic: the European
Commission alone already has more than one million pages of text
translated per year. Add to this the appropriate national figures
for both the private and public sectors, and it soon becomes apparent
that multilingual communication is already big business(8). However,
it is equally clear that new, automatic methods and tools for
multilingual information management (i.e. ones that go beyond
current language-neutral ideas such as workflow, imaging and electronic
document management) are urgently required if communication across
linguistic, sector, regional and domain boundaries is to be optimised.
Since a great deal of this - specialist - communication relies
on the vocabulary of a vast number of subject fields to convey
its content, readily-accessible, up-to-date terminology will play
an increasingly important role in (multilingual) information management
in the 21st century.
Lexicography Terminography
Variety of
language: YES NO
general language (YES as YES
special language special-purpose
lexicography
Subject matter:
broad areas of YES NO
knowledge (RARE) YES
delimited domain NO YES
use of
classification
system
Method of working:
word-based YES NO
concept-based (RARE EXAMPLES YES
ONLY)
Presentation to
user: YES (YES if reorganised)
alphabetical (RARE) YES
thesaurus-type
structure
Headword/entry
term: YES NO
closed class YES YES
open class
Presentation of
entries: PRESENTED TOGETHER PRESENTED SEPARATELY
polysemes/homonyms PRESENTED PRESENTED TOGETHER
synonyms in same SEPARATELY
entry
Orientation: (largely depending
Prescriptive NO on domain)
Descriptive YES YES
YES
WHY TERMINOLOGY?
Figure 1 : Terminology Applications
and Products
TERMINOLOGY AND THE MULTILINGUAL INFORMATION
SOCIETY
Figure 2 : Terminology : A Key Discipline
for the Information Society
2. Thus, for example, patent applications and technical standards have an extremely high percentage of terms (even though the same term may be repeated many times), while general business correspondence will have a lower one.
3. For a discussion of the subject see [Wal 95]
4. UNESCO estimates
5. Thus if the figure is calculated on the basis of those subject fields for which professional qualifications can be obtained, it would rise to over 10,000 according to the catalogue of professions in Europe, or 55,000 according to the training course database at the German Bundesanstalt für Arbeit (Federal Employment Service).
6. c.200 across Europe
7. e.g. in the announcement of its Multilingual Information Society (MLIS) Programme on 8 November 1995. [MLIS 95]