CONTENTS


  1. EXECUTIVE SUMMARIES OF PHASE II REPORTS

  1. WP 2 : ANALYSIS - INTERNATIONAL LEVEL

  1. General

See Final Report, Chapter 3.4.

  1. Latin countries

Origin : Union Latine


Intergovernmental organisation grouping 32 Latin States, Union Latine has been in contact for more than ten years with institutions that work in the area of terminology, translation, lexicography and language industries, among other areas. Its works within the scope of POINTER project has been concentrated in non European Latin countries, especially in Eastern Europe (Romania, Moldavia) and in Latin-America, and for Europe, in international organisations and multinational enterprises. The languages concerned are thus French, Italian, Portuguese and Spanish, among official languages of the European Union, as well as Rumanian and Catalan.

The internationalisation of science and technology, and the emergence of new markets - in particular the new common markets emerging on the continent of America, NAFTA and MERCOSUR - represent a challenge which terminologists can only respond to in collecting terminology more efficiently and in harmonising existing resources.

On this subject, the European Union should not underestimate the important works being carried out beyond its borders, not only for strategic reasons (industries who make use of language processing gain larger market shares), but especially for reasons linked with the collection and harmonisation of terminology.

As far as Latin languages are concerned, work outside the European Union borders is quite considerable in organisations, associations, national or regional commissions (such as the French language office in Quebec, Canada's public works and governmental departments, the Swiss Federal Chancery, TermRom in Bucharest, the Chisinau TermRom, Uruterm in Uruguay, the Mercosur terminology commissions for three Latin-American countries, Colombian terminology Association, Cubaterm in Cuba, among others), in intergovernmental organisations (such as Union Latine) or networks, (Rint for French-speaking communities, REALITER for all Latin countries and Latin languages, RITerm for Spanish and Portuguese speaking countries), or within international institutions which partly deal with terminology (UNO, UNESCO, OIT, ITU, FAO, NATO, OPS, etc.).

These activities have been led around the European Union borders, very often in a similar way to those developed in Europe (RITerm-BD, REALITER, data-bank terminologies for Mercosur) and in mutual co-operation with institutions and researchers in European countries. But the actual mechanisms of institutionalized terminological co-operation between the European Union and the entities above-mentioned still have to be created.

The following would be highly recommended:

  1. WP 3 : ANALYSIS - EUROPEAN LEVEL

  1. T 3.1 : European R&D activities

Origin : University of Surrey


  1. Terminology infrastructure and the information superhighway

This study was undertaken to investigate the potential of the emerging global information networks, e.g. the World Wide Web, the Internet and so on, for promoting terminology. This promotion involves: increasing the awareness of terminology work through these networks, exploring the availability of terminological resources, like glossaries and dictionaries created by specialist communities on these networks, evaluating the potential of the global information networks for disseminating terminology resources, and investigating the use of these networks for broadcasting distance-learning materials related to terminology training.

We have created 'electronic' pages of information about the POINTER project itself on the World-Wide Web, these pages not only provide information about the aims and objectives of the project but also provide access to the details about the participating organisations and to some of the reports that have been produced by the POINTER consortium members. We have found a profusion of terminology resources on the Internet: from glossaries and specialist dictionaries dealing with aerospace, medicine, software engineering, foods and drinks and leisure activities, to the EU's EURODICAUTOM term base that has recently been made accessible. We have created a software system that can provide access to a number of term bases for storing and retrieving terms: the Term Baazar system demonstrates how terminology can be treated as an electronic commodity and, potentially, can be traded using electronic 'cash'. Finally, we have created prototypical terminology learning resources, specifically chapters of a book dealing with terminology (and artificial intelligence) which were semi-automatically converted into a hypertext. This hypertext is now available on the Internet.

The present situation can be summarised by noting that there is a considerable amount of terminology-related activity on the emerging global communication network. Equally importantly, the terminology community can use a number of facilities for disseminating term bases, for providing access to terminology management systems, and for using the Internet as a vehicle for distance learning. The main problem is the lack of awareness in the terminology community of the opportunities that the global communication networks provide and, indeed, the community lacks skills and know-how that are essential for exploiting the networks. The solutions include greater awareness of the potential of these networks, through seminars and workshops and demonstrator projects. Problems which need to be resolved include copyright and data protection, and the organisation of 'electronic trade' across national frontiers.

  1. A model for the training and accreditation of terminologists in the European Union

The POINTER consortium regards the training of terminologists and the promotion of the profession as an important aspect of terminology infrastructure in Europe. To this end, the POINTER consortium investigated a career development model for the training of terminologists such that the experience they gain at work can be accredited together with their academic qualifications. This accreditation takes into account the various aspects of terminology work, terminology acquisition, organisation, application, and terminology education and research. The syllabuses of eight European organisations involved in terminology training were studied and were related to the above mentioned aspects of terminology work. These considerations have resulted in a matrix model for terminology training that specifies the entry requirements to the profession in terms of both formal and experiential training and suggests how a terminologist progresses from a novice terminologist to an expert terminologist and on to senior management positions.

The current situation in terminology training is as follows: (1) there are a large number of terminology courses (often as an annex to translation or other types of courses) together with some in-house training (especially large corporations); (2) many different types of organisations undertake terminology activities; (3) there is a degree of consistency of approach across different countries; (4) we find that there is informal accreditation across corporations (in same sectors, across sectors in large corporations) and accreditation by professional bodies in some European countries and UN/large organisations; and (5) there are a number of EU-funded initiatives which are currently funding terminology training problems directly or indirectly; for instance, the LEONARDO, ERASMUS, DELTA and COPERNICUS programmes fund technology transfer and teaching and learning projects that envisage training in terminology and the use of terminology.

Amongst the principal problems identified were: (a) a need for coherence in training: variation across institutions as well as nationally; (b) the absence of a career development path for terminologists; and (c) a neglect of mobility - movement is neither cross-sectoral nor multilingual or transnational.

The solutions we suggest fall into the following categories:

  1. Specialist terms in general language dictionaries

The European Commission is concerned to optimise the use of linguistic resources. Since up to 40% of entries in some general-language (LGP) dictionaries may be concerned with specialised vocabulary - or terminology - LGP dictionaries, which are widely available, may also be regarded as a terminology resource. Terms included in such dictionaries tend to be those which are used and encountered by both experts and laypeople.

The Research Network Group within the POINTER project investigated the problems and opportunities related to the use of existing LGP dictionaries, particularly bilingual dictionaries and monolingual learners' and advanced dictionaries, subject-specific handbooks and other relevant encyclopaedic material; the focus of our deliberations was on English, German and Dutch. The conclusions of this report, however, are equally relevant to other languages.

The use of the above-mentioned lexicons and knowledge sources poses two problems: what is the form of the entry, and how amenable is this to re-use for other purposes? Analysis of various dictionary entries demonstrates that the extraction of terminological data from currently-available LGP dictionaries (both monolingual and bilingual) is problematic from a number of different points of view, including the inconsistent and imprecise use of subject-field labels, the absence of adequate pragmatic information, and varying definitional practices. Terms are also often deeply nested in entries, even as sub-senses of polysemous headwords. The unsatisfactory use of subject-field labels is of particular importance for the automatic extraction of data.

Solutions are likely to be medium-term rather than short-term, involving the more widespread use of standards for the representation of lexicons and the consistent use of established classification systems. Solutions related to lexical standards are of substantial relevance in the medium term. Of particular interest are the interchange standards that encourage exchange of lexical and terminological resources, such as TIF and other emerging standards produced by ISO, as well as standards that encourage exchange across applications, particularly machine-translation systems and document management systems. It is essential that the research carried out in these areas, for example, the R&D efforts sponsored by the EU as manifested in MULTILEX, GENELEX, MULTEXT, and TRANSTERM is properly archived and articulated in a manner that is comprehensible to terminology and lexical resource developers.

The R&D Network concluded with the following recommendations:

  1. T 3.2 : European institutions

See Final Report, Chapter 3.3.

  1. T 3.3 : Analysis of selected subject fields in all European languages and countries

Origin : CL Servicios Lingüísticos


There are many different types of multilingual resources currently available on the market, such as dictionaries, glossaries, lexica, etc., and everyone agrees that their quality varies enormously. However, no system or procedure for their evaluation has ever really been defined and adopted by actors in the market.

Defined by CL within the framework of task T.3.3., the following evaluation methodology and system have clearly confirmed the great heterogeneity of quality between existing terminological resources. They also point to quality differences, which were already thought to exist, between monolingual, bilingual and multilingual resources. Furthermore, a case study carried out in certain domains possessing a large number of terminological resources served to confirm the results obtained on a general level.

Finally, an in-depth analysis, explaining the different criteria established for this evaluation, enables the reasons for the disparity in quality of terminological resources to be identified.

  1. T 3.4 : Analysis of TMSs (Terminology Management Systems) in Europe

Origin : University of Surrey


Substantial activity can be noted in the field of terminology management systems (TMSs): as many as 60 such systems are reported in the literature. Despite the fact that many of these systems are university/laboratory prototypes and have yet to be marketed, there are still a number of good European products on the market which can already make a considerable contribution to the efficient management of terminology within and across institutions and linguistic boundaries.

During the investigation, a set of criteria was established including technical, conceptual, linguistic and commercial factors: these criteria are crucial to the management of terminology. They can also be used as an evaluation metric to assess the price/performance of a TMS system and to help terminology users to determine the relevance of a TMS system to their own organisation. Based on this set of criteria, a representative selection of TMSs was analysed. The results indicate that current TMSs, while operative, can be substantially improved.

The POINTER study has identified a number of problems. First, terminology exchange across organisations and across languages. Second, problems related to validation and verification. Third, problems related to the user-interface, especially that of localisation and customisation. Fourth, problems in extracting terminology from text corpora. Fifth, the need for using better computing techniques for storing and retrieving terms, including multimodal methods and techniques.

Amongst the solutions identified by the study, the most important is that there is an urgent need to define, adopt and refine existing standards for dealing with different writing systems, for marking-up terminology data using SGML, and for encoding linguistic data. It is important for the TMS developer to interact with other sectors of language engineering, particularly machine translation and information retrieval and for the other sectors to systematically use terminology. The facilities for using terminology databases across hardware platforms, across linguistic and geographical boundaries will lead to the creation of a terminology market place: the developments in local and global computing networks will lead to and support this development. The TMS-based solutions for validating and verifying terminology depend upon the development of protocols for these tasks. However, in the meantime it is important to use or develop tools. for checking for duplications, tools for facilitating access for experts to term bases. and so on.

The POINTER consortium recommends the following: First, there is an urgent need to establish and disseminate criteria for evaluating TMSs. Second, it is vital to develop and promote an awareness of standards - especially those promulgated by international bodies - for terminology exchange and for TMS developers to build tools for facilitating the use of these standards. Third, TMS developers should keep abreast of developments in related subjects such as language engineering and computing science, as well as in terminology research. These recommendations can be pursued by an ELRA-associated body or through projects supported by the EU.

  1. WP 4 : ANALYSIS - NATIONAL AND REGIONAL LEVELS

  1. Belgium

Origin : BJL Consult


The survey, which benefited from the results of three other surveys carried out recently during which the terminology resources and requirements of Belgium were pinpointed and analysed in detail, lead to the following conclusions:

Two projects are particularly worth considering:

  1. Belgoterm. This terminological database, which both provides translations and can be used as an encyclopaedia, was compiled in accordance with the most stringent terminological standards.
  2. A dictionary of Greek and Latin generators elaborated at the CTB could be used to create neologisms in the different European languages.

The following recommendations may be made:

  1. Denmark

Origin : DTG


Within the frames of the EU, Danish is a minority language spoken by 5,000,000 people. Concurrently with the increasing internationalisation, a growing interest in and demand for qualified terminology work has emerged in Denmark. In Denmark there is at present no superordinate terminology organ. The terminology work is coordinated by the Danish Terminology Group.

As a consequence of this, problems exist, not only nationally, where coordination and measures for quality ensurement are impeded, but also internationally, since there exists no single institution in Denmark which may serve as a point of reference for foreign institutions, companies and organs. One of the latest implications of this fact is that the EU in lack of one coordinating terminology centre in Denmark has begun to allocate appointments with a terminological scope to different institutions instead. This development hardly heightens the general and common level of terminology work in Denmark.

An increase of the quality of terminology work in Denmark is related to the following four factors as pointed out by The Danish Board of Technology under the Ministry of Education and Research in Denmark:

  1. France

Origin : CTN


The survey of terminological activities in France is essentially an analytical update of previous enquiries. Indeed, all the French POINTER partners had already carried out surveys of their own, and the present study owes much to this essential groundwork. It should be stressed that translation issues are more fully covered here than other aspects of language industries, notably terminology in knowledge management.

Certain hopeful signs are apparent in France in 1995: more competent, computer-literate graduates in terminology are coming onto the labour market and creating an awareness in many firms of what terminology can contribute; TMSs are better known and often used to great satisfaction; some firms are now self-sufficient in terminology; official efforts at coordinating terminology initiatives have created dynamic associations and a favourable climate to forming a European network.

Most of what was holding terminology back in the recent past, however, still has the same negative effect. The situation is even worse than before in some businesses, as the recession has caused cutbacks which have hit terminology. The number of languages adequately treated is limited to the "golden four" (French, English, with German and Spanish less well catered for), and the domains covered are those of international interest (aerospace, telecommunications, transport). Published specialised dictionaries concentrate on the same languages, and users complain that they are out of date by the time they appear. Information is hard to come by or sift through on dictionaries, TMSs, training and legal issues. Inside companies, terminology is hampered by a negative image, managers quickly calculating how much it costs, but unable or unwilling to take account of the savings that terminology makes. Smaller firms and isolated translators are still largely cut off from terminology services. Uncertainty about copyright laws and questions of confidentiality hinder exchanges and limit enthusiasm for a network.

The time is ripe to found a European terminology body that complements existing structures and directly reflects the preoccupations of those in industrial terminology. The possibilities opened up by electronic networks (Internet, WWW) should be fully exploited to bridge terminology's information gap. It would be a pity not to capitalise on the goodwill generated by the present survey or disappoint the hopes that it has inspired.

  1. Germany

Origin : DIT


There is a general lack of high-quality, up-to-date terminological resources and literature in Germany and, even more importantly, of information and distribution channels for finding out what is already available. This applies in particular to lesser-used languages such as Greek, Italian and the Eastern European languages (in such cases, English is often used as a ´relay language´). However, the same problem also affects innovative and even mainstream areas in major languages such as Spanish and French, and even English. Standards are not always used, or even known about.

The strategic and practical value of terminology is almost always insufficiently recognised, particularly at management level. Many terminological activities have been cut back or frozen in the drive to concentrate on ´core business´. In many cases, short-termist (as opposed to planned) outsourcing is leading to the fragmentation and loss of central terminological resources. This is particularly - but not exclusively - true in the East, where many companies have been wound up and funding of tertiary education has been reduced. Some well functioning corporate termbases and corporate language concepts were found, but these represent a minority of interviewees.

Knowledge of the methodology and procedures behind systematic terminology work is generally limited, even in large enterprises. While translators may be sensitised to general language issues, they often have no formal training unless recently qualified. The same applies to technical writers. Domain experts (e.g. in standardisation bodies) and other groups are almost always totally untrained. Vocational training is therefore seen as a key issue.

The quality (both process- and contents-oriented) of terminological resources varies widely. In addition, among those organisations actively practising some form of quality control, no common validation standards/procedures exist.

There is a widespread readiness among all groups to join an efficient terminology network. Given the limited readiness available to terminologists, any such network and provision of resources must not involve providers in significant effort. However, many existing resources were created without any thought of reuse and would therefore require some degree of reworking. The quality assurance and validation of the terminology provided in such a network are seen as key issues for success.

  1. Italy

Origin : AssITerm


The terminological milieu in Italy.

Terminological activity in Italy

Needs, Constraints, Potentiality

  1. Netherlands

Origin : TopTerm


In the Netherlands, there is a plethora of terminographical and lexicographical sources (dictionaries, vocabularies, classification systems, etc.) in a great number of fields. The Dutch Language Union as a national organisation concerned with the Dutch language has a real appreciation for the importance of terminology. Also, in the Netherlands there are a few organisations that as part of their mission are concerned with the collection, implementation, validation, etc. of terminological data. These organisations have a great impact in their respective subject fields both on a national and international level.

Although the importance of terminology is being increasingly recognized, there is still a lack of knowledge in the vast field of terminology work with regard to terminology theory, working methods, principles, standards, validation procedures, etc.. In addition, there is some resistance of translators to switch to computerized procedures because for them it involves money, time, training, etc.

From these weaknesses, it follows that there are great opportunities in the area of education, such as for special courses in computerized terminology work, the dissemination of terminological information and for networking.

The biggest threat is the lack of knowledge and appreciation of the role of terminology by top management (i.e. those who control the budget for the execution of terminological projects). This can lead to no or inadequate funding and other obstacles such as overprotection of terminological data.

Terminological activities in the Netherlands are extremely fragmented. Thus, at the national level as well as in the European context existing centres dealing with specialized terminology should be supported and created where none exist. Such a multiplicity of terminological centres to be useful for the more general public will require strong cooperation by exchange of information, networking, and so on. Also, at a national level, a top centre of information is required. On the other hand, the strengthening of Eurodicautom should also be carried out as financial resources permit.

  1. Nordic region

Origin : TNC


  1. Nordic countries

The Nordic region consists of Denmark, Finland, Iceland, Norway and Sweden; the languages within the area are Danish, Faeroses, Finnish, Greenlandic, Icelandic, Norwegian, Sami and Swedish. Some of the languages are closely related, while others are unrelated.

The different terminology institutions in the Nordic countries have ever since their foundation had some kind of cooperation, sometimes bilateral, sometimes multilateral. In 1976 Nordterm was founded and the cooperation has since then been more active.

The purpose of Nordterm is to be a Nordic forum and network in the field of terminology. Nordterm shall:

The field of activity of Nordterm includes terminological research, practical terminology work, terminological education, and other activities that concern terminology.

The nucleus of Nordterm is composed of the terminology institutions in the Nordic countries but a large number of other organizations and individuals also take part in Nordterm activities.

The executive body of Nordterm is the Nordterm Steering Committee. Terminological activites are carried out by Working Groups and Project Groups, currently amounting to five.

  1. Finland

Finland's terminological landscape has the following important features:

  1. Iceland

Iceland's terminological landscape has the following important features:

  1. Norway

The terminological landscape in Norway has the following important features:

  1. Sweden

Sweden's terminological landscape has the following important features:

  1. Portugal

Origin : ILTEC


To achieve as clear as possible an overview of existing terminological resources in Portugal, ILTEC surveyed both the most relevant institutions and companies and a significant number of freelances (in total, 267 surveys of terminological activities and 70 detailed questionnaires were sent). Although there was considerable interest expressed verbally, in practice it was difficult and time-consuming to get hold of real data. Personal contact was required and the Portuguese group had to explain both POINTER and ILTEC's role before institutions felt comfortable answering the questionnaire.

Results

Problems

Solutions

Recommendations

  1. Spain
  2. General


Origin : TERMCAT


The present situation in Spain concerning terminology can be summarized as follows:

  1. Region of Madrid and Cádiz


Origin : CINDOC


Surveyed groups show great differences among them. These differences relate to:

Surveyed groups complain of:

Groups work by themselves, without knowing others' activities. They don't keep much contact with each other, although they are willing in cooperating among them and very interested in becoming a network member.

However these negative aspects, terminological activities show an increasing development due to telecommunications technology improvement and better and cheaper hardware and software; terminological works are getting more and more importance for institutions and enterprises, with many works performed, many being done and some in project.

In addition, Terminology has aroused to the university level and it's tought as a topic in some postgraduate schools. This would be a measure for Terminology to increase its social impact.

  1. Basque country


Origin : UZEI


  1. Switzerland

Origin : CRB


Switzerland's traditional demand for a language services industry catering for both domestic and foreign markets has led to the emergence in the last decade of a number of public and private sector terminology databases mainly for the support and rationalization of translation work. Most collections in Switzerland are trilingual, and often quadrilingual, with German and French as the dominant languages, and English prominent in export oriented industry and commerce.

In view of the widespread multilingual competence amongst Swiss citizens, it is not surprising to find high standards of quality exacted from terminologists (and translators). Qualified support from (multilingual) specialists is widely available, and it is not uncommon for specialists themselves to find their way into translation and terminology.

Despite isolated instances of terminology exchange, there is as yet no formal infrastructure in Switzerland. Interest in exchange is high and a limited number of potential network nodes exist, though there are some doubts as to how a decentralised network can maintain standards of quality and guarantee methodological uniformity so as to maximize the rationalizing effect of exchange. The current economic turbulence in Switzerland is curbing terminological activity in some quarters, and may also affect the readiness (and ability) to look very far into the future.

At the same time, there is little doubt that terminology will continue to play an important role in the ongoing processes of international standardization and harmonization, and there are signs, in the context of the information society, that the potential of terminology in areas beyond translation support will soon be more fully realized.

  1. United Kingdom and Ireland

Origin : University of Surrey


The results reported here concerning terminology practice in the British Isles (UK and Ireland) are based principally on questionnaire returns (response rate c. 10%), supplemented by telephone interviews and further research.

Overall, there is a low awareness in all areas of terminology and its role in communication, particularly among companies which are not concerned with translation. Whilst some companies and private translators maintain an up-to-date record of the terminology they encounter, others consider terminology to be irrelevant to their work, even though a large amount of their work is concerned with communication and documentation. There seems to be very little knowledge of the benefits of creating and maintaining an efficient terminology management system.

The most frequently-used language is clearly English, followed by Germanic and Romance languages. Translation is the most common application. The most prominent domains are in the general area of applied sciences, medicine and technology (UDC 6), followed by social sciences, economics, trade, etc. (UDC 3).

Printed dictionaries remain the most popular reference source, but standards are not widely used. Although there is evidence that users of terminology are beginning to use databases to store and retrieve their own terminologies, there is little use of terminology management systems and no evidence of any degree of elaboration for terminological entries. There is clearly a need not only for information on available resources but also on how to compile terminologies.

Terminology work, where it exists, seems to be conducted largely on an ad hoc basis with little strategic planning or use of personnel on a co-operative basis. The need for quality assurance often seems to be tempered by commercial time constraints, and no substantial evidence was found of the use of terminology standards. The use of text encoding standards is just beginning to emerge. Terminological data does not at present seem to be exchanged between users or user groups. While a technical solution would facilitate such a re-use of resources, other problems remain such as copyright, cost and payment mechanisms. Under half the sample expressed a willingness to join a terminology exchange network.

On the basis of these results, the following recommendations can be made:

  1. WP 5 : ANALYSIS - STANDARDISATION AND COPYRIGHT

  1. T 5.1 : Standardisation

Origin : Infoterm


Present situation

Problems

Solutions

  1. T 5.2 : Copyright

Origin : GOTA


The problems posed in creating and running a European exchange network for terminological data are both technical and legal.

From the legal point of view, the difficulties which producers of terminological data confront are mainly due to the electronic media which allow them to stock a large mass of data, and to the large number of people who can have access to the network.

The law should be able to face up to the current changes in technology to answer the need to protect the intellectual inheritance of firms and fulfil its regulatory function.

Existing legal systems (copyright, contract law) and those in the process of being worked out (a proposition for a European Directive for the legal protection of databases) answer these needs more or less effectively.

Protection by copyright is difficult to put in practice, most linguistic information not being susceptible of appropriation.

The European directive's proposition for the protection of databases has opened up an interesting area by installing a system for the protection of the content of databases, independent of the means of diffusion (paper, on-line, CDROM...).

This economic law, similar to the laws of "unfair competition", allows the database producer to oppose the partial or total extraction of the content by unloading or reproduction... and also forbids the commercial reuse of the content provided that it is is not protected by copyright.

The most effective form of protection at present is protection through contract law as it enables to producers of terminology resources to set precisely the conditions for their availability of use.

Moreover, in the context of data exchange, this solution through contract law sets up "a loyal competition" between the different partners of the networks. At the same time it allows the terminologist to define his needs to the jurist with precision and the jurist to interpret them by working out the contractual clauses.

Tehnical solutions for protection can accompany legal solutions. In most cases the technical means for limiting access to the exchange network of terminological data could be used and would add to the contractual arrangements limiting access to information.

These contractual arrangements can be usefully completed, at the international level, through professional collective arrangements, for example, a code of ethics.

  1. WP 6 : CASE STUDIES

  1. T 6.1 : IT&T (Information Technologies and Telecommunication)

Origin : INT


This case study aims at providing a Europe-wide survey of the information available on resources, needs and terminology works in the Telecommunications sector, and to a lesser degree in Information Technology. A panorama of terminology in IT&T is presented in order to evaluate needs and gaps in terminology resources in the area, in view to proposing a future infrastructure.

Terminology begins to be of concern for the Industry as technical writing, documentation and translation (even if sub-contracting in majority) take more and more importance.

Terminology is to the fore in areas of high-tech research. Generally speaking, in the field of IT&T, firms seem to prefer to do their own terminology work rather than outsource, for obvious reasons of confidentiality.

The IT&T domain seems to include the three following types of technical terminologies.

While there is no doubt that confidential terminology could not be put on any open network, the first two types could be made accessible. If this separation appears clear to us, it is not so cut and dry for businesses; either they do not see the question or they prefer to ignore it.

A general caution should be expressed as regards a european network: a lot of projects exist (Lingo, Ernest...), all involving the World Wide Web. In spite of the many advantages it implies, precausions should be taken about quality control and validation.

Barriers

One of the most important problem concerns confidentiality.

Once the problems of confidentiality are set aside, there remains the question of convincing a certain type of person of the commercial interest at stake. Although it is obvious to translators and terminologists that terminology is important, those in charge of sales who work on product description, for example, are much less convinced of this. Their main interest is an immediate return. It is far from sure that they will even be interested in exchanging terminologies even when their own tools are up and running, and satisfy their needs.

The hurdle of confidentiality and competition once more raises its head for the terms of one domain: which terminology should one choose between two competitors in the IT&T field ?

Some of those asked cannot commit themselves without refering to those in authority. Some declare prudently that a term network would interest them, but they can only answer for themselves.

Opportunities

Future terminology work will have to cope with multimedia requirements as the expectations from users rise. It will be common practice to provide a glossary, a dictionary or a term list in electronic form. They will be built as a document allowing links to and the automatic insertion of other media developed to support the contents of the entry (such as graphics, still pictures, videoclips and sound). Naturally, this scenario presupposes the availability at reasonable costs of the technical base (multimedia PC, high speed data transfer) but also of the relevant products. These are clearly emerging as the CD-ROM based dictionaries and encyclopedias already available will prove. As soon as the development costs of such products will decrease and the necessary development tools have become intrisic part of an office software package, more such products will be created. Coupled with the then availability of very fast data transfer (ATM technology) and such techniques as public hyperlink to public databases (of pictures, videos, sounds, texts) and private links to private databases (for the new product, concept, machinery, etc.) a future glossary may end up being a collection of data organized for a special purpose and including a multitude of links to a wide range of data on the "universal" network, that are dynamically linked into the document whenever it is activated for display.

Recommendations

In fact, to be credible and recognized, a term network should be a real knowledge base, guaranteeing quality (thus the idea of a terminology quality "label"), and image for the company. At the present, industrialists try to respect the standards set by the relevant authorities just to be credible on the market, in spite of the cost implied. The term lable should thus be presented in the same light as a standard. This would make it possible not only to present terminology as a commercial product, with a market value, which can be exchanged, sold and reused as well, but also to ensure the homogeneity of the terminology on the network, and achieve a hitherto elusive harmonisation.

Terminological activities within the telecommunications domain have to be developed by means of collecting and validating already existing data, and have to be based upon numerous efforts as far as standardisation is concerned. Moreover, the creation of terminological data should be encouraged. These actions have to be developed, within the framework of European linguistic programmes, and should imply market actors, in order to flavour diffusion of resources and an awareness of the market.

Networking/infrastructural measures/forms

The creation of a workshop or network in Telecommunications and Information Technology should be integrated within a european terminological clearinghouse.

The network created should be politically and economically independant: no link with any country nor specific company.

More transdepartmental cooperation in the development of terminology data.

Aim of the network:

  1. T 6.2 : Environment
  2. General


Origin : DTG


The situation regarding the work in Environmental Terminology is characterized by the fact that it is a new and at the same time very problematic knowledge area. Additions have been made on individual and across several levels but there is a lack of a superior planning and coordination.

It is recommended

  1. Denmark and other Nordic countries


Origin : DTG


Environmental problems know no boundaries or borders.

A few years ago, however, the attitude in the politicians and public administration nevertheless was characterized by the focus on the solving of the problems merely on a local scale. This has changed and today much effort in the environmental work is put into the attempt to decrease the pollution which crosses borders.

Therefore, environment politics increasingly turn into a international task. As a consequence of this, the communication between the Nordic countries and other countries is crucial.

This communication not only relates to the language of the politicians but also to the specialized language of technicians. Nordic companies are active in business and trade with pollution decreasing equipments for billions of Danish kroner and in order to ensure the quality of the communication in this connection it is of vital importance to give highest priority to terminology work in the field of environment.

  1. T 6.3 : Labour law and social security

Origin : CL Servicios Lingüísticos


Differences in levels of social protection between the member countries of the European Union pose a barrier to free circulation.

Since its creation, the European Community has been confronted with the social protection of its citizens and the rights of its working population. Although this aspect has always been approached from an essentially economic point of view, the Treaty of Rome added the social focus, with special mention of articles 48 to 58 in which competencies concerning free circulation of workers have been delegated to the Community. Very soon after this, in the beginning of the seventies, these have been completed with the Council's regulations 1408/71 and 574/72 regarding the application of social security schemes to employed persons, self-employed persons and to members of their families circulating within the Community. Afterwards, evolution in the legislation of each Member State has been taken into account by the regular updating of these regulations. These steps of the Community form the key to harmonisation and co-ordination of social policies.

And finally has to be mentioned the adoption by the Council, in 1992, of two Recommendations regarding the "convergence of social protection objectives and policies" (92/441/EEC) and "common criteria concerning the guarantee of minimal resources within social protection systems" (92/442/EEC), OJ L 245.

The fact that the European Union has defined objectives to orientate the policies of Member States towards convergence should considerably strengthen the exchange of information, but are there adequate, or indeed existing, terminological infrastructures to assemble this information and disseminate it?




  1. ON-LINE TERMINOLOGY RESOURCES

Lists and order

Both lists are sorted by country and database name.

Key to field codes

INFORMATION
[INFOPROV]information provider within POINTER
[INFODATE]date of last update
[INFOPUBL]indication on confidentiality
[INFONOTE]note re. information provided, incl. restrictions re. confidentiality of specific data fields
DATABASE
[DBNAME]name of the database
[DBNAME2]other name of the database
[DBNOTE]information re. the database not provided elsewhere
DATABASE CHARACTERISTICS
[DBXCHFMT]interchange format(s)
[DBSBJFLD]subject field(s)
[DBDATE]date of last update of the database
[DBNBTERM]number of terms in the database
[DBLANG]language(s)
[DBMED]storage medium / access
[DBDISTR]indication on distribution
[DBDSTCND]conditions of distribution (sale / exchange / free)
[DBCHNOTE]information re. the database characteristics not provided elsewhere

ORGANISATION
[ORGSHORT]name of the organisation
[ORGNAME]other name of the organisation
[ORGSUB1]relevant subdivision of organisation
[ORGSUB2]deeper level
[ORGSUB3]deeper level
[ORGLEVEL]geopolitical level of the organisation
[ORGNOTE]information not provided elsewhere
CONTACT DETAILS
[CONTACT]name of most relevant contact person
[TEL]telephone number of contact
[FAX]fax number of contact
[EML]e-mail address of contact
[ADR_1]first line of postal address
[ADR_2]additional line of address
[ADR_3]additional line of address
[COUNTRY]country name
[CONTNOTE]information re. contact person not provided elsewhere



  1. TERMINOLOGY MANAGEMENT SYSTEMS AND EVALUATION CRITERIA

  1. INTRODUCTION

The developments in terminology management systems have been monitored by a number of authors over the last decade or so. Initially, these monitoring efforts covered a range of activities in terminology as a whole, covering aspects of translation, terminology record format, terminology projects and so on. However, in the last five years such monitoring efforts have been focused almost exclusively on Terminology Management Systems, these extensive surveys include those conducted by Mayer (1990), Freigang, Mayer & Schmitz (1991), and Blanchon (1994). Related studies such as the EU-sponsored research projects GLOSSASOFT and EAGLES also consider computer-assisted terminology from different perspectives and sometimes in broader contexts.

Hvalkof (1985)

Hvalkof's Etude comparative des données terminologiques des banques de terminologie DANTERM, B.T.Q., EURODICAUTOM, Q.F.L. et SIEMENS is one of the first comparative studies of terminological data bases. The author compares six systems on mainframe computers: DANTERM (Handelshøjskolen i København, Denmark), BANQUE DE TERMINOLOGIE DU QUEBEC (Canadian Government; referred to as B.T.Q.), EURODICAUTOM (EC-Commission, Luxembourg & Brussels), NORMATERM (AFNOR, France), LEXIS (Bundessprachenamt, Germany; referred to as Q.F.L.), and TEAM (Siemens, Germany; referred to as SIEMENS). The study pays particular attention to the possibilities of exchanging terminology between the different data bases. The comparison contains an extensive list of database fields, which goes beyond the scope of this chapter. This study predates the emergence of what are now known as TMSs.

Mayer (1990)

In 'Terminologieverwaltungssysteme für Übersetzer: Ergebnisse einer Untersuchung' Mayer presents the results of an analysis of eight TMSs available on the European market in spring 1990. The study focuses on TMSs designed for use at the translator's workplace. The individual systems are: TermTracer, Trados (Germany); Termex, Eurolux Computers (Luxembourg); Superlex, C. Blowers (Germany); Profilex, H. Gabriel (Germany); Multiterm, Trados (Germany); Term-PC, Siemens (Germany); Term-Lidas, Software Design (Germany); Cat, Syntec (Germany). Three of these programs were beta versions.

The author distinguishes four main topics which form the basic structure of the analysis: technical description, characteristics of the terminological entry, access to terminology, and other functions/utilities. The criteria dealt with under the four topics are selected with a view to meeting the needs of translation-oriented terminology. Particular attention is not only given to the different look-up features, but also to the interaction between TMS and word-processing programs. Mayer has noted that none of the analysed systems fulfils the demands of the different translation environments. The author foresees a development towards a multifunctional approach to the translator's environment.

Freigang, Mayer and Schmitz (1991)

These authors have analysed 17 TMSs and describe six CD-ROMs containing linguistic or terminological information which were available on the market in winter 1990/91. Following the analysis of Mayer 1990, the authors focus on translation-oriented TMSs, but also include software for the terminologist's workstation. The depth of the analysis varies in relation to the nature and quantity of information available to the authors. Besides standard hard- and software-related criteria, the major aspects of the analysis cover the interaction between TMS and word-processing software, data exchange facilities, the availability of machine-readable glossaries, and the structure of the terminological entry. The authors distinguish three main types of entry structure: freely definable, definable with restrictions, and fixed.

The authors appeared to be hopeful about the growth and impact of TMSs, especially on the European marketplace. They regard establishment and implementation of standards in order to overcome technical and semantic problems concerning the exchange of terminological data. They were convinced that the efficiency and capacity of TMS will be improved, and that they will develop towards knowledge bases. Thus, terminological data will not only be collected, managed and used by translators but by a variety of users or user groups (documentation specialists, librarians, engineers, technical authors and others). The availability of carefully-collected and reliable terminology will become a necessary prerequisite for high-quality translations in specialised fields. Compiling and distributing such terminology will become an additional service offered by language departments, translation agencies and free-lance translators. The report concludes by stating that routines must be developed that will successfully examine large amounts of terminological data with regard to form and content.

Blanchon (1994)

Blanchon's 'Logiciels de terminologie' is the latest overview on tools for computer-assisted terminology work. The author has analysed 69 TMSs, 7 computer-assisted translation systems with a TMS component, and 7 term-extraction tools, as well as several electronic dictionaries on diskette or CD-ROM. The analysis is based on the results of a questionnaire sent to the software developers/distributors. The programs are analysed or described in the light of criteria similar to those used in the studies mentioned before. The main goal was to present a comprehensive market overview. The author concludes by pointing out that the main innovation in the systems surveyed relates to ergonomic and presentational factors, as well as the occasional use of graphics and hypertext links. A tendency to integrate terminology modules in workbenches for translators is also noted, although many desiderata remain.

Related studies

The Linguistic Research and Engineering (LRE) project GLOSSASOFT has attempted to develop methods and guidelines for globalizing software construction. Within this framework the consortium, among others, considers existing tools offered by language technology. Besides brief descriptions of some TMSs, the GLOSSASOFT reports also contain short sections on evaluation.

EAGLES (Expert Advisory Group on Language Engineering Standards) is another LRE project launched in February 1993 in order to foster the provision of standards for the development, exploitation and evaluation of large-scale language resources. Within this context, the EAGLES group will also look at translation aids like electronic terminology banks including electronic monolingual and multilingual dictionary access systems.

In the present appendix we present a resume of the survey conducted by Blanchon which has been augmented by the POINTER consortium, followed by a summary of the evaluation criteria for TMSs developed in POINTER.

  1. OVERVIEW OF TERMINOLOGY MANAGEMENT SYSTEMS (TMSS)

Hardware Platform Interface Language(s) Originators / (Price) Further Reference
ALETHGTUnix GSI-ERLI
AQUILAPC French SITE / 8500 FF
ASCOM-TDB*Unix French ASCOM-TDB
AUTOLEXPC French Traductix / 2000 $ Can
BATEM*PC French Jean Baudot BAUDOT, Jean (1988): BATEM: une minibanque de terminologie, Terminogramme N°46.
BELGOTERM HERMANS, Adrien (1994): La banque terminologique Belgoterm. In: Meta, XXXIX, 1, 1994.
CATPC German SynTec / 40000 DM (approx.)
CATSPC German AUCOM / 950 DM Schmitt, Peter A. (1987): Computer statt Kartei. Terminologiearbeit mit Mikrocomputer. In: Lebende Sprachen 2/87, 56-65.

Schmitt, Peter A. (1992): CATS/FASTERM: Ein Beitrag zur rechnerunterstützten Übersetzung und Terminologiearbeit. In: Forschungsmagazin der Johannes Gutenberg-Universität Mainz, Sonderausgabe 1992, 4-20.

Schmitt, Peter A. (1994): Translationsorientierte Terminographie auf dem PC. Ein neuer Weg von der Terminologiedatenbank zum Fachwörterbuch. In: Fischer, Ingeborg; Freigang, Karl-Heinz; Mayer, Felix; Reinke, Uwe (Hrsg.)(1994), 31-62.

Zouroufidou, Theodora (1992): Terminologische Eintragsstrukturen in Diplomarbeiten und deren Umsetzung im Terminologiesystem CATS. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

CDS ISIS CIPRIANO, A. (1992): Industrial Engineering Terminology. In: Actes de TAMA '92, Applications terminologiques et microordinateurs, organisé par TermNet les 5 et 6 juin 1992 à Avignon, Paris, 1992.

ROSENDAHL, S.(1993): Terminologieverwaltung mit CDS-ISIS. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

CODE*PC, Unix, Mac English Douglas Skuce

Hardware Platform Interface Language(s) Originators / (Price) Further Reference
COGNITERM Skuce & Meyer MEYER, Ingrid; BOWKER, Lynne; ECK, Karen (1991): Constructing a Knowledge-Based Term Bank : Fundamentals and Implications'. In: Actes du symposium international Terminologie et documentation dans la communication spécialisée, Infoterm, Secrétariat d'Etat du Canada, Montréal, 1991.

MEYER, Ingrid; BOWKER, Lynne; ECK Karen (1992): COGNITERM : An Experiment in Building a Terminological Knowledge Base. In: Proceedings of the Fifth Euralex International Congress, Tampere, Finland, 4-9 Août 1992.

MEYER, Ingrid; BOWKER, Lynne (1993): Beyond 'Textbook' Concept Systems: Handling Multidimensionality in a New Generatino of Term Banks. In: Schmitz, Klaus-Dirk (ed.)(1993): TKE'93:Terminology and Knowledge Engineering. Proceedings of the Third International Congress on Terminolgy and Knowledge Engineering, 25-27 August 1993, Cologne. Frankfurt: Indeks.

MEYER, Ingrid (forthcoming): Pour avoir une vue d'ensemble de la salle de bal ... Nouvelles perspectives dans une base de connaissances terminologiques. In: Actes des troisièmes journées scientifiques Traductique-TA-TAO, organisées par l'AUPELF-UREF du 30 septembre au 2 octobre 1993 à Montréal.

COMPLEX / RAILLEX Cvirn, Hermann (1992): Untersuchung der Software 'Complex/Raillex'. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.
CONCEPT & TERM*PC English Christian Quist
CONTEXT*PC English Consense
DESKIPC French Newtech System / 0-500 F
DICOBASE*PC, Unix Lingaware
DICOTERMPC German, French Renato Reinau / 1200 FS
DIKI / SILPC French ANACO
EDIBASE COUTROT, Francois (1992): EDIBASE: progiciel de documentation à usage terminologique. In: Actes de TAMA '92, Applications terminologiques et microordinateurs, organisé par TermNet les 5 et 6 juin 1992 à Avignon, Paris, 1992.
ERITERM Ericsson Language Services
EUROGLOT /

E'GLOT WINDOWS

PCFrench, English, Dutch Linguistic Systems
FAOTERMPC English, French, Spanish FAO
GATPC, Unix Corporate Technology
GESTORLEX TEXTware
GLOBEDISKPC German, English Electronic Publishing / 198 DM
HYPERTEPAPC, Mac Finnish Olli Nykänen

Hardware Platform Interface Language(s) Originators / (Price) Further Reference
INGESPPC Spanish Centersoft / 180 $
INK TEXT TOOLS Donnelley Language Solutions
JURITERM SNOW, Gérard (1993): JURITERM - logiciel de recherche terminologique. In: L'actualité terminologique, Vol. 26, N°1, 1993.
KEYLEXPC German, French, English CAP Debis / 199-399 DM
KEYTERM WINDOWS / UNIXPC, Unix French, English, German Cap Debis /

Unix: 20000-100000 DM.

PC: 2000 DM

Albrecht, Monika (1993): Terminologieverwaltung mit Keyterm. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.
KONSULTPC French André Kaisserlian / demonstration available
LATTER QUIRION, Jean (1992): Trois larrons en foire : TERMIUM, PUBLICIEL, et LATTER. In: L'actualité terminologique, Vol. 25, N°3, 1992.

QUIRION, Jean (forthcoming): La terminotique au Secrétariat d'Etat du Canada. In: Actes des troisièmes journées scientifiques Traductique-TA-TAO, organisées par l'AUPELF-UREF du 30 septembre 1993 au 2 octobre 1993 à Montréal.

LC-TOPPC German, French, English Softex / 345 DM
LE DICOPC French, English Dictionarian Systems / 600 F
LE LEXICALISTEUnix French, English SITE / 1000 F for demo (deducted from purchase price)
LE TERMINOLOGUEPC, Mac French, English Les Editions de Lanaudière / 995 $Can
LEXBASEPC English Micro-Aid / 250 £
LEXIKONPC English, Afrikaans National Terminology Services / free in exchange for data
LEXITERMMac French Société Hizkia / 3900 F
LEXITHEQUEPC French La Maison du dictionnaire / 1500 F

Hardware Platform Interface Language(s) Originators / (Price) Further Reference
LEXMMac French, Basque Société Hizkia
LEXPROPC, Mac, Unix French, English La Maison du dictionnaire/ 4800F
LINGTOOLSPC, Unix German (other languages on demand) Sietec Systemtechnik / 3900 DM
LINGUA-PC*PC French, German Service central de terminologie du Canton de Berne DE BESSE, Bruno; BERNEGGER, Jurg (1991): Lingua-PC, outil de gestion terminologique du canton de Berne. In: Terminologies nouvelles, N° 5, 1991.
MC4PC French Terminformatique / 6000 FF HENNING, Jean-Michel (1992): MC4: un outil pour la terminologie. In: Actes de TAMA '92, Applications terminologiques et microordinateurs, organisé par TermNet les 5 et 6 juin 1992 à Avignon, Paris, 1992.
MULTITERM /

M'TM WINDOWS

PCFrench, English, Spanish, German, Dutch La Maison du Dictionnaire (Paris), Eurolux (Luxembourg), Trados (Germany) / 6200 FF DOS: Heyn, Matthias; Heid, Ulrich (1992): Multiterm 2: Eine konzeptorientierte multilinguale Terminologiedatenbank unter DOS. In: Language et l'homme.

Schrenner, Sibylle (1992): Terminologische Eintragsstrukturen in Multiterm. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

Szylowicki, Ilona (1991): Untersuchungen des Einsatzes elektronischer Werkzeuge für den Übersetzer am Beispiel von 'Translator's Workbench'. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

WINDOWS: HEYN, Matthias (1992): A New Terminological Database within a Graphical Environment: Multiterm for Windows. In: Actes de TAMA '92, Applications terminologiques et microordinateurs, organisé par TermNet les 5 et 6 juin 1992 à Avignon, Paris, 1992.

Thon, Jörg (1993): Terminologieverwaltung mit Multiterm für Windows. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

SEYBOLD, Michael (1995): Terminologieverwaltung unter Windows: Eine vergleichende Untersuchung der Terminologieverwaltungssysteme TermStar, MultiTerm 95, TermISys, Termbase. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

NEOLOG*PC, Mac French Charles Zama
OMNITERM TINOS
PHENIXPC SITE
PROFILEXPC German, English, French Horst Gabriel / 1425 DM
PROTERMPC French Les logiciels Tradulog / 399-650 $Can
STATION GENELEXUnix GSI-ERLI
SUPERLEX / SUPERLEX WINDOWSPC German, English, French Chris Blowers / 999 F

Hardware Platform Interface Language(s) Originators / (Price) Further Reference
SYSTEM QUIRKPC, Unix English InKE Ltd (phone: +44 1483 295744) / price on application HOLMES-HIGGIN, Paul (1995) The Quirk Experiments. PhD Dissertation, Guildford, University of Surrey.
TEDI / POOH Lothar Rostek
TERMBASEPC Gerhard Freibott
TERMBASEPC German, English V. Srinivasan / 2500-3500 DM
TERMEX / TERMEX WINDOWSPC, Mac French, English, German, Dutch, Greek Eurolux Computer or La Maison du Dictionnaire / 2740 FF Brändle, Diana (1992): Terminologische Eintragsstrukturen mit Termex/MTX. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

DE SCHAETZEN, Caroline (1987): Terminologie en langues africaines et gestionnaires de données textuelles. In: Le langage et l'homme, vol XXII, fasc.2, 1987.

Hartmann, Christine; Jilleck, Dagmar (1989): Untersuchungen zur inhaltlichen Strukturierung von Terminologieverwaltungssystemen. (= Arbeitsbericht 11, DFG-Projekt 'Übersetzerarbeitsplatz'.) Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

MELBY, Alan (forthcoming): MTX/TERMEX: gestion d'information terminologique. In: Actes des troisièmes journées scientifiques Traductique-TA-TAO, organisées par l'AUPELF-UREF du 30 septembre 1993 au 2 octobre 1993 à Montréal.

Stoll, Cay-Holger (1989): A concept-oriented approach to terminology work on PC. In: META 34, 3/89, 615-628.

STOLL, Cay-Holger (1991): Tour specialist for language processing. In: Actes du colloque Terminologie et enseignement des langues, organisé par l'AEPLV les 31 janvier et les février 1991 à Cergy-Pontoise, La Tilv éditeur, Paris 1991.

STOLL, Cay-Holger: MTXR - Management of Information. In: Actes de TAMA '92, 2ème symposium TermNet 'Application terminologiques et microordinateurs', Avignon 5-6 juin 1992.

Wright, Sue Ellen (1990): Design for terminological management. In: The ATA chronicle: 6/90 (I), 14-16; 7/90 (II), 13-15; 8/90 (III), 16-17; 9/90 (IV), 15; 10/90 (V), 18-19; 11-12/90 (VI), 31.

TERMISTI*PC French, English (and others) ISTIBLAINPAIN, D.; PETRUSSA, P.; VAN CAMPENHOUDT M. (1991): A la recherche d'ecosystèmes terminologiques. In: Actes du colloque 'L'environnement traductionnel. La station de travail du traducteur de l'an 2001', AUPELF, Mons, 25 - 27 avril 1991, Montréal, 1992.

MERTEN, Pascaline (1992): Apport des relations notionnelles à la description terminologique. In: Actes de TAMA '92, 2ème symposium TermNet 'Application terminologiques et microordinateurs', Avignon 5-6 juin 1992.

VAN CAMPENHOUDT, Marc (forthcoming): Les relations notionnelles expérimentées dans les micro-glossaires Termisti: du foisonnement à la régularité. In: Actes des troisièmes journées scientifiques Traductique-TA-TAO, organisées par l'AUPELF-UREF du 30 septembre 1993 au 2 octobre 1993 à Montréal.

TERMISYSPC German, English Köller Informationssysteme / 200-600 DM SEYBOLD, Michael (1995): Terminologieverwaltung unter Windows: Eine vergleichende Untersuchung der Terminologieverwaltungssysteme TermStar, MultiTerm 95, TermISys, Termbase. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

Hardware Platform Interface Language(s) Originators / (Price) Further Reference
TERM-LIDASPC, Unix German Software Design / 3500 DM (approx.)
TERM-PCPC, Unix German, English Sietec Systemtechnik / 3000 DM Borke, Claudia (1992): Terminologieverwaltung mit Term-PC und Term-Trans. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

Hartmann, Christine; Jilleck, Dagmar (1989): Untersuchungen zur inhaltlichen Strukturierung von Terminologieverwaltungssystemen. (= Arbeitsbericht 11, DFG-Projekt 'Übersetzerarbeitsplatz'.) Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.

Hohnhold, Ingo (1990): Terminographie auf Term-PC. In: MDÜ - Mitteilungsblatt für Übersetzer und Dolmetscher 4/36, 1990, 4-5.

Vollnhals, Otto (1989): Term-PC - the terminology system. In: TermNet News 26, 1989, 20-23.

TERMIS BURDET, Claude-Alain (1991): Terminology and the Management of Information, some practical solutions delivered by the notional inference engine of Termis. In: Actes du symposium international Terminologie et documentation dans la communication spécialisée, Infoterm, Secrétariat d'Etat du Canada, Montréal, 1991.
TERMS Digital
TERMSTAR SEYBOLD, Michael (1995): Terminologieverwaltung unter Windows: Eine vergleichende Untersuchung der Terminologieverwaltungssysteme TermStar, MultiTerm 95, TermISys, Termbase. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.
TFSLarge IBM systems. Leonard Cantor
TMSPC Monsieur Bodart / 3000-4000 DM PÖLKNER, Birgit (1993): Terminologieverwaltung mit TMS. Diplomarbeit, Fachrichtung 8.6, Universität des Saarlandes, Saarbrücken.
TOPTERMPC Toulouse Verlag / 100 DM
TRANSLEXIS IBM Germany
TWINPC English Siemens Nixdorf / 980 DM
VALTERUnix VTKK Government Systems
VCH-TRANSDICTPC German VCH Verlagsgesellschaft / 3900 DM
VERTALER Sun Data Service
WB4000 MONTANA
WHOTERMPC English OMS / 500 $ 'WHOTERM Expansion'. In: Language International, Vol.5, N° 2, 1993 .
WILD-CAT EICISOFT
WORDBOXPC Toulouse Verlag / 320 DM

Sources:

The preceding table was compiled from the results of a survey co-ordinated by Elisabeth Blanchon1 and a survey reported in the POINTER document Aspects of Terminology Infrastructure in Europe: Volume I - Analysis of Terminology Management Systems in Europe2. Entries in the table for hardware platform, interface language and price have been left blank where the relevant information was not readily accessible.

1 TermNet News - Issues 46/47 (1994) and 48 (1995).

2 Khurshid Ahmad, Karl-Heinz Freigang, Felix Meyer, Uwe Reinke, Margaret Rogers, Klaus-Dirk Schmitz.

  1. EVALUATION CRITERIA FOR TERMINOLOGY MANAGEMENT SYSTEMS (TMSS)

CriteriaComments
Hardware and Software requirements
  • Hardware requirements: platform (e.g. PC / Macintosh / Unix), RAM (minimum / recommended), Hard disk capacity (minimum / recommended), peripheral devices (colour screen, mouse),
  • Software requirements: operating system (e.g. TSR / MS-Windows)
  • Underlying data model ('flat' / relational / object-orientated / semantic network).
  • Multitasking mechanisms,
  • Multi-user / network capabilities,
The extent to which these technical features are crucial for selecting a TMS depends very much on the existing or planned electronic data processing equipment in a given work environment.
User Interface
  • Installation routine
  • User interface language(s)
  • Type of user interface (menu driven, graphical user interface; command keys, function keys, dialogue boxes, icons, mouse control)
  • Operating manual,
  • On-line help (context-sensitivity)
  • Tutorials (on-line)
The abilities and experiences of users should be considered when evaluating the language(s), degree of user-friendliness, help and tutorial features and accompanying documentation offered by a TMS.

The layout of information on the screen and the potential for 'customisation' are important for the tasks the TMS is expected to perform.

Terminological Aspects

Data Management:

  • Max. number of - databases / dictionaries; simultaneously accessible databases; languages per database; entries per dictionary.
  • Max. size of entry

Entry Model:

  • Predefined vs. definable entry structures
  • Relations between data categories
  • Relations between entries
  • Restrictions on entry structure
The choice of a TMS for a particular situation depends on the number of languages envisaged, the degree of elaboration required, the language direction (for bi- or multilingual terminologies), the types of relation required within and between entries (e.g. 'synonym autonomy').

CriteriaComments
Input of Information

Validation / Control

  • (Automatic) checking for duplicates
  • Spell checking
  • Consistency control (obligatory data categories, templates)
  • Controlled input (for categories with restricted input options)

Terminology Extraction

  • Languages available
  • Single terms, complex terms, compounds, phraseology.
  • Supported text formats (RTF, SGML, etc.)
In addition to manual input some systems provide facilities for importing data from other programs and for computer-assisted terminology extraction from texts. Another key issue concerning input is that of validation/control - TMSs should warn the user of duplicates, check spellings and control data input for certain categories so as to enhance the consistency of a term base.
Retrieval of Information

Look-up features

  • Access via term: complete term, beginning of term, case sensitivity, manual truncation, restricted search, Boolean operators, fuzzy search, free-text search.
  • System responses: closest match, hitlists for fuzzy searches, error messages.
  • Logging of terms not found.
  • Browsing: alphabetical, chronological, entry ID, conceptual.
  • Access via other fields: entry number, term in context or definition.
  • Control of simultaneous access across network.

Selection of information

  • Selection for: search, import, export, printing
  • Data categories available for selection
  • Selection criteria: Boolean operators, strings, mathematical operators (< , > etc.)

Views on information

  • Fixed vs. definable (extent to which structure of layout can be specified)
  • Alphabetical sorting (correctness for non-Latin character sets)

Security of Information

  • Access rights definable for : Database, Entry, Field
  • Types of access restriction: No access, read only
Facilities for retrieval should match the needs of the envisaged users.

CriteriaComments
Exchange of Information

Printing

  • Direct printer interface or printing via export to / interaction with word-processing and/or DTP programs.
  • Supported printers.
  • Alphabetical sorting.
  • Selection of information.

Import and Export of Data

  • Supported character sets (ASCII, ANSI, etc.)
  • Supported (quasi-)standards for data interchange (MARTIF, MATER, MicroMater, etc.)
  • Supported formats of other TMSs.
  • Selection of information.
  • Features to export to (i) other TMSs (e.g. user-definable export routines) and (ii) word-processing / DTP programs.
  • Routines for validation and control of imported data.

Interaction with Other Applications

  • Word-processing programs: access to TMS from WP; paste from TMS to WP (TL term only / other data elements); method of pasting (direct / via buffer or clipboard).; paste term from WP to TMS.
  • Translators' Workbenches: in addition to interaction with WPs - automatic term recognition; (automatic) term substitution; (automatic) term extraction.
  • MT-Programs: criteria to be developed.
In order to produce printed dictionaries, glossaries and term lists a TMS should have export routines which allow for the specification of layout information for the document to be printed.

Data interchange facilities should support standard character sets (such as ASCII and ANSI); support of 32-bit character sets (as defined in ISO 10646), which are not yet common in application programs, is desirable for languages with non-Latin alphabets.

Moreover, individual structures and formats of other TMSs should be supported as well. Import routines should provide mechanisms for the validation and control of imported data.

Depending on the type of user the ability of a TMS to interact with other software applications may become a crucial factor in its evaluation.

TMSs may interact with standard word-processing programs, translators' workbenches, MT systems and Artificial Intelligence (AI) programs such as expert systems.

The TMS should enable direct access from the word processor and the text treated in the wordprocessor should remain visible. Additionally it is desirable for information to be passed between the TMS and the wordprocessor via 'cutting and pasting'. When part of a translators' workbench a TMS should offer features such as automatic term recognition, (automatic) substitution of source language terms by target language equivalents and computer assisted term extraction.

Current MT systems tend to require explicit linguistic descriptions not required by the human user which is why hardly any of the present commercial TMSs are designed to provide entries for the dictionaries of MT systems.

Additional tools
  • Related programs
  • Fonts
  • Dictionaries
Some TMSs available on the market include additional modules for building up and managing classification systems, concept systems, or thesauri. Other programs offer special fonts for non-Latin character sets or a range of electronic dictionaries for both general and special languages. The applicability of these tools is very much dependent upon the users' requirements.
Commercial aspects
  • Prices (single user / multi-user / site licenses, educational and mass discounts, update price).
  • Support (availability and costs): installation, service, training courses, hot-line.
When comparing the prices of different TMSs, it is necessary to asses whether important auxiliary tools such as import-export routines, character sets or classification modules, are included in the price.

For multi-user and network environments, a multiple software license can be a cheaper alternative to buying several single copies of the program.

Source:

These criteria were developed in the course of the POINTER project and are reported in more detail in Aspects of Terminology Infrastructure in Europe: Volume I - Analysis of Terminology Management Systems in Europe (Khurshid Ahmad, Karl-Heinz Freigang, Felix Meyer, Uwe Reinke, Margaret Rogers, Klaus-Dirk Schmitz).

  1. SURVEYS ON TERMINOLOGY AND TERMINOLOGY WORK

Lists

Key to field codes

INFORMATION
[INFOPROV]information provider within POINTER
[INFODATE]date of last update
[INFOPUBL]indication on confidentiality
[INFOLANG]language in which the information is provided
[INFONOTE]note re. information provided
SURVEY
[SURVTITL]title or name of the survey
[SURVAUTH]author(s) of the survey
[SURVDATE]date of the survey
[SURVGOAL]goals of the survey
[SURVTARG]target group(s)
[SURVCOVE]geographical coverage of the survey
[SURVCONT]contents of the survey
[SURVUTIL]utilisation of the survey
[SURVAVAI]availability of the results of the survey (in general)
[SURVQUAV]availab. of the questionnaire used to the provider of the information
[SURVREAV]availability of the results to the provider of the information
[SURVNOTE]information re. the survey not provided elsewhere




  1. THE POINTER TRAINING MODEL

This Appendix provides selected extracts from the Phase II Workpackage 3.1 report on a model for the accreditation of terminology training [Ahmad et al. 1995].

  1. CORE TASKS AND SUB-TASKS

The areas of activity which we consider to be central to terminology, i.e. so-called Core Tasks, are shown below:


Core tasks, showing also sub-tasks:



Corpus Creation


Text Analysis

Importing Terms


Terminology Acquisition {Creating Termbanks



Evaluating Termbanks


Verification

Validation



User Requirements


Record Format Development


Terminology Organisation {Conceptual Modelling



Systems Procurement


Modification/Updating

Exporting Terms



Document Management



Translation



Terminology Application {Technical Writing



Localisation / Internationalisation


Liaison



General Awareness Training



Terminology Education, Training and Research {Terminology in Academic Courses



Research


Terminological activities do, of course, reach beyond these core tasks. Such activities and areas of interest include: lexicography; legal procedures (e.g. copyright); information systems; information science; and language planning (sociolinguistics). These are not dealt with in this Appendix.

  1. LEVELS OF PROFESSIONAL DEVELOPMENT

We propose to recognise eight levels of professional development:


0 Basic entry (no prior training)



1 Standard entry (some informal experience)


2 Practitioner with basic training

3 Practitioner with more extensive training

4 Practitioner with extensive training and experience

5 Experienced practitioner with supervisory skills

6 Expert Practitioner or middle-ranking Manager


7 Senior specialist or Manager


Each level can be described in a way generic to the entire model, although further criteria specify the requirements to reach a given level in a particular area of work.

  1. ILLUSTRATION OF INTERACTION OF TWO CORE TASKS AND LEVELS OF PROFESSIONAL DEVELOPMENT

This model of professional development can be illustrated by focusing on two core tasks, Acquisition and Organisation, sketched against the eight professional levels mentioned above.

Table 1 describes the progression of a typical terminologist from a trainee (Levels 0, 1) to a practising terminologist (Levels 2, 3, 4 and 5) and finally to a managerial level terminologist (Levels 6, 7). Each core task in Table 1 is split into named sub-tasks.



Level


Acquisition


Organisation
TRAINEE TERMINOLOGIST

0: Structured. Closely supervised environment

Create Corpus

(Copy texts, etc.)



--

1: Structured environment

Create Corpus. create

termbanks (e.g., input

terms)



--
PRACTISING TERMINOLOGIST

2: Practitioner with basic training


--

Modify terms

Export terms (e.g. copy terms)


3: Practitioner with extensive training

Create Corpus

Analyse texts

Import Terms

Create Termbanks


Modify terms

Export terms


4: Practitioner with extensive training and experience

Create Corpus

Analyse texts

Import terms

Evaluate termbanks

Verify termbanks


Elicit user requirements

Procure hardware/software

Modify terms

Export terms


5: Experienced Practitioner

Text Analysis

Evaluate Term banks

Verify terms

Validate terms


Elicit user requirements

Develop record format

Procure hardware/software

Modify terms

MANAGERIAL LEVEL TERMINOLOGIST

6: Experienced Practitioner(Middle Management)

Oversee and integrate Sub-Tasks performed by others

Develop record format

Conceptual modelling

Procure hardware/software


7: Senior Specialist/Manager

Oversee and integrate Sub-Tasks performed by others

Develop record format

Conceptual modelling

Table 5 / 1 : A Detailed Model of two Core Tasks with Levels

Each cell in this table can be further elaborated as illustrated below (example given for the basic level trainee):



Task: Terminology Acquisition




Sub-Task: Corpus Creation




Level: 0




Cell Ref.: TA C0




Academic qualifications



Educated to the age of 18+. Should demonstrate an intermediate level of numeracy and literacy; a working knowledge of computers is a distinct advantage as is knowledge of some foreign language(s).




Terminology-specific skills and experience



No experience of terminology work expected. However, experience of clerical work would be advantageous.




Duties



(1) Working within a structured and closely-supervised environment.



(2) The location and transfer of previously-specified documents from paper and electronic sources to a corpus; adhering to the appropriate structuring principles. Photocopying; scanning; use of Internet browsing software.



(3) Thorough documentation of texts entered to corpus.




On-going training to be provided



(1) Should be encouraged to initiate own searches for texts in a given domain.



(2) Courses in any relevant (human) languages, software tools and operating systems.


(3) Instruction in standardisation issues.


(4) Develop understanding of commercial, administrative or industrial activities and terminology of employing organisation.



  1. POSSIBLE INTERACTIONS BETWEEN TASKS AND EXTANT SYLLABUSES FOR TERMINOLOGY TRAINING

In order to assess the compatibility of extant syllabuses with the matrix model for terminology training accreditation, an attempt was made to locate various components of a selection of the syllabuses investigated in the POINTER training survey in a summary matrix of the core tasks.

The terminology syllabuses of a wide range of institutions were reviewed:

University of Surrey, UK (S); Universitat Pompeu Fabra, Spain (PF); University of Hildesheim, Germany (HI); University of the Saarland, Germany (SA); South Denmark Business School, Kolding (SD); Swiss Federal Chancellery, Bern (FC); Université de Paris Nord, France ; Topterm (TOP). In Table 2 below, an attempt has been made to relate aspects of these syllabuses to the Levels of Professional Development outlined in 5.2 above.

Levels 0 and 1 are designated "Introductory" (I); Levels 2,3,4 and 5 are designated "Medium" (M) and Levels 6 and 7 correspond to "Advanced" (A).

Terminology Acquisition

(Corpus creation, text analysis, incorporating terms, creating termbanks, evaluating termbanks, verification, validation).

Terminology Organisation

(User requirements, record format development, conceptual modelling, systems procurement, modification/updating).

Terminology Application

(Document management, translation, technical writing, localisation / internationalisation).

Terminology Education, Training and Research

(General awareness training, terminology in academic courses, research).

General Background Education for Terminology
University of Surrey

(MA in Translation Studies - 1 year)

I, M (S8, S9 - elaboration of terminology)

I, M, A (S10, S11, S12 - text analysis)

I, M, A (S13, S14, S15 - Term Banks)

I (S3 - key notions of lexical resources)

I, M (S5,S6 - a concept based approach to terminology.

S7 - dealing with bilingual terminology in a concept-based approach)

I, M (S4 - the translator and terminology)

M (S13 - standardisation of terms)

M (Research by dissertation) I (S1, S2 - specialist subjects and special languages)
University of Hildesheim

(four year diploma in applied linguistics)

I (HII - text linguistics)

I (HIII.4 - unambiguity)

I (HIII.3 - concept and designation. HIII.6 - equivalence. HIII.7 - dictionaries) I (HIII.1 - terminology and translation)

I (HIII.5 - terminology standardisation. HIII.9 - practical terminology. HIII.10 - information and documentation)

I (HI - methods and problems of translation)

I (HIII.2 - language for special purpose).

I(HIV - computer assisted translation.)

Saarland University

(Translation / interpretation diploma)

I (SA2- working with TMSs, SA3 - survey of existing ones) I, M (SA4 - entry structures, SA7 - fundamentals of terminography, SA8 - databank models) I, M (SA1 - terminology management in the translation environment, SA6 - use of machine readable glossaries.) M, A (SA7 - SA12)
Southern Denmark Business School

(Training seminar and one-day follow up)

M (SD4 - the computerisation of NORDTERM. DANTERM classification). I (SD3 - terminology and IT. SD5 - terminology and information and documentation. SD6 - terminology and standardisation.) M (Research by case study). I (SD1 - special language, SD2 - terminology and specialised communication)
Swiss Federal Chancellery

(Short course, 1/2 weeks)

I (FC9 - termbanks). I, M (FC3 - concept and designation. FC4 - the terminological entry. FC10 - terminology classification). I, M (FC6 - working procedure in terminology. FC7 - particular problems of legal terminologies. I (FC5 - Practical exercise in the terminological evaluation of parallel texts. FC8 - Glossary compilation practical). I (FC1 - fundamentals of terminology. FC2 - special language).
Universitat Pompeu Fabra (Terminology Programme) I, M (PFII and Option 4 - term data bases) I, M (PFIV and Option 7 - lexicography) I, M (PFIII and Option 3 - Neology and standardisation) I, M (PFI)
Universitat Pompeu Fabra

(PhD Courses in Applied Linguistics)

A (Methodology of terminology creation) A (Lexicography and terminology) A (Rules and standards) A
University of Barcelona

(Master in Applied Linguistics)

I, M (B7-B8, sources of information, B17 - terminotics). I (B4 - B6, conceptual bases, designation conceptual systems, B15 - needs analysis) I(B16, neology and standardisation) I (A1-A3, social, political, scientific frame)
Université de Paris-Nord

(Langues étrangères appliquées) (LEA)

I (Training in the all-round use of a TMS and text analysis) I, M (Courses orientated towards technical translation and business/commerce). I (Practical terminological analysis)

M (Research projects in 3rd & 4th years)

Topterm v.o.f, Amsterdam I (TOP7 - Nomenclature) I (TOP1 - Introduction to terminology, TOP2 - Terminology theory, TOP5 - Computerised terminology work, TOP6 - Aspects of terminology management within an organisation) I (TOP3 - Terminography) I (TOP4 - Special Topics)

Table 5 / 2 : The Relationship between Tasks and Selected Syllabuses
  1. FOUR MATRICES SUMMARISING TASKS AND LEVELS OF PROFESSIONAL DEVELOPMENT

For each Core Task we delineate the Sub-Tasks which can be reasonably executed for a given level of background experience and training. Tables 3, 4, 5 and 6 show a rectangular matrix for terminology acquisition, terminology organisation, training applications, and terminology education and research tasks respectively. In order to ensure movement within and between matrices, cell descriptions must be flexible enough to allow for a wide range of experiences and qualifications to be recognised. Movement is envisaged along the following paths:

The lists of core tasks, sub-tasks and levels of professional development constitute the basis of the matrices presented below. Each Core Task has its own matrix with a column for each of its Sub-Tasks. Each matrix has eight rows representing the levels of professional development. Only the relevant cells of the matrix are filled. The code entered in each of the relevant cells reflects the Task, the Sub-Task and the Level.

Level

Number
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Corpus Creation
Text Analysis
Importing Terms
Creating

Term-banks
Evaluating

Term-banks
Verification
Validation


18+ yrs.

0


TA C0


21+ yrs

1


TA C1


TA T1


25+ yrs

2


TA T2


30+ yrs

3


TA C3


TA A3


TA I3


TA T3


35+ yrs

4


TA C4


TA A4


TA I4


TA E4


TA V4


40+ yrs

5


TA A5


TA E5


TA V5


TA VA5


45+ yrs

6


TA VA6


50+ yrs

7


TA VA7

Table 5 / 3 : Matrix for Terminology Acquisition (TA)

Level

Number
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
User Require-ments
Record Format Develop-ment
Conceptual

Modelling
Systems Procure-ment
Modifi-cation/

Up-dating
Exporting

Terms


0


1


2


TO M2


TO E2


3


TO M3


TO E3


4


TO U4


TO S4


TO M4


TO E4


5


TO U5


TO R5


TO S5


TO M5


6


TO R6


TO C6


TO S6


7


TO R7


TO C7

Table 5 / 4 : Matrix for Terminology Organisation (TO)

Level

Number
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
Document

Management
Translation
Technical Writing
Localisation /

Internation-alisation
Liaison


0


1


TAP T1


TAP TW1


TAP L1


2


TAP T2


TAP TW2


TAP L2


3


TAP T3


TAP TW3


TAP L3


4


TAP T4


TAP TW4


TAP L4


TAP LI4


5


TAP D5


TAP T5


TAP TW5


TAP L5


TAP LI5


6


TAP D6


TAP T6


TAP TW6


TAP L6


TAP LI6


7


TAP D7


TAP T7


TAP TW7


TAP L7

Table 5 / 5 : Matrix for Terminology Application (TAP)

Level

Number
Sub-Task

Name
Sub-Task

Name
Sub-Task

Name
General Awareness Training
Terminology in Academic Courses
Research


0


1


ETR G1


2


ETR G2


ETR R2


3


ETR G3


ETR A3


ETR R3


4


ETR A4


ETR R4


5


ETR A5


ETR R5


6


ETR A6


ETR R6


7


ETR A7


ETR R7

Table 5 / 6 : Matrix for Terminology Education, Training and Research


  1. THE POINTER QUALITY MATRIX

  1. INTRODUCTION

A key element in terminological activity is the evaluation of the quality of information. In a multilingual framework, this evaluation ascertains the level of coherence of equivalents added and ensures the effective reuse of data. The survey performed has therefore been carried out on both existing multilingual documents and monolingual references, since the latter may serve as a basis for the creation of multilingual resources.

The present appendix is a summary of the survey and its analysis which was carried out by CL on a large sample of existing terminological resources with the aim of evaluating their quality.

The results of this survey have formed one basis for the recommendations drafted by the POINTER Consortium in respect of the quality evaluation of existing terminological resources.

  1. COMPILATION OF THE EVALUATION TABLE

CL's first task was to compile an evaluation table that includes the main criteria necessary to allow the assessment of the quality of terminological resources.

The table comprises the following elements:

  1. general identification information about the resources;
  2. information about the languages and fields present;
  3. elements pertaining to the content, structure and methodology used in the creation of the resources.

The criteria taken into account are those generally accepted as fundamental principles by all actors in the market.

A scale of points was defined in which marks can be added to or subtracted from a score, according to predetermined criteria. Each terminological resource is therefore given a score which enables it to be classified into one of eight quality categories.

A sample table, the corresponding scale of points and the list of eight quality categories defined are shown in this Appendix.

In order to delimit the survey's field of application, a list of 28 domains was drafted, covering the main areas of terminological activity (see 6.7).

  1. THE SAMPLE

General Information

The survey was carried out on over two hundred dictionaries, lexica, glossaries and other terminological resources available on the market. These resources were selected from among references available both internally within CL or on the CL international network and in various technical and scientific documentation centres and libraries. Our objective was to obtain a good standard sample of terminological resources, commonly available.

The first finding in the analysis is the great disparity between the domains selected in terms of the quantity of existing terminological resources. The volume of resources available was sometimes 10 to 20 times greater in one domain than in another.

In terms of the number of languages present, the distribution is as follows:

Type of resource
monolingual
bilingual
multilingual
Resource quantity (%)
20.8
41.6
37.6

  1. OVERVIEW OF THE RESULTS

An initial quality analysis in terms of the eight quality categories defined gives the following result:

Category
A
B
C
D
E
F
G
Z
Resource quantity (%)
0
1.4
8.1
25.9
33.6
19.7
8.5
2.8

This initial global analysis illustrates the general poor quality of available resources. Only one third of resources studied fall in the first four above-average categories (A to D), with only 1.4% attaining the category of good or excellent (A and B). It becomes apparent that no resource completely fulfils the established criteria (A=0%).

An analysis according to the number of languages present gives the following percentage breakdown:

Category
A
B
C
D
E
F
G
Z
Monolingual resource
0
2.3
18.6
41.9
20.9
16.3
0
0
Bilingual resource
0
2.3
3.5
23.2
32.5
37.2
13.9
5.8
Multilingual resource
0
0
7.8
18.2
42.8
22.1
7.8
1.2

The table illustrates that monolingual documents are of a better quality than bilingual or multilingual ones. More than 60% of monolingual references fall into the above-average categories, whilst only a quarter of the bilingual and monolingual resources reach this level.

These results also highlight the great disparity in quality between bilingual documents. A closer analysis reveals that this variation is due to the different methodologies used in the creation of resources: a lexicographical approach vs. a terminological approach.

The global lack of quality of multilingual resources, with three quarters falling below average, can be explained by the absence of a coherent methodology employed in the creation of resources.

  1. ANALYSIS BY CRITERION

Methodology

Few terminological resources provide explanation of the methodology used in their compilation. Only 22.4% contain a methodological presentation proper, excluding prefaces and other introductory remarks.

The very principle of analysis of "concepts", which is elementary in terminology, is very often missing in these resources. In almost 90% of cases there is no conceptual organisation of information, the majority simply being ordered alphabetically.

Organisation

The lack of conceptual organisation of information is often further highlighted by a systematic absence of domains referring back to a clearly defined terminological tree. Only 15% of the resources analysed indicate the domain for each concept present.

Furthermore, in cases where cross references between concepts within the resources were found to exist, their coherence is often poor. The survey reveals that approximately 40% of cross referencing is unsystematic and incoherent. This terminological incoherence is all the more serious when evaluated in conjunction with the methodological problems of lack of conceptual organisation of information indicated above.

Content

The content of terminological resources is generally insufficient or offers little information to aid in its ultimate validation. 33% of resources analysed contain no bibliographies, either for each concept or for the document as a whole.

Furthermore, although 60% of the resources were created by specialists in the domain, only 30% were actually validated by a group of specialists.

Finally, in 41% of the resources, the absence of a definition for certain entries increases their terminological weakness. Where the resources do contain definitions, however, over 50% are unclear or lack of concision.

  1. EVALUATION TABLES

Dictionary Identification

Title
Author (s)
Editor / Town / Country
Publication year and number
ISBN number

Evaluation Table: Medium and Availability

Dictionary (1)
Paper / CD-ROM
Data base/ network access
Other medium (specify)
Copyright
Free distribution / Selling / Internal use
Free / Selling / Internal
Corresponding Eurodicautom collection(s) (2)

Evaluation Table: Languages

Languages
VE
DF
CO
NT
RF
German
English
Danish
Spanish
Finnish
French
Greek
Italian
Dutch
Portuguese
Swedish
Regional language of the EU (1)
Other language outside the EU (1)



X = cover 100 % VE = vedette CO = context RF = reference



O = partial cover DF = definition NT = note



Simultaneous X in VE, DF, and X or O in NT, RF in each existing language = 5 points



Simultaneous X in VE, DF, and X or O in NT, in each existing language = 3 points



Simultaneous X in VE, DF, and X or O in NT, for at least one language = 1 point



(1) = specify language(s)

Evaluation table: the data

Answer
Points
Presentation of the methodology
YES : + 2
Bibliography
YES : + 2
Standardised terminology
YES : + 1
Number of concepts
Presentation: conceptual organisation
YES : + 2
Presentation: alphabetical classification
0
Mentioned (sub-)domains
YES : + 2
1 entry = 1 concept
YES : + 2
1 entry = 1 term
YES : - 2
Cross reference coherence
YES : + 2
Translated DF/NT
YES : -1
Original DF/NT
YES : + 2
Cards without DF
YES : -2
Clear and concise DF/NT
YES : +1
Presence of terms not pertaining to the domain
YES : - 2
Mentioned source language
YES : + 1
Author = International organism
0
Author = National organism
0
Author = Public
0
Author = Private
0
Author = Domain organism / expert
YES : + 2
Author = Other (specify)
0
Participation of /validation by various experts
YES : + 2

Answer YES, NO, ND (information not available)

Reliability: Calculation of the Points

Sum of the points equal or superior to 25
A
Sum of the points between 20 and 24
B
Sum of the points between 15 and 19
C
Sum of the points between 10 and 14
D
Sum of the points between 5 and 9
E
Sum of the points between 0 and 4
F
Sum of the points between -5 and -1
G
Sum of the points less than -6
Z

  1. LIST OF THE DOMAINS AND THEIR ASSOCIATED CODES

Code
Domains
T.3.3.a
Agriculture & farm-produce: mechanisation and techniques
T.3.3.b
Agriculture & farm-produce: agricultural products
T.3.3.c
Agriculture & farm-produce: farm-produce industry
T.3.3.d
Insurances
T.3.3.e
Biology
T.3.3.f
Chemistry / toxic products
T.3.3.g
Accountancy, entitlements, balances, plans
T.3.3.h
Labour law (// WP T6.3)
T.3.3.i
Economy, finances and stock exchange
T.3.3.j
Electrical engineering
T.3.3.k
Energy: coal
T.3.3.l
Energy: electricity, nuclear energy
T.3.3.m
Energy: oil
T.3.3.n
Environment (// T6.2)
T.3.3.ñ
Hydrography / hydrology
T.3.3.o
Steel industry
T.3.3.p
Paper industry
T.3.3.q
Textile industry
T.3.3.r
Computer science (// T6.1)
T.3.3.s
Medicine
T.3.3.t
Social protection (// WP T6.3)
T.3.3.u
Telecommunication (// WP T6.1)
T.3.3.v
Air transport
T.3.3.w
Railway transport
T.3.3.x
Sea transport
T.3.3.y
Road transport
T.3.3.z
Transport: infrastructures
T.3.3.&
Fishing: mechanisation and techniques
T.3.3.#
Sociology
T.3.3.ç
General technical dictionary

  1. LIST OF THE RESOURCES ANALYSED

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.aDiccionari de maquinària agrícola Martí i Ferrer, Robert Curial Edicions Catalanes, Barcelona, Spain 199484-7256-893-8 D
T.3.3.aDiccionario técnico, técnica agrícola Abd-El-Wahed & Kames, Klaus Editorial Científico-Técnica, La Habana, Cuba 1981 F
T.3.3.aLandmaschinen und Geräte Steinmetz, H.H.Steinmetz, Betzdorf/Sieg, Germany 1982, 4th editionND E
T.3.3.aMachinisme & équipements agricoles CEMAGREFCEMAGREF-DICOVA + La Maison du Dictionnaire, Paris, France 1990, 3rd edition2-85608-034-0 E
T.3.3.aVocabulary of Forest Management International Union of Forestry Research Organisations IUFRO Secretariat, Vienna, Austria 19903-7040-1055-3 E
T.3.3.bDictionnaire d'histoire et de géographie agraires Fénelon, Paul Conseil Nationale de la Langue Française, Paris, France 1991, 2nd edition2-85319-210-5 E
T.3.3.bPlants and plant products of economic importance FAO Terminology Bulletin, n 25 Food and Agriculture Organization of the United States (FAO) 1983ND D
T.3.3.bPotato terms: trilingual dictionary of the potato van Loon, C.D. & van der Heij, D.G. Pudoc, Wageningen, The Netherlands 198990-220-0962-9 F
T.3.3.cDictionnaire des industries alimentaires Clément, J.-M. Masson, Paris, France1978 2-225 46 079-5D
T.3.3.cDictionnaire encyclopédique d'agrométéorologie Parcevaux, S. deConseil international de la langue française, Paris, France 19902-85319-218.0 D
T.3.3.cWeinbautechnik Steinmetz, H.H.Steinmetz, Betzdorf/Sieg, Germany 1981ND E
T.3.3.çDiccionario de electrónica y técnica nuclear Markus, JohnMarcombo, Boixareu Editores, Barcelona, Spain 198484-267-0003-9 E
T.3.3.çDiccionario de Electrónica, Informática y Centrales Nucleares Mataix, MarianoMarcombo, Boixareu Editiores, Barcelona, Spain 197884-267-0350-X D
T.3.3.çDiccionario enciclopédico de términos técnicos en tres volúmenes Collazo, Javier L. McGraw-Hill Book Company 1985, 3rd edition0-07-079172-4 E
T.3.3.çDictionnaire des industries Joly, Hubert & autres Conseil International de la Langue Française, Paris, France 19862-85319-158-3 E
T.3.3.çDictionnaire technique français-espagnol Mink, H.Editorial Herder S.A., Barcelona, Spain 1989, 3rd edition84-254-1372-9 G

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.çDictionnaire technique général anglais-français Belle-Isle, J.-Gérald Beauchemin, Montreal, Canada 1977, 2nd0-7750-0448-0 E
T.3.3.çDizionario enciclopedico scientifico e tecnico inglese italiano, italiano inglese Nicola Zanichelli S.p.A., Bologna, Italy 1985, 6th edition D
T.3.3.çEnergy Dictionary World Energy Council (WEC) 1992 2-909832-00-7C
T.3.3.çGlossaire Nouvelles techniques de transport Bureau de terminologie, Division Traduction, Affaires générales, Commission des Communautés europénnes Bureau de terminologie, Division Traduction, Affaires générales, Commission des Communautés europénnes, Bruxelles, Belgium 1977, 2nd edition D
T.3.3.çNew polytechnic dictionary of Spanish and English Language, first volume English-Spanish Beigbeder Atienza, Federico Ediciones Díaz de Santos, S.A., Madrid, Spain 198884-86251-72-9 G
T.3.3.çNew polytechnic dictionary of Spanish and English Language, second volume Spanish-English Beigbeder Atienza, Federico Ediciones Díaz de Santos, S.A., Madrid, Spain 198884-86251-73-7 G
T.3.3.çTechnological Dictionary - Dictionnaire technologique - Technologisches Wörterbuch, vol. 1 Feutry, Michel, Mertz de Mertzenfeld, Robert & Dollinger, Agnès Maison du dictionnaire, Paris, France 19762-85608-000-6 G
T.3.3.çVocabulario científico y técnico Real Academia de ciencias exactas, físicas y naturales Espasa Calpe S.A., Madrid, Spain 1990, 2nd edition84-239-5987-2 C
T.3.3.çVocabulário Técnico Buecken, Francisco J. Ediçao Melhoramentos, Sao Paulo, Brésil 1997, 5th editionND G
T.3.3.dDiccionario Mapfre de seguros Castelo Matrán, Julio Editorial Mapfre, S.A., Madrid, Spain 198884-7100-170-5 D
T.3.3.eDictionary of Biology Hale, W.G. & Margham, J. P. Collins, London, UK 19880-00-434351-4 F
T.3.3.eDictionnaire de génétique Sournia, Jean-Charles Conseil International de la Langue Française (CILF), Paris, France 19912-85319-231-8 E
T.3.3.eGlossaire de biotechnologie Commission des Communautés européennes (CCE), Unité de Terminologie Elsevier applied science publishers, London, UK 19901-85160-569-2 D
T.3.3.eLexico Biologikon kai iatrikon oron, anglo-heleniko Patargias, Z. A., Sekefis, K. E., Sekefi-Patargia, K. & Margaritis, L. X. Lifzefs Bibliopoleio, Athens, Greece 1992 F

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.fCompendium de terminologie chimique et lexique anglais-français Richer, Jean-Claude Groupe Communication Canada, Ottawa, Canada 19930-660-94192-9 E
T.3.3.fChemistry and chemical engineering: thematic and annotated picture dictionary Hitzke, Joachim-Charles Hitzke, Illkirch Graffenstaden, France 19932-9501518-7-6 E
T.3.3.fDictionary of chemical terminology in five languages : English, German, French, Polish and Russian Kryt, DobromilaKryt & Wydawnictwa Naukowo-Techniczne, Warsaw, Poland 19800-444-99788-1 F
T.3.3.fTerminologie des substances polluantes Parlement Européen-Direction de la traduction et de la terminologie, Luxembourg, Luxembourg 1984ND E
T.3.3.gDiccionario de derecho, economía y política, inglés-español, español-inglés Lacasa Navarro, Ramón & Díaz de Bustamante, Isidro Editorial Revista de Derecho Privado (EDERSA), Madrid, Spain 1986, 2nd edition84-7130-306-X G
T.3.3.gDiccionario de términos jurídicos inglés-español, español-inglés Alcaraz Varó, Enrique & Hughes, Brian Editorial Ariel, S.A., Barcelona, Spain 199284-344-1579-8 D
T.3.3.gDiccionario Jurídico Espasa Espasa Calpe, S.A., Madrid, Spain 199484-239-5988-0 D
T.3.3.gDiccionario terminológico de economía, comercio y derecho, inglés-español, español-inglés, en 17 vol. Muñiz Castro, Emilio G. Editorial Fontenebro, Collado Villalba (Madrid), España & Area Editorial S.A. / Expansión 199284-87606-30-X G
T.3.3.gDicionário de Contabilidade Lopes de Sá, A. & Lopes de Sá, Ana M. Editora Atlas, Sao Paulo, Brésil 1983, 7th editionND D
T.3.3.gDictionary of accounting Collin, P.H. & Joliffe, Adrian Peter Collin Publishing 19920-948549-27-0 F
T.3.3.gDictionary of German-English Accounting and Business Terms Arthur Andersen & Co. GmbH Fachverlag für Wirtschafts- und Steuerrecht Schäffer GmbH & Co. KG, Stuttgart, Germany 1981, 2nd editionND E
T.3.3.gDictionnaire comptable fiscal et financier Saxcé, Frank de Cabinet Saxcé, Paris, France 1986, 1rst edition 2-9501213-0-6E
T.3.3.gDictionnaire des termes juridiques et commerciaux Mouthier, AnnieEditions De Vecchi, Paris, France 1991, 1rst edition 2-7328-0102-XE
T.3.3.gDictionnaire fiduciaire fiscal Villeguérin, Yves-Robert de La La Villeguérin Editions, Paris, France 1995, 10th edition 2-86521-243-2C
T.3.3.gDroit commercial Bouilly, MichelHachette, Paris, France 1992, 1rst edition 2-01-019027-0D

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.gGlossaire Français/Russe de termes statistiques, Volume I Répertoires d'entreprises Office statistique des Communautés européennes Office des publications officielles des Communautés européennes, Luxembourg, Luxembourg 1994, 1rst edition 92-826-9308-2E
T.3.3.gGlossaire Français/Russe de termes statistiques, Volume III Comptabilité d'entreprise Office statistique des Communautés européennes Office des publications officielles des Communautés européennes, Luxembourg, Luxembourg 1994, 1rst edition 92-826-9312-0E
T.3.3.gGlossaire Français/Russe de termes statistiques, Volume IV Commerce extérieur Office statistique des Communautés européennes Office des publications officielles des Communautés européennes, Luxembourg, Luxembourg 1994, 1rst edition 92-826-9314-7E
T.3.3.gGlossaire Français/Russe de termes statistiques, Volume V Comptabilité nationale Office statistique des Communautés européennes Office des publications officielles des Communautés européennes, Luxembourg, Luxembourg 1994, 1rst edition 92-826-9316-3E
T.3.3.gLe nouveau plan comptable 1982 Groupe Guy Gendrot, Experts comptables Centre de Librairie et d'Editions techniques, Paris, France 1983, 1rst edition 2-85354-526-1E
T.3.3.gLexique de comptabilité Lassèque, Pierre Editions Dalloz, Paris, France 1993, 3rd edition2-247-01532-8 D
T.3.3.gLexique de la fiscalité - Taxation Glossary Bernard, YolandeCentre d'édition du gouvernement du Canada, Approvisionnements et Services, Ottawa, Canada 1990, 3rd edition0-660-55541-7 E
T.3.3.gLexique fiscal Barilari, André & Drapé, Robert Editions Dalloz, Paris, France 1992, 2nd edition2-247-01487-9 D
T.3.3.gVocabulaire relatif à la déclaration de revenus Perron, Madeleine Les Publications du Québec, Québec, Canada 1989, 2nd edition2-551-08259-5 C
T.3.3.hDictionary of Employment Law Selwyn, NormanButterworth & Co Ltd., London, UK 19850-406-20790-0 C
T.3.3.hDictionnaire de Droit social Frankl, Friedrich Max Hüber Verlag, Munich, Germany 1970ND F
T.3.3.hEmployment Terminology / Terminologie relative à l'emploi ILOILO ND G
T.3.3.hEuropean Employment & Industrial Relations Glossary: Belgium Blanpain, RogerSweet and Maxwell Limited, London, England / Office for Official Publications of the European Communities 19920421-44860-1 D
T.3.3.hEuropean Employment & Industrial Relations Glossary: France Lyon-Caen, Antoine Sweet and Maxwell Limited, London, England / Office for Official Publications of the European Communities, Luxembourg 19930421-44870-9 D

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.hEuropean Employment & Industrial Relations Glossary: Germany Weiss, ManfredSweet and Maxwell Limited, London, England / Office for Official Publications of the European Communities, Luxembourg 19920421-44830-X D
T.3.3.hEuropean Employment & Industrial Relations Glossary: Greece Kravaritou, YotaSweet & Maxwell, London, UK 199492-826-2607-5 D
T.3.3.hEuropean Employment & Industrial Relations Glossary: Ireland Von Prondzynski, F. & Richards, Wendy Sweet and Maxwell, London, UK 19940421-44900-4 D
T.3.3.hEuropean Employment & Industrial Relations Glossary: Italy Treu, TizianoSweet and Maxwell Limited, London, England / Office for Official Publications of the European Communities, Luxembourg 19910421-44820-0 D
T.3.3.hEuropean Employment & Industrial Relations Glossary: Spain Martín Valvarde, Antonio Sweet and Maxwell Limited, London, England / Office for Official Publications of the European Communities, Luxembourg 19910421-44840-7 C
T.3.3.hEuropean Employment & Industrial Relations Glossary: United-Kingdom Terry, Michael & Dickens, Linda Sweet and Maxwell Limited, London, England / Office for Official Publications of the European Communities, Luxembourg 19910-421-44850-4 D
T.3.3.hGlossary of labour and the trade union movement European Commission European Commission, Brussels, Belgium 1983ND F
T.3.3.hHandlexikon Arbeitsbeziehungen in der Bundesrepublik Deutschland Weiss, ManfredAmt für Amtliche Veröffentlichungen der Europaïschen Gemeinschaften, Luxembourg / Walhalla Fachverlag, Regensburg, Germany 19933-8029-7430-1 D
T.3.3.hSozialrecht und Arbeitsschutz Raschke, UlrichErich Schmidt Verlag, Berlin, Germany 19873-503-02654-1 F
T.3.3.hVocabulaire des conventions collectives Pétrin, Hélène Les Publications de Québec, Québec, Canada 19912-551-14810-3 D
T.3.3.iA Economia e o Economês Macedo, Marcos Letayf Editora Lemi, Belo Horizonte, Bresil 1981, 2d editionND D
T.3.3.iBankfachwörtebuch der SBG Schweizerische Bankgesellschaft Schweizerische Bankgesellschaft, Schweiz 1986, 1rst edition NDE

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.iBankfachwörterbuch der SBG Schweizerische Bankgesellschaft SBG,Zurich, Switzerland 1987ND E
T.3.3.iDiccionario bilingüe de economía y empresa, inglés-español, español-inglés Lozano Irueste, José Mª Ediciones Pirámide, S.A., Madrid, Spain 198984-368-0456-2 E
T.3.3.iDiccionario de Economía español-inglés en ocho pequeños tomos Tamames, Ramón Alianza Editorial, S.A. / Cinco Días, Madrid, Spain 199284-206-6313-1 E
T.3.3.iDiccionario empresarial Stanford, inglés-francés-español Álvarez Carmona, Isabel, Barallat López, Luis, Elosúa De Juan, Marcelino, Hernando García, Paloma & Huerta de Soto, Jesús LID, Editorial Empresarial, S.L., Madrid, Spain 1993, 6th edition84-88717-02-4 F
T.3.3.iDicionário Bancário Português-Inglês Correia da Cunha, A. Publicaçoes Europa-América, Mem Martins, Portugal 1984 Z
T.3.3.iDicionário de Economia Cotta, AlainPublicaçoes Dom Quixote 1978, 4th editionND E
T.3.3.iDictionnaire commercial et financier, français-anglais-russe Gavrichina, K. S., Sazonov, M. A. & Gavrichina, I. N. VIKRA, Moscow, Russia 19935-900455-55-6 G
T.3.3.iDictionnaire d'économie et sciences sociales Capul, Jean-Yves & Garnier, Olivier Hatier, Paris, France 1994, 2nd edition2-218-05936-3 D
T.3.3.iDictionnaire économique de l'anglais et du français, Volume I, Le système bancaire Banque de FranceEconomica, Paris, France 1992, 1rst edition 2-7178-2279-8B
T.3.3.iDictionnaire économique de l'anglais et du français, Volume II, Crédit, Taux d'intérêt Banque de FranceEconomica, Paris, France 1992, 1rst edition 2-7178-2358-1B
T.3.3.iDictionnaire économique et social Brémond, Janine & Gédélan, Alain Hatier, Paris, France 1990, 5th edition2-218-02591.4 C
T.3.3.iDictionnaire fiduciaire financier Villeguérin, Erik de La La Villeguérin Editions, Paris, France 1991, 2nd edition2-86521-156-8 D
T.3.3.iEnglish-Russian Dictionary of Economics and Finance Anikin, Andrei V. The School of Economics Press,St. Petersburg ,Russia 19935-900428-05-2 F
T.3.3.iGlossaire fiscal Munter, M., Bauduin C. & Hublart, C. European Commission, Brussels, Belgium 1983ND D
T.3.3.iHarrap's Dictionary of Business & Finance Harrap Books Ltd, Bromley, Kent, England 19900 245-60060-4 E

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.iHarrap's Glossary of Commercial & Industrial Terms, Spanish-English, English-Spanish Harrap Books Ltd, Bromley, Kent, England 19900-245-60018-3 G
T.3.3.iInternational Tax Glossary The International Bureau of Fiscal Documentation IBFD Publications BV, Amsterdam, The Netherlands 1992, 2nd edition90-70125-60-9 E
T.3.3.iLamont's Glossary, A Guide for Investors Lamont, BarclayLamont & Partners Ltd., London, UK 1988, 3rd editionND E
T.3.3.iLe dictionnaire technique de la bourse et des marchés financiers Villeneuve, Jeanne France de Soficom Editions, Paris, France 1993, 1rst edition 2-9504950-2-8D
T.3.3.iLe Lexique bilingue d'analyse financière Chavanne, Philippe & Van Oordt Hendrick Editions Accent-International, Paris, France 1993, 1rst edition 2-907816-09-8E
T.3.3.iLessico bancario UBS Unione di Banche Svizzere UBS, Zurich, Switzerland 1984ND E
T.3.3.iLexique bancaire UBS Union de Banques Suisses Union de Banques Suisses, Suisse 1988, 1rst edition NDE
T.3.3.iLexique commercial CIDALa Maison du Dictionnaire, Paris, France 19732-85273-001-4 E
T.3.3.iLexique de banque et de bourse Sousi-Roubi, Blanche Editions Dalloz, Paris, France 1990, 3rd edition2-247-01092-X D
T.3.3.iLexique multilingue des affaires Gibb, John B.La Maison du Dictionnaire, Paris, France 1994, 1rst edition 2-85608-061-8F
T.3.3.iLongman Dictionary of Business English Adam, J.H.York Press, Essex, UK 19820-582-55552-3 D
T.3.3.iManagement Dictionary Sommer, Werner & Schoenfeld, Hanns-Martin Walter de Gruyter, Berlin, Germany 1978, 4th edition3-11-004863-9 F
T.3.3.iPons Fachwörterbuch Wirtschaft Collin, P.H., Janssen, S., Kornmüller, A. & Livesey, R. Ernst Klett Verlag für Wissen und Bildung, Stuttgart - Dresden, Germany 1993, 2nd edition3-12-517930-0 E
T.3.3.iReuter Glossary: International Economic & Financial Terms REUTERLongman, Londres, UK 19890-582-04286-0 D
T.3.3.iTacis Dictionary of Economic and Management Terms European Commission European Commission, Brussels, Belgium 1994, 1st edition84-88361-01-7 C
T.3.3.iThe Penguin Business Dictionary Greener, MichaelPenguin Books, London, UK 19870-14-051214-4 E

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.iThe Penguin Dictionary of Economics Bannock, Graham, Baxter, R.E. & Davis, Evans Penguin Books, London, UK 1992, 5th edition0-14-051255-1 E
T.3.3.iThe Penguin Dictionary of Finance Bannock, Graham & Manser, William Penguin Books, London, UK 19890-14-051195-4 E
T.3.3.iUBS Dictionary of Banking and Finance Union Bank of Switzerland Union Bank of Switzerland, Switzerland 1985, 1rst edition NDE
T.3.3.iWirtschaftsenglisch Wörterbuch van Bernem, Theodor R.Oldenbourg Verlag GmbH, München, Germany 1994, 3rd edition3-486-22829-3 F
T.3.3.jDictionnaire anglais-français des termes relatifs à l'électronique, l'électrotechnique, l'informatique et aux applications connexes Piraux, HenriEyrolles, Paris, France 1989, 16th edition NDD
T.3.3.jDictionnaire d'électronique et d'électrotechnique allemand-français et français-allemand Schroeder, Wolfgang & Kind, F.W. Maison du dictionnaire, Paris, France 1985, 2nd edition2-85608-020-0 F
T.3.3.jDictionnaire des composants électroniques Commission ministérielle de terminologie des composants électroniques Dunod, Paris, France 19942 10 00 1293 2 D
T.3.3.jDictionnaire encyclopédique d'électronique Fleutry, MichelLa Maison du Dictionnaire, Paris, France 19912-85608-043-X D
T.3.3.lA Dictionary of Nuclear Power and Waste Management with Abbreviations and Acronyms Lau, Foo-SunResearch studies press Ltd, London, UK 19870-86380-051-3 D
T.3.3.lDiccionario Nuclear Alonso Santos, Agustín, Barrachina Gómez, Miguel, Caro Manso, Rafael, Cerrolazo Asenjo, José Angel, Granados González, Carlos Enrique, López Rodríguez, Manuel, Palacios Sunico, Luis & Pedro Herrera, Franciso de Publicaciones Científicas de la Junta de Energía Nuclear, Madrid, Spain 197984-500-3077-3 D
T.3.3.lRéacteurs à eau pressurisée - Ilots nucléaires. Lexique français-anglais et anglais-français FramatomeAFNOR, Paris-La Défense, France 19892-12-376211-3 F

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.mDictionnaire de l'offshore pétrole et gaz Whitehead, HarrySCM Publications, Neuilly sur Seine, France 19762-901133-04-5 E
T.3.3.mDictionnaire du forage et des puits - Dictionary of Drilling and Boreholes Moureau, Magdeleine & Brace, Gerald - Institut français du pétrole Editions Technip, Paris, France 19902-7108-0592-8 G
T.3.3.mDictionnaire du pétrole Barbier, YvesEditions SCM, Paris-La-Défense, France 1980 E
T.3.3.mDictionnaire technique du pétrole - Dictionary of Petroleum Technology Moureau, Magdeleine & Brace, Gerald - Institut français du pétrole Editions Technip, Paris, France 1979, 2d edition2-7108-0361-5 G
T.3.3.mOil and Gas Field Dictionary in 8 Languages Chaballe, Masuy, Vandenberghe, Salem & Malekian, F. Farhangan Publications, Téhéran, Iran 1990ND Z
T.3.3.mOnshore/Offshore Oil and Gas Multilingual Glossary Commission of the European Communities (CEC) Graham & Trotman Ltd, London, UK 19790-86010-184-3 E
T.3.3.mPetroleum Terms Zubini, FabioEdizioni Italo Svevo, Trieste, Italy 1992 D
T.3.3.nDictionary of Dangerous Pollutants, Ecology, and Environment Tver, David F.Industrial Press Inc., New York, United States 1981, 1st edition0-8311-1060-0 F
T.3.3.nDictionnaire de l'environnement : les termes normalisés AFNOR, Paris-La Défense, France 19942-12-473012-6 D
T.3.3.nDictionnaire de l'environnement avec index anglais-français Conseil International de la Langue Française (CILF) & Institut d'Etudes Internationales de la Communication sur l'Environnement (COMUVIR) CILF, Paris, France 1992, 3rd edition2-85319-243-1 F
T.3.3.nEnvironment: Reverse Index (French-English) to the English-French Glossary, TERM/40 Languages Service, Terminology and Technical Documentation Section, UNO United Nations Office (UNO), Geneva, Switzerland 1991ND G
T.3.3.nNSCA Environmental Glossary National Society for Clean Air (NSCA), Dunmore, Jane & Gilbert, P.M. Jane Dunmore, Brighton, United Kingdom 19850 903474 29 8 E

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.nThe Environmental Dictionary and Regulatory Cross-reference King, James J.Wiley & Sons, New York, USA 1995, 3rd edition0-471-11995-4 E
T.3.3.nVocabulaire d'écologie Daguet, Philippe, Godron, Michel, David, P. & Riso, J. Agence de coopération culturelle et technique (ACCT), Conseil international de la langue française (CILF) et CNRS-Centre d'études phytosociologiques et écologiques, Paris, France 19792-85319-063-3 D
T.3.3.nVocabulaire de l'environnement Ternisien, Jean A. et autres Conseil international de la langue française (CILF), Paris, France 19762-85319-025-0 E
T.3.3.ñDictionnaire français d'hydrologie de surface avec équivalents en anglais, espagnol et allemand Roche, Marcel F.Masson, Paris, France 19862-225-80739-6 D
T.3.3.ñInternational Glossary of Hydrology United Nations Educational, Scientific and Cultural Organization (UNESCO) & World Meteorological Organization (WMO) WMO-OMM-BMO, Geneva, Switzerland 1992, 2nd edition92-3-002745-6 D
T.3.3.oGlossaire des normes de l'acier Commission des Communautés européennes (CCE), Bureau de Terminologie CCE, Bruxelles, Belgium 197892-825-0833-1 D
T.3.3.oQuelques mots sur l'acier... Lexique à l'usage des utilisateurs Orlandi, M.-C.Editions de Physique, Les Ulis, France ND E
T.3.3.oStahleisen Wörterbuch - Dictionnaire fer et acier Verein Deutscher Eisenhüttenleute & Centre de recherches métallurgiques Verlag Stahleisen mbH, Düsseldorf, Germany 1991, 3rd edition3-514-00336-X F
T.3.3.pDiccionario terminológico iberoamericano de celulosa, papel, cartón, y sus derivados Asenjo, José Luis, Barbadillo, Pedro & Glez. Monfort, Pilar Instituto Papelero Español, Madrid, Spain 199284-604-3169-X D
T.3.3.pDictionnaire papetier français-anglais-allemand-espagnol La Papeterie, Paris, France 1966 F
T.3.3.pDictionnaire technique du papier et des encres Faudouas, Jean-Claude Eyrolles, Paris, France 1991E

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.pVocabulaire des papiers et des cartons anglais-français Côte, Normand Office de la langue française, Québec, Canada 19832-551-04426-X E
T.3.3.qA Small Dictionary of Textile Terms Barnett, June W. 19870-9624960-0-6 E
T.3.3.qDiccionari de la indústria tèxtil Cervera i Caminal, Anna, Mumbrú i Laporta, Josep, Pont i Puntigam, M. Rosa & Taló i Rovira, Joan Edicions de la Universitat Politècnica de Catalunya, Barcelona, Spain 199484-7653-468-X C
T.3.3.qDiccionario terminológico textil - Textil Glossary - Lexique textile Hirsch, Pierre & Estany, Manuel Estany, Barcelona, Spain 198084-300-2484-0 F
T.3.3.qLexique de l'industrie textile Institut Textile de France Office de la langue française, Québec, Canada 1974ND F
T.3.3.rDiccionario comentado de terminología informática Aguado de Cea, Guadalupe Editorial Paraninfo, Madrid, Spain 199484-283-2060-9 C
T.3.3.rDiccionario de computación, inglés-español, español-inglés Freedman, AlanMcGraw-Hill/Interamericana de España, S.A., Aravaca (Madrid), Spain 1993, 5th edition84-481-0028-X E
T.3.3.rDiccionario de informática y tecnologías afines, inglés-español Ginguay, MichelMasson, S.A., Barcelona, Spain 1985, 2nd edition84-311-0375-2 G
T.3.3.rDiccionario de informática, inglés-español-francés, español-inglés-francés, francés-inglés-español Nania, Georges A. Paraninfo, S.A., Madrid, Spain 198584-283-1413-6 F
T.3.3.rDiccionario de términos de proceso de datos Salto Dolla, Angel Paraninfo, Madrid, Spain 1971F
T.3.3.rDiccionario Oxford de informática, inglés-español, español-inglés Mendizábal Allende, Blanca de Ediciones Díaz de Santos, S.A., Madrid, Spain 1985, 1st edition84-86251-28-1 E
T.3.3.rDictionary of computer science, English-Italian-German, Italian-English, German-English Vollnhals, OttoGruppo Editoriale Jackson s.r.l., printed by S.p.A. Alberto Matarelli, Milan, Italy 1982 F
T.3.3.rDictionnaire anglais-français d'informatique - Bureautique - Télématique - Micro-informatique Ginguay, MichelMasson, Paris, France 1992, 11th edition 2-225-82776-1Z
T.3.3.rDictionnaire bilingue d'informatique - anglais / français français / anglais Wiard, Alain & Virgatchik, Ilya Nouvelles éditions Marabout, Alleur, Belgique 19852-501-00689-5 Z
T.3.3.rDictionnaire de la micro-informatique - français anglais FRANTERMNathan, Paris, France 19842-904994-05-X D

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.rDictionnaire français-anglais d'informatique - Bureautique - Télématique - Micro-informatique Ginguay, MichelMasson, Paris, France 1993, 6th edition2-225-84035-0 Z
T.3.3.rHarrap's Informatique Dictionnaire Camille, Claude & Dehaine, Michel Harrap, London, UK 1985, 3rd edition0 245-541195-0 Z
T.3.3.rIllustrated Dictionary of Robotics Paley, Sergey M.La Maison du Dictionnaire, Paris, France 19932-85608-052-9 E
T.3.3.rMacMillan Dictionary of Data Communications Sippl, Charles J. MacMillan Press, London, UK 1985, 2d edition0-333-37083-X F
T.3.3.rThe GUI Guide - International Terminology for the Windows Interface MicrosoftMicrosoft Press, Redmond (Washington), Etats-Unis 19931-55615-538-7 E
T.3.3.sDiccionario de medicina Ruiz Torres, F.Editorial Alhambra S.A., Madrid, Spain 1959, 1st edition F
T.3.3.sDiccionario enciclopédico ilustrado de medicina DorlandInteramérica, Mc Graw Hill, Madrid, Spain 1984, 26th edition 84-7605-412-2D
T.3.3.sDiccionario médico europeo Grass Ediciones - S.A., Barcelona, Spain 199184-7714-039-1 G
T.3.3.sDiccionario terminológico de las ciencias médicas Navarro - Beltrán Iracet, Estanislao Salvat Editores S.A., Barcelona, Spain 1984, 12th edition 84-345-1401-XF
T.3.3.sDictionnaire anglais-français des termes de médecine - English-French dictionary of medical terms Delamarre, Jean & Delamarre-Riche, Thérèse Maloine, Paris, France 1992, 3rd edition2-224-02049-X G
T.3.3.sDictionnaire français de médecine et de biologie Manuila, A., Manuila, L., Nicole, M. & Lambert, H. Masson et Cie, Paris, France 1970ND D
T.3.3.sElsevier's Encyclopaedia Dictionary of Medecine - Part A: General Medecine, in Five Languages Dorian, Angelo Francis Elsevier science publishers B.V., Amsterdam, The Netherlands 19870-444-42823-2 E
T.3.3.sMedical Dictionary Veillon & Nobel Editorial Científico - Medicina, Madrid, Spain 1964, 4th edition F
T.3.3.sReallexicon der Medizin Urban & Schwarzenberg Verlag Urban & Schwarzenberg Verlag, Munich, Germany 1966ND D
T.3.3.tDictionnaire des personnes âgées, de la retraite et du vieillissement Sournia, Jean-Charles Franterm, Paris, France 19842.904994-04-1 C

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.tDictionnaire fiduciaire social Signoretto, Fabrice, Gervais, Isabelle, Desset, Claude & Fricotté, Lisiane La Villeguérin Editions, Paris, France 1992, 11th edition 2-86521-191-6E
T.3.3.tEmployment Promotion and Social Security / Promotion de l'emploi et sécurité sociale ILOILO, Geneva, Switzerland 1988 F
T.3.3.tLexique de la protection sociale Beau, Pascal & Beau, Roger Dalloz, Paris, France 19862-247-00683-3 D
T.3.3.tPension Fund terms / Terminologie de la caisse des pensions ILOILO, Geneve, Switzerland 1988 F
T.3.3.tServices sociaux et services de santé / Health and Social Services Thibault, Monique Fédération canadienne des municipalités / Bureau des traductions, Ottawa, Canada 19810-660-50812-5 E
T.3.3.tSocial Security Terminology / Terminologie relative à la sécurité sociale ILOILO, Geneva, Switzerland 1993 F
T.3.3.tTerminologie de la charte sociale européenne / Terminology of the European Social Charter Bureau de terminologie du Parlement européen (BTP) Bureau de terminologie du Parlement européen, Luxembourg 1982 F
T.3.3.tTerminologie de la Sécurité sociale European Parliament European Parliament, Luxembourg 1974, 2nd editionND E
T.3.3.tTerminologie de la sécurité sociale / Terminologie of Social Security Bureau de terminologie du Parlement européen (BTP) Bureau de terminologie du Parlement européen, Luxembourg 1986 F
T.3.3.tVocabulaire de la retraite Fortin, Jean-Marie Les Publications du Québec, Québec, Canada 19932-551-15249-6 D
T.3.3.tVocabulaire des assurances sociales Desrosier, Georges & Boulay, Jacques L'Editeur officiel du Québec, Québec, Canada 1980, 2nd edition0-7754-2274-6 E
T.3.3.tVocabulaire des pensions / Pensions Vocabulary Collier, LindaCentre d'édition du gouvernement du Canada, Ottawa, Canada 19900-660-55784-3 F
T.3.3.uAusgewählte Stichworte zur Telekommunikation Ascom Hasler AGSauerländer AG, Aarau, Germany 19883-7941-3131-2 E
T.3.3.uDictionnaire anglais-français des télécommunications Luca, Johanne deMasson, Paris, France 19982-225-81063-X Z
T.3.3.uDictionnaire de l'OSI français-anglais Association française de normalisation (AFNOR) Association française de normalisation (AFNOR), Paris la Défense, France 19952-12-481611-X D

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.uDictionnaire de télédétection aérospatiale Paul, Serge, Alouges, Aimé, Bonneval, Henri & Pontier, Louis Masson, Paris, France 19822-225-75889-1 D
T.3.3.uDictionnaire des télécommunications Odier, Antoine & Zenneki, Mohamed Nouvelles éditions Marabout, Alleur, Belgique 19922-501-01674-2 E
T.3.3.uDictionnaire du multimédia / audiovisuel - informatique - télécommunications Notaise, Jacques, Barda, Jean & Dusanter, Olivier Association française de normalisation (AFNOR), Paris la Défense, France 19952-12-465007-6 D
T.3.3.uERITERM A Five-Language Telecommunications Dictionary Ericsson Telecom AB Ericsson Telecom AB, Stockolm, Suède 199291-630-0488-7 G
T.3.3.uInternational Dictionary of Telecommunications Langley, GrahamPitman Publishing Ltd, London, England 1986, 2nd edition0-917845-04-8 E
T.3.3.uTelecommunications Dictionary Rehann, Jens Peter & EL.ET.O Glossima, Thessaloniki, Greece 1995960-7615-00-X G
T.3.3.vComprehensive Aeronautic and Space Encyclopedia Velasco Sales, José 1993 E
T.3.3.vDiccionario aeronáutico civil y militar Velasco Sales, José Editorial Paraninfo, S.A., Madrid, Spain 199484-283-2078-0 G
T.3.3.vDictionnaire du transport aérien Cambournac, Pascal Presses de l'institut du transport aérien (ITA) 19932-908537-08-7 D
T.3.3.vICAO Lexicon International Civil Aviation Organization International Civil Aviation Organization, Montreal, Canada 1980, 5th edition E
T.3.3.vLexicon of terms used in connexion with international civil aviation International Civil Aviation Organization International Civil Organization, Montreal, Canada 1971, 3rd edition E
T.3.3.vMultilingual Aeronautical Dictionary/Dictionnaire aéronautique multilingue Advisory Group for Aerospace Research and Development (AGARD) Technical Editing and Reproduction Ltd., London, England 198092-835-01666-7 C
T.3.3.wEisenbahn: Technik-Wörterbuch English, Deutsch, Französisch, Russisch Dannehl, AdolfBrandstetter, Wiesbaden, Germany 19833-87097-119-3 E
T.3.3.wLexique général des termes ferroviaires Union internationale des chemins de fer Gasser AG, Chur, Switzerland 1988, 4th editionND F
T.3.3.xDictionnaire maritime thématique anglais et français Bruno, André & Mouilleron-Bécar, Claude Masson, Paris, France 19912-225-82098-8 E
T.3.3.xDictionnaire technique de la marine anglais-français et français-anglais Dobenik, Richard H. & Hartline, Gregory W. Maison du dictionnaire, Paris, France 19892-85608-031-6 F

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.xElsevier's Maritime Dictionary in Three Languages: English, French and Arabic Bakr, M.Elsevier science publishers B.V., Amsterdam, The Netherlands 19870-444-42737-6 G
T.3.3.yDictionary for Automotive Engineering Coster, Jean deSaur, Munich, Federal Republic of Germany 19823-598-10430-8 E
T.3.3.yDictionary of Public Transport N.D. Lea Transportation Research Corporation Alba Buchverlag GmbH & Co KG, Düsseldorf, Germany 19813-87094-783-7 E
T.3.3.yDictionnaire de l'automobile Editions techniques pour l'automobile et l'industrie ETAI,France1982 D
T.3.3.yDictionnaire illustré, Technique automobile Blok, Cz. & Jezewski, W. ETAI, France1981, 2nd edition 2-7268-8005-3E
T.3.3.zDiccionari de carreteres Generalitat de Catalunya, Departament de Política Territorial i Obres Públiques Generalitat de Catalunya, Barcelona, Spain 1991, 1st edition84-393-1870-3 C
T.3.3.zDiccionario Técnico Vial de la AIPCR Asociación Española Permanente de los Congresos de Carreteras (AEPCC) AEPCC, Madrid, Spain 198484-398-2430-0 D
T.3.3.zDizionario Tecnico Stradale AIPCRAbete Grafica S.p.A., Rome, Italy 1982, 5th editionND E
T.3.3.zLes Routes FRANTERMFranterm, Paris, France 1984 C
T.3.3.zLexique AIPCR des techniques de la route et de la circulation routière Machu, C.AIPCR, Paris, France 1991ND F
T.3.3.zStraßentechnisches Wörterbuch AIPCRA&C Welchert GmbH, Detmold, Germany 1982, 5th editionND F
T.3.3.zTechnical Dictionary of Road Terms PIARCDorel, Paris, France 1982ND E
T.3.3.#Diccionari de sociologia TERMCAT, Centre de terminologia Fundació Barcelona, Spain 199284-88169-03-5 C
T.3.3.#Dictionnaire des termes de la sociologie Hermans, AdMarabout, Alleur, Belgium 19912-501-01540-1 E
T.3.3.&Dictionnaire des engins de pêche George, Jean-Paul & Nédélec, Claude IFREMER / Ouest-France, Rennes, France 19912-7373-0838-0 E

CodeTitle AuthorEditor / Town / Country Publ. yearISBN number
T.3.3.&Fishing gear Glossarium Specialized service "Terminology and computer applications", CEC, Luxembourg Office des publications officielles des Communautés européennes, Brussels, Belgium - Luxembourg, Luxembourg 198792-825-6941-1 D
T.3.3.&Glossary of inland fishery terms - Glossaire des termes utilisés dans le domaine des pêches intérieures Leopold, M.Département des Pêches de la FAO 1978ND E
T.3.3.&Multilingual dictionary of fishing vessels and safety on board Terminology Unit of the Translation Service, CEC, Luxembourg Office for Official Publications of the European Communities, Brussels - Luxembourg 1992, 2nd edition92-825-9786-5 D

  1. THE POINTER SURVEY QUESTIONNAIRE


POINTER


(Proposals for an Operational Infrastructure for Terminology in Europe)

Interviewee


Enterprise name (where appropriate):

Contact person:

Function :

Address :

Post code :

City :

Country :

Telephone :

Fax :

E-mail :

Field of activity :

Current total staff :

Turnover (gross sales) in 1994 (please state currency) :

Does your organisation perform any terminology work?


Yes  No 

IF SO :


Type of work already produced/in progress/planned:

Number and position of permanent/ temporary staff involved in terminology :

Please describe the steps in your terminology creation/documentation process:

How do you go about checking and validating your terminology, and who does this ?

Please indicate if you do terminology work in :

Fr 
En 
Es 
It 
De 
Pt 
Ne 
Sv 
No 
Da 
El 
Ch 
Ru 
Ar 

Other :

Is your terminology designed for a particular application (please supply term record)?

What software and hardware do you use?

What sort of terminology produced by other organisations do you use, and how do you obtain it?

How has your terminology work changed/evolved over the past five years ?

Please give details of any terminology needs not being met

Are you interested in joining a terminology exchange network ?

Conditions of exchange for your terminology:

Exchange  Sale  Free distribution 

Further details of conditions :

IF NOT :


Why not? (Please list all types of constraints, e.g. internal and external, technical, organisational, monetary...)

Do you authorise us to store and reuse this information for this survey ? (The information will not be used for commercial purposes)

Yes  No 
  1. THE POINTER HOME PAGE AND TERM BAZAAR

  1. INTRODUCTION

Each specialised area of human activity, including science, technology, entertainment, finance and commerce, government and administration, religion and belief, is characterised by the idiosyncratic manner in which human languages are adapted for the purposes of that specialist activity. Each specialised activity is characterised by the way in which extant words and neologisms are used to specifically name objects and processes related to the activity for instance. The specialist community also uses grammatical constructs in a manner in which these constructs are not used in the everyday language of discourse. The focus of our discussion, however, will be on the specialist words or terms to be precise.

Specialist terminologies have a strong diachronic bias in that it is essential to know whether a term is a neologism, an established and currently used term, or is the term obsolete? Here one can, perhaps, discern a life cycle: from inception and birth (coinage) through to growth (currency), to maturity (verification and validation) to death (obsolescence). Standardisation depends crucially on the continued coinage of terms, and has a symbiotic relationship with the attempts of a term to establish itself. Standardisation is linked symbiotically with efforts in verifying and validating a term or indeed a collection of terms. (Symbiotic relationships are shown as double headed arrows in the figure below).

Figure 8 / 1 : The life cycle of a term

Between the stages of coinage and currency, there is the additional phase of the dissemination of the term. If the term is not effectively disseminated, and thus acknowledged within the specialist community, then it can not be labelled as a term.

The dissemination of terminology is therefore a crucial stage in the development, or life cycle, of a term. Traditionally terminology has been disseminated through the use of specialist (paper based) dictionaries, manuals, handbooks, glossaries and increasingly locally available terminology databases. Each of these outlets are limited in the amount of information which can be given, the size of the target audience and problems of accessing the dictionaries or term bases.

Internet and the Communications 'Revolution'

What is the Internet

The Internet originated from the idea of interconnecting four computers together in the late 1960's to ensure that at any one given time at least one of the four computer systems was operating. This network of computers, steadily grew throughout the 1970's and 1980's, and literally exploded in the last few years with the dawn of the information and communications revolution. As there is no governing body which rules the Internet, nor any central computer which controls it, there is no real way on knowing how many computers, or people, are connected to the Internet, but recent estimations suggest there are over 30 million people world wide using the Internet in one way or another.

Although in theory the Internet is the computers which are linked up to the network, the real interest, and use, of the Internet comes from the information which flows along the networks.

The Internet offers various facilities to its users including document search, electronic mail and so forth, and individuals have developed programs to harness Internet resources. These programs include Gopher (a text based menu system accessing various text files and information resources), Telnet (a text based navigational system enabling users to log into external computers), File Transfer Protocol (also known as FTP, a method for enabling the transfer of files from a remote computer to a local computer), Internet Relay Chat (IRC is a method for communicating with other users in real time across the Internet), Email (a method of sending electronic mail to other users on the Internet) and the World Wide Web.

What is the World Wide Web

The World Wide Web, also known as WWW, W3, or the Web, is an ambitious and innovative project which aims to offer as many Internet resources as possible via a simple, user-friendly front end. The front end which the WWW uses claims to be both intuitive and easy to use, and achieves this by the use of Hypertext based information resources.

One of the major advantages of using this common front end to view the Internet resources is that it's consistency means that more people will be able to access the information and use it comfortably, thus creating a more efficient environment for knowledge dissemination.

Currently there are about 6 million pages, or articles, on the World Wide Web, all coming from various locations in the World and each relating to a very wide number of domains. Given these documents are so varied in subject and place of origin there is a growing need to allow users to access information on the various terminologies available. There are many terminologically based resources on the Web, and these will be discussed in more detail later in this paper.

  1. CURRENT ON-LINE TERMINOLOGY RESOURCES

The easy access to the Internet has encouraged people with special interest to communicate with each other across the globe. These interest groups range from aeronautics to zoology, from astrology to Zen Buddhism, from people interested in the latest on the stock market to whisky drinkers and traders. Some interest groups comprise professionals, like academics, engineers, finance and industry, whilst others include hobbyist. And, there are conspiracy theorist and national groups. Almost the entire humanity, at least those living in the West, are represented on the Internet.

Much of the Internet communication is text-based, occasionally sprinkled with pictures, diagrams, music or speech.

Given that there are specialist groups on the Internet communicating with each other through the medium of writing, it should not be surprising that, in addition to communicative text, one finds terminology collections. A 'trawl' through the Internet shows that these collections vary from fully-elaborated term bases that have been validated by experts, like European Union's EURODICAUTOM, to expert produced glossary lists, for example NASA's aeronautics and aerospace glossary. These collections were produced for or by large organisations that employ terminologists and documentation experts. But not-so-well-endowed specialist groups have also published (or rather broadcasted) their terminology collections on the Internet: the UK Imperial College of Science and Technology has produced a 'Free Dictionary of Computing' and their is an elaborated terminology collection produced by the specialist martial arts group, 'CyberDojo' for use by its community throughout the Internet accessible world.

We discuss the above mentioned resources in turn.

  1. Term Bases

Eurodicautom

Eurodicautom was one of the first internationally accessible term bases, and is primarily a 'database of official and technical terms' although it does contain various generic terms also.

Eurodicautom's term base stores a wealth of information on various subjects, and offers the user an attractive front end to access the term base.

The following diagrams show, first, the query page with 'ordinateur' as the entry term, and a choice of source language of 'French' and target of 'English', and second, a section of the result of this query.


Figure 8 / 2 : Eurodicautom querying the term 'ordinateur'


Figure 8 / 3 : Result of the query from Eurodicautom
  1. Thesauri

NASA Aeronautic Terms

The NASA terminology collection comprises 17,455 terms relating to aeronautics. These terms are presented in a hierarchical network format, in that with the presentation of each term the user is given a list of connected terms falling into the categories of either narrower terms, broader terms or related terms. Those terms which are at the top of the hierarchy, i.e. the terms which do not have any broader terms, include definitions of the terms and the source from which the definition was taken from.

Upon entry to the NASA thesaurus, the user is given the opportunity to choose any letter, and then given all the entries in the term base starting with that letter. Although this may look cumbersome and untidy at first, it does make it easy to browse through the term base, observing what is actually available.

  1. Dictionaries

The 'Free On-Line Dictionary Of Computing'

The UK Imperial College (University) of Science, Technology and Medicine has made available The Free On-Line Dictionary of Computing on the World Wide Web. The dictionary holds several hundred terms, with definitions, and suggested related terms.


Figure 8 / 4 : The result of a query to the Free On-Line Dictionary of Computing

The Prostate Dictionary

The CoMed Communications Internet Health Forum provide the Prostate Dictionary on the World Wide Web. The dictionary holds 244 terms concerned with Prostate Cancer, each with comprehensive definitions which include references (and links) to other terms.

The Internet Health Forum provides many other Cancer related information sources, connected to the term base, and also offers readers the opportunity of adding their own contributions.


Figure 8 / 5 : The Prostate Dictionary
  1. Glossaries

CyberDojo Terminology Database

The 'CyberDojo Terminology Database' is an example of one of the smaller term bases available on-line, and includes a number of terms and definitions relating to the Karate based sport of 'CyberDojo'. The term base not only includes definitions of various terms, but where appropriate, it also displays photographs of CyberDojoists performing various moves and kicks to explain the term clearer. Obviously it is much more comprehensible to someone unfamiliar with the sport to explain a term with a picture of someone kicking in the air, rather than trying to verbally explain the fact that this terms means someone is kicking about 3 foot into the air, whilst pivoting their body 45 degrees to the right and punching their left fist into the air, etc.

The use of graphics in a term base is one particular advantage of the use of the World Wide Web, and something which is being exploited more.


Figure 8 / 6 : The CyberDojo Terminology Database

Whisky Terminology

This terminology collection has been made available by the University of Edinburgh and contains a wealth of information pertaining to Whisky and its distillation. This is another example of a smaller, more specialised term base, containing 24 terms, but considering the domain is much smaller than the 'Information Technology' or 'Aeronautics' domains tackled by previously mentioned Term Bases, it is likely to be more extensive to that domain.

  1. Terminology Collections and their Formats

The above term collections emphasise different linguistic and conceptual facets of the terms recorded in the collections.

Each collection of terms, or term base, is presented in a different manner making it easier to access the information, more efficient for accessing the terminology or more sophisticated.

The first format, in which the term base is presented, is the simple text file approach. This is where upon entry to the term base the user is presented a list, or glossary, of terms each with definitions and possibly related terms. One of the advantages of the World Wide Web is that these definitions are not restricted to textual information, but can be in the form of pictures, sounds, or even movie images. The definitions can also contain links to other terms, by the use of hyperlinks as mentioned earlier in the paper.

The next level of term base format is the approach in which the user is given a list of terms and, again using hyperlinks, the opportunity to click on a preferred term to elaborate it. Each elaborated term may now include definitions, related terms, hyperlinks to other terms, or any of the other multi-media utilities mentioned previously.

Currently, one of the most popular formats for the presentation of term bases, is the searchable index on the terms in the database. This means that when the user enters the term base system, they are presented with a form to enter a term (or partial term) and the term is then displayed correspondingly. A partial term is a term in which the user may not know the whole term, but can use wildcards to search for certain strings within a term. For example, the '*' wildcard means any number of letters, so the user can search the database for the term 'comp*' which means they would look for all the terms beginning with the letters comp, i.e. computer, computational, computer screen, etc.

The various degrees of information given about a term is also very changeable, some term bases simply offer definitions of terms, or related terms, while others give a full ontological description, as well as information about the grammar of the term, the date of entry or origin, the domain of the term, the country and language of the term (this is mainly for the multi-lingual term bases) and also a context in which the term can be used.

  1. The advantages of using the Internet

The terminology collections on the Internet really are more than the conventional collections one may find on the computer systems belonging to individuals and organisations. For one thing, the terms, and in some cases the definitions, are usually marked-up by the terminology providers, using hyper-text mark-up facilities, thereby increasing the cross-referencing capability of the collection as compared to a conventional terminally collection. More importantly, perhaps, the Internet provides a retrieval channel for accessing the terms from the term bases: in conventional terminology management systems such facilities cost extra and the users of such systems have to undergo a training process in order to use these systems.

  1. TERMINOLOGY MANAGEMENT IN A GLOBAL AND DISTRIBUTED ENVIRONMENT

The above discussion shows the retrieval power of the Internet for accessing terminology of a range of domains. We would now like to explore further how the Internet can be used to provide a range of terminology management activities covering almost the entire life cycle of a term: from its coinage by experts or enthusiasts, and subsequent usage by some members of the specialist domain, to its inclusion in terminology collections, including term bases or specialist collections and finally to any changes in the usage or definition of the term to its obsolescence.

  1. Terminology Retrieval, Extraction and Exchange through the Internet

Recall that Internet serves principally to provide a communications channel: using the Internet one can have accesses to data bases and text files produced by others and provide others ones' own data bases and text files. Increasingly, it has been possible to access to software tools at remote sites and to use these tools to manipulate data locally or at other sites. Our exploration of Internet-based terminology management will have three distinct aspects: technical, conceptual and commercial.

Technology: Term base access and term retrieval First, we show how a typical user, that is one with not much experience of browsing through the Internet, can be helped in such a browsing exercise by a system that can provide friendly-access to a host of terminology collections.

Terms in texts and text analysis for extracting terms The second aspects relate to the use of facilities for browsing terminology bases in conjunction with an existing terminology management system, System Quirk. This means that the users of this coupled system will be able to create their own term collections register their opinions of terms in extant term bases, and indeed, edit the contents of a terminology provided they can obtain the consent of the managers of term bases they are using.

System Quirk can not only retrieve and add terms to existing collections but can also help in the organisation and analysis of (specialist) texts. Now that the Internet can provide help in the acquisition of texts, especially newly-produced texts produced by the members of a given specialist domain, texts where there is a greater probability of finding neologism, the use of System Quirk together with Internet-accessible text provides an opportunity for extracting neologisms and checking their use which did not exist before the advent of the Internet.

Terminology 'Trade' and Electronic cash The third aspect of our exploration deals with the exchange of terminology across the different formats of the term bases and with the ways in which the terminology providers, if they so wish, may charge for the use of their terminology collections.

Electronic Cash for Terminology: The consolidation of Internet-related services has meant that services and products are being provided through the Internet. The user of the Internet-mediated commodity (or commodities) has to either send money by mail both the snail-mail and e-mail. Furthermore, one can by electronic cash: this is achieved by sending money to the relevant organisations and receiving electronic codes in lieu. These codes can then be transacted as money.

  1. Term Bazaar

Bazaar n. 1. market in an oriental country

2. a fund-raising sale of goods, esp. for charity

3. a large shop selling fancy goods etc.

[ Pers. bãzãr, probably through Turk. and It. ]

(source - The Reader's Digest Oxford Wordfinder - 1993)

The Term Bazaar is a program under development at the University of Surrey, for the POINTER project, which can be accessed via the World Wide Web, and can allow users to search through various term bases, and using hyperlinks jump between terms, definitions of terms, and sources (of definitions).


Figure 8 / 7 : The 'Term Bazaar' page

The Term Bazaar aims to facilitate the exchange of terminology across the World Wide Web. By the definition of exchange the process must be bi-directional, thus allowing users to not only access terminology, but to also enter terminology into the system too.

  1. Terminology Storage

The main function of the term bazaar is the facility to search the term base(s) for terms, and retrieve information about these terms. By entering a term in the 'Entry' field of the term bazaar page and pressing the 'Enter' button, the program will start searching the selected term base for the term.

If the term is found, and has only one instance, the term is displayed for the user with various fields including, Country, Language, Definition, German Equivalent (if there is one available), and also several internal fields including the date of which the term was entered, the initials of the relevant terminologist, and its R/A/G status (a code which signifies how far the term is on the validation scale.

If there is more than one instance of a term, then the user is given a choice of the possible terms, where each term is hyperlinked to the full elaboration of it. If there are no terms found, then the user is informed of this, appropriately.

When searching for terms the user can also enter a '*' wildcard, to search for entries featuring a certain pattern. For instance, if the user runs a query on 'comp*' the system will search for all the terms starting with comp. The user can also enter a query on the term '*' which will list all the terms in a term base. When the system has matched all the terms, relevant to the query, it will present them to the user in the same way that a term with more than one entry is given to the user. If, however, the wildcard entry only recovers one term, then that term is presented to the user in the usual manner.

  1. Terminology Representation

Hyperlinks to Definitions

One of the fields displayed to the user when querying a term is the definition of the term. In the original term entry this is given as a hyperlink to a full definition of the term. The full definition is not shown with each entry, as one definition may be linked with many terms, for instance the terms VDU and Visual Display Unit, may be entered as different terms (with VDU being an abbreviation of Visual Display Unit) but they will have the same definitions.

Each full definition displays a set of fields relevant to the definition, including the source where the definition was taken from, the date that the definition was entered, and the actual definition itself.

Marking up definitions When the user is reading through a definition of a term, it is possible that they will come across other terms which are new to them, so the system automatically marks up all the terms in the definition which are also in the term base, so the user can easily jump to a new term.

Hyperlinks from definitions to sources Each definition entry has an obligatory source, from where the definition was acquired. This source has been made into a hyperlink, so that the user can jump from the definition to the source entry, and find out what type of text the source was, a book, manual, glossary, etc., and what the full title of the text was.

Hyperlinks to foreign language equivalents For every term with a foreign language equivalent, a hyperlink is created to allow the user to jump from the text straight to the foreign equivalent term.

Customisation and Localisation

For each of the entry types, Term, Definition, and Source, the user is allowed the option of personalising the fields presented to them. By clicking on the 'fields' link while viewing a term, the user is offered a list of all available fields for a term and a corresponding checkbox by each one. If the checkbox is checked (i.e. it displays a cross) then that field is displayed. Similarly if the user is viewing a definition or a source and chooses 'fields', then they will have the option to change the fields displayed for definition or source, respectably.

When viewing a term, a user may only be interested in a little bit of information about it, for instance the short grammar and the German equivalent, so by selecting only these fields, the user does not have to read through pages of information about the term, but can just display the relevant information. If a user customises the fields displayed, and stops working on Term Bazaar, when they use it next, the fields will still be as chosen.

  1. Terminology Retrieval

Language Selection

At any time during the use of Term Bazaar, the user is allowed the opportunity to alter the current working language. Certain terms can be found in more than one language (for instance the word 'Computer' can be found in English, German and Dutch), but if English was chosen as the working language, then the English term would be displayed. The Dutch and German equivalents would still be shown as fields, and if chosen the working language would change correspondingly.

A user may be interested in knowing which languages a term is used in, and so can chose the language option of 'Any'. If a term is found in more than one language then the various choices are offered to the user in the form of a list.

Selecting Term Bases

Initially the user is offered the ITTB lexicon (Information Technology Term Base) to view and search through, but can change the lexicon by using the lexicon option box on the Term Bazaar.

As with the fields personalisations, when the user leaves Term Bazaar and returns to it later, they are offered the last lexicon they were working with.

  1. Terminology Entry

By choosing the new term link, the user can enter some of the basic details of a new term, which will then be entered into the term base. At the University of Surrey we have developed a quality control mechanism for monitoring the entry of new terms - each term is given a status of either Red, Amber or Green. Red is for terms which have just been entered by a terminologist which have neither been verified or validated by the experts; Amber terms have either been internally verified and validated by an expert or by an experienced terminologist in the discipline; Green terms are ready for use and have been verified and validated.

For the purpose of Term Bazaar a new category has been created, being N (for New). This means it is possible to find all the new (N) terms, entered into a Term Bazaar by external users, to check them and upgrade them to Red.

New terms can be displayed by any user, but the term itself appears to flash, to warn the user that the term has not been validated or checked.

  1. USING THE INTERNET FOR THE TEACHING OF TERMINOLOGY

Alongside the dissemination of terminology through term bases, the Internet can also be used as a teaching facility with equal effectiveness. Once again, the use of the Internet overcomes the problems of the amount of material available and greatly increases the size of the audience benefiting.

Below is an example of a draft of a book on 'Terminology and Artificial Intelligence' (by K. Ahmad and M. Rogers of the University of Surrey) which has been made available via the World Wide Web. The paper offers an introduction to Terminology and Artificial Intelligence, in a format which is easy to follow and read.


Figure 8 / 8 : The contents of the 'Terminology and Artificial Intelligence' paper

The use of the Internet for teaching is becoming increasingly popular at the moment, especially for distance learning courses. The UK-based Open University now offer a large amount of learning material and the opportunity for students to communicate with each other across the Internet, overcoming problems which had previously occurred.


Figure 8 / 9 : A section of the 'Terminology and Artificial Intelligence' paper
  1. POINTER WWW PAGE

The POINTER WWW page aims to disseminate a wide amount of information pertaining to the project. Currently there is information about the various members of the POINTER project, and their organisations. Furthermore, the WWW page provides access to the various reports and surveys produced for the project. The 'Term Bazaar', developed for the project, can also be accessed. The current address for the page is;

http://www.surrey.ac.uk/MCS/AI/pointer/

Once the project has ended, it is expected that, the pages will continue to grow, including all the major reports published from the project, and will serve as a focal point for further discussions and projects concerned with an infrastructure for terminology in Europe.

  1. RECOMMENDATIONS AND CONCLUSIONS

Our recommendations fall into two categories: strategic and tactical. The strategic recommendation relates to the recent deliberations on the Bangemann-inspired debate on the Information Society, particularly the recent recommendations that whilst market forces will be the principal determiners of the shape of the Information Society, the EU should set up demonstrator projects for highlighting the potential of the global information networks: terminology dissemination and management, especially in the multilingual context, is an ideal area of application in this respect. The tactical recommendation relates to the promotion of collaboration between the lexica and corpora community and the terminology community for exploiting Internet resources.

The 'Term Bazaar' could be seen as another on-line term base, although it does boast several significant differences. Many of the current on-line term bases are made up of flat text files which are either viewed as such or, in some cases, allow the user the option of viewing the file as flat text or having some keyword search mechanism which jumps to the keyword in the text file. This limits the flexibility of the system, and does not allow the user any scope for customisation, except in the sense of the keyword search.

By linking the Term Bazaar to a database-based term base, the user is able to search for terms (or partial terms, using the wildcard facility), given a specific language or domain. In fact, the Term Bazaar should not itself be seen as a term base, but as a platform-independent front end to a term base. The Term Bazaar program is not limited to one type of term base, and with minor adjustments could easily cope with a different style of term base, or a variety of different term bases.

The use of terminology databases can also be used in the searching of documents in at least three different ways:

First, the search program relies mainly on the so-called term indexes or glossary of terms. Such an arrangement does not systematically focuses on the usage-related features of specialist terms: the term bases contain synonyms, deprecated terms, acronyms and above all usually contain standardised terminology.

Second, a term base is based on a model of human knowledge. The implementation of a term base requires a well-developed data model which is essentially an abstraction of the model of knowledge. Term indexes do not contain these elaborate relationships and cannot be grounded as thoroughly as a term base because a term index is simply a list

Third, work in knowledge-based term bases, wherein a knowledge representation scheme is used to represent the extensive terminological data, will help in the inference of new facts from old and will automatically check the pre-stored data in term-relationships. Used in conjunction with a search facility, a knowledge-orientated term base will help in the realisation of an expert system for retrieving documents.

Compilation of Various Glossaries and Dictionaries Available on the Internet

NB - Due to the dynamic nature of the Internet, these addresses are subject to change.

General Purpose

Term Bazaar

http://www.surrey.ac.uk/MCS/AI/pointer/bazaar.html

Eurodicautom

http://www.uni-frankfurt.de/~kurlanda/Eurodicautom.html

Roget Thesaurus

gopher://gopher.slu.se/7waissrc%3a/wais-dbs/Linguistic/roget-thesaurus

World Wide Web Acronym and Abbreviation Server

http://www.ucc.ie/info/net/acronyms/acro.html

Oxford Dictionary of Familiar Quotes

telnet://info.rutgers.edu/

The Devil's Dictionary

http://www.vestnet.no/cgi-bin/devil

Dictionary of Roadie Slang

http://searider.jpl.nasa.gov/~gms/text/slang.html

Science and Technology

NASA Terminology Collection

http://www.sti.nasa.gov/nasa-thesaurus.html

Free On-Line Dictionary of Computing

http://wombat.doc.ic.ac.uk/

Software Engineering Glossary

http://dxsting.cern.ch/sting/glossary-intro.html

Unix and Internet Dictionary

http://www.nnsg.net/kadow/answers.html

Hacker's Dictionary

gopher://gopher.slu.se/7waissrc%3a/wais-dbs/computers-and-software/gloss-and-dicts/jargon

Leisure

CyberDojo Terminology Database

http://cswww2.essex.ac.uk/cgi-bin/search/Web/karate/CyberDojo/terminology.html:CD_terminology

Dan's Poker Dictionary

http://www.universe.digex.net/~kimberg/pokerdict.html

Football Terminology

http://www.atm.ch.cam.ac.uk/Sports/terms.html

Food and Drink

Whisky Glossary

http://www.dcs.ed.ac.uk/staff/jhb/whisky/glossary.html

Beer Glossary

http://www.pathfinder.com/@fSiS@9GaMQAAQDs5/twep/LittleBrown/Best_Beers/Best_Beers_Glossary1.html

Glossary of Coffee Terminology

http://www.cappuccino.com/glos.html

Miscellaneous

Dog Term Glossary

http://pasture.ecn.purdue.edu:1111/Dogs/glossary.html

Vascular Plants glossary

http://155.187.10.12/glossary/glossary.html

Aquarium glossary

http://www.actwin.com/fish/glossary.html

Commerce

Real Estate and Mortgage Glossary

http://www.homebuyer.com/realestate/common.dir/glossary.html

Credit, Financial and Legal Glossary

http://www.teleport.com/~richh/glossary.html

  1. THE REUSABILITY OF GENERAL LANGUAGE DICTIONARY RESOURCES FOR BUILDING TERMBASES

  1. INTRODUCTION

The European Commission is concerned to optimise the use of linguistic resources. Since up to 40% of entries in some general-language (LGP) dictionaries may be concerned with specialised vocabulary - or terminology - LGP dictionaries, which are widely available, may also be regarded as a potential terminology resource. Terms included in such dictionaries tend to be those which are used and encountered by both experts and laypeople.

The Research Network Group within the POINTER project investigated the problems and opportunities related to the use of existing LGP dictionaries, particularly bilingual dictionaries and monolingual learners' and advanced dictionaries, subject-specific handbooks and other relevant encyclopaedic material; the focus of our deliberations was on English and German with some input from Dutch. The conclusions of the report (Ahmad et al. 1995) are, however, equally relevant to other languages.

The use of the above-mentioned lexicons and knowledge sources poses two problems: what is the form of the entry, and how amenable is this to re-use for other purposes? Analysis of various dictionary entries demonstrates that the extraction of termino! logical data from currently-available LGP dictionaries (both monolingual and bilingual) is problematic from a number of different points of view, including the inconsistent and imprecise use of subject-field labels, the absence of adequate pragmatic information, and varying definitional practices. Terms are also often deeply nested in entries, even as sub-senses of polysemous headwords. The unsatisfactory use of subject-field labels is of particular importance for the automatic extraction of data.

Solutions are likely to be medium-term rather than short-term, involving the more widespread use of standards for the representation of lexicons and the consistent use of established classification systems. Solutions related to lexical standards are of substantial relevance in the medium term. Of particular interest are the interchange standards that encourage exchange of lexical and terminological resources, such as TIF and other emerging standards produced by ISO, as well as standards that encourage exchange across applications, particularly machine-translation systems and document management systems. It is essential that the research carried out in these areas, for example, the R&D efforts sponsored by the EU as manifested in MULTILEX, GENELEX, MULTEXT, and TRANSTERM is properly archived and articulated in a manner that is comprehensible to terminology and lexical resource developers.

  1. SUBJECT FIELD VARIATION IN GENERAL LANGUAGE DICTIONARIES

The dictionaries used in our study belong to the 'college' dictionary variety with the exception of the monolingual Duden in 8 volumes. In all, five dictionaries have formed the basis of our conclusions; each dictionary is referred to by the abbreviated title shown in brackets:

A survey of the subject fields in the five dictionaries shows variation not only in the number of fields listed (usually in an Abbreviations section for various labels) but also in the inventory of subject fields for each dictionary:

Dictionary
Coverage
No. of subject fields listed
% subject fields occurring

in other dictionaries
CODMonolingual (Br. En)
74
78.4
DudenMonolingual (De)
179
73.7
LangenscheidtMonolingual (De)
48
100.0
CollinsBilingual (En <--> De)
89
85.2
Oxford DudenBilingual (En <--> De)
107 (en) 135 (de)
80.0

Table 9 / 1 : Number of subject fields listed under 'Abbreviations' in five general-language dictionaries

The subject field inventories in the five dictionaries (Tables 2 to 7) show considerable overlap, suggesting a certain consensus, although it is hard to establish whether such agreement is derivative in some cases (clearly for Duden and Oxford Duden this is likely to be the case).

AeronauticsAnatomy
AnthropologyAntiquities, Antiquity
ArchaeologyArchitecture
ArithmeticAstrology
AstronauticsAstronomy
BiblicalBibliography
BiochemistryBiology
BotanyChemistry
ChurchCinematography
ConchologyCriminology
CrystallographyDemography
EcclesiasticalEcology
EconomicsElectricity
EngineeringEntomology
GeographyGeology
Geometry Greek History
HistoryHorology
HorticultureMathematics
MechanicsMedicine
MeteorologyMilitary
MineralogyMusic
MythologyNational
NauticalObstetrics
OrnithologyPalaeography
Parliament; Parliamentary Pathology
Pharmacy; Pharmacology Philology
PhilosophyPhonetics
PhotographyPhrenology
Physiologypoetical
PoliticsPsychology
ReligionRhetoric
Roman Catholic Church Roman History
ScienceShakespeare
SociologyStock Exchange
TelevisionTheatre, Theatrical
TheologyTypography
UniversityVeterinary
Zoology

Table 9 / 2 : Inventory of subject fields in Concise Oxford Dictionary of Current English (8th edition) (1990)

AkustikHandwerk (Gerberei, Böttcherei, Bäckerei, usw.) Religion
AnatomieHauswirtschaft Rentenversicherung
AnthropologieHeraldik Rundfunk
ArbeitsrechtHochfrequenztechnik Rundfunktechnik
Arbeitswissenschaft HochschulwesenSchiffahrt
ArchitekturHolzverarbeitung Schiffbau
AstrologieHotelwesen Schriftwesen
AstronomieHüttenwesen Schülersprache
BakteriologieImkersprache Schulwesen
BallettInformationstechnik Seemannssprache
BallistikJagdwesen Seewesen
BankwesenJägersprache Sexualkunde
BautechnikKartenspiel Soldatensprache
BauwesenKaufmannssprache Sozialpsychologie
BergbauKerntechnik Sozialversicherung
BergmannspracheKindersprache Soziologie
Betriebswissenschaft KinoSport (Boxen, Fußball, Reiten, usw.)
bildende KunstKirchensprache Sportmedizin
BiochemieKochkunst Sprachwissenschaft
BiologieKommunikations-forschung Sprengtechnik
BodenkundeKosmetik Statistik
BörsenwesenKraftfahrzeugtechnik Steuerwesen
BotanikKraftfahrzeugwesen Stilkunde
BuchbindereiKunstwissenschaft Straßenbau
BuchführungKybernetik Studentensprache
BuchwesenLandwirtschaft Tabakindustrie
BürowesenLiteraturwissenschaft Technik
ChemieMalerei Textilindustrie
DatenverarbeitungMathematik Theater
DichtkunstMechanik Theologie
DiplomatieMedizin Tiermedizin
DruckerspracheMeereskunde Tierzucht
DruckwesenMetallbearbeitung Touristik
EisenbahnwesenMetallurgie Uhrmacherei
ElektronikMeteorologie Verfassungswesen
ElektrotechnikMilitär Verhaltensforschung
FernsehenMineralogie Verkehrswesen
FernsprechwesenMode Vermessungswesen
FertigungstechnikMünzkunde Versicherungswesen
FilmMusik Verslehre
FinanzwesenMythologie Verwaltung
FischereiwesenNachrichtentechnik Viehzucht
FliegerspracheNachrichtenwesen Völkerkunde
FlugwesenNaturwissenschaft[en] Völkerrecht
ForstwesenOptik Volkskunde
FotografiePädagogik Waffentechnik
FrachtwesenPaläontologie Wasserbau
FunktechnikPharmazie Wasserwirtschaft
FunkwesenPhilatelie Werbesprache
GartenbauPhilosophie Winzersprache
GastronomiePhonetik Wirtschaft
GaunersprachePhysik Wohnungswesen
GeldwesenPhysiologie Zahnmedizin
GenealogiePolitik Zahntechnik
GenetikPolizeiwesen Zeitungswesen
GeographiePostwesen Zollwesen
GeologiePrähistorie Zoologie
GewerbesprachePsychoanalyse
GießereiPsychologie
graphische Technik Raumfahrt
HandarbeitenRechtssprache

Table 9 / 3 : Inventory of subject fields in DUDEN - Das große Wörterbuch der deutschen Sprache. In 8 Bänden (1993-95)

ArchitekturMathematik
AstronomieMedizin
Bankwesen [banking] Meteorologie
BiologieMilitär, Kriegswesen
BotanikMusik
ChemiePädagogik
Eisenbahn [railroad] Pharmazeutik
Elektrizität, Elektotechnik

[electricity, electrical engineering]

Philosophie
Elektronische Datenverarbeitung

[electronic data processing]

Physik
evangelisch, evangelische Religion

[protestant, protestant religion]

Politik
Fernmeldewesen, Telegraf, Telefon Postwesen
FernsehenPsychologie
FilmRecht
FotographieReligion
Gastronomie, Küche

[gastronomy, cooking]

Rundfunk [broadcasting]
GeographieSeefahrt [seafaring]
GeologieSozialwissenschaft, Soziologie
GeometrieSport
katholisch, katholische Religion Technik, Technologie
Kraftfahrzeuge [motor vehicle] Theater
LandwirtschaftVerkehrswesen
Linguistik, Sprachwissenschaft Verwaltung,

Bürokratie

[administration,

bureaucracy (written language,

formal)]

Literaturwissenschaft Wirtschaft, Volkswirtschaftslehre

[economics]

LuftfahrtZoologie

Table 9 / 4 : Inventory of subject fields in Langenscheidts Großwörterbuch Deutsch als Fremdsprache (1993)

Sachbereichsangaben Field labels
Verwaltungadministration
Landwirtschaftagriculture
Anatomieanatomy
Archäologiearchaeology
Architekturarchitecture
Kunstart
Astrologieastrology
Astronomieastronomy
Kraftfahrzeugeautomobiles
Luftfahrtaviation
Kindersprache(baby talk)
biblischbiblical
Biologiebiology
Botanikbotany
Hoch- und Tiefbaubuilding
Kartenspiel(cards)
Chemiechemistry
Schach(chess)
Handelcommerce
Kochen und Backencooking
kirchlichecclesiastical
Volkswirtschafteconomics
Elektrizitätelectricity
Mode(fashion)
Finanzenfinance
Angeln,/Fischerei(fishing)
Forstwesenforestry
Fußballfootball
Geographiegeography
Geologiegeology
Heraldikheraldry
Geschichtehistory
Gartenbauhorticulture
Jagdhunting
Industrieindustry
Versicherungswesen insurance
Rechtswesenlaw
Sprachwissenschaft linguistics
Literaturpertaining to literature
Mathematikmathematics
Maß(measure)
mechanischmechanical
Medizinmedicine
Meteorologiemeteorology
Metallurgie, Hüttenkunde metallurgy
militärischmilitary
Bergbaumining
Mineralogiemineralogy
Straßenverkehr motoring and transport
Musikmusic
Mythologiemythology
nautisch, Seefahrt nautical, naval
Nationalsozialismus Nazism
Optikoptics
Ornithologie, Vogelkunde ornithology
Parlamentparliament
Pharmaziepharmacy
Philosophiephilosophy
Phonetik, Phonologie phonetics, phonology
Photographiephotography
Physikphysics
Physiologiephysiology
poetischpoetic
Dichtungpertaining to poetry
Politikpolitics
Presse(press)
Psychologie, Psychiatrie psychology, psychiatry
Rundfunkradio
Eisenbahnrailways
Religionreligion
Schulwesenschool

Naturwissenschaften science
Bildhauereisculpture
Handarbeitsewing
Skisportskiing
Sozialwissenschaften social sciences
Raumfahrtspace flight
BörseStock Exchange
Landvermessungsurveying
Techniktechnology
Nachrichtentechnik telecommunications
Textilientextiles
Theatertheatre
Fernsehentelevision
Typographie, Buchdruck typography and printing
Hochschulwesenuniversity
Tiermedizinveterinary medicine
Zoologiezoology

Table 9 / 5 : Inventory of subject fields in German/English/German Dictionary (2nd edition) (1991)

Administration, Administrative Aeronautics
AgricultureAlchemy
AnatomyAnglican Church
AnthropologyAntiquity
ArchaeologyArchitecture
AstrologyAstronautics
AstronomyBacteriology
BibliographyBiochemistry
BiologyBookkeeping
BotanyChemistry
CinematographyCommerce, Commercial
Communication Research Construction
DentistryDiplomacy
DressmakingEcclesiastical
EcologyEconomics
EducationElectricity
EthnologyEthology
FootballGastronomy
GenealogyGeography
GeologyGeometry
Graphic ArtsHeraldry
History, Historical Horology
HorticultureHydraulic Engineering
Information Science International Law
JournalismLinguistics
LiteratureMagnetism
ManagementMathematics
Mechanical Engineering Mechanics
MedicineMetalwork
MetaphysicsMeteorology
MilitaryMineralogy
Motor VehiclesMountaineering
MusicMythology
Natural ScienceNautical
Nuclear Engineering Nuclear Physics
NumismaticsOceanography
OrnithologyPaleontology
ParapsychologyParliament
PharmacyPhilately
PhilosophyPhonetics
PhotographyPhysics
Physiologypoetical
PoliticsPrehistory
ProsodyPsychology
RailwaysReligion
ResearchRhetoric
Roman Catholic Church School
ScienceShipbuilding
Social ServicesSociology
Soil ScienceStock Exchange
SurveyingTelephony
TelevisionTheology
UniversityVeterinary Medicine
WoodworkZoology

Table 9 / 6 : Inventory of subject fields in Oxford Duden German Dictionary (1990): English field labels used in the Dictionary / Im Wörterverzeichnis verwendete englische Sachbereichsangaben

Altes TestamentAnatomie
AnthropologieArchäologie
ArchitekturAstrologie
AstronomieBauwesen
Bergmannssprachebiblisch
bildende KunstBiologie
BodenkundeBörsenwesen
BotanikBuchführung
BuchwesenBürowesen
chemischchristlich
Datenverarbeitungdichterisch
DruckerspracheDruckwesen
Eisenbahnelektrisch
ElektrotechnikEnergieversorgung
Energiewirtschaftevangelisch
FernsehenFernsprechwesen
FinanzwesenFischereiwesen
FliegerspracheFlugwesen
ForstwesenFotografie
FrachtwesenFunkwesen
GastronomieGaunersprache
GenealogieGeographie
GeologieGeometrie
HandarbeitHandwerk
HauswirtschaftHeraldik
historischHochschulwesen
HolzverarbeitungImkersprache
Informationstechnik Jagdwesen
JägerspracheJugendsprache
juristischkatholisch
KaufmannsspracheKindersprache
KochkunstKraftfahrzeugwesen
KunstwissenschaftLandwirtschaft
Literaturwissenschaft Luftfahrt
marxistischMathematik
MechanikMedizin
MeereskundeMetallbearbeitung
MetallurgieMeteorologie
MilitärMineralogie
MittelalterMünzkunde
MusikMythologie
nationalsozialistisch Naturwissenschaft
Neues TestamentPädagogik
Paläontologie Parapsychologie
ParlamentPharmazie
PhilateliePhilosophie
PhysiologiePolizeiwesen
PostwesenPrähistorie
PsychologieRaumfahrt
RechtsspracheRechtswesen
ReligionRhetorik
römisch-katholisch Rundfunk
Schülersprache Schulwesen
SeemannsspracheSeewesen
SexualkundeSoldatensprache
SozialpsychologieSozialversicherung
SoziologieSprachwissenschaft
SteuerwesenStudentensprache
TextilwesenTheologie
TiermedizinVerhaltensforschung
VerkehrswesenVermessungswesen
Versicherungswesen Völkerkunde
VölkerrechtVolkskunde
WerbespracheWinzersprache
WirtschaftWissenschaft
ZahnmedizinZeitungswesen
ZollwesenZoologie
Zusammenschreibung

Table 9 / 7 : Inventory of subject fields in Oxford Duden German Dictionary (1990): German labels used in the Dictionary/Im Wörterverzeichnis verwendete deutsche Sachbereichsangaben

It can be seen from Tables 2 to 7 that the subject fields are in many cases drawn very broadly, e.g. Biology; Engineering; Geography (COD), whereas in terminology work the focus tends to be on strictly-delimited domains or sub-domains. However, some fields in the general-language dictionaries are very specialised, e.g. Böttcherei ('barrel-making') as a sub-field of Handwerk ('crafts') (Duden). Some subject fields are also related but not clearly delineated, e.g. Zahnmedizin and Zahntechnik (Duden). Furthermore, certain labels characterise varieties of language rather than subject fields, particularly in the Duden and the Oxford Duden dictionaries, e.g. Jägersprache ('hunting language'), Rechtssprache ('legal language'), Schülersprache ('school language'). Yet these language varieties or registers may also have their counterpart in the actual subject field itself: Jagdwesen ('hunting'), Rechtswesen ('law'), and Schulwesen ('secondary education'). Again, the boundaries are blurred: in this case, however, confusing the linguistic and the conceptual level.

The application of the subject field labels in the dictionary entries is also inconsistent in a number of respects: they are often omitted for particular senses within a polysemous entry even when available. Use of the available labels is therefore inconsistent and this would hinder a systematic identification of terms within a particular, if broadly-defined, domain.

Furthermore, subject-field labels are expanded on an ad hoc basis. The label Billiards etc. (sic) is given for the fifth sense of bridge in the COD, although it does not appear in the list of subject field labels provided. There is therefore broad variation in the degree of specificity in subject-field delimitation, e.g. Engineering versus Billiards etc. which is not necessarily evident from the list of labels.

Even when the sense of a headword is confined to a particular subject-field, the entry may still be sub-divided into polysemes according to a lexicographical approach. For instance, the entry for head in Collins which is labelled NAUT has three sub-senses: Bug ('bow'); Topp ('mast'); Pütz ('toilet'). This can be regarded as a consequence of the very broad subject fields employed in general-language dictionaries.

Finally, the allocation of words to sense groupings (or perhaps subject fields) can also be seen to vary between dictionaries, e.g. head (of beer) is grouped together with head (of plant) in Collins, but is given as a separate sense in Oxford Duden. While such inconsistency is not surprising, it nevertheless makes the task of acquiring terminology from such sources more problematic.

  1. RE-USE OF LGP DATA FOR TERMINOLOGICAL PURPOSES

A serious consideration in the re-use of linguistic resources is the automatic extraction of data: in the present case this concerns the feasibility of extracting terminological data from LGP dictionaries. The crux of the matter resides in the strategies employed when associating a certain term with a terminological domain or subject field. Langenscheidt chooses to note special terminology when, quite simply, 'special language terms have found their way to general language' (cf. [Gö/Hae/Well 93] p. xix) -- what exactly this means remains unclear. Duden, on the other hand, claims to distinguish exactly (cf. Vol. 1, p. 20ff) between general language and special languages (Drosdowski 1993-95) while, however, 'not indicating the terminological domain of a term when its definition clearly indicates the domain to which it belongs'. The attempted differentiation between the register of a domain and the domain itself is a further complication. It appears that the degree of specificity striven for by Duden is counter-productive. When viewed from the perspective of re-usability for Terminological Knowledge Bases, however, both Duden and Langenscheidt exhibit inconsistencies in their treatment of special terms that seem to render automatic information extraction impossible, given present lexicographical practice.

  1. VARIATION OF DEFINITIONS ACROSS GENERAL LANGUAGE DICTIONARIES AND HANDBOOKS

In any terminology, the definition plays a key role in structuring the knowledge of the domain and in linking the concepts - and hence the terms - in relations which reflect this structure, both hierarchical (e.g. genus-species; part-whole) and non-hierarchical (e.g. cause-effect; material-product). Definitions of concepts (terms) within the same domain are therefore related in a rich network of interconnections. Typically, in general-language dictionaries, the definitions of terms within a particular domain or subject field are not systematically organised. This lack of systematicity is all the harder to detect because of the alphabetical presentation of the data. However, while lexicographical practice may vary, certain principles of defining have been identified. Zgusta's (1971) principles of defining are as follows

Landau (1989) has suggested that there are differences between the way general words and scientific words are defined. He claims that general words are defined on the basis of citations illustrating actual usage: the meanings are extracted from a body of evidence. The meaning of scientific entries, on the other hand, are imposed on the basis of expert advice. The experts may have sources apart from their own knowledge and experience, but their sources are encyclopaedic rather than lexical, that is they are likely to consist of authoritative definitions composed by other experts whose concern is maintaining the internal coherence of their discipline rather than faithfully recording how terms are used. Their goal is ease and accuracy of communication between those versed in the language of science (Landau 1989:20).

Zgusta's principles for defining can be used as a basis for assessing whether or not a dictionary entry can be imported into a term data base solely on the basis of how entry has been elaborated. In order to evaluate how Zgusta's principles can be put into operation we analysed the term nuclear fission in a number of dictionaries and handbooks. The languages covered were English, German and Dutch. Nuclear fission is an interesting term in that it refers to a theoretical concept, that of the nucleus of an atom splitting into its constituents spontaneously or otherwise. But fission simultaneously refers to a practical proposition, that of massive controlled or uncontrolled release of energy as in nuclear reactors or in the inappropriately-named atomic bomb.

A number of dictionaries and handbooks (see below) were consulted in order to evaluate the definitions of nuclear fission. This evaluation was guided by Zgusta's principles and we were also interested in other terms used in the definition of nuclear fission especially those which may have to be looked up by a novice or layperson or a translator.

Nuclear fission, that is the spontaneous or deliberate break-up of the nucleus of an atom, is a term which has been used in both general- and special-language texts since the end of the Second World War. In the following, we have looked at the definition of fission, a deprecated term for nuclear fission, in the Collins CoBuild Dictionary:

Fission: Nuclear fission is the splitting of the nucleus of an atom to produce a large amount of energy or cause a large explosion.

Terms need defining: nucleus; atom.

Not all nuclei can be fissioned for the production of energy or for purposes of causing an explosion.

The following dictionaries and handbooks show a similar kind of omission to a greater or to a lesser extent:

Source
Missing Terms
Pennick. A M., and Adamson, A L. (1984). 'Nuclear Energy'. In (Ed.) J P Quale. Kempes Engineers Year Book 1984 (89th Edition). London:Morgan-Grampian Book Pub. Co. pp F1/1-F1/21. beta decay; gamma ray; neutron shielding; gamma shielding
Procter, Paul M. Ed.) (1982) Longmann New Universal Dictionary. Harlow (Essex, UK): Longman Publishers. atomic nucleus. Splitting into how many fragments?
The American Heritage Dictionary of the English Language(1992). Boston (Mass, USA): Houghton Mifflin Co. electron volts. Perhaps the best and most accurate definition.
Lenz, W. (1987). 'Dampfer-zeugungs-anlagen (Steam generating plants)'. In (Eds.) W. Beitz, W.and K. -H. Kuttner. Taschenbuch für den Maschinen-bau/Dubbel Heidelberg: Springer Verlag. p L 8 (section 1.2.1). thermisch; Spalt-neutronen; Kettenreaktion
van Dale: Groot woorden-boek Engels-Nederlands & van Dale:Groot Nederlands-Engels (1986). Utrecht/Antwerpen: Van Dale Lexicografie bv. (atoom) splitsing?



Studies of terms from different domains (e.g. unification from Computing and neoplasticism from Art Criticism) showed similar results.

Since definitions are, as we have seen, crucial to the coherence of a terminology, issues such as those raised by Zgusta are clearly of some importance. In general, terminological practice coincides with that of lexicography, except that lexicography fails to stress the importance of systematicity, presumably because of the more diffuse nature of general language as compared to special languages. A further factor is that the lexicographical principles as outlined by Zgusta are not necessarily followed in practice, thereby decreasing the potential value of such definitions for terminological purposes.

  1. CONCLUSIONS

In this section, a number of problems have been identified in connection with the identification, presentation and representation of terms in general-language dictionaries. These relate to the fluid boundaries between general and special languages, the absence of clear pragmatic information, the inconsistent, sometimes ad hoc, and imprecise use of subject-field labels, the occurrence of terms as senses, or even sub-senses, of polysemous entries (where the canonical headword is therefore neither a word nor a term), and varying definitional practices. All this means that the re-use of such data, e.g. as the basis for further elaboration, is not straightforward, particularly if automatic processing is intended. The solutions are likely to be medium-term rather than short-term, arising from the availability of machine-readable dictionaries, in which data is more consistently represented to the user. The use of standards being presented such as the MULTILEX MLEXD representation format in the construction of lexicons is to be encouraged, in this way a number of dictionaries can be produced from the same source for different purposes (including different human user groups, machine processing, and so on). The compilation of terminologies, or nascent terminologies, from such lexicons can then be regarded in the same way as the construction of any other dictionary subset of the original data. Some effort also needs to be directed towards a more consistent use of subject-field labels, e.g. according to an established classification system.

The terminological data included in general-language dictionaries is likely to be most relevant to those areas where lay and expert interests overlap. While certain domains may therefore not be covered in general-language dictionaries, it may also be expected that only subsets of terms from more accessible domains will be included, defined by their usage in communicative situations which are not exclusively expert-to-expert (i.e. field-internal). Pragmatic labelling is, however, often deficient in distinguishing between levels of usage.

  1. SELECTED TOPICS IN TERMINOLOGY RESEARCH - TERMINOLOGY ACQUISITION, INTERCHANGE AND TERMINOLOGY KNOWLEDGE BASES

Whether or not terminology is an independent research discipline is a moot point, but a number of strands of research work can be found within and outside terminology which would be of relevance to terminologists and terminology users. These exemplary topics reflect the dependence of terminology on other disciplines and how other disciplines can be affected by terminology:

  1. TERMINOLOGY ACQUISITION

One of the problems which impedes successful specialist communication is the lack of sufficiently-elaborated, timely and accurate terminologies, both monolingual and multilingual. The compilation of terminologies is predicated in the first instance on the identification and acquisition of terms in delineated areas of knowledge or domains. Traditional manual methods of working are highly labour-intensive and slow. Research has therefore been focusing on ways of automating or semi-automating this stage of terminological work.

  1. Scientific writing: Discovery and terminology

Terminology can be found in the discourse of specialist communities, i.e. in texts. Research on the structure and functioning of special-language texts is therefore of considerable importance in considering the ways in which terms behave in text, as a basis for the (semi-)automatic acquisition of terms.

Michael Halliday has used the notions he has developed in systemic linguistics together with his predications about language as a 'social semiotic', to explore the historical relationships between science, language and literature (Halliday & Martin 1993). Halliday's explorations are of significant import for terminology science in that he has commented on the language used by scientists and has argued that technical terms play an important role in the creation of 'a discourse of organised knowledge'. For Halliday and Martin the uniqueness of scientific language lies in the 'lexicogrammar' or wording of the language.

Scientists adapt and innovate using the lexicogrammatical resources of their first language, or those of their second (or third) language, in order to express abstract and concrete ideas, to simplify and summarise complex facts, to explicate and exemplify, to contradict and reinforce, to categorise and state exceptions to rules. And, in the execution of all these complex tasks - tasks which are executed through the medium of written text - scientists coin new terms, suppress the use of some extant terms, restrict or expand the scope of terms, borrow from other disciplines and from other languages. Terminology, as a collection of vocabulary items and as a science of terms, plays a central role in scientific endeavour.

According to Halliday, scientific English 'is English with special probabilities attached: a form of English in which certain words, and more significantly certain grammatical constructions, stand out as more highly favoured, while others correspondingly recede and become less highly favoured, than in other varieties of English' (1993:4). Halliday notes that scientists use certain 'grammatical resources' in order 'to create a discourse that moves forward by logical and coherent steps, each building on what has gone before' through 'nominal elements' and 'verbal elements' (1993:64). The 'nominal elements' form technical taxonomies and summarise and package the representation of processes, while 'verbal elements' relate 'nominalised processes', not only externally to each other but also internally to the author's interpretation of them (1993:64). Halliday identifies a number of phrases which have superficial resemblance to Cruse's 'lexical semantic frames' (Cruse 1986), and which govern and help to extract information from scientific texts. Certain types of linguistic expression can therefore be identified as enjoying a potentially-high degree of 'termhood', identifiable through statistical probabilities and/or linguistic forms and patterns with the potential also to indicate semantic relations.

Frawley's position on the role of scientific discourse is an even stronger one in that he regards science primarily as discourse: 'science is a sign system, a method of creating representations of the world and of institutionalising these representations into coherent systems of extended talk: science is discourse, a discourse that 'precedes all factual knowledge: discourse is the first knowledge of science' (1986: 68-9). And, as this discourse is, perhaps, the sole vehicle for disseminating the ideas of scientists, then it is the discourse that 'allows the innovation to be an innovation' (1986:78). The case for using text as a source of terms is therefore a strong one.

  1. Corpus-based Terminology Acquisition

While it is now commonplace for lexicographers - especially in English-language publishing - to be supported by computer tools integrated into a 'workstation', the acquisition of terms semi-automatically remains largely in the area of research. So-called terminology management systems which are commercially available on the market do not contain programs for the acquisition of terms, which is still mainly carried out manually - by 'scanning' text, whilst simultaneously seeking other linguistic and conceptual data. However, the solution is not to adopt wholesale the tools (and techniques) which have been developed for lexicography, since general-language and special-language texts exhibit different characteristics in a number of respects. The task of the terminographer is of a different nature from that of the lexicographer. Lexicographers, for instance, are not primarily concerned with identifying what a 'word' is (they are interested in all words in a text). For terminographers, however, identifying terms among other words is a major question, although linguistic and conceptual pointers play a role (e.g. compounding and collocational patterns.) Certain notions can, however, be usefully borrowed from the pioneering work on lexicography. One such notion is that of 'lexical evidence'.

Pioneers of 'lexical evidence' in corpus linguistics include Randolph Quirk, Jan Svartvik and John Sinclair, whose work in collecting and analysing large corpora of texts for studying a given language has had considerable influence on how dictionaries are compiled. The structure of the texts in these text corpora is sometimes explored and exploited to refine the 'evidence', mainly by using a tagged corpus, or structure is indirectly referred to through a classification of these texts according to their genre and register.

It is now almost a commonplace that large bodies of text stored in machine-readable form - or electronic 'corpora' - provide both a high quantity and a high quality of evidence to the general-purpose lexicographer, not only in the task of identifying and distinguishing between polysemous words, but also in exploring the interaction between sense and structure (see, for instance: Sinclair 1987; 1991; Svartvik 1992). But the use of special-language corpora in terminology work has received considerably less attention to date, despite the suggestion that 'systematic terminology compilation is now firmly corpus-based' (Sager 1990:130). However, a number of aspects connected with corpus-based terminology are now being dealt with in the research literature, including corpus design, terminology management, term identification and translation-oriented terminology (Ahmad & Rogers 1992a; 1992b; Holmes-Higgin, Griffin, Hook & Abidi 1993; Ahmad, Davies, Fulford & Rogers 1994; Rogers & Ahmad 1994; Meyer & Mackintosh 1995a; Meyer & Mackintosh 1995b; Ahmad & Rogers forthcoming) .

Indeed, it can be argued that the use of corpora, particularly electronic corpora, is even better motivated in terminology than in general-purpose lexicography as a source of 'evidence' for the compilation of terminologies. Our reasons relate mainly to the notion that special-language texts deal with semantically-restricted domains and that these texts are produced largely by members of closely-defined discourse communities in order to disseminate their knowledge of these domains. Such restrictions can be said to reduce the degree of lexical and syntactic variation in text (Lehrberger 1982) and therefore lead to what might be described as a more objective kind of evidence for the terminologist than the much more variable texts of general language. It is not the case that there is no variation in special-language texts, simply that there is less of it than in general-language texts.

The evidence which can be provided by special-language texts will achieve a greater consensus on the basis of smaller quantities of primary data than the evidence gleaned from larger quantities of general-language texts. Furthermore, terminologists are less able to use other sources of evidence such as introspection or extant lexicons (Sinclair 1985) as a source of data than lexicographers working on general language. 'Observation', i.e. the gathering of primary data from texts, therefore becomes even more important in terminology.

A key research issue in the (semi-)automatic acquisition of terms from text, alongside the identification of terms, is that of corpus design. If a corpus is randomly collected and lacks organisation, then output from automatic processing of the texts it contains - even straightforward operations such as concordancing, word lists and indexes - will be extremely difficult to interpret. In lexicography, a common principle of organisation at the highest level of text categorisation is that of imaginative versus informative text genres. It is clear that a different typology of texts has to be developed for the special-language texts from which terminology can be acquired. The question of corpus size is also an issue: it is generally agreed that for terminological purposes special-language corpora need not aspire to the millions (or hundreds of millions) of words now generally expected in a general-language corpus (Ahmad & Rogers forthcoming).

The notion of 'special language' can be understood as a series of levels which are conventionally described on a scale of 'abstraction', from theoretical-academic (primarily written) through workshop-technical (also spoken), to the language of distribution, sales and marketing (see, for instance: Fluck 1985:21). At the lowest level of abstraction, the boundaries between special language and general language become increasingly blurred. So we might expect there to be greater variation in use here than at the highest level of abstraction. Since these different levels of abstraction in special language are likely to be realised in different text types, e.g. academic journal articles and scholarly monographs at the most specialised level, manuals at the middle level, and advertising and journalistic literature at the lowest level, an optimal evaluation of the evidence which is gleaned by processing the corpus depends on a careful structuring of the corpus, incorporating in its design the feature 'text type'.

Scientific writing can be distinguished not only by the profusion of scientific terms, a large majority of which can be classified as nominal expressions, but by the preponderance of agentless passives (Svartvik 1966), the marked nominalisation of verbs (Halliday and Martin 1993:64), and by the very low frequency of personal pronouns. It is at the lexical level that the text types can be distinguished from other text types in general language, like newspaper editorials, short stories and novels, personal letters and so on.

A related issue here is that of corpus re-usability. In the lexicographic literature, there is much discussion of tailoring a corpus to a particular purpose and structuring the corpus accordingly (e.g. Engwall 1994). This usually involves establishing a hierarchy of categories. However, recent work on corpus-based terminology work has shown that it is possible to re-use corpora for different purposes by creating a 'virtual corpus' as a subset of the texts available in the generic corpus, thereby obviating the need for a pre-set hierarchy (Ahmad, Holmes-Higgin & Abidi 1994).

Once the corpus is assembled, particular techniques can be devised for actually acquiring terms. Researchers have identified a number of ways in which this can be done in both tagged and untagged corpora. It must be borne in mind, however, that tagging (e.g. for word class or subsets of word classes) is not fully automatic and that taggers are usually designed for general-language texts and are not available for many languages. Working with untagged corpora is therefore likely to be the most promising way forward for terminology in the immediate future.

The best-known statistically-based work, with lexicographers working on general-language texts in mind, is that of Kenneth Church and his colleagues in the United States (e.g. Church, Gale, Hanks & Hindle 1990; Church & Hanks 1990; Church, Gale, Hanks, Hindle & Moon 1994). As they rightly point out, the machine processing of large corpora through concordancing can produce massive volumes of relatively undifferentiated material which is hard to interpret. They have therefore developed tools which are statistically based. For terminological purposes, however, these tools hold little immediate attraction since they appear to need tagged corpora of over one million words in order to produce interesting results.

As already indicated, corpus-based work for terminological purposes is still quite rare. One approach has been under development at the University of Surrey since the late 1980s. This approach uses untagged corpora and relies inter alia on statistical patterns, linguistic cues and differences between special-language and general-language texts. Let us look briefly at one particular technique for acquiring terms.

The most frequently-occurring words in any text belong to the closed-class category. Closed-class words have a largely or wholly grammatical role and include articles, pronouns, prepositions, auxiliary and modal verbs, and conjunctions. The membership of the closed-class word-family is fixed or limited; neologisms are not normally added. The open-class category, on the other hand, comprises words whose membership is in principle indefinite or unlimited. New words and senses are continually being added to this set as new inventions, ideas, and so on, emerge. But there are differences in the distribution of these categories in general language and special language texts.

A comparison of the word frequency between a general-language corpus, say, for example, the Lancaster/Oslo Bergen (LOB) Corpus of British English (c.1961), and a specialist corpus of automotive engineering texts shows the difference in behaviour of open-class items in the two corpora (see Table 1). The six most frequent words in both corpora are the same closed-class words, comprising over 15% of the total words in each corpus. The first ten words in both corpora are still closed-class words and comprise just under 25% of the total words of each corpus:

Surrey's automotive engineering corpus

369,751 words
Lancaster Oslo-Bergen corpus

1,013,737 words
Word

Form

RankRelative Frequency

(%)

Word ClassWord

Form

RankRelative Frequency

(%)

Word Class
the1 7.15closed the1 6.74closed
of2 3.34closed of2 3.53closed
and3 2.36closed and3 2.75closed
to4 2.23closed to4 2.64closed
in5 2.10closed a5 2.23closed
a6 1.92closed in6 2.10closed
is7 1.33closed that7 1.12closed
for8 1.09closed is8 1.10closed
with9 0.86closed was9 1.05closed
on10 0.75closed it10 1.04closed
as11 0.68closed for11 0.92closed
be12 0.66closed he12 0.89closed
are13 0.64closed I13 0.75closed
by14 0.62closed as14 0.72closed
that15 0.62closed with15 0.71closed
emission16 0.59open be16 0.71closed
this17 0.58closed on17 0.69closed
at18 0.56closed his18 0.62closed
engine19 0.56open at19 0.60closed
vehicle20 0.51open by20 0.57closed
system21 0.48open not21 0.54closed
car22 0.48open had22 0.54closed
catalyst23 0.46open this23 0.52closed
it24 0.45closed but24 0.49closed
which25 0.44closed from25 0.46closed

Table 10 / 1 : A contrastive analysis of a special-language and a general-language corpus

The first two open-class words in the LOB corpus do not appear until rank 53 (the reporting verb said ) and rank 62 (the common noun time) respectively. However, there are six words that are open class, all nouns, between ranks 16 and 23 in the special-language corpus, all potentially key terms of the domain: emission, engine, vehicle, system, car and catalyst.

If we compare the relative frequency of the first six open-class words in the automotive engineering corpus with their relative frequency in the LOB Corpus, we find that the co-efficient of the relative frequency is some guide to the quantitative differences between special-language texts and general-language texts. The very high values of this co-efficient for some words indicates that, perhaps, these words are used almost exclusively, say, for automotive engineering. The high-frequency open class words identified in Table 1, have a large co-efficient of relative frequency ranging between 16 and infinity. Terms such as emission and catalyst, together with related terms like autocatalyst, converter, and hydrocarbon(s) have zero frequency in the LOB Corpus, but a finite frequency in the automotive engineering corpus; hence, the 'co-efficient of weirdness' for these terms, when computed by comparison against the LOB corpus, is infinity. (Table 2).




Word
Surrey Automotive Engineering corpus (369,751)
Lancaster Oslo-Bergen corpus

(1,013,737)


Co-efficient of
Absolute Freq.
Relative Freq.
Absolute Freq.
Relative Freq.
Weirdness
(a)
(b)
(c)
(d)
(b/d)
autocatalyst27 0.010 0.00Infinity
car1,790 0.48272 0.0317.89
catalyst1,700 0.460 0.00Infinity
control1,517 0.41199 0.0220.88
emission2,194 0.590 0.00Infinity
engine2,083 0.5670 0.0181.10
hydrocarbon140 0.040 0.00Infinity
hydrocarbons290 0.080 0.00Infinity
system1,795 0.48298 0.0316.33
vehicle1,884 0.5120 0.00258.5

Table 10 / 2 : The preponderance of open-class words in special-language literature

(Figures in columns 'b' and 'd' are to two decimal places)

In contrast, for the most frequently occurring closed-class words, like the, of, and, to, a and in comprising just under 20% of the LOB Corpus and the automotive engineering corpus, the co-efficient of relative frequency is close to unity. For other closed words, like we, what, and would for example, the coefficient of relative frequency is far less than unity: it appears that scientists in particular and specialists in general tend to 'suppress' the use of certain closed-class category words. This suppression is as much an idiosyncrasy of the specialist texts as is the preponderance of nominals in such texts: a kind of weirdness, a departure from the norm, a departure from the general language of everyday usage. Techniques have also been developed to deal with the problem of lemmatisation in term identification, i.e. detecting patterns for canonical forms rather than particular word forms such as singular and plural forms of the noun.

Future work in term acquisition is likely to focus on the identification and exploitation of semantic relations between terms in text, and on the identification of multi-word terms and term boundaries. A corpus-based approach to terminology acquisition will also help to enrich the linguistic elaboration of terms including their phraseology.

  1. LANGUAGE -INFORMED STANDARDS FOR TERMINOLOGY INTERCHANGE

The identification and elaboration of terms of a specialist enterprise involves specialist writers, terminologists, and standardisation experts. The organisation of terms in terminology data bases further involves computing scientists. As most specialist enterprises are the result of the efforts of a multi-lingual community, it is hardly surprising that the identification, elaboration and organisation of terms does and should include translators and interpreters.

Each of the actors, that is the specialist writer, terminologist, standards person, translator, interpreter and computing scientist, brings to bear the influence of his or her discipline on the identification, acquisition and elaboration of terms and the organisation of terminology data bases. The writer will stress the linguistic and communicative aspects of a term; the standards person will emphasise the need for systematic classification, Platonists or otherwise, for individual terms and for terminology data bases; the translator will highlight interlanguage aspects (however defined); the computing scientist will focus on how to simplify the descriptions of all other actors, so that each individual term and its associated data and, indeed, the entire terminology data base can be stored efficiently and be retrieved on demand. This means that despite the best efforts of all concerned, each actor and ultimately the end user, has to be aware of assumptions of every other actor. One symptom such tacit understanding is the polysemous nature of certain attributes of terms and here we are thinking of attributes like 'NOTE' and 'SCOPE'. Another symptom is the existence of variants for describing the same field: 'headword', 'main term', 'lemma', etc. This tacit understanding is also rooted in the fact that some of the attributes of a term refer to philosophical conundrums like 'synonymy', 'equivalent words'; in a work-a-day context, the actors (and the users) often have no problems when dealing with synonyms or equivalents, but the underlying computational models of most of the current terminology data bases cannot cope with these philosophical conundrums. No wonder that, when dealing with the organisation of terminology in a term base, the national and international standardisation bodies invariably and inevitably have an annexe/appendix that defines the 'terminology of terminology'.

Given that an ever-growing number of individuals and enterprises formally or informally involved in collecting and organising terminology data bases, on different computing systems, using idiosyncratic linguistic conventions and conceptual schemes, and generally relying on varying degrees of tacit understanding, it is not difficult to understand that using terminology (terms and/or term databases) developed by others is not easy, except when both parties (the giver and the receiver, exporter and importer) have identical computing systems, conventions and schemes and so on.

Due to the strategic importance of terminology, and the fact that it is not easy to import or export terms or term bases, terminology interchange has become an important issue in terminology research.

The emergence of ISO standards like ISO 8879 - the Standard Generalised Markup Language standard (see, for instance Goldfarb 1990)- has enabled the users of different text processing and word processing systems not only to exchange the 'natural text' - essentially letters of an alphabet together with numerals, punctuation marks, etc. - with each other, but also additional information that is interspersed among the natural text that provides, for example, information about how the text is organised in paragraphs, how it is paginated, and type font related data such as style, font size, and so on. Using SGML standard, data more complex than natural text, like figures, artwork, tables, references with other documents, can be identified and exchanged across different text processing and word processing systems.

SGML standard and associated software helps individual users in 'marking up' a document more generally than a specific word or text processing system can. The major users of SGML are by and large document publishers, e.g., the US Department of Defense, European Union organisations, dictionary publishers, encyclopaedia vendors and so forth.

The terminology community has shown keen interest in using SGML for interchanging terminology. Reinke (1993) has discussed the possibility of the emergence of a 'standard interchange format for terminographic data'. The author discusses three major difficulties encountered in such an interchange: (i) Hardware- and software-related difficulties, including problems related to non-standard character sets, different record layouts used in different terminology management systems; (ii) data categories and data elements, particularly the polysemous and vague use of nominals for describing categories; and (iii) the treatment of homgraphs, variants and 'doublets'. Reinke succinctly discusses the history of various terminology interchanges and briefly describes the proposed SGML standard.

SGML, however, is a publishing layout standard and relies heavily on how we describe a given document - terminology record formats and consequently terminology data bases on a computer system. Certainly, SGML gets away from the cryptic coding of data used in interchanging magnetic tapes containing termbases (the earlier MATER and MicroMATER standards). But what is required before SGML encoding of terms is the articulation of the linguistic and conceptual attributes of terms. Once articulated, an SGML schema - or document type definition in SGML parlance - for describing the linguistic features and conceptual categories related to a term can be devised.

In the next two sections, we describe recent work in lexicography and in knowledge representation (in artificial intelligence) that helps in the articulation of such linguistic and conceptual information associated with a term.

  1. Lexical interchange standards and reusable lexica

Computer-based dictionaries and dictionary-based computers A dictionary of a language is an archive of a language and to a limited extent a repository of a people's culture: the archive and the repository can be used for teaching and learning the language and for writing and reading written texts and for speaking and comprehending the spoken form of the language. One may argue that a good dictionary comprises data relating to 'words-in-use', say a learners' dictionary or an advanced dictionary for current usage.

In order to communicate a myriad of information to the dictionary user, the lexicographer uses a variety of graphetic information: abbreviations for grammatical, semantic and pragmatic data, different font sizes for the definition as compared to the entry, bold and italic lettering to distinguish (limited) grammatical information, iconic symbols, like arrows to indicate hyponymy, pictures for elaborating on a definition and so on. The interlinkages between an entry and its definition and amongst entries is indicated by specialist type fonts and by the use of different kinds of parentheses. More recently, excerpts from 'representative' text corpora of the language have been used to provide further data about how to use a given entry and to elaborate upon meaning by association: these excerpts again are laid out in a different kind of a font and style to other components of an entry.

Consider the scenario where the lexicographer has to organise his or her dictionary on a computer system. The use of computers for storage, updating and retrieval of data and information requires making the data as explicit as possible to the machine. Mere typographic mark-up of data, for instance, marking the components of an entry such that some are capitalised whilst others are italicised and still others are displayed, either on a visual display screen or on paper, is tantamount to using the computer as a photocopier: simply the reproduction of data on paper from paper. The problem will be further exacerbated if the encoding of data involves distinguishing the linguistic sign of an entry, for example, the headword, definition, grammatical category etc., as slot names in a table, from the values attached to these attributes, i.e. the linguistic content, as the 'fillers' for these slots. The use of computers for lexical storage and retrieval is crucially dependent on how a computer expert understands and consistently simulates, in howsoever limited a fashion, the cognitive processes that underpin the working methods of a lexicographer and the retrieval strategies of dictionary users.

The proliferation of word processing software, machine-assisted translation systems, 'smart' document management systems, dialogue systems all require dictionaries in machine readable format and suitably represented on a computer system. Moreover, dictionary publishers produce a variety of dictionaries, almost all based on a common archive. In order for a publisher to achieve a quick-turnaround computers are used to store the 'common archive' or dedicated lexical data base as well as programs that can extract data from this archive.

Unfortunately, dictionary publishers and lexicographers, both paper-based and computer-based, use different representations to build this archive. The net result is that there are many different variants of the same 'common archive' tags used by say Oxford, Collins, Longman and American Heritage dictionaries are all different. Although many publishers use data-base management systems for this 'common archive' the full potential of these systems is not realised.

The use of a common data base management system requires building a data-oriented representation of the dictionary, the so-called data model that helps, in a limited way, a computer expert to understand the various attributes associated with each entry say, and associated value of each of the attributes. The data model is an abstraction, literally a simplification, of a lexical entry on the one hand and of the dictionary as a whole or the other hand for the purposes of storage and retrieval. There are few dictionary publishers that really look into this question in any detail and the result is that extensive revisions are required of the programs if and when the lexicographers wish to add more attributes to an entry, or attach more or fewer values to a given attribute. This is a symptom of a complex of shortcomings: under-representation of data, that is the use of an encoding scheme rather than of a representation schema, and a failure to consistently describe the linguistic/lexical properties of words and phrases.

MULTILEX Project and its contribution to lexical standards The MULTILEX project has produced a lexical interchange and description format, MLEXD that is 'language informed', computationally plausible and multifunctional.

MLEXD is 'language informed' in that it is based on authentic data and so-called lexicon-driven theories. This theoretical framework is characterised by a feature-oriented representation of linguistic information. It is possible to use procedures for inferring new information from the pre-stored and represented data by using unification, inheritance and subsumption. The MLEXD approach thus has a synergy with related grammar formalisms such as Generalised Phrase Structure Grammar, Head-driven Phrase Structure Grammar, Categorial Unification Grammar, and Lexical Functional Grammar.

This feature-oriented representation has helped in the storage and retrieval of a collection of orthographic, phonological and morphological (GPM) features for describing the complexity of the so-called form aspects of the linguistic sign. The GPM features data is linked with syntactic and semantic features that might be regarded as representing the combinatory and the content aspects of the linguistic sign. Hence, MLEXD has two main components: the form related graphic, phonological and morphological Unit and the content related Lexical Unit. The respective features of each of these units can have attributes that can either be irrelevant, optional or mandatory depending on the language in question: gender is not a relevant feature of the English noun whilst this is the case for, say, Spanish, French and German.

A number of organisation have adopted the MLEXD approach and data description for building bilingual computer-based dictionaries. However, MLEXD provides scant guidance about terminology and terminology interchange formats generally do not emphasise the linguistic/lexical properties of words and phrases. An interface between MLEXD and terminology interchange format (e.g. ISO 12200 -MARTIF- and Text Encoding Initiatives Proposals), will be of considerable import to translators and technical authors. It is in this context that the efforts of another EU-sponsored project TRANSTERM are of considerable import.

  1. Intelligent Representation of terminology and terminology knowledge bases

A terminology data base of a specialist discipline contains substantial amounts of the knowledge of that discipline. This knowledge has to be encoded (in tables of a relational table, for example) before it can be stored in a data base. The encoding methods (and techniques) reflect the architecture and the demands of the software systems that are used in build the terminology data base. This stored knowledge is decoded and decrypted before it can be used by its end-users. The knowledge in a terminology data base is in the 'eyes of the beholder': a human being browsing through terms describing the animal kingdom, for example, will bring to bear his or prior knowledge of mammals in order to appreciate that 'a cat is a mammal and because it is a mammal it has hair'. This 'knowledge' is encoded in the 'DEFINITION' (or 'NOTE' or 'SCOPE') field of a term base: thus, for a terminology data base 'a cat is a mammal'' is a string of 13 characters divided into 5 words and 'mammals have hair' is another string of 3 words comprising 15 characters! Unless very cumbersomely encoded, most of the currently available TMSs cannot make the simple inference, based on using the two strings of characters mentioned above, that because a cat is a mammal it has hair. If the data associated with individual terms could be represented more explicitly, such that a TMS could make some of the inferences which human beings make whilst browsing through a terminology data base, then such a terminology base will active as a pro-active base of knowledge: the so-called terminology knowledge base (TKB) The difference between a terminology data base (TDB) and a terminology knowledge base can be appreciated by noting that whilst knowledge can be regarded as interpreted information, information is processed data. Thus a TKB comprises processed and interpreted data, or has facilities to interpret the information, a TDB can only process data.

Limitations of the data models and terminology data bases Currently available terminology management systems comprise facilities for managing large volumes of terminological data, that is collections of terms and their associated data. This data management currently uses data base management systems (DBMS)' methods and techniques. Usually a proprietary DBMS is used and in some (rare) cases a TMS vendor will provide an in-house DBMS.

The efficient storage and retrieval of data into and from a DBMS depends crucially on how the data was modelled, i.e., simplified such that it can be stored in a database. The modelling is focused on optimising the use of the software and hardware systems and focused on avoiding inconsistencies. A given model allows the organisation of data into hierarchies or networks or relational tables, the latter being one of the popular data modelling techniques in the 1970s. More recently, object-oriented models have been used in order to experiment with the storage and retrieval of terminology collections; in an object-oriented system, for example, each term, and its relationship with others, will be represented as a computational object so that these objects communicate with each other. Object-orientation allows for the description of abstract notions, like shared properties of a number of different objects through some kind of inheritance mechanism - a computer program that helps a user to automatically transfer the properties of a superordinate term, say, to its subordinates and its instances.

The move away from simple files storing terminology collections towards relational databases is an acknowledgement that the data associated with a term is quite complex. This is true whether the data subset is administrative data, like date of entry, terminologist's name, documentation source, or whether it is linguistic data, like grammatical information, foreign language equivalent, depreciated data, or whether it is conceptual data including concept-oriented information, semantic relations (synonymy, antonymy, part-whole taxonomy, etc.), ontological relations of causality, instrumentality, etc. The complexity lies in the fact that the linguistic and conceptual data involves respectively knowledge of language and knowledge of the subject domain, and some encyclopaedic knowledge or general knowledge of the world at large.

Data models, and associated data structures, may be good for abstracting descriptions and encrypting certain well defined data, but when it comes to the representation of knowledge, these models and structures cannot cope. Data models and data structures like relational tables cannot, in principle, cope with the interrelationships, interdependencies and exceptions to relations and dependencies that are implicit in terminology collections. The use of object-oriented models and systems is a natural move towards storing terminology collections. But what is really required is a formalism or scheme in which knowledge can be explicitly represented.

Representation is about making things explicit, about resolving ambiguities and above all, particularly in the context of artificial intelligence, about creating a surrogate of a class of things that exist in the real world on a computer system. This surrogate should not contain ambiguities, either lexically or structurally; it should help explicate shared knowledge, since it is not possible to share knowledge between a human and a machine in the same was as is possible between humans; and it should be content-addressable and be heavily cross-referenced.

There are a number of schemata for representing terms - the building blocks of the domain - using knowledge representation formalisms such as frames and semantic networks. These schemata are a means of allowing end-users to visualise the rich interconnectivity of terms. The formalisms also have in-built reasoning algorithms for inferring new information from pre-stored data.

Knowledge representation is a sub-discipline of artificial intelligence (AI), and is particularly concerned with the representation of human knowledge on a computer system through the development of knowledge representation formalisms. These formalisms were developed for organising aspects of human knowledge so that this knowledge can be accessed and applied through the use of computer programs. Conventionally, human beings organise knowledge and it is human beings that access and apply the organised knowledge. This ability for organising, accessing and applying knowledge, efficiently and in a timely manner, is a hallmark of intelligent behaviour. Researchers in knowledge representation build computer programs to simulate this aspect of human intelligence.

The best practical example of the use of semantic networks in the construction of a terminology base is that of the Unified Medical Language System(UMLS"), is part of a 'long term effort to build an increasingly intelligent automated system that understands biomedical terms and their interrelationships and uses this understanding to help users to retrieve and organise information from machine-readable sources' (Unified Medical Language System 1991). One component of UMLS is the UMLS Semantic Network, a knowledge source that is used for categorising all concepts stored in term bank containing over 67,000 medicine-related concepts together with over 220, 000 terms plus terms from the International Classification of Diseases, and terms from the American Medical Association, US Library of Congress, and terms from SNOMED, Systematised Nomenclature of Medicine . The UMLS is currently available for a trial basis from the US National Institute of Health.

A number of authors have discussed how to build a terminology knowledge base using knowledge representation formalisms. Ahmad, Rogers and Thomas (1987) have proposed the use of propositional logic, through the use of logic programming language PROLOG, in building terminology knowledge bases. Skuce (1993) has argued for a frame-oriented approach to the representation of specialist knowledge. Wettengel (1993) has briefly outlined the relevance of an advanced semantic network language, KL-ONE, for building terminology bases. Ahmad (1995) has discussed the use of conceptual graphs for building intelligent terminology bases.

In order to elaborate further on the notion of terminology knowledge bases we will present two case studies. The first case study shows frames cane be used in building a terminology knowledge base, whilst the second shows how the same task can be accomplished through the use of Sowa's conceptual graphs.

Frames and terminological knowledge bases A frame is essentially a special-purpose data structure for representing "common concepts and situations." The concepts and situations can be (hierarchically) organised in a network of nodes and relations, where the topmost node represent "general concepts" and the lower nodes which represent specific instances (the specialisation of the general concept). A frame is a knowledge representation formalism based on the idea of a frame of reference (Minsky 1968, 1975). A frame carries with it a set of slots that can represent objects that are normally associated with a subject of the frame. The slots can then point to other slots or frames. That gives frame systems the ability to carry out inheritance and simple kinds of data manipulation.

Consider the following frames of reference for the two main types of the cells that comprise typical animal nervous systems: the neurons and the neuroglial cells:

SlotFiller
frame nameNervous System Cell
akocell
locationnervous system
function
material

SlotFiller SlotFiller
frame nameNeuron frame nameNeuroglial Cell
akoNervous System Cell akoNervous System Cell
location location
functionimpulse conduction functionancillary
material material

The functions of a generic nervous system cell are not defined neither the material that comprises these cells. However, note that the frames for the neuron and the neuroglial cell specify the functions of these cells. The inheritance mechanism, that any frame-based system has, will help the system to infer that both these are kinds of cells which are usually found in the central nervous system: the location of the neuron and the neuroglial cell is not specified explicitly but the inheritance mechanism will again help in automatically 'filling' in this slot with the value 'central nervous system'.

The above frame descriptions can be used in the development of a terminological knowledge base. Gillam and Ahmad (1996) have used the BABYLON workbench, a low cost (c. ECU 40/£35.) collection of programs that can be used for building knowledge-based systems. BABYLON was developed at the German National Research Center for Computer Science (Christaller et al 1992), for building a terminology knowledge base comprising knowledge of visual neurons. The above frames, together with many others, were represented inside a BABYLON knowledge: a hierarchy of these frames is shown below.

.Figure 10 / 1 : Figure showing a hierarchical organisation of nervous system cells

Conceptual Graphs and terminology knowledge bases The conceptual graphs, originally due to Sowa (1984), are a theoretically well-grounded variant of semantic networks. Conceptual graphs form a knowledge representation language based on the one hand in linguistic, psychology and philosophy and data structures and data processing techniques on the other. A conceptual graph consists of concept nodes and relation nodes. The concept nodes represent entities, attributes, states, and events . The relation nodes show how the concepts are interconnected.

The terminology management system, System Quirk, comprises a conceptual graphs editor (CGE) as a potential, knowledge-based alternative for representing terminological data. The CGE has facilities for creating, storing, retrieving, modifying and deleting conceptual graphs. It has an in-built knowledge base of conceptual relationships. The CGE displays conceptual graphs in the graphical notation used by Sowa, on a 'canvas' which is extendible and includes scrollbars.

The CGE enables a user to build a new graph from pre-stored canonical graphs and enables the user to extend an existing conceptual graph. The build function applies formation rules automatically to canonical graphs in the knowledge base to generate a graph for a concept given by the user. The extend function allows the user to extend an existing graph at a given concept node.

The CGE comes into its own when it displays the myriad of semantic relations -as the relational nodes of the graph - that are used in defining and elaborating the terms of the nervous system cells. Figure below shows a description of rods, a kind of photoreceptor or neurons found in the retina, in terms of concept nodes and relations -- the former are shown in square boxes (for example, horizontal cells, retina, membrane dendrite etc.) whereas the relational nodes (for instance, location, attributes etc.) are the circles in the diagram:

Figure 10 / 2 : Figure showing a description of the photoreceptor cells using System Quirk's conceptual graph editor

The conversion of a definition into its equivalent knowledge representation form, either frames or conceptual graphs is not a straight forward task. However, it is a task that is of strategic importance for building pro-active terminology management systems. And with this strategic task in mind, work is currently being carried out at the University of Surrey, University of Ottawa and a number of other institutions, that is focused on the automatic conversion of a terminology data base, particularly the definitions of terms, from a data-base oriented format into a knowledge-base oriented format.

  1. CONCLUSION

Much in the way a language-informed design of a lexica makes it a reusable resource that can be of value to a range of users, a knowledge-informed design of a terminology collection will render a terminology data, or more precisely, terminology knowledge base of value to a range of users. Currently, a terminology data base is usually used by pro-active users interested in looking up data associated with a term. A TKB will encourage the use of systematically organised and represented terminology in information retrieval, in technical writing and in teaching and learning of sciences, where a suitable endowed TMS will offer help based on the inference drawn by the computer system. A terminology data base that has its linguistic information and specialist-domain knowledge explicitly represented will be ideal for machine translation applications. The currently available MT systems do not provide any facility for helping their users in dealing with the interaction of specialist knowledge and linguistic information. Indeed, one can argue that this interaction underpins the whole of the translation process and any support, for example through a language-informed terminology knowledge base, that could be provided would certainly improve the performance of the MT systems.

Terminology management experts can certainly contribute to the four major areas of activities in knowledge representation as described by Patel-Schneider (1991):

The terminology management experts can bring to bear their extensive knowledge of structuring concepts and the words and phrases that go with these structures. The terminology management community can benefit from the embodiment of formalisms: whilst a range of views is expressed about conceptual structures and conceptual primitives, it is not possible to evaluate any of these notions in any objective sense. The development of computer-based formalisms for representing, rather than coding, terminology collections will help in assessing how such formalisms cope with semantic 'data'. Much of the data that is used to test the efficacy of knowledge representation formalisms is not "taken from natural language discourse in communicative environments but is elicited in experimental settings" (Reiger 1988:34).