GENERAL FRAMEWORK


IMPACT AND AWARENESS


Present Situation

As has been seen in Section 1: "The Importance of Terminology", terminology represents a vital aspect of monolingual and multilingual information management, a significant economic factor, and an established academic discipline. The findings of the POINTER Project demonstrate that thousands of people across Europe use terminological resources in their daily work. Examples of typical professional groups dependent on terminology are: translators, technical writers, information managers, domain specialists such as engineers, lawyers, doctors, teachers, language engineers, cooks, firemen, civil servants, etc., etc. Lexicographers are interested in terminology as a means of guaranteeing the completeness and consistency of their dictionaries. Information and documentation staff engaged in multilingual thesaurus construction are interested in terminology as a means of classification, while artificial intelligence specialists use terminology as the basis for the conceptual structure, or to control the vocabulary of their knowledge-based systems.

As regards resources, an extremely large number of groups and individuals across Europe are engaged in producing glossaries, reference works, term lists, databases, and other terminological resources for their own internal use or for dissemination to others. These activities extend far beyond the commercial publishing sector, important though this is. At an infrastructural level, some countries have set up one or more official bodies to promote the interests of terminology, while at the micro-economic level a number of enterprises have instituted corporate language policies and, in some cases, even terminology departments. Indeed, in a number of cases, companies have made large amounts of terminology available via on-line services and/or other channels free of charge and without restriction, in order to increase knowledge of their activities, standardise usage and cement their positions as market leaders. Often, though not exclusively, these enterprises come from the IT&T sector, with its acute awareness of the value of information - two examples are Microsoft and SAP.

Looking at awareness in Europe as a whole, the following points may be observed.

Firstly, terminology and terminology work is generally better anchored as a concept and proportionally better developed in those countries which have a coherent and proactive general language policy. This is often the case in traditionally bilingual or multilingual areas (Switzerland is one good example here), in areas with a strong sense of a separate cultural and linguistic identity (e.g. France, Catalonia and the Basque Country, or South Tyrol), or which are otherwise sensitised to the need for multilingual communication(1).

Secondly, the degree of terminological activity occurring in the private sector varies as a function of the size of the organisation concerned, and the area in which it is active. Generally speaking, the larger the company, the more likely it is to engage in (formalised) terminological work, and to have a corporate language policy. Conversely, SMEs often do not have differentiated functions, procedures or even, in many cases, an awareness of any need. Equally, exporters dependent on translation to place their products and enterprises with research facilities are also more likely to be aware of the need for, and be involved in, terminological activities. Those who have not been sensitised in this way may well fail to see the problem, or to diagnose any symptoms correctly. However, a few cases were also found of management awareness even in smaller organisations resulting in a "disproportionate" level of terminological activity.

Problems

Nevertheless, it must be clearly stated that, in general, the strategic, commercial and practical value of terminology is basically unrecognised. This applies at all levels of activity (national, regional, European and international) in all sectors and types of organisation (i.e. both public and private), and in all branches of trade and industry. In fact, to a surprising degree, terminology can be said to be "invisible". At a very basic level, many people have not even heard of the concept, while many others have no idea of its importance or of the basic methods associated with it, even though in practice they may be users or even creators. Still others, including many "experts", find it difficult to present a coherent case for terminological resources and terminology work, or to list the benefits and back them up with facts and figures.

One reason for this is that, in practice, terminology work is a very often a part-time activity, i.e. terminology appears as an (often unrecorded) component of another activity such as translation, technical or other writing, standardisation, research and development, marketing, etc. Only a small minority of organisations (whether in the private or in the public sector) have appointed terminologists, and of these, a number have since disbanded the relevant departments because of subsequent cost pressures. Where terminology work is performed as part of another activity, it often suffers from the problem that the latter, too, is also not the core business - and hence not the first priority - of the organisation concerned. On the contrary, most communication functions such as translation and technical writing are equally regarded as additional and unwelcome cost factors. Given today's highly competitive economic environment, there is a strong feeling that these must either be reduced to a minimum or alternatively passed on to the (internal or external) customer in all cases.

More generally, terminology and related disciplines have a general image problem to overcome. The - often subconscious - argumentation seems to be that since everyone speaks at least one language, everyone is a language expert, and hence services in this area are "nice to have" rather than essential. As a result, terminologists (like translators and technical writers) generally occupy a subordinate position in corporate hierarchies, and have to fight constantly for funding. In this context the lack of costing and cost/benefit models discussed at length in Chap. 2.2 "Economic Aspects of Terminology and Terminology Work" is a particular handicap. In practice, this lack of investment means that there may well be no chance of increasing productivity by introducing modern technology and methods, of validating existing work as a prelude to reutilisation, or even of researching and obtaining the most reliable external resources and hence improving quality.

Linked to this is the point that the number of terminologists with formal academic or vocational qualifications in the subject is relatively small, although growing(2). As a result, many people simply have not met a "proper" terminologist, a problem compounded by the lack of data on the industry and market as a whole, and the poor dissemination of that information which is available. Moreover, the "amateurs" performing the work in practice often have little or no training in terminological theory, methods, resources and tools, unless they are recent graduates of related professions such as translation.

Another problem negatively affecting terminology work within enterprises in particular is the fragmentation of language workflows and policies. This is currently being exacerbated by the general trend towards decentralisation and the reduction in corporate overheads and hence facilities. Thus in a number of cases policies on terminology and language in general were not enforced across departmental or divisional boundaries, either because of inertia or because of active opposition. Equally, subcontractors complained in some cases that they were unable to have the work that they had done incorporated into corporate repositories or procedures, either because of a lack of time and resources to do so, or because of the "not invented here" syndrome. Conversely, there is also a distinct tendency to "sit on" the results of terminology work, either because they represent a presumed (departmental or personal) competitive advantage, or because of uncertainties concerning redistribution (e.g. copyright, confidentiality). These problems are exacerbated by the more general lack of market mechanisms for distribution and resale, and of easy technical solutions for exchanging existing resources.

Solutions

The results of the POINTER Project indicate a severe discrepancy between the generally extremely poor level of awareness of terminology and its demonstrable importance as a cultural and economic factor. As a result, concerted measures are needed to redress the situation. These must act at a number of different levels:

Recommendations

Leading on from this, the following specific recommendations can be made (unless otherwise stated, they should be performed by all relevant organisations and individuals within the terminology field in Europe):

General/Infrastructural Measures


European Awareness and Publicity Campaign


Awareness within the Terminology Community


In addition to the general measures mentioned above, specific actions should be directed at improving information flows and the level of knowledge within the terminology community (including for this purpose the many "part-time" terminologists already identified). The most important of these measures are:

Publicity for POINTER Results


As has already been mentioned, the POINTER Project represents a new quality in the organisation and knowledge of the terminology community within Europe. It is therefore important that the results of the Project are disseminated as widely as possible. In addition to long-term infrastructural and other measures proposed in this document, it is important to keep debate and commitment alive in the shorter term, and to increase the circle of participants even further. The POINTER Consortium therefore recommends that it perform the following publicity measures specifically related to the POINTER Project.

ECONOMIC ASPECTS OF TERMINOLOGY AND TERMINOLOGY WORK


Present situation

The POINTER Project results show a gradual increase in awareness amongst enterprises and other organisations (as both creators and users) of the vital importance of economic and financial aspects of terminology and terminology work(3). However, it is clear that there is a wide gap between the expectations of business and financial decision-makers on the one hand, and the arguments and analyses presented so far by those involved in terminology work on the other, particularly as regards three fundamentally interrelated key issues: costs, prices and benefits. This section therefore concentrates on these crucial subjects, embedding them in their overall economic and business environment, and provides recommendations for further action to be taken by the emerging terminology services industry as a whole, as well as by individuals and organisations involved in terminology work. Other topics discussed include billing models for on-line services, confidentiality and ownership of terminology, and general problems.

Today's corporate environment presents a fragmented picture, with a retreat from terminology work in some enterprises offset by increased investment in others. On the one hand, many terminological activities are currently being cut back or frozen in the drive to concentrate on "core business", while short-termist (as opposed to planned) outsourcing is leading to the fragmentation and loss of central terminological resources. One of the reasons for this is the inability of many in-house terminology services to quantify the cost and benefits of the service they provide, often coupled with a fear that if what they perceive to be the "real" cost of terminology were to become known within the organisation, their chances of redeployment - or even unemployment - would be immediate.

This trend should, however, be balanced against the existence of a small number of well functioning corporate termbases and (total or partial) corporate language concepts. In a number of areas, and most particularly in the telematics industries (e.g. Microsoft, IBM, SAP, Novell), there is even a trend on the part of large companies towards releasing terminology at low cost, or free of charge, to suppliers, customers and the more general public. Such enterprises have a clear vision of the tangible benefits of such terminological activities (although they, too, often appear to have no clear idea of the costs involved), viewing the widespread dissemination of their terminological resources as an aid to competitive advantage (e.g. application/platform standardisation, general corporate image) and greater efficiency of communication. This trend is also mirrored, to a greater or lesser extent, in other industries faced with rapid technological change and a customer base which is becoming increasingly demanding with respect to the quality and quantity of information directed towards it (for example the financial services industry). Indeed, it is possible to distinguish the early signs of another specific trend, with forward-looking, IT-driven enterprises and sectors using terminology as a specific tool for leveraging market gains, whilst many representatives of the more traditional industries now suffering economic decline, particularly in their domestic markets, apparently believe that terminology work has no concrete impact on their economic performance. This trend should be monitored closely to provide the terminology services industry with more detailed information about its own current and future potential market base.

However, there are a number of areas, for instance machine and computer-aided translation, and lexicography, where the economic benefits of terminology and terminology work are highly appreciated, and where costs have been classified to the extent required in these sectors, even though the question of profitability for the terminology creator, rather than the publisher, is sometimes doubtful(4). This chapter concentrates on the bulk of terminology work - where less data is available - and where visible benefits are less clear-cut.

The "mission statement" in Action Line 1 of the new Multilingual Information Society (MLIS) programme recognises that "the work in the field of terminology covers a vast range of activities with important implications for trade, science, the cultural sector and technology, and the implementation of Community decisions, directives and regulations". The efforts of the POINTER Consortium have identified not only a number of economic and financial problems which could hinder the development of the proposed infrastructure, but have also highlighted cost-effective and readily achievable solutions which can be implemented in the near term.

Costs

Problems


In the area of general terminology work in the private and public sectors, terminology creators are increasingly faced with a situation where management - and users - expect that terminological resources are sufficiently mapped and defined to generate understandable and acceptable cost analyses and cost-benefit statements, in the same way as any other item of operating income/expense. The urgent need to be able to produce accurate costing information was identified many years ago by Frederick Brooks in his book "The Mythical Man-Month"(5): "It is very difficult to make a vigorous, plausible, and job-risking defence of an estimate that is derived by no quantitative method, supported by little data, and certified chiefly by the hunches of the managers."

The first problem that has to be solved is one which does not affect terminology alone, but all resources based to a greater or lesser extent on the product of purely intellectual work supported by an increasing number of tools: how can "information" be valued in terms of its cost and benefits? At the same time, as some enterprises and other organisations seek ways to recover (still unspecified) costs, there is now also pressure to price terminological resources to allow them to be offered on the marketplace as commercial products. From a business point of view, exploding(6) input cost components to the degree necessary to state the overall cost to the producer is a useful decision support process to achieve a viable pricing policy. From a marketing point of view, it is difficult to justify a particular market price if the vendor's mark-up is unknown. The requirement to establish the cost of terminology is therefore driven by both the internal and external environments.

Another problem where costs are involved is that there is often an explicit unwillingness by terminology creators to disclose the cost of their data (where known) or even to assist in its analysis. This stems either from a misplaced idea of competitive advantage (discussed later), or from a fear that because they are unable to provide any indicators as to the cost-benefit of their services, such activities will be prime candidates for elimination during enterprise restructuring. Coupled with the fact that few of those involved in terminology have the business skills to develop convincing benefit arguments, such developments are forcing many in-house services, in particular language services which also create terminological data, to cut back on the volume of terminology work being carried out, thus reducing their ability to deliver a quality service.

The same problem also affects terminology service providers who are faced with a need to estimate costs for terminology services (in particular terminology creation) to be provided to direct customers, often as part of multinational projects. The experience of one enterprise cannot be transferred simply to another, and generic costing models which can be readily adapted to the needs of a specific company would enable the terminology services industry to function more successfully in a competitive, cost-conscious environment.

Statements are occasionally heard from industry and public-sector organisations that "our terminology costs DEM/FRF/etc. xxx per term", and whilst these may indeed be relatively accurate for the specific enterprise in individual instances, they cannot provide any generally applicable cost indicators. The true cost of terminology is bound to be affected by a variety of factors, including available resources, the number of terms and languages to be documented, the creation processes involved, data maintenance requirements and the status of the subject field as established or newly emerging. Whilst a relatively low cost may well be expected in small language service enterprises producing small collections of terminology on an ad-hoc basis, the cost of highly prescriptive terminology (in particular for standardisation purposes) can be substantial, especially in small countries or markets. Even in terms of labour alone, the effort required to standardise a single term as part of international committee work may run into thousands of ECU.

However, it should also be remembered that there are no standardised methods of terminology creation. Much terminology is created not for the market, but in the market, where the cost of terminology is often not a significant issue, but merely one of a number of factors. This is clearly in contrast to the interests of those individuals and enterprises dedicated to the terminology creation process.

Solutions


The solutions suggested are outline indications of the possible approaches which could be adopted. Much work still has to be undertaken before any of these models can be put into productive operation, and the objective of this section is to stimulate the development of practical and realistic solutions to the problem of terminology costing which can be implemented by terminology service providers in the private and public sectors throughout Europe.

The most logical approach to developing costing models for terminology would be to attempt to base such models around estimating techniques used in other sectors. Because of its nature, R&D would appear at first sight to be a suitable candidate, but because almost all R&D effort is now project-driven, such techniques cannot simply be transferred to the scenario of continuous terminology work. Another sector from which help might be expected is that of software development. If a terminology project involves developing a fixed number of term entries, each with a fixed number of fields of predefined lengths, then a costing model derived from either established mathematical models (e.g. COCOMO) or bottom-up (e.g. task-based/experience-based) approaches may be suitable: the elaboration of such a model must, however, be tailored to the needs and constraints of the individual development environment. This still leaves us with the problem of how to tackle the bulk of terminology work, i.e. on-going termbank population, frequently covering a variety of subject areas and often being performed as an adjunct to other activities.

One of the most pragmatic approaches to the problem of developing a costing model for continuous terminology work undertaken by dedicated terminology services centres around the defined terminology life-cycle. The life-cycle describes the workflows and processes involved in terminology creation, dissemination, updating, etc. Another source of terminology life-cycle information is the terminology validation procedure, which must identify and verify the processes involved in the creation of terminology. Such procedures are currently planned or under development (e.g. for terminology, written and spoken language resources as part of the cross-sector ELRA project, and in INTERVAL for terminology) and these will provide valuable input for the development of a costing model which, whilst not universally applicable, will give terminology creators a core framework which they can adapt to meet their own specific requirements. One of the most important tasks will be to survey successful terminology service providers to establish the workflows and processes involved, rather than their views on costs (which are likely to be regarded as confidential in any case),. Once it is possible to identify a number of processes which appear to be common to the creation elements of the terminology life-cycle, these can then be defined as cost components. Whilst fixed costs are relatively easy to establish (provided that the cost centre involved is willing to acknowledge the applicability of fixed costs to terminology work), variable costs will need more careful scrutiny. This approach should also allow the opportunity (hidden) cost of low-quality/missing terminology to be calculated, as well as providing a basis for establishing the cost-benefit of high-quality terminology.

Another, more theoretical, model proposed uses "Transaction Cost Theory"(7) to anticipate the benefits of implementing terminology controls and a coherent terminology policy. The proponents of this theory argue that the core of all performance exchange processes is the exchange of information, and that terminology work is a means of reducing the volume and costs of information exchange. Under ideal circumstances, the process of performance exchange is preceded by terminology work aimed at standardisation and harmonisation, and also taking account of linguistic usage(8). It is claimed that this model can reduce the communication requirement for all subsequent transactions, and thus cut transaction costs substantially, and may even be the key to making many transactions possible in the first place.

Benefits

Problems


Although few people will actually deny that there are benefits of terminology and terminology work, it has proven remarkably difficult in the past to qualify these in such a way as to make them meaningful and attractive to decision-makers in the private and public sectors. To quantify these benefits, in terms of weighting them on some sort of scale, does not appear a feasible solution, as such a process would necessarily be subjective and therefore open not only to abuse, but also to justifiable rejection. It should, however, be possible to assess some of the value components of terminology, which can then be factored in to generate cost-benefit statements. In turn, these can provide an indication of the overall economic benefit of the terminology involved. A number of problems must first be solved, including the following:

Solutions


To produce the classic cost-benefit statement, benefits must be viewed in conjunction with the cost of the service or product involved. It therefore follows that any analysis of benefits must be closely linked to the examination of costs. Ideally, such activities should be performed concurrently, and harmonised to ensure continuous feedback.

Based on the terminology life-cycle defined above for costs, value components in the processes and workflows need to be identified and integrated into a benefits matrix. This generic matrix should be capable of forming an integral part of any costing model developed, and should also be readily adaptable to meet the needs of specific terminology creators. Examples of value components which should be taken into account in this matrix include:

The examination of overall cost-benefit issues would undoubtedly be undertaken most effectively by studying a variety of "real-life" terminology life-cycles in collaboration with independent terminology service providers, SMEs, industrial/IT enterprises, research establishments and public-sector administrations. A blend of practical and academic input would enable common factors and trends to be identified and integrated into the generic matrix.

Because terminology is frequently seen as almost solely translation-oriented, repositioning it as "multilingual information management" will help make its value clearer to enterprises. This in turn will make it easier to embed terminology within the information environment and facilitate market research to obtain more detailed market-oriented data.

Pricing

Problems


There are those who claim that terminology is so expensive that it must be distributed free of charge, but this view pays little regard to the desire of creators to recoup the cost of terminology creation, as well as the often irresistible pressure in today's enterprises to participate in internal enterprise markets: an in-house service providing a "terminology hotline" may now have to charge for this service in addition to project-related contributions. Whilst some generic terminology resources will certainly be distributed free of charge or at cost price (for instance via ELRA and ELDA, the European Language Resources Distribution Agency), the number of terminological collections which will be available at "cost plus mark-up" is bound to increase, for example as a result of Commission policy with regard to its own resources. The selling price of terminology is heavily dependent on the existence of sufficiently accurate costing models. Other factors frequently hindering the development of coherent pricing policies are the lack of substantiated market data and the lack of business and management skills among terminologists. Once these components have been resolved, pricing policies taking into account benefit arguments, anticipated market, etc. can be developed.

Solutions


As pricing is inextricably bound up with costs and benefits, it will only be possible to develop realistic pricing policies once these other problems have been resolved satisfactorily. It must also be remembered that standardised unit costs and prices can only apply to standardised terminological units within a collection: if the elements of a terminology collection are not uniform, this must be taken into account in the pricing model.

The elaboration of pricing policy forms a core element of the terminology marketing process, and experienced members of the terminology services industry must be in a position to provide consultancy services blending in-depth knowledge of the factors involved in terminology creation with state-of-the-art management and marketing skills. In turn, such consultants need to be able to benefit from a Europe-wide support and training network. Business skills training for terminology creators and service providers will also make a vital contribution to the implementation of viable pricing policies.

The market research activities proposed for other areas must also include surveys on customer price acceptance. This in turn depends heavily on the ability to explain the benefits of terminology.

On-line Services/Billing

Problems


Several models are currently under development for the provision of language resources in on-line networks. For example, POINTER has a home page on the World Wide Web at the University of Surrey, and the RELATOR Linguistic Resources Server (available at three sites) could provide a platform for the distribution of language resources by ELRA. Whilst some resources will undoubtedly be made available free of charge, a charge will be levied for others. There is a widespread belief, particularly in the academic community, that resources in the Internet/WWW should be free of charge as a matter of principle. However virtuous and idealistic such a view may be, it overlooks simple economic realities and the desire of many resource-owners and developers (including the European Commission) to achieve a return on the often substantial investments of private and public funds in such resources. This does not mean that terminology with a commercial value will not be distributed free of charge (e.g. Microsoft, Novell, etc.), for instance to achieve specific market gains in the areas of corporate image or standardisation. Rather, both sectors - "freeware" and "against payment" - will therefore co-exist in future. As an information resource, terminology also has an economic value, and once matters of costing and pricing have been settled to the satisfaction of the owner, the option of distribution/sale via on-line services - as opposed to print or CD-ROM - becomes increasingly attractive. There is also an apparently widespread, but nonetheless mistaken belief that product liability issues can be avoided if a product or service (in this case terminology) is distributed free of charge. This is not the case: any liability involved does not depend on whether a product or service has been the subject of a pecuniary transaction.

One problem which has still to be solved is that of a watertight payment mechanism for on-line distribution. Even today, credit/charge cards are strongly advocated in some quarters as the most efficient method for guaranteeing payment for on-line services. However, this approach ignores two problems which - taken together - demand a different approach to this question.

Availability

Credit/charge card usage varies widely from country to country, and the situation in Europe is particularly diverse. The main problem, however, is one of customer base. If an on-line service provider is selling to the general public, then payment via plastic card may be a viable option once technical/security problems have been resolved. In the case of language resources, and in this instance terminology, however, the prime target market is likely to be corporate and institutional users. Few financial directors will be willing to allow credit cards to be used at the cardholder's discretion for the purchase of language resources which may even be required to be entered as an asset in the company's accounts. However, freelance translators and other small businesses may well prefer this option if they are willing to accept the security risks. This means that the credit/charge card option is not feasible as the sole solution.

Solutions


The most workable solution for billing on-line services will use "intermediate" technology already available. Although it may perhaps be slightly less easy to use (and certainly less newsworthy) than the credit card-based approach or paradigms involving real-time on-line debiting, it must offer the greatest degree of security and reliability. Where financial transactions are involved, it should be remembered that the most successful models combine proven technologies with progressive concepts: where money is at stake, prudence demands that risks be minimised.

Some promising solutions must be ruled out - at least for the time being - because of their limited applicability. The new generation of Minitel systems in France, for instance, uses smart card technology residing on the terminal itself to effect payment. It is conceivable that such a system could be implemented on a global scale, but it could still face the problem of corporate acceptance. Other possible systems, based on the secure electronic multibanking technologies already implemented and based around DES/RSA security systems (for instance BCS/Multicash in Germany and many other European countries, the IBOS international banking consortium), would be inappropriate because such customer-bank/interbank networks depend on point-to-point transmission and have yet to be adapted to the different operating and transport environment of the Internet. The model proposed here therefore relies on existing software technologies to ensure secure financial transactions.

The language resources server incorporates a metering system. The metering data (for example per "unit" of a resource) is passed - via an appropriate security gateway - to an accounting package residing off-line on another system. The accounting software then calculates the charges to be debited to the accounts of the users (buyers) and the amounts to be credited to the accounts of the resource owners. Statements are sent to the users and resource owners at regular intervals (e.g. monthly), with the respective bank accounts being debited and credited via the regular banking system. Such a model, based on (global) contracts between the server operator and the user on the one hand, and the operator and the provider on the other, is certainly not innovative and will doubtless disappoint those who believe that the Net is a suitable medium for all financial transactions. The administrative workload involved is unlikely to be substantially greater than that of any 100% on-line system. However, the proposed model is not only secure and reliable, it is also easy and inexpensive to implement, efficient and cost-effective to operate; and above all it can be installed anywhere in Europe today. True implementation of the quick and cheap European cross-border funds transfer mechanisms demanded by the Commission will only enhance the attractiveness of this model to corporate and institutional users and resource owners.

Confidentiality/Ownership

Problems


In many enterprises and organisations, much is made of the allegedly confidential nature of terminology developed in-house. They may classify terminology as public, "semi-public" (defined here as terminology which is "less widespread and can be made available on a network under the conditions laid out by the provider"(10)) and confidential. In practice, however, this argument is highly overrated: very little terminology is actually confidential, in the sense that its disclosure outside the organisation will do real damage to its economic interests. In addition, any confidentiality is strictly limited in time(11).

Another argument frequently advanced is that terminology work should not be outsourced due to its confidential nature. Again, this is at best overstated, at worst a thinly disguised attempt to justify the continued existence of in-house services: most companies regularly outsource highly confidential areas (e.g. legal advice, tax and accounting services, advertising campaign planning, market research, product/service research and development) and often publish at least some of the results. The tendency to treat much terminology as confidential hampers information flows and transactions within the enterprise, and may well account for substantial - albeit hidden - costs and efficiency losses.

Another factor frequently observed is the tendency of terminology creators and/or holders in an organisation to keep the existence of their terminology hidden from the rest of the organisation. There are certainly a number of reasons for this, but the prime motive is probably a lack of appreciation within the organisation of the efforts and benefits involved in terminology work, itself a result of the inability of many practitioners to demonstrate the benefits of terminology work in focused business terms. In such cases, the co-ordination deficits within organisations, frequently coupled with a lack of language/terminology policy, disrupt organisational information flows and affect costs and efficiency. This is a particular occurrence of what is actually a much more general problem - "information hiding".

Another problem linked to the areas of confidentiality and ownership is that of copyright and IPR. The general uncertainty as to what is actually covered by copyright, IPR and moral rights, coupled with the implications of new legislation such as the Database Directive, is not only proving to be a barrier to the effective distribution of all forms of terminological and other language resources, it also represents potentially significant cost and financial risk factors with a consequent impact on cost-benefit and pricing.

Solutions


A solution to the disruptive effect not only of alleged terminology confidentiality, but also of "hidden" terminological resources would be to institute corporate terminology audits(12) to identify all terminological resources, terminology creation and distribution processes within a particular organisation. The terminology audit - which might be integrated into a wider corporate knowledge audit - can provide answers to the following questions:

The results of a terminology audit would help streamline and optimise monolingual and multilingual information flows in an organisation, with associated internal and external cost and efficiency gains.

The more general copyright and IPR problems will only be solved by concerted action with the support of the Commission. One of the ELRA workpackages addresses these issues, and all actors in the terminology sector should be encouraged to participate actively in this area, taking into account, for example, the findings of the 1995 KnowRight Conference(13) and the follow-up measures agreed during this conference.

General

Problems


One more general problem that has a significant economic impact on the vast majority of actors in the terminology sector is the disproportionate effort and input cost required to be expended by small enterprises and institutions to participate in projects co-funded by the European Commission. Given the restricted financial and labour resources available to these SMEs and small institutions, who make up the bulk of active participants in terminology work, the preparation of project proposals is felt by many to be a severe burden, since they are unable to invest the same amount of resources as large enterprises and public-sector institutions, which are better equipped to absorb the costs involved. This lack of differentiation is in itself a barrier to the development of an effective infrastructure for terminology in Europe, as many SMEs and small institutions who could make a very effective contribution are deterred from doing so by the excessive cost. This situation must be remedied as a matter of urgency.

Solutions


The current Commission project proposal procedures effectively prevent many organisations from proposing and/or participating in European projects. We are therefore urging the Commission to simplify the workload and costs involved in project proposals for SMEs and small institutions, enabling them to bid for projects on equal terms with large enterprises and institutions.

Recommendations

In terms of general economic efficiency, it is vital that support be provided in the following areas, addressed in more detail elsewhere in this document:

THE LEGAL FRAMEWORK


Present Situation

The current legal framework for the creation and dissemination of terminological resources and tools is inconsistent and incoherent at the European level, between individual national legal systems, and often even within individual countries. In addition, no legislation has been specifically developed to cover terminology (or, indeed, other language resources such as lexicographic data) at any of these levels. Instead, a significant number of different legal mechanisms apply in most legislatures to terminology, the most important being the various intellectual property rights, the law of unfair competition (and in particular the new sui generis right proposed by the European Commission's Database Directive), contract law, and product liability legislation. This situation is considered as one of the major obstacles to the development of effective distribution and dissemination mechanisms for terminology, and has led to considerable confusion and uncertainty in practice. Although considerable work has been done both within the terminology community and outside it on clarifying the issues involved(14), more needs to be done both to produce complete clarity and to publicise the solutions. In addition, any on-line distribution network will have to address a number of other, more general legal issues such as data protection and security legislation.

Intellectual Property Rights


Intellectual property rights (IPRs) are defined in the Macquarie Australian Dictionary as "the rights of creative workers in literary, artistic, industrial and scientific fields"(15). They are designed to allow the authors of original work in the intellectual field to control the dissemination of their ideas, thus helping ensure a return on investment and hence promoting innovation: "The two fundamental characteristics [...] are that property is created and security is given"(16).

Intellectual property rights are protected by number of legal mechanisms, the most important of which are copyright, trademark law, patent law, and the law governing unfair competition and trade secrets. Since terminological data (like specialist texts) may combine linguistic and non-linguistic representations (e.g. technical drawings and mathematical symbols), and since it is characterised by complex data structures, a single collection (and its recording, storage, processing and distribution) may well be covered by multiple intellectual property rights. This problem will increase as the move to multimedia leads to the incorporation of other types of information, and particularly graphics, on a regular basis. However, the main IPR normally cited in the context of terminology is copyright.

Since the emphasis and justification for protecting intellectual property lies in the latter's innovative value, objects to be protected under it have to pass a test of originality. In practice, this poses a problem for many terminological resources, since individual words (although not necessarily definitions) are traditionally ruled to be in the public domain(17). In addition, the identity and size of the smallest protectable unit has not been finally and satisfactorily resolved. A further unwelcome complication is that a single collection may contain both copyrightable/copyrighted and non-copyrightable/copyrighted material.

An exception to the rule that single words cannot be protected by IPRs is formed by the subset of company and product names, slogans, etc. which are used for marketing (communications) purposes, and to protect market position. Such terms can be protected by trademarks or other so-called "notorious" (i.e. established) marks - a mechanism which explicitly associates a commercial interest with the intellectual/innovative one. A grey area exists where trademarks are adopted as generic product names or even into general language (e.g. "to hoover", or "Sellotape"); in this case, some manufacturers will complain while others will regard such usage as a competitive advantage. However, while many termbases may contain a few such words which are (or may be) protected by trademark legislation, the majority of their header entries will not fall under this ruling.

However, IPR protection may well be available to termbases in which definitions and context material are provided, especially if these and the collections they form part of are designed to have a prescriptive function. In this case, there is often an explicit wish to quote published reference material unchanged, as an objective (or at least external) authority Thus in standardisation, it is particularly important to refer to standard - and hence already copyrighted - material. While much of such practice will be covered by the principle of fair use, extensive citation, including the cumulative use of individual passages, may present a problem(18).

A further open question is what constitutes originality in a multilingual context. At least one trend in the current debate on copyright going on within the translation community holds that any translation - even an incorrect one (!) - of any unit of information into another language represents a creative and original act and is therefore protected under authors' rights(19). Such a human-centric view of the translation process is bound to come under pressure when confronted with the (unoriginal) realities of machine translation. However, the problem - and the uncertainty it causes - are too important to be decided by the vagaries of case law (which is often extremely contradictory in this area), or by time. Rapid expert clarification would be the preferable solution.

A last uncertain point concerns the assignment of any IPRs as may apply by the terminologist. While employees normally assign the rights associated with their labours to their employers by virtue of their contract, the position of contractors and freelances is not always so clear (unless specifically regulated by contract). This is one of the main uncertainties preventing groups such as translation agencies from releasing their sometimes considerable terminological resources.

Database Directive


Given the uncertainty surrounding copyright and other intellectual property rights, it is fortunate that other mechanisms are, or will shortly be, available. First and foremost there is the European Commission's proposed Database Directive, which is based on a mechanism similar to that used in the law of unfair competition, and which aims at protecting economic interest rather than originality of content. The protection offered is independent of the means of dissemination used (e.g. paper, on-line, CD-ROM, or network). It allows the producer or rights holder to prevent the unfair partial or total extraction of the database contents by downloading or reproduction, as well as the reuse for commercial purposes of the contents. However, this right will only be granted if the resource in question is not protected by copyright, and it is also limited in its period of application. In addition, of course, it has yet to be implemented.

Contract Law


The use of contract law to govern the creation and dissemination of terminological resources is both widespread and increasing, since it allows the parties to the contract to define the conditions under which terminology may be disseminated or reused precisely. Contract law protection may exist either independently of or in addition to any IPR protection; the contract itself does not create any property rights but rather represents an agreement by the signatories to behave in the ways laid down in the contract. As such, it traditionally only binds direct parties to the agreement, which in the context of dissemination for large-scale reuse can prove a potential problem (once a "leak" has occurred in a network, secondary unauthorised distribution may be very difficult to stop in practice). However, the combined use of well formulated contracts, sophisticated access profiles and usage monitoring techniques, target group education (e.g. via a code of good practice) and an appropriate willingness to take legal action against offenders should reduce this problem to a manageable size. (A similar process is to be observed in the software industry).

Product Liability


A rather different type of problem is posed by product liability legislation. It would seem highly probable that language resources are subject to product liability legislation, and the EC Directive on Product Liability in particular. Given the potentially fatal implications of terminological mistakes in, for example, the aerospace, environmental and pharmaceuticals sectors, this idea does not automatically appear out of place. It would further seem to be the case that this liability applies irrespective of whether such resources are sold, distributed at cost, or offered free of charge. Disclaimers of responsibility are ineffective, except possibly to deter the more timid of users from initiating proceedings. If this is the case, then resource providers are also liable even if their data comes from outside the European Union, since the "importer" is the first and last point in the chain for European purposes. This development would seem to be logically consistent with national rulings that the laws governing defamation and obscenity also apply to on-line networks.

The link between product liability in other areas and terminology is important, too. One way of reducing both the likelihood of situations founding liability arising, and of reducing damages if they do, is to introduce a quality management system. Such systems are strongly dependent on documentation and communication processes, which in turn depend on controlled and unambiguous terminology. To this extent, product liability legislation has a positive side, too.

Data Protection and Security


Although this was not the subject of the POINTER study (which, in fact, concentrated solely on copyright), the vastly increased opportunities for data manipulation raise a number of legal issues concerning data protection and security. This is made more acute by the current problems surrounding encryption, which is one of the most attractive methods of transporting commercially sensitive information (in the context of a terminology network, this could refer either to terminological data itself or to metering and billing data). The attitudes and policies of the various national governments towards encryption techniques (up to and including total bans) are causing uncertainty about a process viewed by the business community as being of vital importance in an information society. Resolution of these problems is therefore a matter of urgency.

Last but not least, the legal situation creates even more complexity when applied in practice. The fragmentation and uncertainty of the overall legal position, the number of potentially applicable rights and laws and the lack of legal skills among terminologists and their managers, mean that there is great uncertainty and ignorance about what laws apply, to what, and to whom. Furthermore, the problem of applying the legal situation (whatever it may be) to existing data is even greater than applying it to the creation of new resources, since the ad hoc and hurried nature of much terminology work, and the still relatively common practice of omitting source information, means that it is often difficult even to establish ownership or other rights after the event.

This state of affairs is obviously a major obstacle to the creation and in particular to the distribution of terminological resources (and to a lesser extent of information on them) at the European level. Without being assured of adequate protection for their intellectual assets and investments, and without clear knowledge of their potential liabilities, very few serious resource owners will be prepared to make their collections available to a wider public.

Problems

The major legal problems facing the widespread (electronic) dissemination of terminology within a network arise from the fact that it is now extremely easy to store large amounts of data in an easily distributable and manipulatable form, and to give large numbers of people access to it. In addition, the development of cheap and reliable reproduction and manipulation techniques (such as the combination of photocopying, scanning, optical character recognition and text processing) has considerably weakened the position of authors and effectively removed their ability to control the dissemination of their work. As has been shown above, these developments have not yet been adequately reflected in IPR and legislation, which arose at a time when controlling the physical form of a document offered an effective mechanism for protecting its intellectual contents. Similarly, product liability legislation originally developed out of a physical product model, which was extended to cover such aspects as documentation, and hence translation. There is thus no mature and concerted body of ideas and rulings on the degree of protection that can be offered to, or to what extent liability can be incurred by, information products in general.

Another complicating factor is the existence of significant variations between IPR legislation and jurisprudence both within the national level (e.g. between the Anglo-American concept of copyright and the continental European "droit d'auteur"), and between national, European and international instances. Also, some legislatures make a distinction between information (i.e. terminology) produced by state organisations "for the public good", and that produced by private sector organisations on a commercial basis. These factors have led to the development of different policies and rights, and to corresponding uncertainty and an inability in practice to enforce them.

Last but not least, the question of the smallest protectable unit has considerable implications for terminology, affecting not only copyright but also licensing and payment issues and models, and product liability. Parallel SGML tagging is one solution to the problem although it is unlikely to find general acceptance as it is extremely resource-intensive and hence expensive. A dual approach which allows both tagging at the field level (e.g. to determine authorship) and at the level of a set of fields (e.g. to establish copyright) needs to be adopted.

Solutions

The aim of all work on the legal framework for terminology must be to facilitate effective distribution of resources, i.e. it must provide efficient protection against unauthorised distribution/reproduction on the one hand, and security for resource providers and resellers making terminology available on the other.

One positive conclusion to be drawn from the POINTER Project's research into copyright is that - at the conceptual level - current and planned legal provisions would seem to be able to cope both with the challenge of modern technology and with the specific nature of terminology. It is therefore not necessary to create a specific "island solution" for terminology - on the contrary, any similarities between this area and other areas of language engineering need to be exploited wherever possible. What is required, however, is adaptation to ensure that the specific needs and nature of terminological resources and terminology creators are taken into account. In addition, legal mechanisms can (and in future normally will) be supplemented by technical applications for information protection, and by flanking measures such as codes of good practice, to yield workable practicable solutions that can be applied to the many and varied types of terminology.

Last but not least, specific work needs to be done (by joint working groups of lawyers and other specialists) on the underlying legal issues and models involved in the transition to a virtual society. Care should be taken to ensure that solutions are co-ordinated among all areas of the linguistic and information communities facing similar problems (e.g. via ELRA, KnowRight). The emphasis in all cases should be on the rapid provision of practical tools and guidelines for both producers and users, in order to guarantee the growth of the emerging language resources industry. Once developed, these solutions must be actively promoted via awareness campaigns. In addition, accurate and timely information must be provided on a regular basis and in a longer-lasting form (e.g. via ELRA and the proposed ETIS).

Recommendations


1. Thus awareness of terminology and other language issues is increasing in Finland and other countries which have recently joined the EU due to the new status of the national languages concerned as official EU languages, and a vast increase in the need for translation, e.g. of specialist legislation, etc. as a result of entry.

2. To give just one example: Union Latine has calculated that, in France, more than 100 students a year graduate from 4 year (or longer) degree courses with significant terminology input.

3. e.g. [DTT 94]

4. For a detailed discussion of cost and profitability in lexicographical projects, see [Stell 95] and [Kalfon 95]

5. [Brooks 82]

6. When costs are "exploded", they are broken down into successive levels of detail by process, workflow and individual component.

7. For further information, see [Pic/Rei 94], [Koll 94] and [Schwetz 94]

8. For an interesting example of this, refer to the discussion of the CEDEFOP conference terminology project described in Chapter 3.3.

9. Where timeliness is concerned, attention is drawn to the enabling role of ever-improving tools and methods. These can be of major advantage in meeting timeliness requirements, as well as reducing costs for those operating in the market.

10. See Phase II Workpackage 6.1 Report, page 30

11. In general terminology has no confidentiality or creative value, but merely a commercial value. An excellent example of this is the Danish electronic corpus of texts built by Dr. K. Ahmad at the University of Surrey in under two hours by downloading non-copyrightable material from the Internet. The corpus was then used for terminology extraction.

12. These could be based around the "language audit" models already implemented in several European countries. Language audits are conducted by both in-house teams and external consultants.

13. See [Brunn/Sint 95]

14. e.g. the DTT Symposion [DTT 92], the work of the GOTA Group, the Infoterm Draft Guide to Terminology Agreements and the KnowRight Conference [Brunn/Sint 95]

15. quoted in [Work Gr 93]

16. ibid.

17. c.f. [Wright 95] in [Brunn/Sint 95]

18. ibid.

19. cf. Communication by a French lawyer during a podium discussion, ASTII Conference, October 1995