![]() |
![]() |
| Home | Undergraduate | Postgraduate | Research | Contacts | Search |
|---|
IMPRESS – IMPRession Evidence & Serial-crime profiling SystemPrevious Research and Track
Records –Sheffield, Surrey, & Strathclyde Universities NLP
Group, Dept. of Computer Science, University of Sheffield Sheffield NLP group is an RAE-2001 5*
group in a 5 grade Department of Computer Science (DCS), and together with the
DCS Speech group and the Information Retrieval group in the 5* Information
Studies Department, Sheffield has the largest language and information research
grouping (about 70 people with seven professors) in England, and possibly
Britain. Active research areas relevant to this proposal include: The
EPSRC-supported GATE infrastructure for NLE R&D is being used extensively
at Sheffield and abroad (Cun
02, http://gate.ac.uk/).
Dialogue modelling and the modelling of belief structures in time is
another important area of work (Wil 02 and EU-projects AMITIES, COMIC and FaSIL).
Multilingual and multimodal information extraction from the web, newswires,
scientific journals, legal and medical documents has been successfully carried
out (see Hum 00 and May 02, supported by past EPSRC projects GATE GR/M13473 and GATE1
GR/M31699/01, and EU projects NAMIC, AVENTINUS, MUMIS
and UK Government project MUSE). The EPSRC 5-university IRC AKT on Knowledge
Management Project (GR/N15764/01)
Of direct
relevance to the present proposal are the joint Sheffield-Surrey EPSRC SOCIS
project (GR/M89676)
the UK Government MUSE project, which have linked extended lexical material
directly to the retrieval of structured images (crime images in SOCIS) and
video segments (football sequences in MUSE) retrieved from structured
commentaries (see, Pas
02, Pas 03). Sheffield has
also emerged as a leading centre in the development of adaptive IE, having
developed algorithms for wrapper-induction IE, obtaining excellent experimental
results on publicly available corpora (Cir 01). The Amilcare adaptive IE system
is emerging as a reference IE system in the Semantic Web field and is part of
the EPSRC AKT IRC Consortium. Sheffield
has also had considerable experience, in the EU COMIC 5FP project, in adaptive
dialogue management, particularly dialogue content resulting from the fusion of
multi-modal inputs (e.g. speech, text, writing, gesture and vision) into a
single communicative representation. Yorick Wilks is
Professor of Computer Science at the University of Sheffield and Director of
ILASH, the Institute of Language, Speech and Hearing. He has published numerous
articles and seven books in that area of Artificial Intelligence and NLP. He is
also (three times) a member of the EPSRC College of Computing, a Fellow of the
European and American Association for Artificial Intelligence, on advisory
committees for the National Science Foundation, and on the boards of some
fifteen AI-related journals. He created the DIDEROT IE system for ARPA in the
US in 1990-93 and was the principal investigator of the GATE / LaSIE / ECRAN /
MUSE / MUMIS / NAMIC IE projects at Sheffield. Centre
for Knowledge Management, Dept. of Computing, University of Surrey The Centre is within an RAE-2001 6 rated Unit of Assessment (Elec. Eng). The Centre currently has 7 academics, 4 RA’s and 20 PhD students. The Centre is amongst the most active terminology and knowledge acquisition centres in Europe. Active research areas relevant to this proposal include: The
Centre has developed a knowledge management system, System Quirk (with grants
from the EPSRC, EU, & DTI funding), for investigating the link between
different modalities of communication: namely, text, image, and time-serial
data (Ahm 01a, Gil 02, Sal 98, www.surrey.ac.uk/Quirk). The system has been downloaded by over 900 organisations
worldwide. In the EPSRC-supported SOCIS
project, image and collateral captions were being fused for retrieving one (say
image) from another (collateral text): terms were automatically extracted from
the collateral texts and used to index images automatically – descriptions of
crime-scene images provided by investigators were used to index crime-scene
images (Scene of Crime
Information System – SOCIS, GR/M89041/01). Four UK Police Forces have
evaluated the web-enabled
SOCIS system and the results have been encouraging enough for SOLCARA PLC
to explore commercial viability of the system (www.computing.surrey.ac.uk/SOCIS). Qualitative,
opinion-related, data is being extracted from (financial) news wires and
correlated with time-serial data (share price movement) to generate buy and
sell signals (EU-IST GIDA Project No. 2000-31123 , EU-IST Project ACE
No. 22271). In a current EPSRC project (Television in Words –TiWO,
GR/R67194/01) audio
descriptions of moving images for the visually impaired are being processed
by exploring narrative structures of films (Sal 03 & www.computing.surrey.ac.uk/TIWO). The automatic extraction of the conceptual structure of a domain
by examining texts of the domain (EU-IST SALT Project 1999-10951, ESPRIT-LE Projects Interval No. 4002 and Transterm No. 62-055) is being used to track the emergence of concepts in semiconductor
technology, in artificial intelligence and in health care (Alt 02). Text categorisation is a key interest of the Centre: automatic
terminology extraction is used to facilitate text categorisation (Ahm
01b). Work
directly related to this project relates to the automatic construction of
thesaurus of terms, i.e. conceptually organised terms, within sub-specialisms
(forensic science à crime scene photography à footwear impression) (Ahm 03a); this is an example of
data mining on textual sources. The use
of multi-net neural classifier
systems, each classifiers specializing in a specific task, will be crucial for
collating images and texts from different sources, as in the case of impression
evidence about a single individual from different scenes of crime (Ahm
03b). This classifier is based on an earlier data
mining project sponsored under a TCS grant (TCS 1940). Khurshid
Ahmad is Professor of Artificial Intelligence at the University of Surrey and
Head of Department. He has published
over a 100 articles and two books in the area of computer-assisted learning and
terminology extraction. He is a member
of the EPSRC College of Peer Review. He
is responsible for the development of System Quirk and is the Principal
Investigator of GIDA and SOCIS,
and of SALT, INTERVAL, and TRANSTERM in the past. He has served as
a visiting professor at the Copenhagen Business School Andrew
Salway is Lecturer in Multi-media systems.
He is the Principal Investigator on the EPSRC-sponsored TIWO project. He has given invited lectures in Japan,
Canada, Australia and in the UK on the relationship between moving images and
their textual description. He works
closely with the BBC, ITFC, Tate Gallery and the Royal National Institute for
the Blind and the Banff Centre for New Media (Canada). Chris
Handy is Tutor in Information Extraction and has previously worked for Surrey
Police. His research is on how forensic
scientists examine, classify and report images. He works in close co-operation with the Met’s Crime Academy and
is exploring the exploitation of the SOCIS system with Solcara PLC with help from the DTI. He
is currently Forensic Science Unit,
University of Strathclyde The
Forensic Science Unit (FSU) at the University of Strathclyde established the
first UK postgraduate degree course in the forensic sciences about 30 years
ago. The FSU plays a key role in
teaching and research in the area of forensic science. Research in the FSU is focussed toward
forensic issues such as drug profiling and crime scene reconstruction. The FSU
are a founder member of the European Network of Forensic Science Institutes
(ENFSI) and its research is funded by the BBSRC, the EPSRC and pharmaceutical
companies. Academics within the FSU are
practitioners and are authorised for the forensic examination of drugs, dyes,
documents (inks) and criminalistics (physical evidence). Dr Adrian Linacre is a
Senior Lecturer in the FSU. He
specialises in the application of DNA analysis to forensic examination,
especially in cases of drug trafficking.
He has pioneered the use of grass as evidential material. His research informs the course on DNA
profiling he has established within the FSU.
Dr Linacre has published over 25 papers in international journals and
written 7 book chapters. Dr Linacre is
secretary and treasurer for the Competence Assurance Project of ENFSI, with the
work funded by a grant from the joint action programme of the EU for
co-operation between law enforcement agencies in the Union (EU Joint
Action OISIN 96/636/JAI CASE
FOR SUPPORT
Introduction The
SOCIS project provided us with a wholly novel way to link the structured
information in pictures and captions together directly. In IMPRESS we intend to build on this by
investigating a range of data mining (DM) techniques applied to this result, so
as to identify patterns that can be linked directly to individuals, in this
case persistent offenders whose patterns of offence have not been obvious to
investigators. The initial success of
multimedia computing was the integration of text, image, video and audio data
at the level of the bit-stream so that they could be stored, accessed and
processed by the same system. However,
integration at higher levels of abstraction remains an unsolved problem; cf.
the EPSRC’s ‘grand challenge’ of capturing and storing digital human
memories. One aspect of this problem is
the fusion of information from heterogeneous sources, including different kinds
of media, different coding formats, different languages and different points of
view. The scenario of crime detection
and prevention is an interesting scenario for exploring these problems. Scene-of-crime officers gather records of
impression evidence (photographs and textual descriptions) which on their own
make little sense, but taken together can be interpreted by investigators to
solve a case. The explanation of the
chain of events and individual(s) that resulted in the crime scene, relies on
credible and robust evidence. Accomplished
investigators learn not only to identify individual items of impression
evidence, but learn to correlate the different items collected in different
places/times/modalities. This
assimilation of disparate data to produce one significant item of information
–credible, robust evidence – is an abstraction worthy of the grand
challenge. The IMPRESS project will
attempt to mimic the behaviour of an accomplished investigator. In
SOCIS we dealt only with the integration of visual and textual information
related only to items of impression evidence and their description. IMPRESS will extend the integration of
multimedia crime-scene information at three distinct but interrelated levels
incorporating both impression evidence and Modus Operandi (MO) reports. First by using established information
extraction techniques we will link the MO’s – lengthier, more interpretive free
texts. Second our focus will be on
crimes where both the MO and impression evidence are available. In this connection, IMPRESS will address the
integration of descriptions and MOs given by different members of the same
police force. Third, IMPRESS will
address the integration of information gathered by different police
forces. Each of the three levels
depends crucially on maintaining and continually updating the inventory of
concepts and terms, and the relationship amongst terms. IMPRESS will not only be able to learn
idiosyncratic words and image fragments, but will learn to correlate the words
and images. The
development of the IMPRESS system will incorporate findings from these three
strands of research in order to integrate heterogeneous crime scene information
into common machine-executable representations – a task, in this context, akin
to data preparation which is a crucial precursor to data mining. The project will go on to evaluate a range
of data mining techniques in the IMPRESS system for identifying habitual
criminals from impression evidence and MOs across many crime scenes. SoCIS
established strong co-operative links with police forces and software companies
and, having favourably evaluated the SoCIS system, they are keen to participate
in the IMPRESS project: accompanying this proposal are 9 letters of support
from 5 police forces, a software company and a university department of
forensic science who will form the IMPRESS Round Table (including three letters
from different departmental heads in the Metropolitan Police). The Round Table will serve a number of
functions including: (i) provision of multimedia crime scene data; (ii) user
requirements and feedback for the IMPRESS system; and, (iii) dissemination of
project results. Five police forces
have already supplied impression evidence for the SoCIS project and this data
will be used in IMPRESS along with MOs from the West Midlands Police; IMPRESS
complements West Midland’s FLINTS system. The distributed nature of data
related to habitual criminals is an ideal test case for EPSRC’s GRID for
supporting e-Science activities and Internet II emerging in the USA. Background Property related offences, especially theft
offences, criminal damage, and burglary, accounted for over 75% of all the 5.5
million offences reported in the UK in 2001/02; many of these crimes are
committed by habitual criminals. Large volumes of data related to the modus
operandi (MO) is being collected: the West Midlands Police Force collects 3,000
free-text MO’s daily amounting to 1 million items per year. The Force is currently using an intelligent
workflow system, FLINTS, to combine ‘forensic and physical evidential “hits”’
to display links between criminals and the evidence. FLINTS produces a profile of offenders and of crimes
committed. There are evidence tracking
systems that deal with the movement of crime related exhibits (cf LOCARDä Evidence Tracking System) from ‘crime scene through to court’; this
tracking is again performed through a class description, much like that used by
freight handling organisations. The
Metropolitan Police (North London Branch) has developed a profiling system
based on a detailed categorization of crime scenes and suspects/convicted
criminals. There image-oriented
workflow systems, used mainly by US Police Forces, that help in the
categorization of images according to the intrinsic visual features. FLINTS, LOCARD and imaging workflow systems
all can be used as systems for building the profile of a habitual
criminal. A future workflow system should be able to
process and fuse together the impression evidence in the two modalities, free
text descriptions and images. Given the
large volume of impression evidence, it is important for such a system to learn
to fuse the different items of impression evidence in a coherent whole. The workflow system should use the
high-performance/high-speed and secure, broad bandwidth data/communications
networks (e.g. Internet II or the GRID). There are problems related to the variance in the ways different
Police Forces describe and image a scene of crime and indeed there are
variations within a large Force.
Nevertheless, there are similarities in descriptions as manifested by
terminology, reflecting the conceptual structure of forensic science and
criminalistics, on the one hand and the specialized nature of the images of
impressions. There is a need to have a
holistic view of the impression evidence: different types of evidence and two
different modalities. Training of
forensic scientists and officers should reflect this. An intelligent workflow system for impression evidence, that
pro-actively fuses, and learns to fuse, descriptions and images, will require
active co-operation between the end-users (the Police Forces), the academic
researchers in information extraction, in text and image mining, and in Grid
technologies, and software vendors specialized in working with the Police
Forces. There
are a number of projects currently sponsored by the EPSRC in areas as diverse
as Psychology, Medicine and Computing (Crime VUS -GR/N09701/01, IXI-
GR/S21526/01, Spectral Retrieval-GR/N33348/01, Freedom To Forget - GRA91364/01), which focus on the
analysis and management of images with some support for language engineering
techniques. In other projects (MIAS -GR/R83972/01, GR/M66233/01) one of the important
research questions is the use of (manually-created) ontologies and thesauri.
The IMPRESS project will benefit from lessons learned in these projects about
the management of image repositories and inform them in return about image-text
interactions and how adaptive learning systems can be beneficial. The research work undertaken
already in building image management systems that beneficially use texts collateral
to the image has been documented (Sri 00).
The key question of inter-indexer variability has not been extensively
discussed (Eak 99). Multiple classifier
systems have been used deal with properties of an individual image (colours,
shapes, texture) and there is some work on relating images to sound (Rol 02). Aims
and objectives of the technology research The project’s technological thrust is to build on the achievements in
this domain of the SOCIS project: the direct linking of data in the forensic
domain drawn from both text and image fields, in a way we believe to be
original. The aim is then to consolidate that investment and research
achievement by extending it to the IMPRESS system so that a range of machine
learning techniques, applied to the multi-modal SOCIS data, will allow the
emergence of data clusters that correspond not only, as in other systems, to geographical
areas, but to individuals seen as complex constructions of correlated traits
that will enable that individual to be identified, located and caught. The system will be embellished with a learning
component that learns to correlate the textual descriptions and image features
of individual impressions, and subsequently learns to correlate different types
of impression evidence. Our objectives
are: 1.
To develop a computational method for fusing
information from different types of impression evidence, and from accompanying
descriptions and MOs provided by experts; and to investigate how the link
between images and texts can be learnt
by machines. 2.
To investigate the similarities and differences
in how experts describe impression evidence and articulate Modi Operandi (MOs);
and to automatically generate a domain ontology from their texts. 3.
To develop a system that will take the fused
information and the domain ontology, and then apply existing data mining
techniques in order to detect patterns relating serial offenders; and to seek
early exploitation of the IMPRESS system by demonstrating prototypes to the
Round Table throughout. 4.
To incorporate computer-based evidence methods
within the forensic science teaching curriculum. The
excitement and novelty of the research challenges to be addressed Our
approach is different from the FLINTS system (West Midlands Police), which
helps to visualise highly structured data.
Our effort will be more general, both as regards techniques and as
regards data (in that it extends over both text and image data) starting with
Kohonen maps (a neural computing method) over all features covered by SOCIS in
an effort to develop individual profiles over data automatically. The information extraction techniques
developed in SOCIS will be used in IMPRESS to process the Modus Operandi data
(in free-natural language text) accessible to FLINTS. Furthermore, the storage and processing of images of impression
evidence, together with the automatic correlation with the linguistic
description of the images, will result in an exciting and novel
impression-evidence management system.
This novel system will complement existing systems like FLINTS. Forensic
scientists perform a crucially important real-world task where heuristics
abound and much knowledge is personal knowledge; in some respects, the forensic
scientists behave like other diagnosticians that have been investigated in the AI
literature and in other respects they behave like hypotheses-makers as in law
and cognate subjects. Literature on
contents management, knowledge management and multimedia systems constantly
refer to the intellectual challenge of dealing with a mixture of perceptual and
cognitive modalities. Information
fusion, where it helps in the enterprise of forensic intelligence, will throw
light on intelligent information processing in human beings on the one hand and
will help in the development of robust systems on the other. We believe our
approach to building repositories of structured information from otherwise
unstructured data (descriptions in strings of comprising texts and images
comprising pixel clusters), is different from that of conventional prescriptive
or highly theoretical approaches in the literature for building knowledge
bases; we have used pre-existing semi-structured repositories of knowledge
including free text, lexical and terminological resources, to maximally exploit
the content in them with our integrated extraction techniques. The proposed research will lead to a unique
combination of advanced and developing technologies enabling the fusion of
information and extraction of key facts from this fusion. An understanding of how experts describe
visual information will ground the technological advances; both in terms of
their expert knowledge and of the special language they use to articulate their
descriptions. Such understanding will
be informed by, and may in turn inform, the training of forensic scientists. Relevance
of research to the TCPD vision, and other beneficiaries The
proposed research will address problems of crime detection by the intelligent
management of crime scene data, offender identification by predictive
techniques considering crime patterns and person detection by novel data
acquisition and processing techniques.
These problems have been identified in the TCPD Vision (Nov 2001). The three Universities in the project will
help undertake an investigation of the variation in the way in which impression
evidence is described, stored and retrieved.
Strathclyde will provide a forensic science framework for describing and
linking impression evidence and will use the results to inform their teaching
and learning programmes. We will raise
awareness of research in our specialised areas within the forensic science and
the police community through novel manners of interaction and dissemination. What
we propose is in some respects blue sky research: text mining appears blue
sky in that our premise is that we can process natural language texts
despite its deep grounding in human nature and culture, but by carefully
targeting on a specialist enterprise such difficulties can be alleviated as we
and others have demonstrated in dealing with texts in science, medicine,
engineering, and the arts. Computer
systems that can learn patterns, textual or image patterns, and indeed that can
learn to correlate patterns in different modalities, appear blue sky research:
again, it has been demonstrated that by carefully selecting categories of
images and texts, systems can indeed endeavour to learn characteristics of
images and texts. The variation in the
perception of images and texts is an open question in cognitive sciences:
training and experience can reduce this variance. Nature
of the research team and its ability to deliver the research project aims The specific research advance from the existing EPSRC SOCIS cooperation
(Universities of Surrey and Sheffield, EPSRC Grant No. GR/M89041/01) has
resulted in an integrated working prototype that links the searching of
structured images to the searching of structured captions; this bringing
together of meaningful structures in both language and vision has been a holy
grail of artificial intelligence for over thirty years and we believe we have
made real progress as partners in SOCIS, rather than just using one modality
(e.g. language) as an index for searching another (e.g. pictures or
video). The current work in two
separate projects (Surrey’s TIWO and Sheffield’s MUSE)
were planned in the light of experience gained from SOCIS. Sheffield and Surrey together have brought
extensive and evaluated experience in information extraction, a widely
distributed architecture for modular natural language processing applications
of this sort, and techniques for building and integrating ontologies with
semantic structural representations.
Work has also been successfully carried out in neural network based
learning of images and texts that describe these images. Strathclyde FSU has made original
contributions to the areas of forensic science and criminalistics. The University of Surrey and Strathclyde have begun
negotiations about a joint MSc course in Forensic Information Systems. The three universities have established an
excellent rapport with major Police Forces in the UK, and indeed a significant
proportion of their recent work has benefited from such an interaction. This is manifested by expressions of
support; the Manager of the Kent Police’s Forensic Sci. Service and
President-elect of the UK Forensic Science Society (2003-04), has agreed to
Chair the project management committee, the IMPRESS Round Table. In the SOCIS project, Surrey and Sheffield relied on the scene-of-crime
officers for understanding how they interpret images focusing mainly on
footwear impressions and tool-marks.
This input will be broadened and formalized by Forensic Science Unit, Univ.
of Strathclyde. The Unit will organise
the input of the expertise for the IMPRESS system, and will lead the evaluation
of the IMPRESS system at regular intervals during the lifetime of the
project. The Met’s Crime Academy, part
of the Met’s Specialist Crime Directorate, is helping Surrey in understanding the
effect of training of forensic professionals on the variation in their
description of scene-of-crime images. Detailed Work Plan The work will be split into
five work packages. As part of each work package each partner will plan and
manage the work to be completed, prepare progress reports and participate in
meetings (both technical and Round Tables). Work Package 1: Domain Modelling (10 person months:
Surrey 4, Strathclyde 6)
An understanding of the
domain and knowledge of the needs of the users will be made explicit through
techniques of knowledge acquisition such as brainstorming, structured
interviews and case studies. Domain
specific data models such as the National Intelligence Model will be
investigated to determine their utility. Milestone: User Requirements Specification (Month 6) Work Package 2: Terminology and Ontology Building
(12 person months: Surrey 6, Strathclyde 6)
This workpackage will test
the hypothesis that experts share a special language and will examine how their
descriptions of impression evidence and their articulations of MOs vary, both
within and across police forces. The
automated generation of a domain ontology will be evaluated as a means of
alleviating problems caused by such variance for multimedia information integration. Methods developed in SOCIS, for automatically
building glossaries and conceptual structures from text corpora, will be
incorporated in IMPRESS. The SOCIS modules will be enhanced to build the
terminology of emerging sub-domains, like ear and clothes impressions,
by using search engines as text providers. The SOCIS text corpus (a collection
of texts with 0.75 million words) will be enhanced by the 1 million plus MO
reports (c. 10 million words) provided mainly by the West Midland’s Force. The
SOCIS image repository, provided by the Met’s Crime Academy, will be used to
focus on the key visual features essential for linking crime scenes. This
repository will be expanded under the guidance of Strathclyde and the
Forces. An ontology of the domain will
be semi-automatically constructed based on a domain-specific corpus and tuned
in with the NIM and PITO Common Data Model. This ontology will help address the
issue of terminology variation amongst the police forces by acting as a
translation filter as well as generating term clusters for information
extraction and retrieval purposes. Milestone:
Automatic terminology/ontology component (Month 10) Work Package 3: Text-Image Data Mining and
Multi-modal Data Fusion (16 person months: Surrey 16, Sheffield 12)
Information extraction based
methods for text mining, developed in the SOCIS project, will be tested on
large volume data provided by West Midlands and other Forces. Image and text
analysis techniques developed in SOCIS will be evaluated and expanded further.
Components of GATE and QUIRK will be adapted for the IMPRESS prototype for text
mining, and image analysis toolboxes, e.g MATLAB, will be used for image
analysis. Multi-modal data fusion is the integration of disparate data from a variety of
sources and of
differing modalities into a formal framework. The fusion of this data can be
greater than the sum of the parts, providing a rich, homogenous data
representation. A variety of existing multi-net neural systems and adaptive IE
systems will be evaluated to produce a representation of the data acquired in
WPs 1, 2. Milestone: Text-Image data mining components (Month
12), Multi-modal data fusion component (Month 16). Work Package 4: System Development (33 person
months: Surrey 18, Sheffield 9, Strathclyde 6)
A system will be developed
to extract patterns of similar criminal behaviour across crime scenes in order
to identify serial criminals. The IMPRESS system will be based on the SOCIS
web-based system and will be enhanced to include image analysis, multi-modal
processing and a learning capability, and GRID-enablement. The system will
incorporate results from WPs 2 and 3, along with existing data mining
techniques. The prototype will be continually tested and will be released to
the various Police Forces. Standard software engineering methodologies will be
adhered to with a continuous prototyping approach adopted, allowing domain
experts to evaluate, and provide feedback, on system design and functionality. Milestones:
Prototype I (Month 16), Prototype II (Month 20) Work Package 5: Round Table Meetings and Knowledge
Dissemination (13 person months: Sur 4, Sheff 3, Strathclyde 6)
IMPRESS will follow three dissemination routes: First, journal publications and peer-reviewed conferences in the areas of information extraction/NLP, data mining, multimedia systems and the bi-annual meetings of the UK Forensic Science Society. Second, Strathclyde, in collaboration with the Surrey and Sheffield will produce a training programme that can be delivered either at universities teaching forensic science or at recognised training establishments. Third crucial route is the IMPRESS Round Table. A round table (RT), comprising forensic professionals, and associated software houses, across the UK, will be formed at the outset of the project. The RT will be chaired by a leading forensic professional and will meet at quarterly intervals for the validation of knowledge gathered, and for user testing and evaluation of the IMPRESS prototype. SOLCARA will be on the RT to address questions related to the commercial potential of the IMPRESS system. The Round Table will be expanded to include other Police Forces and software houses in the UK. Milestone: Web Site (Mnth 1), Training Programme (Mnth 22), Evaluation
report/ Final Report (Mnth 24) Justification of Resources The
University of Surrey will appoint one RA and one PhD student, Sheffield will
appoint and RA, and Strathclyde one PhD student. In addition funding is
requested for a part time Knowledge Dissemination Officer for editorial and
clerical work. The main focus of the Research
Assistants work will be on designing and implementing the IMPRESS system. The RA’s will first evaluate, and
subsequently implement a system comprising data mining, adaptive information
extraction, neural computing, data fusion and image processing techniques. The existing SoCIS software will be used as
a starting point. The Project Students will undertake domain modelling,
terminology and ontology construction.
Funding is requested for travel to allow the collaborating Universities
to meet to discuss the ongoing work and to enable the regular Round Table
meetings to be held at each University in turn. Funding for technical support
and library provision is requested. Evidence
of connectivity and explanation of the value of the proposed Collaboration We have formed a consortium that includes 5 police forces (Kent Constabulary, the Metropolitan Police and the Metropolitan Police Crime Academy, Strathclyde Police, Surrey, West Midlands), a software systems house (Solcara) specializing in police computer systems and a University (London Metropolitan University) Forensic Science Department. The consortium will act both as a source of domain knowledge and as one method of knowledge dissemination. Additional dissemination will be achieved by the employment of a knowledge dissemination officer who will provide secretarial, editorial and presentation support for the dissemination of information to interested parties. Solcara PLC will be involved in exploring the commercial potential of the IMPRESS system. Outline
of management structure The proposed consortium includes UK police forces, three Universities,
and a software systems developer specializing in crime prevention and detection
technologies. The project will be
co-coordinated by a Round Table chaired by an experienced forensic science
practitioner from one of the five Police forces. The Round Table will be involved in chasing progress as per the
work plan and will suggest changes or modifications accordingly. The universities together will produce quarterly
progress reports for the duration of the project. The software houses will
explore possible exploitation routes with the support of the Universities. Arrangements
for take up (IPR ownership) During
the life of the project IPR ownership will rest jointly with the Universities
of Surrey, Sheffield and Strathclyde.
This could be transferred, in whole or in part, subject to discussion,
to any interested parties on completion of the project. References Ahm 01a: Ahmad,
K. (2001). ‘The Role of Specialist
Terminology in Artificial Intelligence and Knowledge Acquisition’. In (Eds. ) S-E. Wright & G. Budin. Handbook of Terminology Management. Amsterdam: John Benjamins Pub. Co. pp
809-844. Ahm 01b: Ahmad,
K., Vrusias, B. &Ledford, A.,(2001)
Choosing Feature Sets for Training and Testing Self-Organising Maps: A
Case Study, Neural Comp & App, Volume 10, pp 56-66. Ahm 02a Ahmad,
K., Bale, T., & Casey, M. (2002) Connectionist Simulation of Quantification
Skills. Connection Science Vol. 14 (No. 3). pp 165-201. Ahm 02b Ahmad,
K., Vrusias, B. &Tariq, M. (2002), Co-operative Neural Networks and
Integrated Classification, Proc. 2002 Int. Joint Conf. on Neural Networks Piscataway:
IEEE Press. pp.1546-1551,. Ahm 03a: Ahmad,
K., Tariq, M., Vrusias, B. and Handy C.(2003). Corpus-Based Thesaurus
Construction for Image Retrieval in Specialist Domains. In (Ed). Proc 25th European Conf on Inf.
Retrieval Research (ECIR-03, Pisa, Italy) LNCS-2633. Heidelberg:Springer Verlag. pp 502-510 Ahm 03b: Ahmad,
K., Casey, M. & Vrusias, B., Combining Multiple Modes of Information using
Unsupervised Neural Classifiers, Proc.
MCS 03. LNCS 2709. Heidelberg: Springer-Verlag. Alt 02: Al-Thubaity, A. & Ahmad,
K. (2002) Tracking the Knowledge of Emergent Domains. Proc 6th Int. Conf. on Inf. Visulaisation (London). Los Alamitos: IEEE Comp. Press. pp
685-690. Cir
01: F.
Ciravegna (2001) (LP)2, an Adaptive
Algorithm for Information Extraction from Web-related Texts. In Proc.
IJCAI-2001 Workshop on Adaptive Text Extraction and Mining, Seattle 2001 Cir
02: F.
Ciravegna, A. Dingli, Y. Wilks, D. Petrelli (2002) Timely and Non-Intrusive
Active Document Annotation via Adaptive Information Extraction. In Proc. of
ECAI Workshop on Semantic Authoring, Annotation and Knowledge Markup, Lyon,
France, 2002. Clo
02: P.
Clough, R. Gaizauskas, S. Piao and Y. Wilks (2002) METER: MEasuring TExt Reuse.
Proceedings of the Association for Comp. Linguistics. July 2002. Cun
02: H.
Cunningham (2002) GATE, a General Architecture for Text Engineering. Journal
of Computers and the Humanities, Vol. 36, pp. 223-254. Eak
99 Eakins, J.P., Graham, M.E.: Content-based
Image Retrieval: A Report to the JISC Technology Applications Programme. Image
Data Research Institute Newcastle, Northumbria Gil 02: Gillam, L.
(2002). (Ed.) Workshop on
Financial News: Making Money in the Financial Services Industry. Int Conf. on Terminology and Knowledge Eng.
(August 2002, Nancy, France). Hum
00: K.
Humphreys, G. Demetriou and R. Gaizauskas (2000) Two Applications of
Information Extraction to Biological Science Journal Articles: Enzyme
Interactions and Protein Structures. In Proc. Pacific Symposium on Biocomputing
Honolulu. May
02: D.
Maynard, H. Cunningham, K. Bontcheva and M. Dimitrov (2002) Adapting A Robust
Multi-Genre NE System for Automatic Content Extraction, In Proc.of the 10th
Int. Conf. on Art. Int.: Methodology, Systems, Applications (AIMSA 2002) Pas
02: Pastra,
K., Saggion, H., Wilks, Y.,Extracting relational facts for indexing and retrieval
of crime-scene photographs, Knowledge-Based Systems, Elsevier Science
(forthcoming) Pas
03: Pastra,
K., Saggion, H., Wilks, Y., Intelligent Indexing of Crime-Scene Photographs, IEEE
Intelligent Systems, Special Issue on "Advances In Natural Language Processing",
vol. 18 (1) pp. 55-61, 2003. Rol
02 F. Roli, J. Kittler (Eds.), Multiple
Classifier Systems: Proceedings (LNCS), Third International Workshop, MCS 2002,
Cagliari, Italy, June 24-26, 2002. Sal
03: Salway,
A., Graham, M., Tomadaki, E., & Xu
J.,(2003), ‘Linking
Video and Text via Representations of Narrative', AAAI Spring Symposium on Intelligent
Multimedia Knowledge Management, Palo Alto, 24-26 March 2003. Sal
98: Salway, A. & Ahmad, K. (1998) Talking Pictures: Indexing and
Representing Video with Collateral Texts. In (Eds.) D.Hiemstra, F de Jong and
K.Netter. Twente Workshop on Lang. Tech. in Multimedia Info. Retrieval,
December 7-8, 1998. Enschede: Univ. Twente. pp85-94. Sri
00 Srihari, R K & Zhang Z: Show&Tell: A Semi-Automated Image
Annotation System. IEEE MultiMedia 7(3): 61-71 (2000) Ste
01: M.
Stevenson and Y. Wilks, The Interaction of Knowledge Sources in Word Sense
Disambiguation. Journal of Comp. Linguistics, 2001. IMPRESS Detailed Work Plan – Gantt Chart
WP1: Domain Modelling, Milestone: User Requirements Specification WP2: Terminology and Ontology Building, Milestone: Automatic terminology/ontology component WP3: Text-Image Data Mining and Multi-modal Data Fusion. Milestone: Text-Image data components, Multimodal data fusion component WP4: System Development Milestone: Prototype I (month 16), Prototype II (month 20) WP5 (a):
Knowledge Dissemination Milestone: Web Site (Mnth 1),
Training Programme (Mnth 22), Evaluation report/ Final Report (Mnth 24) WP5 (b): Round Table Meetings |