Image-Text
Combinations
Dr
Andrew Salway (University of Surrey) and Dr Radan Martinec (University of the
Arts, London)
The combination of different media types is a defining characteristic of multimedia yet much research has concentrated on understanding the semantics of media types individually. Recently, systems have been developed to process correlated image and text data for tasks including multimedia retrieval, fusion, summarization, adaptation and generation. We argue that the further development and the more general application of such systems require a better computational understanding of image-text combinations. In particular we need to know more about the correspondence between the semantic content of images and texts, and the ways in which images and texts are used together.
We are developing a new area of multimedia research focused on image-text combinations. From a computational perspective, our aim is to develop a general theory to be applied in the development of multimedia systems that process correlated image and text data. Our theoretical framework describes how visual and textual information combine in terms of semantic relations between images and texts. Our classification of image-text relations is grounded in aesthetic and semiotic theory and has been developed with a view to automatic classification. We are currently evaluating the application of these image-text relations in a variety of multimedia systems.
The following technical report summarizes our work as of late 2004:
Salway and Martinec (2005), Some Ideas for Modelling Image-Text Combinations, Dept. of Computing Technical Report CS-05-02, University of Surrey.
A paper will appear in the journal Visual Communication, October 2005.