SuperinformationhighwayS and IntelliMedia 2000+: bringing together humanities, science, and engineering

Paul Mc Kevitt
Center for PersonKommunikation (CPK)
Institute of Electronic Systems (IES)
Aalborg University
Fredrik Bajers Vej 7-A5, DK- 9220, Aalborg Ø, DENMARK
pmck@kom.auc.dk


1 Introduction
There is a major motivating force which is driving the Humanities and Sciences/Engineering towards each other in the area of integration of language and vision  processing by machines: SuperinformationhighwayS. This force is the ability now to have information in text, voice, sound, graphic and video forms available within minutes at local and 1. remote sites through interfaces like Netscape and search engines like 2. AltaVista. People will be able to pose their queries for retrieving information about say stocks and shares, or good restaurants in a city or their bank account by speaking that query  to the machine. In turn, they will  be able to direct the machine's graphical display of the information it is presenting in response. Visual information comes in many formats from diagrams to videos as does language information both natural and formal. The Sciences/Engineering are more concerned with methods for transmitting, processing, representing and retrieving information across networks while the  Humanities are more concerned with the actual information itself. Slethei (1998) also makes this point on convergence of the gap between the two cultures, especially in respect of spoken dialogue systems (http://www.hd.uib.no/AcoHum/abs/Slethei.htm).

The area of MultiMedia is growing rapidly internationally and it is clear  that   it has various meanings from various points of view. MultiMedia can be separated into at least two areas: (1) traditional MultiMedia and (2) Intelligent MultiMedia (IntelliMedia). The former area  is the one that people traditionally  think of as  being MultiMedia, encompassing the presentation of text, voice, sound and video/graphics with possibly touch and virtual   reality linked in. However, the computer has little or no understanding of the meaning of what it is presenting. IntelliMedia,   which  involves the computer processing and understanding of perceptual input from speech, text and visual images and  reacting to it is much more complex and  involves technologies from the Engineering side in terms of spoken language processing, natural language processing, image processing, Computer Science and Artificial Intelligence and from the  Humanities side in terms of Linguistics, Cognitive Science, Psychology and studies of the mind. (see Mc Kevitt 1994/95/96/97). This is the newest area of MultiMedia research which has seen an upsurge over the last two years and one where most universities internationally do not have all the necessary expertise locally. Traditional and Intelligent MultiMedia education and research are found in the Science/Engineering and Humanities/Humanistic Computing Departments at Aalborg  University, Denmark.

2 IntelliMedia 2000+
The Institute for Electronic Systems at Aalborg University, Denmark has expertise in the area of IntelliMedia and has already established an initiative called IntelliMedia 2000+ funded by the Faculty of Science and Technology (FaST). IntelliMedia 2000+ coordinates research on the production of a number of real-time research demonstrators exhibiting examples of IntelliMedia  applications and education  in the form of a new Master's degree in IntelliMedia. An important emphasis is  the integration of research and education  in IntelliMedia. IntelliMedia 2000+ is coordinated from the Center for PersonKommunikation (CPK) which  has a wealth of experience and expertise in spoken language processing, one of the central components of  IntelliMedia, but also radio communications which would be useful for mobile applications (CPK  Annual  Report, 1998). More details on IntelliMedia 2000+ can be found on WWW: http://www.kom.auc.dk/CPK/MMUI/. IntelliMedia  2000+ involves four research groups from three Departments within the Institute for Electronic Systems:  Computer Science (CS), Medical Informatics (MI), Laboratory of Image Analysis (LIA) and Center for PersonKommunikation (CPK), focusing on platforms for integration and learning, expert systems and decision taking, image/vision processing, and spoken language processing/sound localisation respectively. The first two groups provide a   strong basis for methods of integrating semantics and conducting learning and decision taking while the latter groups focus on the two main input/output components of IntelliMedia, vision and speech/sound.

3 Education
Teaching is a large part of IntelliMedia 2000+ and two new courses have been  initiated: (1) MultiModal Human Computer Interaction, and (2) Readings in Advanced Intelligent MultiMedia. MultiModal HCI, including  traditional HCI, involves teaching of  methods for the development of optimal interfaces through methods for layout of buttons, menus, and form filling methods for screens but also includes advanced interfaces using spoken dialogue and gesture. The course on Readings in Advanced Intelligent MultiMedia is innovative and new and includes active learning where student groups present state of the art research papers and invited guest lecturers present their research from IntelliMedia 2000+. A new Master's Degree (M.Eng./M.Sc.) has been established and incorporates the courses just mentioned as core modules of a 1 and 1/2 year course taught in English on IntelliMedia. Each semester has a theme associated with it and involves both project work and courses. Semester I focusses on Basic methods, Semester II on Advanced methods and III on a Master's Thesis in Intelligent MultiMedia. The latter semester has no courses. The Masters course is open for non-Danish and Danish students. All courses are given in English and the thesis can be written in English or Danish. Each student is graded according to internationally recognised grading schemes. More details can be found on WWW: http://www.kom.auc.dk/ESN/masters.

The emphasis on group organised and project oriented education at Aalborg University (Kjaersdam  and   Enemark 1994) is an excellent framework in which IntelliMedia, an inherently interdisciplinary subject, can be taught. Most courses involve students working on project work in groups in the unique Aalborg style. Here, each semester the students work together in groups of three to  four  on self-chosen projects and this has proven to give students better opportunities after their education. Approximately 50% of the courses have individual  examinations and all courses can be examined as part of an oral examination based on the prepared project report. Groups can even design and  implement a smaller part of a system which has been agreed upon between a number of groups. It is intended that there be a  tight link between  the education and research aspects of IntelliMedia  2000+ and that students can avail of software demonstrators and platforms developed but can also become involved in developing them. The Master's course is now in its second year with over 20 students, half of whom are from abroad and a number of student projects related to IntelliMedia 2000+  have already been completed (Bakman  et al. 1997a, 1997b, Nielsen 1997, Tuns and Nielsen 1997). Currently five student groups are enrolled in the Master's conducting projects  on multimodal  interfaces, pool-game trainer, virtual steering wheel, audio-visual speech recognition, and  face recognition.  Occasionally, a Lifelong Learning course is given for returning students of Aalborg University who wish to continue  their education. This course is a compression of the core IntelliMedia courses.

4 CHAMELEON
The results from the four research groups of IntelliMedia 2000+ have hitherto to a large extent been developed within the groups themselves. However, our goal was to establish collaboration among the groups in order to integrate their results into developing IntelliMedia demonstrator systems and applications. Some of the results would be integrated within a short term perspective as some of the technologically based modules are already available, others on the longer term as new  results become available. The demonstrator would be a single platform called CHAMELEON with a general architecture of communicating agent modules processing inputs and outputs from different modalities and each of which could be tailored to a number of  application  domains. CHAMELEON would demonstrate that existing platforms for distributed processing,    decision taking, image processing, and spoken dialogue processing could be interfaced to the single platform and act as communicating agent modules within it. CHAMELEON would be independent of any particular application domain. The first prototype of a CHAMELEON software and hardware platform has been developed. CHAMELEON demonstrates that existing software modules for (1) distributed processing and learning, (2) decision taking, (3) image processing, and (4) spoken dialogue processing can be interfaced to a single platform and act as communicating agent modules within it.

CHAMELEON is independent of any particular application domain and the various modules can be distributed over different machines. Most of the modules are programmed in C++ and C. CHAMELEON demonstrates that (1) it is possible for agent  modules to receive  inputs particularly in the form of images and spoken dialogue and respond with required outputs, (2) individual agent modules can produce output in the form of semantic representations, (3) the semantic representations can be used for effective communication of information between different modules, and (4) various means of synchronising the communication between  modules can be tested to produce optimal results. More details on CHAMELEON are found in Broendsted et al.(1998) and Mc Kevitt (1998) (http://www.hd.uib.no/AcoHum/abs/McKevitt-demo.htm) .

5 Conclusion
SuperinformationhighwayS are forcing the merging of the Humanities and Sciences/Engineering in terms of processing, integrating, representing and accessing information in multiple modalities including at  least text, voice, sounds and    images/videos (Intelligent Multimedia). Information from many cultures will be input in the form of natural and formal  speech and language with images in the form of simple diagrams right up to videos. The Humanities will be concerned more with the  content of the information being passed while the Sciences/Engineering  will be more concerned with processing, representation and transmission. As  Horgan (1996) points out much of the future of science for 2000+ will be in the integration and engineering of existing theories, models and systems with convergence. Aalborg University is well equipped in terms of research expertise and education to be able to contribute to IntelliMedia 2000+ which will be important  for the future of international computing and media development. An important emphasis is the integration of research and education in IntelliMedia. We believe IntelliMedia will also throw light on the numerous developments in Computer and Cognitive Science (CS)  (O Nuallain 1995 and O Nuallain et al.1997). IntelliMedia 2000+ (http://www.kom.auc.dk/CPK/MMUI/) will ensure the position of Denmark and Europe in the construction of the future of SuperinformationhighwayS.


Acknowledgements
We take this opportunity to acknowledge support from the Faculty of Science and Technology, Aalborg University, Denmark and from the European Union (EU) under the ESPRIT (OPEN-LTR) Project 24 493. Paul Mc Kevitt would also like to acknowledge the British Engineering and Physical Sciences Research Council (EPSRC) for their generous funded support  under grant B/94/AF/1833 for the Integration of  Natural Language, Speech and Vision Processing  (Advanced  Fellow) and LIMSI-CNRS, Orsay, France where he was a Visiting Professor whilst completing this abstract.

Notes:
1 Netscape is a trademark of Netscape Communications Corporation.
2 AltaVista is a trademark of Digital Equipment Corporation.
3 Paul Mc Kevitt is also a British Engineering and Physical Sciences Research Council (EPSRC) Advanced Fellow at the Department of Computer Science, University of Sheffield, for five years under grant B/94/AF/1833 for the Integration of Natural Language, Speech and Vision Processing.

7 References
Broendsted, T., P. Dalsgaard, L.B. Larsen, M. Manthey, P. Mc Kevitt, T.B. Moeslund, K.G. Olesen (1998) A platform for  developing Intelligent MultiMedia applications. Technical  Report R-98-1004, Center for PersonKommunikation (CPK), Institute for Electronic Systems (IES), Aalborg University, Denmark, May. Bakman, Lau, Mads Blidegn, Thomas Dorf Nielsen, and Susana Carrasco Gonzalez (1997a) NIVICO - Natural Interface  for VIdeo COnferencing. Project Report (8th Semester), Department of Communication Technology, Institute 8, Aalborg University, Denmark. Bakman, Lau, Mads Blidegn, and Martin Wittrup (1997b) Improving human computer interaction by adding speech, gaze, tracking  and agents to a WIMP based environment. Project Report (9th/10th Semester), Department of Communication Technology, Institute 8,  Aalborg University, Denmark. Baekgaard, Anders (1996) Dialogue management in a Generic Dialogue System. Proceedings  of the Eleventh Twente Workshop on Language Technology (TWLT), Dialogue Management in Natural  Language Systems, 123-132. Twente, The Netherlands. Dalsgaard, Paul and A. Baekgaard (1994)  Spoken language dialogue systems,   In   Prospects and Perspectives in Speech Technology: Proceedings in Artificial  Intelligence, Chr. Freksa, (Ed.), 178-191, September. Muenchen, Germany, Infix. Horgan, John (1996) The end of science: facing the limits of knowledge in the twilight of  the scientific age. Reading, Mass.: Addison-Wesley (Helix Books). Mc  Kevitt, Paul (1994) Visions for language.  Proceedings of theWorkshop on Integration of Natural Language and Vision processing. Twelfth American National  Conference on Artificial Intelligence (AAAI-94), Seattle,Washington, USA, August, 47-57. Mc Kevitt, Paul (Ed.)  (1995/1996) Integration of Natural Language and Vision Processing (Vols. I-IV). Dordrecht, The Netherlands: Kluwer-Academic Publishers. Mc Kevitt, Paul (1997) SuperinformationhighwayS. In ``Sprog og Multimedier'' (Speech and  Multimedia) Tom Broendsted and Inger Lytje (Eds.), 166-183, April 1997. Aalborg, Denmark: Aalborg Universitetsforlag (Aalborg University Press). Mc Kevitt, Paul (1998) CHAMELEON and the IntelliMedia WorkBench: integrating research from the humanities, science and engineering. In WWW and printed Proceedings of the International Conference on The Future of the Humanities in the Digital Age: problems and perspectives for humanities education and research. University of Bergen, Bergen, Norway, September (http://www.hd.uib.no/AcoHum/abs/McKevitt.htm). Nielsen, Joergen (1997) Distributed applications communication system applied on IntelliMedia WorkBench. Project Report (8th  Semester), Department of Medical Informatics and Image Analysis (MIBA), Institute 8, Aalborg University, Denmark.O Nuallain, Sean (1995) The search for mind: a new foundation for cognitive science. Norwood, New Jersey: Ablex Publishing Corporation. O Nuallain, Sean, Paul Mc Kevitt and Eoghan Mac Aogain (1997) (Eds.) Two sciences of mind: readings in cognitive science and consciousness. "Advances in Consciousness Research" (AiCR 9). USA: John Benjamins. Slethei, Kolbjørn (1998) Can education bridge the gap between the two cultures? In WWW and  printed Proceedings of  the International Conference on The Future of the Humanities in the Digital Age: problems and  perspectives for humanities education and research. University of Bergen, Bergen, Norway, September (http://www.hd.uib.no/AcoHum/abs/McKevitt.htm) Tuns,  Nicolae G. and Thomas  Dorf  Nielsen (1998) Experimenting with phase web as AI support in the CHAMELEON system. Project Report (9th semester), Department of Computer Science, Institute 8, Aalborg University, Denmark.