Task: Semantic analysis of text is in its embryonic stages with most efforts at deep analysis having been abandoned to be replaced by shallow parsing and statistical analysis. It is necessary to develop an approach to semantic analysis of text that uses a fairly general theory of semantics and to develop computational processing techniques to apply it to text. The task is to investigate different ways of defining the semantics of text and establishing computational methods for their automatic identification.
Task: Information Retrieval methods have proven to be a reliable technique for the classification of documents. Many different methods are available for IR yet there is only limited knowledge about the context for optimum performance of each method. As well greater linguistic knowledge helps the performance of an IR classifier. Furthermore research generally shows that combinations of IR classifiers tend to be more accurate than any one classifier.
Project Aim: To construct a suite of IR classifiers that incorporate supplementary linguistic knowledge and determine the scenarios in which each is at its optimum and the effectiveness of combined classifiers.
Task: A number of machine learning techniques are popular in the field of computational linguistics for tackling problems of Part-of-speech tagging, Syntactic analysis and Word Sense Disambiguation (WSD). It would be valuable to make a comparative appraisal of the different techniques reported in the literature and apply some of the more up-to-date machine learning methods to tackle the same domains.
Project Aim: To appraise the different methods of supervised and unsupervised learning and apply them to part-of-speech tagging, syntactic analysis, and Word Sense Disambiguation and build generic implementations of the methods suited to rapid deployment in HLT systems.
Task: One study of verbs has produced a classification of about 300 classes. This study presents information on the list of words of each class and the grammatical structures that reside within each class. The challenge is to implement programs that can identify each class and the elements in a sentence that match the corresponding structure of that verb class.
Project Aim: The aim of the project is to design a system that enables one to specify the grammatical structure of a verb class, its word members and other relevant lexical information and subsequently to tag a text with appropriate structural markers. Identifying semantic contexts in which the verb class is used will hone the accuracy of a verb analysis system. The system will be tested on dialogue from psychotherapy interviews. The verb classes have been set up as a resource in an XML file for this project.
Task: Every new language technology system tends to design its own lexical database with the result that there are hundreds of systems that are incompatible with each other. This creates a serious impediment to aggregating the knowledge in many systems and has to be solved by the development of a knowledge sharing metalanguage.
Task: There are a number of architectures for LT Systems. most rely on a sequential pipeline architecture for the progressive processing of data from the simple to the complex stages. More recent approaches have used agent based philosophies that communicate the results of their processing. we are interested in developing a transaction based model that makes decisions on a just-in-time manner and with probabilistic criterion. WE believe this architecture will allow for massive parallelisation of the processing tasks as well as significant interaction between processing functions. Experiments in parallelising the solution using both clusters and grids need to be conducted.