Honours projects 2009

Projects supervised by Josiah Poon

I am offering projects in the areas of Data Mining, Information Extraction and Multi-lingual Search:

  1. Development of a Simple Interaction Mining Toolkit
  2. Finding Contingency of an Interaction Set
  3. Efficient Interaction Mining on Dataset with High Dimension but Little in Quantity
  4. Efficient Mining of Interaction Set and Contingency in Tandem
  5. Email Butler
  6. Personalized Conference Reminder
  7. Multi-lingual Search Platform

Feel free to contact me if you are interested in any of these projects by email (josiah AT it DOT usyd DOT edu DOT au) or in person.

Interaction Mining

Interaction Mining is a new mining paradigm that I am working on. It is in some ways similar to the Association Rule Mining (ARM). In contrast to ARM, this new approach is parameter-free and the interactionset is goal-specific. It also assumes the values of the variables lie on a lattice.

Efficient Interaction Mining on Dataset with High Dimension but Little in Quantity
Interaction mining is believed to be useful in the analysing the observational dataset from a microarray experiment. However, in this particular domain, the number of attributes (genes), m, is significantly more than the number of cases, n, i.e. m >> n. This setting is quite different from the conventional assumption of the mining algorithm when n > m. The aim of this project is to come with solution to address this specific condition.

Skills needed: Sound Java programming skill, COMP5318 will be an advantage, but not necessary

Suitable majors: Databases, Computer Science, Health Informatics

Efficient Mining of Interaction Set and Contingency in Tandem
We have been keeping the mining of the interactionset and their contigencies as two separate independent process. The aim of this project is to integrate these two processes into a single efficient step.

Skills needed: Sound Java programming skill, COMP5318 will be an advantage, but not necessary

Suitable majors: Databases, Computer Science

Information Extraction

Personalized Conference Reminder
An academic staff frequently receives reminders from emails or from the Web. Some of these notices require actions to be taken, e.g. the due date of a paper, the announcement of a special seminar, the appointment of a meeting. The current process to remind oneself of an event is to copy-and-paste the information to the Calendar in Outlook or to re-enter all the information to the Datebook in PalmPilot. This is not only a tedious process but it is also error-prone (jotting the wrong date & time). The aim of this project is to study and implement a prototype so that a person simply clicks the message (email or web page) and the appropriate event information of a conference (e.g. date, time, venue, conference name, paper due date) will be automatically extracted to one's scheduler without re-typing.

The aim of this project is to study and to implement a prototype that extracts information according to the user research interests.

References:

Training Agents to Recognize Text by Example, Henry Lieberman, Bonnie A. Nardi, Proceedings of the Third International Conference on Autonomous Agents (Agents'99)

Hierarchical Wrapper Induction for Semistructured Information Sources, Ion Muslea, Steven Minton, Craig A. Knoblock, Proceedings of the Third International Conference on Autonomous Agents (Agents'99)

Suitable majors: Databases, Software Engineering, Networking