Honours/MIT Student Projects 2003 [Projects Supervised by Josiah
Poon]
The followings are some of the suggestions for
the honours projects. They are not exhaustive and I welcome anyone
to come and discuss their own ideas.
You can either contact me using email josiah@it.usyd.edu.au
(preferred) or come to my office at G83.
Learning User Interests using an Artificial
Life Approach
Description: In the age of information explosion, precious time is wasted
in eliminating junk information. At the same time, a lot of our
time is spent in searching and locating relevant and interesting
information. This will be ideal to employ smart personal assistants
to help us filter out information based on the understanding of
our preference and current interests. It is unfortunate that not
everyone is wealthy enough to hire a human assistant. A framework
is developed to learn and to adapt to a user's evolving interests
according to his/her past behaviour. The system is grounded on an
evolutionary computing paradigm (artificial life). The proposed
framework called GENIE which will continuously learn the user behaviour/interests.
A non-intrusive approach is adopted in this work. The user model
will be constructed implicitly without his/her active involvement.
This project aims to develop a prototype using the GENIE framework.
The framework caters for learning in both short-term and long-term
memory.
Key Areas:
user model, incremental learning, reinforcement, evolutionary algorithms
Text Mining on Financial News
Description:
Most of the financial predictions are made by crunching number
from previous days. However, some of the factors that contribute
to the rise/fall of a, say, share price are not always numbers,
they can be crisis in the Middle East, the election of president
in U.S. These political factors cannot be expressed in a traditional
database. This project aims to make predictions of the movement
of a financial instrument from various text documents, e.g. newspaper,
financial report etc. The movement of different financial indicators
in historical documents are used to predict the rise or fall of
the financial indicator for the following day.
Description:
[Scenario] John was a research student. He planned to write a few
papers to different conferences. He just visited a web page that
contained information about a conference. He was interested and
he wanted to put the Conf Title, conference dates, submission due
date to the DateBook in his PalmPilot. His browswer had a programmable
button of which he has previously demonstrated how the system could
find these kinds of information. The only thing he had to do now
was to press the button to extract these details. However, the due
date in this conference announcement page was slightly different
from what the system has been taught. John intervened and shown
where the due date was. On top of extracting this missing information,
the system also generated an additional rule to handle this new
situation. [End Scenario].
The aim of this project is to study and to
implement a prototype that extracts information according to the
users examples.
Key Areas:
text extraction, programming by example/demonstration, XML
PhotoSensitive: Adaptive Multimedia
Presentation
Description:
In traditional photography, we normally choose the best one to display
in a photo frame. This selected photo stays in the frame for quite
a long time before it will be replaced by another newer picture.
With the increasing popularity of digial cameras/ scanners, the
paper-based photos now have their digital incarnation. The digital
version of these pictures provides a greater flexibility to display
a different picture throughout the day. The goal of this project
look for opportunities. In this initial stage, pictures are manually
assigned with labels so that they can be made use by the retrieval
mechanism. Candidate pictures are retrieved according to the sensory
data from the environment. These data can be sound, light, temperature,
movement, barometer reading etc. Here are some very crude examples:
1. if the room temperature is high, then select
those pictures related to activities such as beach, outdoor BBQ
etc,
2. if the sound level is high, select pictures related to parties
etc.
In other words, the choice of graphical images
is sensitive to the environments. A selection scheme is defined
to choose the most appropriate one from this set of candidate pictures
for display. Although the scenario uses photo display as an example,
the general scheme should be applicable to choosing music and other
mutlimedia presentation.
Key Areas:
meta-data, context-awareness, planning
Java Documentation Helper
Description:
One of the difficulties encountered by a beginner Java programmer
is to find the appropriate class(es).The current arrangement of
the Java documentation requires someone to understand its structure
before you can search. If you dont know there exists such
a thing, you dont even know what
and how to ask. Another problem with
a non-Java programmer is that the person doesnt have the correct
vocabulary to formulate a search. The aim of the project is to (1)
parse the Java documentation and create a corresponding metadata
description and (2) translate and map a users non-Java oriented
queries to the metadata.
Key Areas:
meta-data, learning, Java
OnTAP: Online Teaching Assistant Project
Description:
Learning software development is not just about writing programs
in a certain computer language. Students also have to learn to manage
the process so that a quality system can be produced. COMP1001 and
COMP1002 are the two foundational units in doing a computing degree
in our department. Regular submission of project plans is a crucial
assessment component. However, tutors
generally just check if a plan has been submitted without much feedback.
Students do not know if they are on the right track or if they have
missed some important tasks. A simple system has been built that
1. performs simple keywords
extraction from the submitted plan,
2. the extracted information is compared
with the task model and the schedule,
3. a just-in-time response is automatically
constructed from templates and feedback to the students.
However, the prototype has to be further enhanced
in the area of workflow technology (particularly non process-based
workflow) as well as a more sophisticated information extraction
module. Thorough user testings are required.
Key Areas:
information extraction, task modelling, software development, project
plan
Coevolution of Features Selection &
Test Cases & Learning
Description:
The quality of knowledge acquired from in a supervised learning
process depends upon what training data is provided to the learner
as well as what features have been considered to be salient in the
process. However, the selection of training data and test data usually
relies on the insights of the research scientists. This is usually
the experience of the persons involved to find the right combination;
it has more-or-less become an art and a myth. This project is to
develop algorithms so that a feedback loop between the selection
process and the learning is enabled to mutually inform/guide each
other, the algorithm is then compared with some existing approaches.
References: - Hillis, W. D.(1990). Co-evolving parasites improve simulated
evolution as an optimization procedure. Physica D, 42, pp. 228-234.
Description:
Even though we have more and more information on the web moving
to multimedia, majority of them is still textual-based. In the future
(or now), we can access the web using
wireless technology while we are on the move. We do not want to
read the text while we are bumping along. We just want to listen
to the information while we are jogging (just like we listen to
a CD/mp3 player). We can use some latest development from W3C regarding
voice, e.g. voiceXML or SALT, but whats about the numerous
legacy pages that have been in use? A large company may have the
resources to redesign all these pages, but a humble (and poor) academic
does not have the money and extra effort to convert all the pages.
The aim of this project is to build a prototype to enable a user
to navigate web pages (including legacy pages) and to listen to
his (her) required information as if s(he) is using a radio. And
thats where the name coming from radioWeb