Statistical Natural Language Processing (COMP5046)
UNIT OF STUDY
This unit introduces computational linguistics and the statistical techniques and algorithms used to automatically process natural languages (such as English or Chinese). It will review the core statistics and information theory, and the basic linguistics, required to understand statistical natural language processing (NLP).
Statistical NLP is used in a wide range of applications, including information retrieval and extraction; question answer; machine translation; and classifying and clustering of documents. This unit will explore state of the art approaches to the key NLP sub-tasks, including tokenisation, morphological analysis, word sense disambiguation, part-of-speech tagging, named entity recognition, text categorisation, phrase structure and Combinatory Categorial Grammar parsing.
Students will implement many of these sub-tasks in labs and assignments. The unit will also investigate the annotation process that is central to creating training data for statistical NLP systems. Students will annotate data as part of completing a real-world NLP task.
Our courses that offer this unit of study
- Bachelor of Arts (Honours)
- Bachelor of Computer Science and Technology (Honours)
- Bachelor of Computer Science and Technology (Honours) (Advanced)
- Bachelor of Information Technology
- Bachelor of Information Technology and Bachelor of Arts
- Bachelor of Information Technology and Bachelor of Commerce
- Bachelor of Information Technology and Bachelor of Laws
- Bachelor of Information Technology and Bachelor of Medical Science
- Bachelor of Information Technology and Bachelor of Science
- Bachelor of Liberal Arts and Science (Honours)
- Bachelor of Medical Science (Honours)
- Bachelor of Science (Advanced Mathematics) (Honours)
- Bachelor of Science (Advanced) (Honours)
- Bachelor of Science (Honours)
- Graduate Certificate in Information Technology
- Graduate Certificate in Information Technology Management
- Graduate Diploma in Computing
- Graduate Diploma in Health Technology Innovation
- Graduate Diploma in Information Technology
- Graduate Diploma in Information Technology Management
- Master of Health Technology Innovation
- Master of Information Technology
- Master of Information Technology Management
- Master of Information Technology and Master of Information Technology Management
Further unit of study information
Lecture 2 hrs/week; Laboratory 1 hr/week.
Through semester assessment (50%) and Final Exam (50%)
Christopher D. Manning & Hinrich Schutze/The Foundations of Statistical Natural Language Processing/1999//
Faculty/department permission required?
Unit of study rules
Prerequisites and assumed knowledge
Knowledge of an OO programming language
Study this unit outside a degree
If you wish to undertake one or more units of study (subjects) for your own interest but not towards a degree, you may enrol in single units as a non-award student.
If you are from another Australian tertiary institution you may be permitted to underake cross-institutional study in one or more units of study at the University of Sydney.