|
|
|
|
|
Course Description
This is an especially exciting time to study Information Extraction (IE), a fundamental research area in Natural Language Processing (NLP), which aims to enable computers to automatically process large amounts of free text. This course teaches core IE concepts and techniques that are important for students to develop automatic text processing applications. The students will digest and practice their NLP knowledge and skills by working on real projects.
This course will begin with lectures introducing basics of natural language processing and machine learning, and continue to read papers on several sub-disciplines of IE, including named entity recognition, relation extraction, event extraction, coreference resolution and sentiment analysis.
Course Goal
Through this course, students will gain solid theoretical knowledge and enough practical experience to design and develop their own text processing applications in the future.Evaluation Metrics
This course will emphasize on skills of critical paper reading and practical system development. You're required to present at least one research paper, read all the papers, write short paper summaries (two paragraphs, at most 1 page) and actively participate in class discussions. You will work on a nlp project by yourself or teamed with one classmate (a team of at most 2 people). The project will be evaluated two times, which occur in the middle of the semester and at the end of the semester. By mid term, you should have built a working system. Then in the latter half term, you will work on further improving the performance of your system. Specifically, the following score calculation metric will be used:
Paper presentations: | 25% |
Paper summaries: | 10% |
Class participation: | 10% |
Mid-term Project: | 25% |
Final-term Project: | 30% |
The grading policy is as follows:
90-100: | A |
80-89: | B |
70-79: | C |
60-69: | D |
<60: | F |
Presentations and Discussions
Two or more papers will be presented and discussed in each class. Papers will be presented by different students, one paper per student. Each student will present gist of a paper in 20 minutes or less and lead discussions after all the papers have been presented.
Paper presentations should aim to address the following aspects:
- Problem defination and motivation.
- Possible Downsteam Applications (as described in the paper and your thoughts)
- Proposed solution/algorithm/method.
- Strengths of the proposed method.
- Weakness of the proposed method.
- Your thoughts to improve on the proposed method.
Everyone should read all the papers, write short paper summaries and is expected to participate actively in the discussions.