CPSC 633: Machine Learning (Spring 2016)
Professor:
Dr. Thomas R. Ioerger
Office: 322C HRBB
Email: ioerger@cs.tamu.edu
Office hours: Wed, 3:15-4:15, or by appointment
Class Time: Tues/Thurs, 3:55-5:10
Room: 108 CHEN
Course WWW page: http://www.cs.tamu.edu/faculty/ioerger/cs633-spr16/index.html
Textbook: Machine Learning.
Tom Mitchell (1997).
McGraw-Hill.
pdf
Teaching Assistant: TBD
office hours: TBD
Goals of the Course:
Machine learning is an important sub-area within AI, and is broadly
applicable to many application areas within Computer Science. Machine
learning can be viewed as methods for making systems adaptive
(improving performance with experience), or alternatively, for
augmenting the intelligence of knowledge-based systems via rule
acquisition. In this course, we will examine and compare several
different abstract models of learning, from hypothesis-space search,
to function approximation (such as by gradient descent), to
statistical inference (e.g. Bayesian), to the minimum
description-length principle. Both theoretical issues
(e.g. algorithmic complexity, hypothesis space bias) as well as
practical issues (e.g. feature selection; dealing with noise and
preventing overfit) will be covered.
Topics to be Covered:
- decision trees, rule learning
- empirical methods - algorithm evaluation and comparison
- Bayesian classifiers (Naive Bayes, Gaussian models)
- nearest-neighbor classifiers
- feature selection/weighting
- perceptrons/linear discriminants
- neural networks
- support vector machines
- Bayesian inference
- ensemble methods
Prerequisites
CPSC 420/625 - Introduction to Artificial Intelligence
We will be relying on core concepts in AI, especially heuristic
search algorithms, optimization, and propositional logic
Either the graduate or undergraduate AI class (or a
similar course at another university) will count as satisfying this
prerequisite.
In addition, the course will require some background in analysis of
algorithms (big-O notation), and some familiarity with probability and
statistics (e.g. standard deviation, confidence intervals, linear
regression, Binomial distribution).
Projects and Exams
There will four or five programming projects and a final exam.
The main work for the class will consist of several programming projects in
which you will implement and test your own versions of several learning
algorithms. These will not be group projects, so you will be expected to do
your own work. Several databases will be provided for testing your algorithms
(e.g. for accuracy).
Your grade at the end of the course will be based on a weighted average of
points accumulated during the semester, 50% for projects and 50% for the final
exam. The maximum cutoff for an A will be 90%, 80% for B, and 70% for C.
The late-assignment policy for homeworks and projects
will be incremental: -5%/per day, down to a maximum of -50%. If the
project is turned in anytime by the end of the semester, you can still get up
to 50% (minus points marked off).
Schedule
Tues, Jan 19: | first day of class | types of machine learning;
core concepts | Ch. 1 |
Thurs, Jan 21: | | Version Spaces, inductive bias | Ch. 2,
slides |
Tues, Jan 26: | week 2 | Decision Trees | Ch. 3,
slides |
Thurs, Jan 28: | | pruning |
(Mingers,
1989), slides |
Tues, Feb 2: | week 3 | |
Thurs, Feb 4: | | Rule Induction | Ch 10.1-10.3;
slides |
Tues, Feb 9: | week 4 | Empirical Evaluation |
Ch. 5, slides |
Thurs, Feb 11: | | (class cancelled) |
Tues, Feb 16: | week 5 |
cross-validation, T-tests |
slides |
Thurs, Feb 18: | | Neural Networks | Ch. 4,
slides |
Tues, Feb 23: | week 6
|
Thurs, Feb 25: |
Project 1 due | |
Tues, Mar 1: | week 7 | Instance-Based Learning | Ch. 8 (8.1-8.2) |
Thurs, Mar 3: | | NTgrowth, PCA |
slides |
Tues, Mar 8: | week 8 | Feature Selection |
slides |
Thurs, Mar 10: |
Mar 14-18: | Spring Break | Spring Break |
Tues, Mar 22: | week 9 | Support Vector Machines |
(Burges, 1998), slides |
Thurs, Mar 24: |
Project 2 due |
Tues, Mar 29: | week 10 | Bayesian Learning (hMAP, hML, MSE, MDL,
BOC) | Ch. 6 |
Thurs, Mar 31: | | Expectation Maximization |
Tues, Apr 5: | week 11 | guest lecture by Dr. James Caverlee:
Machine Learning Applications to Social Media |
Thurs, Apr 7: | | Naive Bayes algorithm; Bayesian networks |
Tues, Apr 12: | week 12 | more on feature weighting |
slides |
Thurs, Apr 14: | | HMMs | Rabiner (1989),
slides
Tues, Apr 19: | week 13 | ensemble classifiers: bagging | Breiman
(1996), slides |
Thurs, Apr 21: | | boosting | Freund
and Schapire (1996), slides |
Tues, Apr 26: | week 14 | computational learning theory | Ch. 7,
slides |
Thurs, Apr 28: | Project 3 due
Tues, May 3 | class cancelled (last day of class) | no office hours
this week |
Fri, May 6 | review session: 113 HRBB, 4:00-5:00 |
Mon, May 9 | Final Exam:
1:00-3:00pm, 108 CHEN | |
| |
Academic Integrity Statement and Policy
Aggie Code of Honor: An Aggie does not lie, cheat or steal, or tolerate those who do.
see: Honor Council Rules and Procedures
Americans with Disabilities Act (ADA) Policy Statement
The Americans with Disabilities Act (ADA) is a federal anti-discrimination statute that provides comprehensive civil rights protection for persons with disabilities. Among other things, this legislation requires that all students with disabilities be guaranteed a learning environment that provides for reasonable accommodation of their disabilities. If you believe you have a disability requiring an accommodation, please contact Disability Services, currently located in the Disability Services building at the Student Services at White Creek complex on west campus or call 979-845-1637. For additional information, visit http://disability.tamu.edu.
<--
The Americans with Disabilities Act (ADA) is a federal anti-discrimination
statute that provides comprehensive civil rights protection for persons with
disabilities. Among other things, this legislation requires that all students
with disabilities be guaranteed a learning environment that provides for
reasonable accommodation of their disabilities. If you believe you have a
disability requiring an accommodation, please contact Disability Services, in
Cain Hall, Room B118, or call 845-1637. For additional information visit
http://disability.tamu.edu.
-->
Links
- other good books on ML:
- Duda, Hart, and Stork
- Bishop
- links related to Instance-Based Learning and Feature Selection
- decision tree papers
- (Quinlan,
1986) - Induction of Decision Trees
- (Mingers,
1989) - An Empirical Comparison of Selection Measures for Decision-Tree
Induction
- (Mingers,
1989) - An Empirical Comparison of Pruning Methods for Decision Tree Induction
- (Buntine
and Niblett, 1992) - A Further Comparison of Splitting Rules for
Decision-Tree Induction
-
(Murthy, Kasif, and Salzberg, 1994) - A System for Induction of Oblique
Decision Trees
-
(Quinlan, 1996) - Improved Use of Continuous Attributes in C4.5
-
(Cohen, 1995) Fast Effective Rule Induction (RIPPER)
-
(Brunk
and Pazanni) - the algorithm for
Rule-Post Pruning is discussed section 5; see also section 4.2 in
(Furnkranz, 1997). A Comparison of Pruning Methods for Relational
Concept Learning.
- papers on class imbalance
-
Overview of Bayesian Inference - by Jun Liu and Charles Lawrence
- An
Introduction to MCMC Methods for Machine Learning (Andrieu...Jordan, 2003)
- overview of Bayesian sampling methods
- Chapter
on Decision Trees from Tom Michell's Machine Learning book
- UCL Machine Learning
Repository
- chi-square test
- KD-trees (Andrew Moore)
- wrappers (Kohavi and Sommerfield, 1995)
- Relief (Kira and Rendell, 1992)