The Course Project

In order to learn machine learning techniques, best practices, and theory, one really needs to apply machine learning to the real world. While synthetic/mocked/theoretical data is often studied, a set of real world applications (yours and your fellow students’) will ground your understanding and better prepare you all for using machine learning in your future careers.

About Project Deliverables

Each deliverable has certain number of points associated with it:

  • (1) Possible Direction
  • (1) Task Definition
  • (2) Test Set
  • (3) Training Set
  • (3) Feature Analysis
  • (3) Exploration
  • (3) Project Reflection

There are 15 total points in the project. Remember that for an A in the course you will need at least 12 of these points declared satisfactory.

(1) Possible Direction (4 March)

To that end, your first deliverable for the Course Project will be a 'verbal presentation' of a Possible Direction with:

  1. The question you will teach a machine to answer.
  2. The dataset you expect to draw examples from.

You will have done some checking of the following issues before settling on a dataset & question pair. This is not binding -- we will present these “Possible Directions” and you can join a classmates direction (with or without making it a group project).

For more information, please see: Choosing a Dataset and Task

(1) Task Definition (12 March)

For this deadline, your task definition will be written and binding. You will have heard of different projects from your peers, and may even be working on the same task as another student. That is fully acceptable, and leads to some interesting comparisons and shared understanding of the problem that can develop later.

For more information, please see: Choosing a Dataset and Task

(2) Test Set (26 March)

This checkpoint will look somwhat different for students who have 'natural labels' in their dataset, i.e., trying to predict the category of news articles that come with a category -- here you are working on a theoretical future prediction, using past predictions that are already available.

I have natural labels.

For these students, you will focus on getting this data into Python, joining tables as needed, and then checking your work on a handful of data points (manually inspecting to make sure your automatic processing has worked). You will want to be ready to train some kind of classifier soon after this checkpoint.


  • (1) Some initial setup of train/validate/test splits.
  • (1) Thoughts on task definition or other data sources that you might pursue later.

I need to make my own labels.

By this point, you will have collected (as fairly as possible) at least 100 labels for your problem. This is a large enough collection to be useful, but small enough that if you realize your task definition needs to change or accomodate something in the actual data, you can quickly patch any labels that need changing. A brief reflection on what you learned while performing this annotation will be required.


  • (1) >= 100 labels for test set.
  • (1) Thoughts on task definition.

(3) Training Set (Wednesday, 14 April)

I have natural labels.

For these students, you will be evaluating your setup. Some example questions you might ask are: (1) should I group data by week to make the time series more reliable, (2) what are the opportunities for train/test splitting on this data? (3) Should I be using K-fold cross-validation?


  • (1) Out-of-the-box model performances. Do you need a linear or nonlinear model?
  • (1) Is this dataset large enough? You likely cannot expand it, but you can still analyze to determine whether more labels would lead to better models or not.
  • (1) Thoughts on feature design.

Note that if you have pre-cooked features, e.g., from Kaggle, you still have feature-design options: Here you will be looking to see if you have incomplete data, if you can impute the results, or how you might encode missing values. You will also explore whether any features should be combined.

I am making my own labels

By this point, you will have collected at least 300 additional labels for your problem. Your understanding of the task, and what makes it possible for you as a human will be improving, and you will likely be working on feature extractors at this point. You will choose any basic machine-learning model to explore the labeling curve of your problem at this point; but even if this curve indicates that more labels would improve performance, you will likely be focusing on the model from this point onward.


  • (1) >= 300 labels for training set
  • (1) Learning Curve
  • (1) Thoughts on feature design.

(3) Feature & Data Analysis (Wednesday, 28 April)

By this point, you will have explored a handful of models, and have a better understanding of the difficulties of your task, what works well, and what does not. You will analyze feature contributions to model performances, to see if you can simplify the model, and propose your next 'Exploration' task.


  • (1) Feature Performance Analysis
  • (1) Appropriate Models tried and Compared
  • (1) Proposal of Exploration Task

(3) Exploration (Friday, 7 May)

This task is the most open-ended.

  • If your course project turns out to be rather solvable, you are free to pursue a related task, or a task redefinition here.
  • You may also explore implementing particular model or subdomain of ML (e.g., deep learning, reinforcement learning, recommender systems, etc.) instead of being tied to your project description.


  • (1) Task is a logical continuation or appropriate challenge.
  • (1) Task is properly versioned / submitted.
  • (1) Task is properly evaluated.

(3) Project Reflection (Friday, 21 May)

The project reflection is again, slightly open-ended.

If you are still working on your originally proposed task, your job here is to assemble a presentation for your corporate overlords, imploring them to: (a) ship your new model, (b) fund more research into this problem, (c) never ship your new model, etc.

If you are not still working on your originally proposed task, you may assemble a similar presentation -- but for your exploration. Alternately, we will discuss an appropriate objective.


  • (1) Presentation exists and is of appropriate length and depth.
  • (1) Recommendation is clear from presentation, and supported by data.
  • (1) Lessons-learned from the overall project are included.