Analyzing the Titanic Dataset using Bayesian Optimization

Taught by one of my favourite profs in NUS (Dr. Vincent Tan), MA4270 was, hands down, one of the best modules I took under the Dept. of Mathematics. This module explored the mathematical basis of machine learning models we use everyday.

Word of warning though - this module was (is?) quite rigorous and requires a solid understanding of statistics. It has a computational part too, where we had to implement an SVM and perform regression from scratch (no scikit)!

For our final project, we were free to choose any dataset and apply any machine learning algorithm to it to derive useful results.

We chose the titanic dataset, for it was easy to understand (a mix of categorical and numerical data) and required little data cleaning.

Our models were mathematically rigorous, rather than focusing on results; despite that, we tuned our hyperparameters using Bayesian Optimization and achieved a good training score.

Our report and code is available for anybody interested in viewing it.

PROJECTS
hackathon data machine learning