Titanic AI Data Study

Untitled design 2 Titanic AI Data Study 1

Titanic AI Data Study

Website Tiles V117 Titanic AI Data Study 3

Creating an AI data model to analyze the Titanic’s data and predict survival rates with an 88% accuracy and 96% precision was a truly gratifying experience. This project allowed me to immerse myself in the fascinating worlds of data science and machine learning, all while uncovering insights from one of history’s most iconic events.

The journey began with the Titanic dataset, a rich repository of information about the passengers and their fates. Armed with this data, I embarked on the exciting journey of building a predictive model using Python and Anaconda, making the most of several critical libraries that significantly facilitated the process:

  • Scipy 1.1.0: This library offered an array of statistical functions and tools, which played a crucial role in shaping the model’s ability to process and manipulate the data effectively.

  • Numpy 1.14.3: Numpy’s array operations and mathematical functions were instrumental in carrying out numerical computations efficiently, a vital aspect of data modeling.

  • Matplotlib 2.2.2: Matplotlib was a key asset in visualizing the dataset, helping me gain valuable insights into the distribution of variables, relationships between features, and potential patterns.

  • Pandas 0.23.0: Pandas proved indispensable for data manipulation and exploration. Its powerful data structures like DataFrames simplified the task of preprocessing and cleaning the Titanic dataset.

  • Sklearn 0.20.0: Scikit-learn, or sklearn, provided a robust platform for implementing machine learning algorithms and evaluating their performance.

The models I chose for this project were diverse, each with its own strengths and weaknesses. Here are the models I employed, along with their respective accuracy scores:

  • Logistic Regression (LR): Achieving an accuracy of 88.6%, this model proved to be a solid performer, offering reliable predictions.

  • Linear Discriminant Analysis (LDA): With an accuracy of 93.7%, LDA showcased its ability to discern patterns in the Titanic dataset effectively.

  • K-Nearest Neighbors (KNN): This model attained an accuracy of 95.2%, demonstrating the power of proximity-based classification in this context.

  • Classification and Regression Trees (CART): CART achieved an accuracy of 96.2%, making it one of the top-performing models in this analysis.

  • Naive Bayes (NB): With an accuracy of 94.9%, Naive Bayes proved to be a reliable choice for probabilistic classification.

  • Support Vector Machine (SVM): Achieving an accuracy of 88.5%, SVM demonstrated its effectiveness in distinguishing between survivors and non-survivors, adding further depth to my analysis.

The precision of 96% suggests that our model was adept at minimizing false positives, which is particularly important in this context when predicting survival.

Overall, this project not only honed my skills in data analysis and machine learning but also deepened my appreciation for the historical significance of the Titanic disaster. It was immensely satisfying to see the AI model provide valuable insights into the likelihood of survival based on room number and gender, showcasing the power of data science in uncovering hidden patterns in the past.

Project Description

Creating an AI data model to analyze the Titanic’s data and predict survival rates with an 88% accuracy and 96% precision was a truly gratifying experience. This project allowed me to immerse myself in the fascinating worlds of data science and machine learning, all while uncovering insights from one of history’s most iconic events.

Project Details

  • Client: North Metropolitan TAFE
  • Date: July 15th 2020
  • Category: AI

Project Participants

  • Developers: James Noonan
  • Manager: James Noonan
  • IT Guy: James Noonan