The Pima Indians of Arizona have the highest reported prevalence of diabetes of any population in the world. A small study has been conducted to analyse their medical records to assess if it is possible to predict the onset of diabetes based on diagnostic measures.
Dataset is downloaded from Kaggle.
The objective of this article is to build a predictive machine learning model to predict based on diagnostic measurements whether a patient has diabetes. This is a binary (2-class) classification problem. There are 768 observations with 8 medical predictor variables (input) and 1 target variable (output 0 for "no" or 1 for "yes").