Skip to content

Latest commit

 

History

History
42 lines (30 loc) · 2.39 KB

README.md

File metadata and controls

42 lines (30 loc) · 2.39 KB

Heart Disease Prediction Analysis

this repository containing analyses and models for predicting heart disease. It includes clustering analyses, classification tree, and logistic regression models applied to dataset USING THE # SPSS.

aimed at predicting heart failure,

  1. Descriptive Statistics and Missing Value Analysis:

    • Provides descriptive statistics such as mean, minimum, maximum, standard deviation, skewness, and kurtosis for key variables related to heart disease prediction.
    • Analyzes missing values and identifies extreme values for each variable.
  2. T-Test for Equality of Means:

    • Assesses the significant difference in means between two groups for variables such as age and cholesterol level.
    • Reports t-test statistics and p-values for each variable.
  3. Regression Analysis:

    • Evaluates the regression model's performance in predicting age based on several predictors.
    • Reports R-squared, adjusted R-squared, standard error of the estimate, and Durbin-Watson statistic.
  4. Pearson Correlation Matrix:

    • Examines the correlation between various variables, such as age, resting blood pressure, cholesterol, and heart disease.
    • Provides insights into the strength and direction of relationships between variables.
  5. ANOVA Table:

    • Analyzes the differences in means between groups for variables like age, resting blood pressure, cholesterol, and others.
    • Reports F-statistics and p-values to determine the significance of group differences.
  6. Chi-Square Test:

    • Applies Fisher's exact test to determine significant associations between categorical variables.
    • Reports the p-value to assess the significance of the association.
  7. Factor Analysis:

    • Investigates the relationship between variables and identifies underlying factors.
    • Reports component loadings for each variable in the dataset.
  8. Clustering Analysis (BAVERAGE and QUICK):

    • Utilizes two clustering algorithms (BAVERAGE and QUICK) to group similar cases together based on heart disease prediction variables.
    • Reports cluster centers, number of cases in each cluster, and ANOVA results.
  9. Classification Tree and Logistic Regression:

    • Constructs a classification tree and logistic regression model to predict heart disease.
    • Reports model summaries, risk estimates, classification accuracy, and coefficients for predictor variables.