Food-Nutrient-Data-Wrangling

Introduction

In this project, I will use Python as the scripting language to manipulate data in a open source database from Food Standards Australia and New Zealand.

The main techniques in this project includes:

Cleanning
Visualisation
Clustering
Correlations
Predicitons

Implementation:

For this project purpose, these library will be need to imported:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from scipy.spatial.distance import pdist, squareform
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier
from sklearn.cluster import KMeans
from scipy.spatial.distance import cdist

Project:

See the assignment description attached for more information.

Stage 1: Cleanning and visualisation techniques

Count the total number of foods and attributes then calculate the median value of a attribute.
Add a new attribute to the dataset
Visualisation using boxplot

Visualisation using barplot

Visualisation using scatterplot

Visualisation using parallel co-ordinates

Visualisation using pie chart

Merge and export data to JSON format

Stage 2: clustering, correlations and predictions techniques

Standardisation database for effectively techniques using
Principle components analysic (PCA)

Clustering visualisation

K-means and sum of squared errors

Correlation and Mutual Information

Prediction models: decision trees

Prediction models: K-NN
Feature generation

Note:

This project is a university assignment from Element of Data Processing subject in The University of Melbourne, Department of Computing and Software System, Semester 1 2019.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
8i. Australian Health Survey Classification System.csv		8i. Australian Health Survey Classification System.csv
PCA.PNG		PCA.PNG
README.md		README.md
bar_plot.PNG		bar_plot.PNG
box_plot.PNG		box_plot.PNG
cluster.PNG		cluster.PNG
correlation.PNG		correlation.PNG
data_wrangling.py		data_wrangling.py
data_wrangling_p2.py		data_wrangling_p2.py
food_nutrient_2011_13_AHS.csv		food_nutrient_2011_13_AHS.csv
k_means.PNG		k_means.PNG
parallel_coordinates_plot.PNG		parallel_coordinates_plot.PNG
pie_chart.PNG		pie_chart.PNG
predict_decision_tree.PNG		predict_decision_tree.PNG
projectspecification.pdf		projectspecification.pdf
scatter_plot.PNG		scatter_plot.PNG
spec(4).pdf		spec(4).pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Food-Nutrient-Data-Wrangling

Introduction

Implementation:

Project:

Note:

About

Releases

Packages

Languages

luuhoanganhhuy/food-nutrient-data-wrangling-

Folders and files

Latest commit

History

Repository files navigation

Food-Nutrient-Data-Wrangling

Introduction

Implementation:

Project:

Note:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages