Skip to content

Commit

Permalink
Codecov and Census dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
sbrugman committed Jul 27, 2019
1 parent 76e685e commit 87b62af
Show file tree
Hide file tree
Showing 10 changed files with 382 additions and 8 deletions.
6 changes: 3 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ install:
- pip install -e .

script:
- if [ $TEST == 'unit' ]; then pytest --cov=./ --sanitize-with tests/sanitize-notebook.cfg tests/unit/; fi
- if [ $TEST == 'issue' ]; then pytest --cov=./ --sanitize-with tests/sanitize-notebook.cfg tests/issues/; fi
- if [ $TEST == 'examples' ]; then pytest --nbval --cov=./ --sanitize-with tests/sanitize-notebook.cfg examples/; fi
- if [ $TEST == 'unit' ]; then pytest --cov=$TEST --sanitize-with tests/sanitize-notebook.cfg tests/unit/; fi
- if [ $TEST == 'issue' ]; then pytest --cov=$TEST --sanitize-with tests/sanitize-notebook.cfg tests/issues/; fi
- if [ $TEST == 'examples' ]; then pytest --nbval --cov=$TEST --sanitize-with tests/sanitize-notebook.cfg examples/; fi
# Our well-behaved Unix-style command-line tool exits with code 0 unless an internal error occurred
- if [ $TEST == 'console' ]; then pandas_profiling -h; fi

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ For each column the following statistics - if relevant for the column type - are

The following examples can give you an impression of what the package can do:

* [Census Income](http://pandas-profiling.github.io/pandas-profiling/examples/census/census_report.html) (US Adult Census data relating income)
* [NASA Meteorites](http://pandas-profiling.github.io/pandas-profiling/examples/meteorites/meteorites_report.html) (comprehensive set of meteorite landings)
* [Titanic](http://pandas-profiling.github.io/pandas-profiling/examples/titanic/titanic_report.html) (the "Wonderwall" of datasets)
* [NZA](http://pandas-profiling.github.io/pandas-profiling/examples/nza/nza_report.html) (open data from the Dutch Healthcare Authority)
Expand Down
1 change: 1 addition & 0 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ <h1 id="pandas-profiling">Pandas Profiling</h1>
<h2 id="examples">Examples</h2>
<p>The following examples can give you an impression of what the package can do:</p>
<ul>
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/census/census_report.html">Census Income</a> (US Adult Census data relating income)</li>
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/meteorites/meteorites_report.html">NASA Meteorites</a> (comprehensive set of meteorite landings)</li>
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/titanic/titanic_report.html">Titanic</a> (the "Wonderwall" of datasets)</li>
<li><a href="http://pandas-profiling.github.io/pandas-profiling/examples/nza/nza_report.html">NZA</a> (open data from the Dutch Healthcare Authority)</li>
Expand Down
44 changes: 44 additions & 0 deletions examples/census/census.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
from pathlib import Path

import pandas as pd
import numpy as np
import requests

import pandas_profiling

if __name__ == "__main__":
file_name = Path("census_train.csv")
if not file_name.exists():
data = requests.get(
"https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
)
file_name.write_bytes(data.content)

# Names based on https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names
df = pd.read_csv(
file_name,
header=None,
index_col=False,
names=[
"age",
"workclass",
"fnlwgt",
"education",
"education-num",
"marital-status",
"occupation",
"relationship",
"race",
"sex",
"capital-gain",
"capital-loss",
"hours-per-week",
"native-country",
],
)

# Prepare missing values
df = df.replace("\\?", np.nan, regex=True)

profile = df.profile_report(title="Census Dataset")
profile.to_file(output_file=Path("./census_report.html"))
328 changes: 328 additions & 0 deletions examples/census/census_report.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion examples/meteorites/meteorites_report.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion examples/nza/nza_report.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion examples/stata_auto/stata_auto_report.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion examples/titanic/titanic_report.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

0 comments on commit 87b62af

Please sign in to comment.