Skip to content

Commit

Permalink
x_train_bin may also be filtered (tensorflow#212)
Browse files Browse the repository at this point in the history
* x_train_bin may also be filtered

Both `qnn` and `fair_nn` are based on binary encoded `x_train_bin`, however this dataset is full of contradictory examples (7789 out of 11520). Though already mentioned in the doc, "binarizing will cause more collisions", it might be unclear to a part of readers that the ratio of collision exceeded 67.6%! and this ratio will affect classificaion training.

* Don't use the binary-non-contradictory data.

* capture output so it doesn't print the arrays.

Co-authored-by: Mark Daoust <[email protected]>
  • Loading branch information
tiancheng2000 and MarkDaoust authored May 5, 2020
1 parent 2187b36 commit a78c79a
Showing 1 changed file with 23 additions and 0 deletions.
23 changes: 23 additions & 0 deletions docs/tutorials/mnist.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -466,6 +466,29 @@
"x_test_bin = np.array(x_test_small > THRESHOLD, dtype=np.float32)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "SlJ5NVaPojhU"
},
"source": [
"If you were to remove contradictory images at this point you would be left with only 193, likely not enough for effective training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {},
"colab_type": "code",
"id": "1z8J7OyDojhW"
},
"outputs": [],
"source": [
"_ = remove_contradicting(x_train_bin, y_train_nocon)"
]
},
{
"cell_type": "markdown",
"metadata": {
Expand Down

0 comments on commit a78c79a

Please sign in to comment.