Merge pull request apache#156 from cristiancrc/patch-2

Update evaluation.html.md.erb
lscoder · Oct 12, 2015 · e8e0550 · e8e0550
2 parents bad760a + 4805eb0
commit e8e0550
Showing 1 changed file with 14 additions and 14 deletions.
diff --git a/docs/manual/source/templates/recommendation/evaluation.html.md.erb b/docs/manual/source/templates/recommendation/evaluation.html.md.erb
@@ -38,13 +38,13 @@ mandatory parameter,
 2. the `EngineParamsGenerator`, it contains a list of engine params to test
    against.
 The following command kickstarts the evaluation
-workflow for the classification template.
+workflow for the classification template (replace "org.template" with your package).
 
 ```
 $ pio build
 ...
-$ pio eval org.template.recommendation.RecommendationEvaluation \
-    org.template.recommendation.EngineParamsList
+$ pio eval org.template.RecommendationEvaluation \
+    org.template.EngineParamsList
 ```
 
 You will see the following output:
@@ -100,7 +100,7 @@ Metrics:
 
 The console prints out the evaluation meric score of each engine params, and finally
 pretty print the optimal engine params. Amongs the 3 engine params we evaluate,
-The second yeilds the best Prediction@k score of ~0.1521.
+the best Prediction@k has a score of ~0.1521.
 
 
 ## The Evaluation Design
@@ -109,7 +109,7 @@ We assume you have read the [Tuning and Evaluation](/evaluation) section. We
 will cover the evaluation aspects which are specific to the recommendation
 engine.
 
-In recommendation evaluation, the raw data is a sequence of known rating.  A
+In recommendation evaluation, the raw data is a sequence of known ratings.  A
 rating has 3 components: user, item, and a score. We use the $k-fold$ method for
 evaluation, the raw data is sliced into a sequence of (training, validation)
 data tuple.
@@ -126,7 +126,7 @@ using the known rating of a user.
 There are multiple assumptions we have to make when we evaluate a
 recommendation engine:
 
-- Definition of 'good'. We want to quantity if the engine is able to recommend
+- Definition of 'good'. We want to quantify if the engine is able to recommend
 items which the user likes, we need to define what is meant by 'good'. In this
 examle, we have two kinds of events: 'rate' and 'buy'. The 'rate' event is
 associated with a rating value which ranges between 1 to 4, and the 'buy'
@@ -138,7 +138,7 @@ above the threshold is considered 'good'.
 data contains rating for all user-item tuples. In contrast, of a system containing
 1000 items, a user may only have rated 20 of them, leaving 980 items unrated. There
 is no way for us to certainly tell if the user likes an unrated product.
-When we examinte the evaluation result, it is important for us to keep in mind
+When we examine the evaluation result, it is important for us to keep in mind
 that the final metric is only an approximation of the actual result.
 
 - Recommendation affects user behavior. Suppose you are a e-commerce company and
@@ -158,7 +158,7 @@ behavior is homogenous.
 
 In MyRecommendation/src/main/scala/***Engine.scala***,
 we define the `ActualResult` which represents the user rating for validation.
-It stores the list of rating in the validation set for a user.
+It stores the list of ratings in the validation set for a user.
 
 ```scala
 case class ActualResult(
@@ -168,9 +168,9 @@ case class ActualResult(
 
 ### Implement Data Generate Method in DataSource
 
-In MyRecommendatin/src/main/scala/***DataSource.scala***,
+In MyRecommendation/src/main/scala/***DataSource.scala***,
 the method `readEval` method reads, and selects, data from datastore
-and resturns a sequence of (training, validation) data.
+and returns a sequence of (training, validation) data.
 
 ```scala
 case class DataSourceEvalParams(kFold: Int, queryNum: Int)
@@ -292,7 +292,7 @@ to determine what the candidates know.
 A good metric should be able to distinguish the good from the bad.
 
 A way to define relevant is to use the notion of rating threshold. If the user
-rating for an item is higher than certain threshold, we say it is relevant.
+rating for an item is higher than a certain threshold, we say it is relevant.
 However, without looking at the data, it is hard to pick a reasonable threshold.
 We can set the threshold be as high as the maximum rating of 4.0, but it may
 severely limit the relevant set size, and the precision scores will be close to
@@ -338,12 +338,12 @@ We have two lists of parameters (lines 2 to 3): `ratingThreshold` defines what r
 and `k` defines how many items we evaluate in the `PredictedResult`.
 We generate a list of all combinations (line 11).
 
-These metrics are expecified as `otherMetrics` (lines 9 to 11), they
+These metrics are specified as `otherMetrics` (lines 9 to 11), they
 will be calculated and generated on the evaluation UI.
 
 To run this evaluation, you can:
 
 ```
-$ pio eval org.template.recommendation.ComprehensiveRecommendationEvaluation \
-  org.template.recommendation.EngineParamsList
+$ pio eval org.template.ComprehensiveRecommendationEvaluation \
+  org.template.EngineParamsList
 ```