Skip to content

Commit

Permalink
Drop deprecated benchmarks from rules, fix llama2_70b_lora data
Browse files Browse the repository at this point in the history
  • Loading branch information
ShriyaPalsamudram committed Jun 13, 2024
1 parent 2ddbe8a commit 259dead
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 107 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ MLCommons project work is tracked with issue trackers and pull requests. Modify

2. Since reference benchmarks are expected to take 7-day @ 1 GPU to run, the target accuracy can be adjusted to make sure that the benchmark runtime is not too high. Some references like gpt3/stable diffusion take much longer, but these are rare excpetions

3. The goal is to choose an accuracy target with low variance on the probability distribution of number of iterations. To get this, run as many runs as possible and plot accuracy vs iterations. Target accuracy should be picked from the region right after the knee point in the plot as shown below.
3. The goal is to choose an accuracy target with low variance on the probability distribution of number of iterations. To get this, run as many runs as possible and plot accuracy vs iterations. Target accuracy should be picked from the region right after the knee point in the plot.
![plot](./images/target_accuracy_knee.png "Target Accuracy")

a. "Too flat" region - The reason the "too flat" region is "too flat" is because when you try to pick an accuracy by drawing a horizontal line at that accuracy, the angle the line intersects with the curve is very low. This means that the iterations needed to reach the target accuracy in this region has a very high range (In numerical analysis this is called "ill conditioned"). Thus, it would not make a good benchmark accuracy target.
Expand Down
Loading

0 comments on commit 259dead

Please sign in to comment.