Skip to content

Commit

Permalink
increase trims to prevent 1 in 100000 run crash
Browse files Browse the repository at this point in the history
  • Loading branch information
tromp committed Feb 23, 2018
1 parent 941bf31 commit 504bd51
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 19 deletions.
8 changes: 4 additions & 4 deletions GPU.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Here's a typical solver run:

$ ./cuda30 -r 2
GeForce GTX 1080 Ti with 10GB @ 352 bits x 5505MHz
Looking for 42-cycle on cuckoo30("",0-1) with 50% edges, 128*128 buckets, 240 trims, and 128 thread blocks.
Looking for 42-cycle on cuckoo30("",0-1) with 50% edges, 128*128 buckets, 256 trims, and 128 thread blocks.
Using 2680MB bucket memory and 21MB memory per thread block (5392MB total)
nonce 0 k0 k1 k2 k3 a34c6a2bdaa03a14 d736650ae53eee9e 9a22f05e3bffed5e b8d55478fa3a606d
4-cycle found
Expand All @@ -82,7 +82,7 @@ bit key for the siphash function which generates the half-billion graph edges.
This key is shown after each nonce as four 64-bit hexadecimal numbers. The GPU is
responsible for generating all edges and then trimming the majority of
them away for clearly not being part of any cycle. After a default number of
240 trimming rounds, only about 1 in every 13450 edges survives, and the
256 trimming rounds, only about 1 in every 13450 edges survives, and the
remaining 37000 or so edges are sent back to the CPU for cycle finding, using a an
algorithm inspired by Cuckoo Hashing (which is where the name derives from).

Expand All @@ -92,7 +92,7 @@ To see a synopsis of all possible options, run:
SYNOPSIS
cuda30 [-b blocks] [-d device] [-h hexheader] [-k rounds [-c count]] [-m trims] [-n nonce] [-r range] [-U blocks] [-u threads] [-V threads] [-v threads] [-T threads] [-t threads] [-X threads] [-x threads] [-Y threads] [-y threads] [-Z threads] [-z threads]
DEFAULTS
cuda30 -b 128 -d 0 -h "" -k 0 -c 1 -m 240 -n 0 -r 1 -U 128 -u 8 -V 32 -v 128 -T 32 -t 128 -X 32 -x 64 -Y 32 -y 128 -Z 32 -z 8
cuda30 -b 128 -d 0 -h "" -k 0 -c 1 -m 256 -n 0 -r 1 -U 128 -u 8 -V 32 -v 128 -T 32 -t 128 -X 32 -x 64 -Y 32 -y 128 -Z 32 -z 8


Most of these are for shaping the GPU's thread parallellism in the various edge generation and trimming rounds.
Expand All @@ -102,7 +102,7 @@ Here's a run that uncovers a solution:

$ ./cuda30 -n 60 -r 4
GeForce GTX 1080 Ti with 10GB @ 352 bits x 5505MHz
Looking for 42-cycle on cuckoo30("",60-63) with 50% edges, 128*128 buckets, 240 trims, and 128 thread blocks.
Looking for 42-cycle on cuckoo30("",60-63) with 50% edges, 128*128 buckets, 256 trims, and 128 thread blocks.
Using 2680MB bucket memory and 21MB memory per thread block (5392MB total)
nonce 60 k0 k1 k2 k3 275f9313c78adcec c3dc47d972920e25 41f8c5d51abbf1e7 74da5cc5b52b2a0b
6-cycle found
Expand Down
28 changes: 14 additions & 14 deletions GPU_tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Recall the solver options:
SYNOPSIS
cuda30 [-b blocks] [-d device] [-h hexheader] [-k rounds [-c count]] [-m trims] [-n nonce] [-r range] [-U blocks] [-u threads] [-V threads] [-v threads] [-T threads] [-t threads] [-X threads] [-x threads] [-Y threads] [-y threads] [-Z threads] [-z threads]
DEFAULTS
cuda30 -b 128 -d 0 -h "" -k 0 -c 1 -m 240 -n 0 -r 1 -U 128 -u 8 -V 32 -v 128 -T 32 -t 128 -X 32 -x 64 -Y 32 -y 128 -Z 32 -z 8
cuda30 -b 128 -d 0 -h "" -k 0 -c 1 -m 256 -n 0 -r 1 -U 128 -u 8 -V 32 -v 128 -T 32 -t 128 -X 32 -x 64 -Y 32 -y 128 -Z 32 -z 8

Let's look at each of these in turn.

Expand Down Expand Up @@ -40,12 +40,12 @@ For example,

$ ./cuda30 -h "DEADBEEF" | head -2
GeForce GTX 1080 Ti with 10GB @ 352 bits x 5505MHz
Looking for 42-cycle on cuckoo30("ޭ??",0) with 50% edges, 128*128 buckets, 240 trims, and 128 thread blocks.
Looking for 42-cycle on cuckoo30("ޭ??",0) with 50% edges, 128*128 buckets, 256 trims, and 128 thread blocks.

$ ./cuda30 -h "444541440A42454546" | head -3
GeForce GTX 1080 Ti with 10GB @ 352 bits x 5505MHz
Looking for 42-cycle on cuckoo30("DEAD
BEEF",0) with 50% edges, 128*128 buckets, 240 trims, and 128 thread blocks.
BEEF",0) with 50% edges, 128*128 buckets, 256 trims, and 128 thread blocks.

-k rounds [-c counts]
------------
Expand All @@ -70,14 +70,14 @@ For example:
round 13 size 542 completed in 2 ms
round 14 size 470 completed in 3 ms
round 15 size 403 completed in 2 ms
rounds 12 through 237 completed in 58 ms
trimrename3 round 238 size 2 completed in 1 ms
trimrename3 round 239 size 2 completed in 1 ms
rounds 12 through 253 completed in 58 ms
trimrename3 round 254 size 2 completed in 1 ms
trimrename3 round 255 size 2 completed in 1 ms
4-cycle found
282-cycle found
1006-cycle found
390-cycle found
findcycles completed on 33234 edges
findcycles completed on 29180 edges
Time: 1010 ms
0 total solutions

Expand All @@ -103,14 +103,14 @@ count all 128\*128 buckets is rather slow, as witnessed by the 3x longer Time be
round 13 size 9301758 completed in 2 ms
round 14 size 8167240 completed in 3 ms
round 15 size 7230054 completed in 2 ms
rounds 12 through 237 completed in 709 ms
trimrename3 round 238 size 33514 completed in 1 ms
trimrename3 round 239 size 33514 completed in 1 ms
rounds 12 through 253 completed in 709 ms
trimrename3 round 254 size 33514 completed in 1 ms
trimrename3 round 255 size 33514 completed in 1 ms
4-cycle found
282-cycle found
1006-cycle found
390-cycle found
findcycles completed on 33234 edges
findcycles completed on 29180 edges
Time: 3356 ms
0 total solutions

Expand All @@ -128,7 +128,7 @@ back to the host.

-m trims
------------
The number of trimming rounds. Default 240. Can be increased arbitrarily. At some point, there will be
The number of trimming rounds. Default 256. Can be increased arbitrarily. At some point, there will be
no edges left to trim, as all remaining edges are already part of a cycle:

$ ./cuda30 -m 1518 -k 2000 -c 128
Expand All @@ -149,8 +149,8 @@ no edges left to trim, as all remaining edges are already part of a cycle:
round 1514 size 88 completed in 0 ms
round 1515 size 88 completed in 0 ms
rounds 12 through 1515 completed in 281145 ms
trimrename3 round 1998 size 88 completed in 0 ms
trimrename3 round 1999 size 88 completed in 0 ms
trimrename3 round 1516 size 88 completed in 0 ms
trimrename3 round 1517 size 88 completed in 0 ms
4-cycle found
2-cycle found
16-cycle found
Expand Down
2 changes: 1 addition & 1 deletion src/mean_miner.cu
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,7 @@ struct trimparams {
u16 reportrounds;

trimparams() {
ntrims = 240;
ntrims = 256;
nblocks = 128;
genUblocks = 128;
genUtpb = 8;
Expand Down

0 comments on commit 504bd51

Please sign in to comment.