Skip to content

Sejudyblues/cs194-winograd

 
 

Repository files navigation

About:

  • We implemented 3x3 convolutions.
  • When using the Winograd algorithm for convolutions, we used F(2x2, 3x3), which means that the output for one tile is 2x2.

OSX setup instructions:

  • install latest XCode Command Line Tools
  • install brew
  • install armadillo brew install armadillo
  • install llvm for openmp brew install llvm

Hive setup instructions:

  • Download the source for armadillo
  • After unzipping the directory, cmake ., then make, then make install DESTDIR=~/lib

Generate A Problem

  • Compile with make
  • Create a problem file by running python3 gen_problem.py > [problem filename]

Run Naive Convolution

  • Use a file of the generated format (see above) as input for the program ./naive_convolution [input filename] [output filename]

Run Winograd Convolution implented serially

  • ./winograd [input filename] [output filename]

Run Winograd Convolution implemented in OpenMP

  • ./winograd_openmp [input filename] [output filename]

Run Winograd Convolution implemented in OpenCL

  • ./winograd_gpu [input filename] [output filename]

Flops Calculation:

  • All floating point additions and multiplications are counted as separate operations.
  • Manually counted, as Haswell architecture does not support hardware counters

Benchmarks

  • ./bench_image_size.sh
  • ./extract_perf.py BENCHMARK_FILE will extract MFlop/s and times to bench_extract_mflops.csv and bench_extract_times.csv.
  • NOTE: The benchmark creates many .in input files that may take up a lot of disk space. Make sure disk quota does not fill up during benchmarking.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 71.7%
  • Python 12.6%
  • C 8.5%
  • Shell 4.3%
  • Makefile 2.9%