NCTU 2017 Computer Architecture Final Project - Part 1

Part-1：Implement convolution , relu , and maxpooling in convLayerGPU() with CUDA and store your result in the outGPU and use NVVP to analyze your code.

Download

At command line type:

git clone https://github.com/s0606757/CA2017FP-Part1.git

Three sub-directory

./data

This directory contains the input data for the base program：

. /data/filter.txt - Store the values of filters
. /data/neuron.txt - Store the values of input neurons

./device

The program under this directory can show the device information.

usage

cd ./device
make
make run

./example

There are two examples(InnerProduct and Matrix Multiplication) under this directory.

usage

cd ./example/InnerProduct/
make
make run

or

cd ./example/MatrixMultiplication/
make
make run

Usage of the base program

make
make run

Evaluation

We will compare the execution time to get the speedup by

Speedup = convLayerCPU_execTime / convLayerGPU_execTime

Grading Policy

(A) Completeness (35%)
Your result(convLayerGPU) must be correct (Pass the check) (10%)
    Your design(convLayerGPU) is faster than convLayerCPU() (20%)
    Use NVIDIA Visual Profiler to help you improve performance(TA will check it in your report) (5%)
(B) Report (35%)
    Describe your implementation algorithm and explain your results (15%)
    Discuss what kind of optimization you did( it is better or worse?) (10%)
    Show how you use NVVP to help you find and solve perf. issues (5%)
    Feedback of this part (5%)
(C) Performance Rank (30%)
    We will rank your CUDA kernels’ performance(execution time) on GTX K20C
    The fastest one will get 30% and the last one will get 1%

Other Rules

It’s team work, 1 ~ 3 people in one team
- Register here before deadline.
Account list
Compress your code and report into one zip file and upload to E3.
Name your report as：LeaderID_Report_FP1.pdf
Name your package as： LeaderID_FP1.zip
One team only need to upload one package to E3.
Make sure TA can compile and run your code with “make” and “make run” on the provided server.
Any CUDA library is forbidden to use in this project !!!
DELAY IS NOT ACCEPTABLE !!!
Due day：2017/10/26(Thr) 23:50

Useful Reference

Introduction to CNN -1 Here
Introduction to CNN -2 Chinese version English version
Introduction to CUDA Here
NVVP Here
GPU Profiling Here

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
data		data
device		device
example		example
.gitattributes		.gitattributes
CNNConvLayer.cu		CNNConvLayer.cu
CNNConvLayer.h		CNNConvLayer.h
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NCTU 2017 Computer Architecture Final Project - Part 1

Download

Three sub-directory

./data

./device

usage

./example

usage

Usage of the base program

Evaluation

Grading Policy

Other Rules

Useful Reference

About

Releases

Packages

Languages

s0606757/CA2017FP-Part1

Folders and files

Latest commit

History

Repository files navigation

NCTU 2017 Computer Architecture Final Project - Part 1

Download

Three sub-directory

./data

./device

usage

./example

usage

Usage of the base program

Evaluation

Grading Policy

Other Rules

Useful Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages