Skip to content

Commit

Permalink
patches from Rob for PRK 2.12 and some trailing whitespace removal
Browse files Browse the repository at this point in the history
  • Loading branch information
jeffhammond committed Jun 1, 2015
1 parent d71a6ae commit af1d35b
Show file tree
Hide file tree
Showing 9 changed files with 122 additions and 133 deletions.
1 change: 0 additions & 1 deletion CHARM++/Stencil/stencil.C
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,6 @@ public:
for (int j=MAX(jstart,RADIUS); j<=MIN(n-1-RADIUS,jend); j++) {
for (int i=MAX(istart,RADIUS); i<=MIN(n-1-RADIUS,iend); i++) {

double start = temp_new(i-istart,j-jstart);
for (int jj=-RADIUS; jj<=RADIUS; jj++) {
temp_new(i-istart,j-jstart) += WEIGHT(0,jj)*temp(i-istart,j-jstart+jj);
}
Expand Down
4 changes: 2 additions & 2 deletions OPENMP/RefCount_private/private.c
Original file line number Diff line number Diff line change
Expand Up @@ -255,8 +255,8 @@ int main(int argc, char ** argv)
#else
printf("Solution validates\n");
#endif
printf("Rate (CPUPs/s): %d, time (s): %lf\n",
(int)(iterations/refcount_time), refcount_time);
printf("Rate (MCPUPs/s): %lf, time (s): %lf\n",
iterations/refcount_time*1.e-6, refcount_time);
}
exit(EXIT_SUCCESS);
}
8 changes: 4 additions & 4 deletions OPENMP/RefCount_shared/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ OPTFLAGS = $(DEFAULT_OPT_FLAGS)

#uncomment any of the following flags (and change values) to change defaults

#LONG64FLAG = -DLONG_IS_64BITS
#description: can use "long" for 64 bit integers instead of "long long"

#DEBUGFLAG = -DVERBOSE
#description: default diagnostic style is silent

DEPENDENCEFLAG = -DDEPENDENT
#description: default counter update is with inter-dependence

#NTHREADFLAG = -DMAXTHREADS=n
#description: default thread limit is 256

Expand All @@ -26,7 +26,7 @@ INCLUDEPATHS =

### End User configurable options ###

TUNEFLAGS = $(LONG64FLAG) $(DEBUGFLAG) $(NTHREADFLAG) $(USERFLAGS)
TUNEFLAGS = $(DEPENDENCEFLAG) $(DEBUGFLAG) $(NTHREADFLAG) $(USERFLAGS)
PROGRAM = shared
# objects below are the default, used by "clean," if invoked
OBJS = $(PROGRAM).o $(COMOBJS)
Expand Down
189 changes: 86 additions & 103 deletions OPENMP/RefCount_shared/shared.c
Original file line number Diff line number Diff line change
@@ -1,32 +1,32 @@
/*
Copyright (c) 2013, Intel Corporation
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
* Neither the name of Intel Corporation nor the names of its
contributors may be used to endorse or promote products
derived from this software without specific prior written
* Neither the name of Intel Corporation nor the names of its
contributors may be used to endorse or promote products
derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
*/

Expand All @@ -36,27 +36,27 @@ NAME: RefCount
PURPOSE: This program tests the efficiency of exclusive access to a
pair of non-adjacent shared reference counters
USAGE: The program takes as input the total number of times the reference
counters are updated, and the number of threads
USAGE: The program takes as input the total number of times the reference
counters are updated, and the number of threads
involved.
<progname> <# threads><# iterations>
The output consists of diagnostics to make sure the
The output consists of diagnostics to make sure the
algorithm worked, and of timing statistics.
FUNCTIONS CALLED:
Other than OpenMP or standard C functions, the following
Other than OpenMP or standard C functions, the following
functions are used in this program:
wtime()
bail_out()
getpagesize()
HISTORY: Written by Rob Van der Wijngaart, January 2006.
*******************************************************************/

#include <par-res-kern_general.h>
Expand All @@ -68,30 +68,30 @@ HISTORY: Written by Rob Van der Wijngaart, January 2006.
#if !defined(__USE_BSD) && !defined(__USE_XOPEN_EXTENDED)
extern int getpagesize(void);
#endif
#define COUNTER1 (*pcounter1)
#define COUNTER2 (*pcounter2)

int main(int argc, char ** argv)
{
int iterations; /* total number of reference pair counter updates */
int page_fit; /* indicates that counters fit on different pages */
size_t store_size; /* amount of space reserved for counters */
s64Int *pcounter1,
double *pcounter1,
*pcounter2; /* pointers to counters */
s64Int *counter_space; /* pointer to space reserved for counters */
double cosa, sina; /* cosine and sine of rotation angle */
double *counter_space; /* pointer to space reserved for counters */
double refcounter1,
refcounter2; /* reference values for counters */
double epsilon=1.e-7; /* required accuracy */
omp_lock_t counter_lock; /* lock that guards access to counters */
double refcount_time; /* timing parameter */
int nthread_input; /* thread parameters */
int nthread;
s64Int sum_distance=0,
sum_distance2=0; /* distance and distance squared between
reference counter updates by the same thread,
summed over all threads */
double avg_distance,
avg_distance2; /* distances averaged over all threads */
int nthread;
int num_error=0; /* flag that signals that requested and obtained
numbers of threads are the same */

/*********************************************************************
** process and test input parameters
** process and test input parameters
*********************************************************************/

if (argc != 3){
Expand All @@ -112,7 +112,7 @@ int main(int argc, char ** argv)
}

omp_set_num_threads(nthread_input);

/* initialize shared counters; we put them on different pages, if possible.
If the page size equals the whole memory, this will fail, and we reduce
the space required */
Expand All @@ -121,38 +121,39 @@ int main(int argc, char ** argv)
#ifdef VERBOSE
printf("Page size = %d\n", getpagesize());
#endif
counter_space = (s64Int *) malloc(store_size+sizeof(s64Int));
while (!counter_space && store_size>2*sizeof(s64Int)) {
counter_space = (double *) malloc(store_size+sizeof(double));
while (!counter_space && store_size>2*sizeof(double)) {
page_fit=0;

store_size/=2;
counter_space = (s64Int *) malloc(store_size+sizeof(s64Int));
counter_space = (double *) malloc(store_size+sizeof(double));
}
if (!counter_space) {
printf("ERROR: could not allocate space for counters\n");
exit(EXIT_FAILURE);
}

#ifdef VERBOSE
if (!page_fit) printf("Counters do not fit on different pages\n");
else printf("Counters fit on different pages\n");
if (!page_fit) printf("Counters do not fit on different pages\n");
else printf("Counters fit on different pages\n");
#endif

pcounter1 = counter_space;
pcounter2 = counter_space + store_size/sizeof(s64Int);
(*pcounter1) = 0;
(*pcounter2) = 0;
pcounter2 = counter_space + store_size/sizeof(double);

COUNTER1 = 1.0;
COUNTER2 = 0.0;

cosa = cos(1.0);
sina = sin(1.0);

/* initialize the lock on which we will be pounding */
omp_init_lock(&counter_lock);

#pragma omp parallel reduction(+:sum_distance,sum_distance2)
#pragma omp parallel
{
int iter; /* dummy */
/* we declare everything the same type/length to avoid consversions */
s64Int oldcounter;/* previous thread value of reference counter */
s64Int newcounter;/* current thread value of reference counter */
s64Int distance; /* distance between successive counter updates by
same thread */
double tmp1; /* local copy of previous value of COUNTER1 */

#pragma omp master
{
Expand All @@ -163,10 +164,15 @@ int main(int argc, char ** argv)
printf("ERROR: number of requested threads %d does not equal ",
nthread_input);
printf("number of spawned threads %d\n", nthread);
}
}
else {
printf("Number of threads = %i\n",nthread_input);
printf("Number of counter pair updates = %i\n", iterations);
printf("Number of threads = %d\n",nthread_input);
printf("Number of counter pair updates = %d\n", iterations);
#ifdef DEPENDENT
printf("Dependent atomic counter pair update\n");
#else
printf("Independent atomic counter updates\n");
#endif
}
}
bail_out(num_error);
Expand All @@ -176,72 +182,49 @@ int main(int argc, char ** argv)
refcount_time = wtime();
}

/* the first iteration that any thread does initializes oldcounter.
We could treat this situation with a test in the main loop, but
that adds overhead to each iteration, so we keep it separate */
omp_set_lock(&counter_lock);
(*pcounter1)++;
#ifdef VERBOSE
oldcounter=*pcounter1;
#endif
(*pcounter2)++;
omp_unset_lock(&counter_lock);

#pragma omp for
/* start with iteration nthread to take into account pre-loop iter */
for (iter=nthread; iter<iterations; iter++) {
for (iter=0; iter<iterations; iter++) {
omp_set_lock(&counter_lock);
/* keep stuf within lock region as brief as possible */
#ifdef VERBOSE
distance = ((*pcounter1)++)-oldcounter;
oldcounter = (*pcounter1);
#ifdef DEPENDENT
double tmp1 = COUNTER1;
COUNTER1 = cosa*tmp1 - sina*COUNTER2;
COUNTER2 = sina*tmp1 + cosa*COUNTER2;
#else
(*pcounter1)++;
COUNTER1++;
COUNTER2++;
#endif
(*pcounter2)++;
omp_unset_lock(&counter_lock);
#ifdef VERBOSE
sum_distance += distance;
sum_distance2 += distance*distance;
#endif
}

#pragma omp master
{
#pragma omp master
{
refcount_time = wtime() - refcount_time;
}
} /* end of OpenMP parallel region */

#ifdef VERBOSE
if (iterations > 1) {
avg_distance = (double) sum_distance/(iterations-1);
avg_distance2 = (double) sum_distance2/(iterations-1);
}
#ifdef DEPENDENT
refcounter1 = cos(iterations);
refcounter2 = sin(iterations);
#else
refcounter1 = (double)(iterations+1);
refcounter2 = (double) iterations;
#endif
if ((*pcounter1) != iterations || (*pcounter1) != (*pcounter2)) {
printf("ERROR: Incorrect or inconsistent counter values "FSTR64U,
(*pcounter1));
printf(FSTR64U"; should be %d\n", (*pcounter2), iterations);
if ((ABS(COUNTER1-refcounter1)>epsilon) ||
(ABS(COUNTER2-refcounter2)>epsilon)) {
printf("ERROR: Incorrect or inconsistent counter values %13.10lf %13.10lf; ",
COUNTER1, COUNTER2);
printf("should be %13.10lf, %13.10lf\n", refcounter1, refcounter2);
}
else {
#ifdef VERBOSE
printf("Solution validates; Correct counter value of "FSTR64"\n", (*pcounter1));
printf("Solution validates; Correct counter values %13.10lf %13.10lf\n",
COUNTER1, COUNTER2);
#else
printf("Solution validates\n");
#endif
printf("Rate (CPUPs/s): %d, time (s): %lf\n",
(int)(iterations/refcount_time), refcount_time);
#ifdef VERBOSE
if (iterations > 1) {
printf("Average update distance: %lf\n", avg_distance);
printf("Standard deviation of update distance: %lf\n",
sqrt(avg_distance2-avg_distance*avg_distance));
printf("Mean and standard deviation of update distance for random locks: ");
printf("%lf, %lf\n", (double)(nthread-1), sqrt((double)(nthread*(nthread-1))));
printf(" fair locks: ");
printf("%lf, %lf\n", (double)(nthread-1), 0.0);
}
#endif
printf("Rate (MCPUPs/s): %lf, time (s): %lf\n",
iterations/refcount_time*1.e-6, refcount_time);
}

exit(EXIT_SUCCESS);
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# License

See COPYING for licensing information.
See LICENSE for licensing information.

# Overview

Expand Down
12 changes: 6 additions & 6 deletions README.special
Original file line number Diff line number Diff line change
Expand Up @@ -33,19 +33,19 @@ POSSIBILITY OF SUCH DAMAGE.
This files contains instructions for building and running PRKs using
the Charm++, OpenSHMEM, Grappa, or Fine-Grain MPI runtimes.

These runtimes need to be installed, and you need to specify in file
These runtimes need to be installed, and you need to specify in file
PRKROOT/common/make.defs where they are located. Note that if Charm++,
OpenSHMEM or Grappa are built on top of MPI, MPI does need to be
OpenSHMEM or Grappa are built on top of MPI, MPI does need to be
installed, even if it is not called directly.

Grappa
------
This runtime requires cmake to install. Compile and link flags
depend on the particulars of the installation and cannot be specified
uniformly. Hence, it may be necessary to obtain them from the Grappa
installation after its build from GRAPPAROOT/util/grappa.mk and insert
them into the required Makefile associated with each PRK.
The run script grappa(s)_run may not give the desired result. If
installation, after building it, from GRAPPAROOT/util/grappa.mk and
insert them into the required Makefile associated with each PRK.
The run script grappa(s)_run may not give the desired result. If
Grappa is built to use MPI, the code can be run in most cases by using
mpiexec.hydra, with the usual command line parameters.

Expand All @@ -63,5 +63,5 @@ Fine-Grain MPI version of mpiexec.hydra. This name may well clash
with the version of mpiexec.hydra supporting standard MPI. Such a
conflict may be avoided by starting a sub-shell for FG_MPI runs and
changing path variables only inside that shell. Scripts in the
PRKROOT/Run directory employ this technique automatically and do
PRKROOT/Run directory employ this technique automatically and do
not need to be modified for proper execution.
Loading

0 comments on commit af1d35b

Please sign in to comment.