A.4_Shmoos
A.4 Shmoos
A "shmoo plot" refers to a graphical display of test circuit patterns as two inputs (such as voltage and clock rate) vary. When writing code to identify the optimal blocking parameters for various kernels, it is useful to do similar tests by varying inputs such as the threadblock size and loop unroll factor. Listing A.3 shows the chShmooRange class, which encapsulates a parameter range, and the chShmooIterator class, which enables for loops to easily iterate over a given range.
Listing A.3 chShmooRange and chShmoolterator classes.
class chShmooRange {
public:
chShmooRange() {}
void Initialize(int value);
bool Initialize(int min, int max, int step);
bool isStatic() const{return m_min==m_max;}
friend class chShmooIterator;
int min() const{return m_min;}
int max() const{return m_max;}
private:
bool m 初始化;
int m_min, m_max, m_step;
};
class chShmooIterator
{
public:
chShmooIterator(const chShmooRange& range);int operator \*() const{ return m_i; } operator bool() const{ return m_i $< =$ m_max; } void operator++(int) { m_i $+ =$ m_step; }; private: int m_i; int m_max; int m_step;
};The command line parser also includes a specialization that creates a chShmooRange based on command-line parameters: Prepend "min," "max," and "step" onto the keyword, and the corresponding range will be passed back. If any of the three are missing, the function returns false. The concurrencyKernelKernel sample (in the concurrency/subdirectory), for example, takes measurements over ranges of stream count and clock cycle count. The code to extract these values from the command line is as follows.
chShmooRange streamsRange;
const int numStreams $= 8$ .
if(!chCommandLineGet(&streamsRange,"Streams", argc,argv)){ streamsRange.Initialize( numStreams);
}
chShmooRange cyclesRange;
{ const int minCycles $= 8$ · const int maxCycles $= 512$ · const int stepCycles $= 8$ cyclesRange.Initialize( minCycles,maxCycles,stepCycles); chCommandLineGet( &cyclesRange,"Cycles", argc,argv);
}And users can specify the parameters to the application as follows.
concurrencyKernelKernel -- minStreams 2 --maxStreams 16 stepStreams 2