...同时在报告中也能看出这两条pragma的作用对象。这也不难理解,对于嵌套的for循环,一旦外层for循环被施加了PIPELINE,那么内存for循环将自动被展开(也就是被施加了UNROLL)。...从这个案例中我们可以看出,PERFORMANCE其实是一种自动化的方式或者说智能地去选择施加什么样的pragma以达到目标吞吐率。
loop..."). Clearly this is incompatible with "#pragma unroll ...". Richard, any thoughts on this? For comparison, Intel and IBM compilers have the following syntax: #pragma unroll #pragma unroll(n) #pragma nounroll GCC and MSVC don't provide an unroll pragma, though unroll optimization ...
This pragma is needed withclang -O2as it otherwise doesn't unroll this loop at size M0==16.