[Libre-soc-dev] harmful SIMD again

Luke Kenneth Casson Leighton lkcl at lkcl.net
Fri Aug 20 12:23:29 BST 2021


https://www.reddit.com/r/programming/comments/p0yn45/three_fundamental_flaws_of_simd/h9ncoft/?utm_source=reddit&utm_medium=web2x&context=3

FUZxxl is advocating the case for SIMD by implementing this algorithm
in AVX512, SSE and NEON:
https://github.com/clausecker/pospop/blob/master/safe.go

i initially made the mistake of thinking it was a straight vectorised popcount,
it's not.  for count8safe there are 8 accumulators:
* accumulator 0 receives the count of the number of bit zeros of ALL
  the input vector.
* accumulator 1 receives the count of the number of bit 1s of the entire
  input vector

    for i := range buf {
        for j := 0; j < 8; j++ {
            counts[j] += int(buf[i] >> j & 1)
        }
    }

NOT

    for i := range buf {
        for j := 0; j < 8; j++ {
            # counts is NOT the same length as buf.
            counts[>>>>I<<<<] += int(buf[i] >> j & 1)
        }
    }

turns out that SVP64 can do this algorithm in about 12 or so instructions,
which is 35x less than the instruction count needed for AVX512 or NEON.

l.



More information about the Libre-soc-dev mailing list