[Libre-soc-dev] DCT/FFT augmentations

Hendrik Boom hendrik at topoi.pooq.com
Sat Jul 3 14:37:36 BST 2021


On Sat, Jul 03, 2021 at 02:22:53PM +0100, Luke Kenneth Casson Leighton wrote:
> ---
> crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
> 
> On Sat, Jul 3, 2021 at 1:56 PM Hendrik Boom <hendrik at topoi.pooq.com> wrote:
> 
> >     for k in range(len(Y)):       # ydim2
> >         for i in range(len(X)):              # ydim1
> >            if X[i][k]
> >               for j in range(len(Y[0])):        # xdim2
> >                  result[i][j] += Y[k][j]
> >
> > we arrive at a formulation that allows collapsing the boolean
> > array into 64-bit words in the j-direction.  I suspect this
> > is also a speed-up, but one that doesn't mesh well with
> > collapsing multiple loops into a single instruction.
> 
> correct.  applying SIMD (or multi-issue execution).
> 
> actually, multi-issue would be fine.  SIMD you could put the X[i][k]
> ANDed with the predicate bits.
> 
> however, to be honest, if we're talking multi-bit patterns (64-bit)
> then the probability of any given 64-bit pattern being zero is relatively
> small for the saving involved.

I cheated above.  Once the bits are compacted into 64-bit words,
I'm using bit-subscripts instead of word-subscripts.  So the inner
loop gets optimised to performing bit operations on entire words
instead of bit-by-bit.

The if X[i][k] still tests individual bits.

> 
> if these were single-bit values it'd be a different matter.

Exactly.

> 
> hmmm, what's the difference between this and the bitmatrix operations?
> https://libre-soc.org/openpower/sv/bitmanip/

Those are likely the instructions we'd use.

-- hendrik

> 
> l.
> 
> _______________________________________________
> Libre-soc-dev mailing list
> Libre-soc-dev at lists.libre-soc.org
> http://lists.libre-soc.org/mailman/listinfo/libre-soc-dev



More information about the Libre-soc-dev mailing list