[Libre-soc-dev] DCT/FFT augmentations
hendrik at topoi.pooq.com
Sat Jul 3 13:56:25 BST 2021
On Fri, Jul 02, 2021 at 11:27:33PM +0100, Luke Kenneth Casson Leighton wrote:
> On 7/2/21, Hendrik Boom <hendrik at topoi.pooq.com> wrote:
> > Just a note: interpreting + as 'or', and * as 'and',
> > operating on Boolean matrices,
> > and having result, X, and Y be the exact same matrix,
> > updated while being used,
> > gives the traditional Warshall transitive-closure
> > algorithm, if the loops are nested exactly in thie order.
> we'd need a triple boolean logic
> instruction to do it... oh wait! the ternary instruction!
> this can handle 3 operands in with all possible permutations of all
> possible boolean logic combinations between 3 operands.
> > And probably a truckload of read-write conflicts.
> the priority is i j on the result as inner loops, so it's ok.
Rearranging and inserting a branch,
for k in range(len(Y)): # ydim2
for i in range(len(X)): # ydim1
for j in range(len(Y)): # xdim2
result[i][j] += Y[k][j]
we arrive at a formulation that allows collapsing the boolean
array into 64-bit words in the j-direction. I suspect this
is also a speed-up, but one that doesn't mesh well with
collapsing multiple loops into a single instruction.
> read is fine, it's writes that you have to be careful about.
> my other favourite one is GF(2^N) equivalent of mul-and-add.
> this is how Rijndael MixColumns can be done in its original
> mathematical form, with GF(2^8)
> i'll make a note about the Warshall algorithm.
> thank you Hendrik.
> Libre-soc-dev mailing list
> Libre-soc-dev at lists.libre-soc.org
More information about the Libre-soc-dev