[Libre-soc-dev] SVP64 parallel map-reduce idea
Jacob Lifshay
programmerjake at gmail.com
Mon Jun 14 15:41:19 BST 2021
On Mon, Jun 14, 2021, 07:37 Jacob Lifshay <programmerjake at gmail.com> wrote:
> On Sun, Jun 13, 2021, 15:26 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>
>>
>> question: does the following result in prefix sum behaviour:
>>
>> gpr(RT) = gpr(RA)
>> for j = 0 to vl-1:
>> for i = 0 to j:
>> gpr(rt+j) = OP( gpr(ra+i), gpr(rb+i) )
>>
>
> I'd have to think through if that's a prefix sum (got vaccinated yesterday
> so not feeling the best), however:
> I think it doesn't matter because it's O(n^2) -- prefix sum serial is n
> operations, tree-based prefix sum is at most O(n*log n) (when using the
> least efficient tree prefix sum, O(n) for better ones; if assuming inner
> loop elements are run in parallel: O(log n))
>
> why would you pick an algorithm that takes 4096 steps when there are
> better (faster) algorithms that take less than 1/50 as many steps (both
> serial prefix-sum and work-reduced tree prefix sum (wikipedia link above))?!
>
Oops: miscalculated. it's actually ~2000 steps for your O(n^2) algorithm
for VL=64. that doesn't change my conclusion.
Jacob
>
More information about the Libre-soc-dev
mailing list