You are running user mode code on preemptive OS and you can not literally take full control of cpu processing resources.OS scheduler will decide which threads will be in ready state and scheduled to run next.Th
Your proposal to use parallelize the search would still take O(M.N) operations, but lower the constant multiplier. If we name this multiplier 'b', you would lower the operation count from b.M.N to b.M.N/8 if you use eight threads and the algorithm is 100 percent parallelizable (...
One point which I see right away is to change the async depth to be 4 or more to parallelize the decoding loop. With this you should definitely see an increase in decoding speed, hence reducing the latency. I need few details to analyze this ...
You are running user mode code on preemptive OS and you can not literally take full control of cpu processing resources.OS scheduler will decide which threads will be in ready state and scheduled to run next.There is also more priviledged code than yours which must run.For example scheduler...
One point which I see right away is to change the async depth to be 4 or more to parallelize the decoding loop. With this you should definitely see an increase in decoding speed, hence reducing the latency. I need few details to analyze this issue better - 1...
One point which I see right away is to change the async depth to be 4 or more to parallelize the decoding loop. With this you should definitely see an increase in decoding speed, hence reducing the latency. I need few details to analyze this issue better - 1. System c...