To improve the throughput of current scheduler, we can do a simple optimization by calcluating priorities in parallel. This doubles the throughput of density test, which has the default config with 3 priority funcs (the spreading one does not actually consume any computation time). It matches the expectation.