This gives a 2x speed increase compared to the existing implementation. Thanks to Steve Thomas for the initial patch and Tim Graham for finishing it. Backport of 1e4f53a6eb8d1816e51eb8bd8f95e704f6b89ead from master.