looks like I'm doing a variation of the "obvious way", though mine is interleaving the value with itself.

table lookup would probably be best - think I'll do it with a 16 byte table and double up each nybble as a 512 byte table would take up too much room.