Unless you have a lot of knowledge...
that isn't boiling down to anything that gives you a big win. Not when 300 million pointers takes that much space.
The knowledge that it is "gap-y" could provide a space win, but only at some performance cost. You'd have to store the data in carefully compressed form and then decompress before using. For instance you could have a table that maps every 64K block of social security numbers to a block of data, which has Scott's format stored using a standard kind of compression. To lookup a social security number you have to decompress the appropriate block then look up your social security number.
Well filled-in stretches and gaps should both compress well. However now every lookup involves decompressing a block of data rather than just a bitmap. This can only be recommended as a win if the data is seriously uneven and 128 MB of RAM simply can't be done.
Given the cost of RAM, I'd go with Scott's solution.
Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)