You've got to go through each bottleneck during an access and then see where people start piling up.
How fast can a program read a file, even if it is pure ram disk.
Hmm - I seem to recall moving data at 300MB per second when walking through a fully cached 1GB file on an Opteron - so let's say if you average out finding stuff in 1/2 the time, ie: 1.5 seconds for that piece of the puzzle.
So, you can server 6 or 7 queries in about 10 seconds, which is what I consider an absolute, worst case, will not wait any longer until I hit the STOP button.
But it is really a waste of activity anyway.
So then your choice is to start working through access and storage. It is preferable if you can code a single program to load the data in memory, and this program has very quick access to the data, and then this program can publish a server interface (simple tcp/ip socket, etc), which other programs can use.
And then you think in terms of memory usage. You don't really need to actually store the data if you can come up with a representation of existence. You'd rather use less resources if possible, as long as it doesn't slow down your access.
I seem to recall this particular task lends itself to bitmapped scoreboarding. Also, which a bit of knowledge of the data type, I seem to recall only certain ranges of numbers are used.
So when you end up compressing what you store, and add a bit of smarts to the search method, you go from serving a couple of dozen query a minute to a couple of thousand a second. At that point you find your networking socket creation becomes the bottleneck, rather than the actual query, so then you might move tho optimizing that. Or you may find a couple of cheap load balanced machines takes care of your complete universe of possible usage, so you stop.