Post #121,610
10/17/03 5:35:05 AM
|
Dealing with a lookup table
I have a C program that needs to make big tables of sines and cosines of angles.
There are several ways to proceed.
1) Make one enormous static table that has global scope, big enough to handle the largest possible set of values. This has the advantage of speed, but it's possible that one would have overlapping simultaneous use of the table by differing functions, so this is ruled out.
2) Make a static table inside each function that needs one. This bloats the code but seems like the best compromise between speed and data isolation.
3) Dynamically create the tables when they are needed. This eats running time - the overhead of mallocing the tables adds about 5% of time over 2), but this is the safest method of all and results in smaller (=faster loading) code. Speed, however, is essential - a calculation may happen hundreds of times every second, and there will be parallel calculations going on. Does the overhead of loading the code match the overhead of dynamically creating the tables? A typical table is about 200x31 doubles.
What's the "standard" way of dealing with an enormous table?
Note, the functions that need these tables will eventually live in a shared library.
-drl
|
Post #121,621
10/17/03 8:09:43 AM
|
Probably 1...but
how static is the info? If it's truly static, and all you're doing is lookups - data isolation isn't really going to be an issue. I'd either go with an external function and scope the data by file or simply go with shared memory. (It also sounds like performance is critical and I'd probably add mutexes and multithread the app.)
If the data is static with regard to only to a particular function (and different functions can be used at different times) your trade-off is going to be memory-use vs. access speed. Throw everything into a single large chuck of memory and access will be fast. Create the memory on the fly and memory use will be less, but you'll take a speed impact.
|
Post #121,622
10/17/03 8:09:58 AM
|
Why is simultaneous access a problem?
Is it enough to love Is it enough to breathe Somebody rip my heart out And leave me here to bleed
| | Is it enough to die Somebody save my life I'd rather be Anything but Ordinary Please
| -- "Anything but Ordinary" by Avril Lavigne. |
|
Post #121,645
10/17/03 10:51:16 AM
|
I echo the question of the other respondents:
Why would you rule out 1), if all you're doing is looking up data? (If you're also writing data, then I understand, but that wasn't what I inferred from your query.)
jb4 "There are two ways for you to have lower Prescription-drug costs. One is you could hire Rush Limbaugh's housekeeper ... or you can elect me President." John Kerry
|
Post #121,652
10/17/03 11:04:08 AM
|
Re: I echo the question of the other respondents:
Yes, I misstated it - there are 10 tables of coefficients for rather complex time series that get updated under certain conditions - not on every throw in general, but at unpredictable intervals. They are "semi-permanent", in that they live far longer than the characteristic time interval of the simulation, but actually get recalculated 10s of times per second.
-drl
|
Post #121,687
10/17/03 1:27:36 PM
|
I vote for option 4
Which is - estimate number of tables you might have simultaneously, multiply by some safety factor, preallocate these chunks of memory (statically if you like) and then manage their use yourself. You save the mallocs while still maintaining isolation. IOW, create a pool of tables - make each member of the pool the size of the largest table.
For extra fun - build it so it dynamically allocates more tables if the pool runs out but never free any of them - just mark them for reuse.
In Java, you can't escape the creepy feeling.
--James Gosling
|
Post #121,758
10/17/03 9:15:28 PM
|
Re: I vote for option 4
That's a excellent idea.
This is why I am a shitty programmer - that solution is obvious in hindsight.
-drl
|
Post #121,913
10/19/03 8:53:31 PM
|
Split the two functions
one module that makes and maintains the tables another that reads and sends in updates. Call the first from the second when needed. The first can then control locking and memory. the non programmer answer. thanx, bill
"You're just like me streak. You never left the free-fire zone.You think aspirins and meetings and cold showers are going to clean out your head. What you want is God's permission to paint the trees with the bad guys. That wont happen big mon." Clete questions, help? [link|mailto:pappas@catholic.org|email pappas at catholic.org]
|
Post #122,765
10/24/03 10:56:52 PM
|
Dammit, use relational tables
I remember having this kind of issue in a Fortran program in my early years. I hugged Xbase in that it no longer was a worry as long as I had the disk-space. App languages are usually shitty at large structure handling.
Simply make a table with a column for each "dimension". Example:
Table: TrigTable ----------------- rowID (optional) SinValue CosineValue theValue
Need more dimensions? No problem:
Table: 3Dspace -------------- X Y Z theValue
Simple Dimple! You do-everything-with-app-code people need a cluestick.
________________ oop.ismad.com
|
Post #122,776
10/25/03 12:20:43 AM
|
Again, read the problem description.
He's recalculating this stuff 10s of times per second. Put it in tables and he's going to pay both file and index maintenance overhead when all that's really needed is an array.
You do-everything-with-tables people need a cluestick.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #122,823
10/25/03 5:08:07 PM
|
not clear on environment of this thing
"a calculation may happen hundreds of times every second"
I wasn't sure this meant it needed "real-time" response or not. Is it a large batch job, or something like real-time image recognition?
Most desktop DB engines cache a lot of stuff in RAM, I would note. It only starts to becomes disk intensive when it runs out of available RAM and can't use indexes nor the natural order of rows that fit the grain of requests. But, at least it does not crash at that point.
Does he need something that "degrades gracefully", or is it predictable-speed-or-bust?
________________ oop.ismad.com
|
Post #122,826
10/25/03 5:38:08 PM
|
Re: not clear on environment of this thing
They are planets - a lot of them might be moving around at once, and you might have several simulations going at once. Every planet needs a separate set of table of sines and cosines as functions of time for each instance. You don't want to recalculate these tables constantly, rather caclulate them for two times that are "close" with respect to the approximation and then linearly interpolate them, which is much faster than doing 31*19 sines and cosines for every planet and then slogging through a perturbation table with hundreds of coefficients.
So Todd's idea was the best one, you create a pool of tables and then leave them lying around until memory needs to be reclaimed for whatever reason. This is a middle ground between just hogging a buttload of memory with static tables vs. mallocing and freeing them over and over again.
The perturbation tables come from a spectral analysis of the output of JPL's DE404 numerical integration of the Solar System - so they are extremely accurate over a specific interval. 1/4 to 1/2 second of arc in 4500 years is realistic, even for Pluto. (It's an amazing thought that the exact details of the motion of the Earth, Moon and planets over a 4500 year interval can be modeled in a few hundred K of C code.)
-drl
|
Post #122,846
10/26/03 1:52:10 AM
|
You'll have to wait 4500 yrs to know if you did it right :-)
________________ oop.ismad.com
|