IWETHEY v. 0.3.0 | TODO
1,095 registered users | 1 active user | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New We have a lot of levels of caching
and have to keep them all in mind when designing. (Yes I know you were talking about instruction caching - we do something similar at the service call level).

It makes for a challenging environment but I find it a lot of fun.



"Whenever you find you are on the side of the majority, it is time to pause and reflect"   --Mark Twain

"The significant problems we face cannot be solved at the same level of thinking we were at when we created them."   --Albert Einstein

"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses."   --George W. Bush
New Agreed
I had an interesting meeting today.

We had moved from Oracle/Solaris 8/Veritas/Xyratex to Oracle/Linux 2.4/ext3/Clariion.

Went from quad CPU Sparc 450Mhz to Qaud CPU Opteron 2.2Ghz.

I considered the ext3 the iffiest part of the move, but this is RH AS3 , no decent file systems available, at least for large data.

So the programmers started complaining about incredibly poor performance a month or so after the move.

Note: I was involved in NONE of their code, other than the foundation of the project 5 years ago.

And all the coders who worked on it until 6 months ago are gone.

So anyway, the new guy, who is a brilliant Perl programmer, but knows NOTHING of large data, is bitching about the terrible performance.

None of my tests show anything wrong, so I just watch his processing.

When they moved from Sparc to Opteron, they figured that had a bunch of CPU, so they ALSO killed 6 dual Xeon compute servers and centralized their CPU intensive runs on the single box.

At the same time they were doing the Oracle work.

Driving their load average to 20.

Blaming the system

SMACK!

Today was the day the manager of that group said he had no worries about performance anymore, that the system I designed and gave to him to use work great, and that he was sure there would be no performance problems in the future now that his people realized they were being silly. They've been running just fine for a couple of weeks now.
New Ahhh, cogs click into place.
I would have fallen over if you told me it was at 20.

How the FSCK would you ever get that high doing the compute stuff you are doing... unless they were throwing everything at it at the same time.

I have seen machine with load averages in the 200s and greater performing just fine. It was an application server that scheduled and ran multiple, multiple, multiple servlets (or computelets) per second. It was able to keep up, but you can only execute so many a second.

In any case, these things were the cause of the 200+ load average... The CPUs, Disk I/O, Memory I/O, Memory Cache, Buffering, etc... wasn't even being pegged at 200+ Load Average. So many things to be executed and submitted at a time, some functions submitted 300 joblets (or whatever they called them) at a time.

The only time the app servers were that busy, was during Fall Registration opening and closing. Other than that, it barely rose above 2.

This was an IBM 4 Proc P9XX system doing the work with a Gaggle of Memory and nice disk perf.
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey
[image|http://www.danasoft.com/vipersig.jpg||||]
New It's helpful to remember just what "load average" means.
Someone told me it was how many scheduled items in the kernel missed their turn in the schedule. Sounds like one of the few times that "load average" is actually highly misleading.

Wade.
Save Fintlewoodlewix
New yea, That is a good analogy...
But I have always described it, Job waiting in Queue during the sample frequency.

Or in other words, Jobs waiting for their turn in the scheduling.

Sort of like Left turn lanes in the USA (Right turn lanes in places that drive on the wrong side of the road).

At busy intersections, left turn lanes typically build up a queue of cars to turn left. Most of the time only 3-5 get through per light. Sometimes takes 5 or 6 light cycles to get through it.
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey
[image|http://www.danasoft.com/vipersig.jpg||||]
     Why Apple moved to Intel - (admin) - (18)
         Nice article. Thanks. - (Another Scott) - (1)
             Size == Speed - (ChrisR)
         I'm not a programmerbloke, but... - (pwhysall) - (15)
             Obviously you are not a programmer :-P - (ben_tilly) - (14)
                 Re: Obviously you are not a programmer :-P - (pwhysall) - (1)
                     Everything winds up in cache - (ben_tilly)
                 But neither are you, at least not that low a level - (broomberg) - (11)
                     I might surprise you - (ben_tilly) - (3)
                         Latency latency latency - (broomberg) - (2)
                             Note that most macs are single-user systems -NT - (ben_tilly) - (1)
                                 No argument -NT - (broomberg)
                     Our perl programmers think that way all the time - (tuberculosis) - (6)
                         Please reread, I changed it - (broomberg) - (5)
                             We have a lot of levels of caching - (tuberculosis) - (4)
                                 Agreed - (broomberg) - (3)
                                     Ahhh, cogs click into place. - (folkert) - (2)
                                         It's helpful to remember just what "load average" means. - (static) - (1)
                                             yea, That is a good analogy... - (folkert)

One... two... FIVE!
50 ms