IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Hahaha he said: ....
Winders...

more like Windon't

Sorry, I'm sick of the whole Microsoft Networking Stack and its lack of ability without stupids tricks to get it to begin to perform.

I've just got done (over the weekend) helping a friend diagnose a slow webserver, Linux with Apache and Tomcat and a few other things... All backed up by a Windows 2008 Server with MS-SQL on it(not his choice and not his choice). All Linux, Apache Foundation, Sun Java and Microsoft softwares are all latest versions and patched to the hilt. All hardware is commensurate to the job.

Symptom: Static content served by Apache blazes. Static-like (stuff in the WAR and other contexts) content served by Tomcat blazes. Dynamic content served by Apache (widgets served by AJAX) is "ok" for speed (these are driven by stuff I don't understand yet). Tomcat Dynamic content (meat of the site) is so slow...

Watching with Wireshark on the Linux machine side and it appears that requests from Apache directly goto the Windows Server machine... and get a response back pretty soon. For some reason it feels very slow. Utilization on the Linux machine is minimal, if detectable except for memory use.

Continuing to watch with Wireshark, Tomcat requests to the Windows machine, basically sit and wait and wait and wait, then piddle out of the Windows machine. Never timing out as things are just slow and then eventually finish.

Its like there is two modes on output from the Windows machine: Slow and Slower.

Well then, we kick on Wireshark on the Windows 2008 Server, no Windows Firewall on or other limiting products that I could see. We put Windows machines into promiscuous listening mode. What a different view we see from the Window machine. Fast responsive... in fact wait... let me re-phrase that... *BLAZING* ... in fact wait... The whole system is running as expected. What?

We run it through some of the most intensive/expensive stuff we have to throw at it. Response time are sub 100ms to the browser for a complete page and all AJAX is smooth, everything. Seeing all the responses immediately from the Windows machine, we figure it was just stuck. Cool!

Turn off Wireshark and go back to regular on the mode on the NIC. Ummm, ooof. Back to the same old non-performing setup.

So, we turn on port mirroring on the switch for the Windows Server first and Linux Webserver second. Plug in a Linux machine and he runs Wireshark to watch everything. Sure enough, we see bad response and slow throughput from the Windows machine. Using Wireshark to turn on promiscuous mode on Windows 2008 Server again... watching from the external monitor no response time issues.

So... evidently something is fscked up on the Windows machine. The worst part is, the machine was burnt down and rebuilt about 5 times by a Microsoft sponsored and rec'd "MSCE" because he couldn't, even with Microsoft's direct help and remote control from the company itself, get it to perform. Microsoft left it with him as unresolved, but billable to the tune of $42,000 as the blame was squarely placed on the Open Source Solution being used.

Solution: Leave Windows Networking in Promiscuous mode without capturing anything. There by not causing undue stress on the machine. Everything works great.



Anyone that knows more than I do about Windows, care to explain this for me?







P.S.
They also tried running everything on the Windows 2008 machine at one time also, but it beat itself to death regularly. Fixed it to work better, as soon as they put it on the internet... even with proper precautions... IIS was compromised and the machine was taken down. They then ran Apache on a cheap Linux box... keeping Tomcat on the Windows machine... performed poorly also. Soon they got two identical machines, ended up getting better performance but not good. So... that is when MS was called to fix the issue.

P.P.S.
You can bet he doc'd the HELL out of what we did. The place is back billing MS and the "MSCE". Plus since the project is now 4 months behind, lost billable is being back billed and as proof they have them, they had contracts to provide service and were paying penalties. As of Yesterday... they are billing customers now.

P.P.P.S
The only reason they had Windows Server in the mix to begin with... Blackberry Enterprise Integration... What ever dudes. Whatever. It (what ever *it* is) is also running just fine on the Windows 2008 server in promiscuous mode.
New Bummer.
You have my sympathy in having to deal with Windows and Microsoft like that.

I continue to play with Linux, but I haven't been able to spend the time to figure out how to do reliable installations of software on K/ubuntu (i.e. stuff that isn't in the Ubuntu repositories), Flash is still problematic for me, etc., etc. I have managed to break my experimental Debian install, and that's another ball string.

Yeah, yeah, I know I need to spend the time. :-)

Don't let it get to you - it's job security, right?

Cheers,
Scott.
New Tales like this...
...remind me that my decision to get the fuck out of IT was a good one.
New you is out of IT?
or out of support of IT?
New IT is in my job, but my job isn't IT.
If I had to boil my job down to a single sentence, it'd be "managing the customer's expectations".
New Ah, like customer service at Verizon
--

Drew
New s/managing/lowering/g :-)
New Isn't that always what it means?
Not being sarcastic this time. No one ever talks about "managing expectations" unless they mean that they're trying to lower expectations to fit under the expected reality.
--

Drew
New IME, that is usually so.
A lot of the time it is a conversation between what is desired and what is possible. Sometimes, what is requested is idiotically easy and what can be delivered is much greater. More usually, the customer simply needs their expectations to actually be grounded in reality. That's what "managing expectations" involves. Then it's a conversation between people who think they know what they want and people who have to make something happen. One sales guy I sometimes work with has been known to say that *he* doesn't remember how the database is structured because *we* know and can just tell him when he asks. :-/

Wade.

Q:Is it proper to eat cheeseburgers with your fingers?
A:No, the fingers should be eaten separately.
New Not really
We have a very close relationship with our customer; the product (a rather intricate highway management system) is complex and continually changing due to a rolling programme of upgrades and fault responses.

The relationship is twofold - we talk to their technology people, but we also have a relationship with the operational staff. This being a government agency, left and right hands don't always communicate as well as they might, so we sometimes act as an intermediary.

Operational issues trump all; the expectations being managed are sometimes those of the technology side, with them being told by us that the operations people want feature X or have problem Y, so roadmap feature Z will either have to wait or will cost more or will require a modification.
Expand Edited by pwhysall Jan. 12, 2010, 09:49:56 PM EST
New This sounds like something from "Office Space"
"I deal with the damn customers so the engineers don't have to. I have people skills".

Or something like that...

I will choose a path that's clear. I will choose freewill.
New One possible difference (minor to me)
I've never come across a situation like that, so the following is pure speculation...

On Windows, NICs in promiscuous mode can't bind to the "any" address. They need a specific IP address. That may open/close paths in the network stack, but I would figure that something as bad as you describe would have lit up BBs all over the place if it was a Server 2008 "feature".

It is also possible that the NIC firmware is doing something nasty. Switching to promiscuous mode does involve the hardware after all.
New More info...
There two machines are both DL785s

The Windows machine was nearly $95K, 8 AMD Opteron 8439 SE six-core processors, 256GB PC2-6400 (64 x 4GB) Memory, 16 500GB Drives, 4 Gig network ports, a Dual Channel Fiber HBA (for future connection to a Storage works array that is coming soon), iLO, 6 power supplies ... the works, all official HP NICs and Drives and Memory and HBA and... you name it, it had an HP logo or sticker (same as the Linux machine)

The Linux Machine, wasn't as impressive at $80K, 8 AMD Opteron 8439 SE six-core processors, 256GB PC2-6400 (64 x 4GB) Memory, 4 500GB SATA Drives, 6 network ports, iLO, 6 power supplies ... almost the works. Enough to really support the SUSE Linux very well.

You see Microsoft called in HP engineers as well and got everything "set perfectly" to eliminate the possibility of Firmware/BIOS/mish-mash. Microsoft even got the HP Engineers to fix the Linux machine "working better". Network throughput performance were similar for both Windows 2008 and Linux, nearly theoretical through put.

72 special patches/hot-fixes to Windows 2008 and SQL server.

They tried special network drivers, hard configured NICs, aggregated NICs, even an old Intel 100MB NIC. Surprisingly even when connected through a crossover this problem manifested itself. They tried stripped down machines (processor/memory/extra anything removed) and got similar results.

HP washed its hands and closed the support request, saying the hardware was working as expected as proof they offered the perf testing and maximum raw throughput benchmarks being similar in numbers on both Windows and Linux.

Well... 17 days of on-site and really no results. Finally Microsoft gave a final result as not their problem.



All this crap happened before I started helping, Saturday morning at 10:30AM. I did all this remotely, helping via IM and phone. I could have driven 20 minutes and been there, but, I figured we would find something stupid and it wasn't worth my effort to drive. After all, it was Windows. By 3PM, it was wrapped up.

I personally cannot believe Microsoft and HP didn't find this problem. I guess good'ole troubleshooting is an art lost. Its a matter of throwing out assumptions and starting with looking at the situation a fresh, in other words, meaning all assumptions are bad and need to be re-read and thought out again.

Just this Monday, I got bit by assumptions on a problem I assumed was a real problem as I "trusted/assumed" the person that helped work the project. He made an assumption that was wrong and I never caught it. I guess it happens.
New Drool....
All we have is a handful of DL380s and DL365s :)

But the HP NICs (Broadcom 1Gb's for us) in those can do strange things under Windows. Mention "teaming" and things can go seriously haywire. An individual card can respond instead of the team, sometimes with the IP address that was used to configure the box. That in turn creates major confusion on the network.
New Andrew... any comment on this?
New Well, my experience is in a rather different area.
Small business accounting systems don't need super-servers, and Windows is on the client end with Linux on the server end, and throughput demand is absurdly low, even in a busy company. Also, vendors don't put out that much effort for my class of clients.

On the other hand, I can confirm that vendor tech support is quite competent - provided only their hardware and supported software is involved. If their stuff passes diagnostics, and still doesn't work - they're totally lost.

From the dim distant past, in simpler times, I remember the words of a formerly skeptical tech support guy at Digital Research when I submitted my ingredients list for a system that kept blowing the data. "That's all perfect, it can't possibly not work!"

In that case it turned out to be some unknown code fragment in the BIOS of the Everex host (Everex was a big name at the time) that kept blowing the data, but only with the exact setup I'd installed. Everex could suggest no fix. I replace the computer with an ALR (also a big name at the time) and all was well for 11 years. I don't remember for sure, but I think I sold the Everex to someone else and it worked fine until it died.

In a much more recent case, I had been building computers and maintaining the network for a fast growing firm. A guy named Jonathan rose to power and decided to go "all Microsoft" and "name brand" on the computers. He bought a hideously expensive Dell server - but same deal. Every once in a while it blew the data.

Something was wrong with the on-motherboard SCSI controller, but it always passed diagnostics, so nothing could be done. Finally it had a catastrophic failure, and the service contract had just lapsed.

So I ran home and in a few hours built a real cheap server out of junk and some stuff I was supposed to deliver to another company - and I bought a cheap Linksys NAS for off-line data. I never told him the NAS ran Linux or he would have had a seizure.

This cheap rig performed at least as well as the Dell and didn't eat the data. Jonathan disappeared from the face of the earth at about this time (not computer related) so management was free to have me build them a real server that worked.

But that's the bottom line. Start mixing stuff (and how are you going to build a working system any other way) and vendor tech support gets lost in the fog.

And the one thing I hate, hate, hate more than anything else is a server where everything is built onto the motherboard. You can't pull and replace until it stops failing. Even if you bypass an on-board component, and supposedly disable it, that doesn't mean it isn't still there causing trouble. These machines are too complex for anyone to comprehend, and too close to the edge of nervous breakdown for logic and diagnostics to prove anything.
Expand Edited by Andrew Grygus Jan. 13, 2010, 02:48:13 AM EST
New Everex VS ALR
Wow!

Me too. But in a different order.

I bought the ALR dual CPU for a 30 person editorial division. Around 30 floppy disks to install SCO Xenix with the MPX extensions. It was the Compaq SystemPro rip-off.

After 2 months it went back. The vendor burned a bunch of SCSI drives and cards. SCSI was new to them, badly terminated multiple configurations, sent them back. And then on the next set needed to be swapped (vendor burned them), I recieved the original burned ones back. I marked everything from that moment on.

The thing would panic on heavy IO. Or when I looked at it poorly.

When we gave up on that, we moved to the Everex Step Cube. Serial number 9. Created by hand, by loving lab techs. It was WONDERFUL.

Then they offloaded the production to a real manufacturing site. Every cube produced after that moment SUCKED. Bus was not grouded properly, boards would fry. Not to me though. Whenever I called support with a problem, 1st I had to prove my system was actually made in-house. Then they'd give me a REAL tech. And then they'd fix the problem. Easily, or at least, with a good chance of happy resolution. They conferenced in people from SCO (when it was a decent company), and got 1-off MPX patches. If I had bought a later one, I'd have to swap the entire system out since they didn't trust them and would not waste their time with them.
New I sold a few MegaCubes . . .
. . and a whole lot of Step desktops - enough to get an award from Everex. Unfortunately most of them went to one large company.

The company had a branch office and wanted to install a Step there, but Everex refused to provide on-site service in Santa Barbara - too far from their nearest office. The company turned to another brand for all their computers.

At about this time Everex went into serious decline.
New They had very little remote physical techs
They depended on vendors like you for the hands on.

My vendor was IFFY. On the other hand, run through the typical vendors. Highend enough to play with the BLEEDING edge, inexepensive enough to want to buy the non-Compaq system-pro alternatives. This would be an aspiring vendor, someone who is willing to take a risk on Tier-2, unable (or unwilling) to qualify for Tier-1.

The vendor in my area was about an hour away. They were N Jersey, I was S Jersey. There were no others that I was allowed to talk to, since I foolishly called them 1st. I would have preferred a PA/Philly area based vendor.

At that point, a service call would cost serious money if I was paying for travel time. I wasn't. This was all included in the quite reasonable (as compared to CPQ) price. Which meant for every blip on this brand-new system (ALR in this case, but same vendor for the cube) would then trigger at least a day effort on the vendor tech's part. They must have lost serious money when trying to install it over 3 months.

Did they call upon you to perform service calls on systems installed by some other vendor?

New No, they never called on me to perform service . . .
. . on any systems. They were attempting to set up their own service structure here in Los Angeles and didn't want mere dealers horning in on their contracts.
     Latest browser benchmarks on Winders. - (Another Scott) - (22)
         Hahaha he said: .... - (folkert) - (19)
             Bummer. - (Another Scott)
             Tales like this... - (pwhysall) - (8)
                 you is out of IT? - (boxley) - (7)
                     IT is in my job, but my job isn't IT. - (pwhysall) - (6)
                         Ah, like customer service at Verizon -NT - (drook)
                         s/managing/lowering/g :-) -NT - (boxley) - (3)
                             Isn't that always what it means? - (drook) - (2)
                                 IME, that is usually so. - (static)
                                 Not really - (pwhysall)
                         This sounds like something from "Office Space" - (beepster)
             One possible difference (minor to me) - (scoenye) - (8)
                 More info... - (folkert) - (7)
                     Drool.... - (scoenye)
                     Andrew... any comment on this? -NT - (folkert) - (5)
                         Well, my experience is in a rather different area. - (Andrew Grygus) - (4)
                             Everex VS ALR - (crazy) - (3)
                                 I sold a few MegaCubes . . . - (Andrew Grygus) - (2)
                                     They had very little remote physical techs - (crazy) - (1)
                                         No, they never called on me to perform service . . . - (Andrew Grygus)
         If you can live with the interface... - (pwhysall)
         This is why I won't support Windows servers. - (static)

Kilroy was here.
81 ms