IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Question about tuning kernel parameters.
I've got an interesting problem with Linux that's proven tricky to track down. Basically, it's a web site running on Apache on half-a-dozen load-balanced servers. All of them talk to a MySQL database on another server. All of them are connected together via a private gigabit ethernet network, as well as their public interfaces.

Now, for various reasons, the web pages open new connections to the database at the start of the pages. (Pooled connections are a political problem, so we don't use them.) We have a few predictable times where this fails, and so I've been digging into the TCP/IP settings and the thread and open file settings in /proc.

First, TCP. The default TCP buffer maximum (sys/net/core/rmem_max and wmem_max) is 128k. In some TCP tuning guide I found, one of the first recommendations was to raise that to 16Mb (!). Not sure of the utility of such a massive increase, I raised ours to just 1Mb, but I don't know how to see if there's any benefit. In fact, nothing obvious has improved. Am I on the right track?

Second, threads. We've got MySQL setting it's own open files limit to 10240 (I was a little surprised this can be done) and it's maximum connection count is 1500. Yet it always peaks at something like 995 threads. The maximum thread count in /proc is something over 14000 and the maximum files open is almost 400000. Upping the open files limit is easy, but unfortunately requires a restart of the database service - although it's just a shutdown and a restart, it takes long enough to be noticed. Is increasing the open files limit the first thing I should try?

Thanks!

Wade.
"Insert crowbar. Apply force."
New I don't know MySQL but...
I wouldn't raise limits that it is not getting near.

I'd guess that the number of threads used is tied to how many different things it is doing at once. If the activity is relatively constant, the number of (threads, files, etc) used will also be constant. Therefore raising the limits won't help you because you aren't coming near them.

Some of those limits you said it isn't near. You haven't said how many files it has opened, but I highly doubt that it is hitting 400000 open files if you only have a maximum number of connections of 1500. I also suspect that the number of threads has to be at least as big as the number of active connections, so if you're at 995 threads max, you aren't hitting the 1500 limit.

I'd think that increasing the TCP buffer size will improve sustained throughput. But if you have a typical web application, throughput isn't the issue. Round trips are. So I would expect to see little if any improvement from that.

Without knowing more about your system, and without MySQL experience, my gut says that you want to do 2 things.

1. Create one connection per Apache child, not one per page request. Creating/closing connections takes a lot of work for a database. If you're having trouble doing that fast enough, then stop doing it so often. I don't know how hard this is to do for you though.

2. Move to a reverse proxy setup. In a reverse proxy setup the browser connects to a lightweight Apache child that connects to the Apache children that connect to the database. This means that a user behind a slow modem ties up only a lightweight process, and not an expensive one that is taking database resources. Doing that will likely drop the number of active database connections that you need by a factor of 5-10 or so. I don't know MySQL, but I know that this makes Oracle much happier. (With mod_perl it also cuts overall memory requirements for the site, which means that you need fewer webservers.)

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New That's the problem.
I *don't* know how many files it has open. Rummaging in /proc/(pid)/fd shows that threads often have the same file open multiple times - does this count? There were just over 1500 items in the

We're fairly sure something is capping connections before it hits max_connections because under sustained load, the thread count in MySQL caps below max_connections. So it's running out of some other resource before connection threads. And when that happens, we get 'cannot create thread' errors - but that doesn't say *why* the thread can't be created. :-/
2
I'd love to move to one connection per Apache process. Unfortunately, that's a political fight, as I mentioned. The web server app is written in PHP so the equivalent is called pooled connections. The problem is that the CEO has access to MySQL's show processlist; pooled connections show up as idle threads in that and Apparantly That's Bad. However, MySQL is known for being fast at connection setup and teardown. So.

The reverse-proxy arrangement sounds interesting and would be something to try for.

Thanks for the thoughts.

Wade.
"Insert crowbar. Apply force."
New You already know this, but I'll say it anyway
The problem is that the CEO has access to MySQL's show processlist; pooled connections show up as idle threads in that and Apparantly That's Bad.
First, regardless of what technical problems you're having the CEO shouldn't be involved at that level. If he is, what is he paying you guys for?

Now, assuming you're dealing with someone who should be looking at the processlist -- and since he's The Boss™, he gets to decide that he is that someone -- it might be appropriate to contrast real production problems that you can demonstrate you are actually having vs. a potential perceived inefficiency.

Is there a slow time or regular maintenance window during which you could try changing the configuration and benchmark it? If so, I'd put together a list of two or three changes you'd like to try -- like bumping up the limits you mentioned above, things you believe won't actually fix it -- and then while you're doing it also try the pooled connections.

And just so I have some frame of reference, what kind of transaction volume (page views) are you doing? I worked on a system with three fairly beefy webservers hitting a MySQL master with five slaves.
===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
New There is a risky political solution
Go to the CEO and say, "I have a big scalability improvement that I could easily make which I've been told not to do because we're afraid that you'll misinterpret how efficiently our database is being used."

He'll get upset, but it won't be at you. And the political barrier to the solution will melt away. However the person who he gets upset at is likely to become unhappy with you. OTOH the CEO is likely to be relatively pleased with you, but that might not help you very much.

I suspect that very few people would choose that solution...

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Heh.
It's tempting. It could even work. All we really need is for pages to be noticeably faster. And for sales to go noticeably up.

But we've got another solution in the offing. We're going to remove the CEO's access from that application. I believe we've weened him off root access on the web servers. It will be part of the effort to stop the web developers having near full reign in the production environment, too (that gives me the horrors sometimes).

Wade.
"Insert crowbar. Apply force."
New TCPOE?
How about a TCP Offload Engine?

There was a couple I found that were Linux only, but only one was available. Ping Barry about the hardware, I think he actually got one.

It might improve you connection success ratio, also make your idle connections a might bit more wantable... by jerri-engineering about the TCPOE with regard to connections and all that.

A small amount of money on hardware might just avoid the political firestorm.
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey
Freedom is not FREE.
Yeah, but 10s of Trillions of US Dollars?
SELECT * FROM scog WHERE ethics > 0;

0 rows returned.
     Question about tuning kernel parameters. - (static) - (6)
         I don't know MySQL but... - (ben_tilly) - (5)
             That's the problem. - (static) - (4)
                 You already know this, but I'll say it anyway - (drewk)
                 There is a risky political solution - (ben_tilly) - (1)
                     Heh. - (static)
                 TCPOE? - (folkert)

I think coming back from the summit was the rather more important achievement.
79 ms