We're back to a scalability layer.

The problem with shared-instance caching is that the data can be very volatile.

This is what usually happens when I say 'caching' in this context. :-)

The sort of caching I usually have in mind in this context is the kind that, assuming a typical PHP page, stops the page hitting the database multiple times for the same piece of data. Lifetime is typically seconds and validity is local to the page.

Shared-instance caching of data is a different game. You don't cache volatile data with that; you cache essentially static data. The active user list is a good example, provided users don't get added or delete all that often. There is a range of data in most apps like that that is a few to a few dozen entries and doesn't change in days. Even if it's not much, I imagine putting it in memcached is cheaper than constantly fishing it off a database.

On the other hand, putting PHP session information into memcached apparantly works very well. I'm just about to try that.

Assume equal reads and writes.

Ouch. That makes replication much less attractive.

Instances can't be partitioned, since they are shared by all users. Instances could be wholly on separate partitions, however.

I think I need to be careful with my terminology. LJ horizontally partitioned their user database because they never joined one user's data with another user's data. Then they have a cluster database which is used to indicate which cluster a particular user's data is.

Is your data structure like that? The support system I work with could probably do that because individual tickets stand alone.

Thanks for the input. I'm actually more interested in sparking discussion, and I thought this was a good example to use since it's different than things like forum software and the like.

Indeed. And I really welcome the opportunity.

I personally have been wrestling with the concepts of a scalability/database-abstraction layer and what that means in practice. I posted about it elsewhere, but my main concern with the direction I'm heading is that it seems to be taking SQL away from the application layer which my developers aren't showing any signs that they want to understand. Yet I can see benefits from this because I could put different objects in different databases or even not in a database and the application code doesn't need to know or care.

Wade.

Postscript: I have an opportunity to help optimize a quite different application that is thrashing its database. The owner of the instance has a replication setup but no knowledge how to make the app talk to the slave for reads :-/. Unfortunately, the database handler is the ADODB one from PEAR. In other words, the *whole thing* assumes there is one database and only one database. Knobs. This is (one reason) why I don't like the PHP database library addons. And modifying it will be a non-trivial task. *sigh*

Unfortunately, the one I've written here that *does* know how to use multiple databases intelligently is technically owned by my employer...

Post #274,155 by static 11/27/06 11:56:01 PM Reply	We're back to a scalability layer. The problem with shared-instance caching is that the data can be very volatile. This is what usually happens when I say 'caching' in this context. :-) The sort of caching I usually have in mind in this context is the kind that, assuming a typical PHP page, stops the page hitting the database multiple times for the same piece of data. Lifetime is typically seconds and validity is local to the page. Shared-instance caching of data is a different game. You don't cache volatile data with that; you cache essentially static data. The active user list is a good example, provided users don't get added or delete all that often. There is a range of data in most apps like that that is a few to a few dozen entries and doesn't change in days. Even if it's not much, I imagine putting it in memcached is cheaper than constantly fishing it off a database. On the other hand, putting PHP session information into memcached apparantly works very well. I'm just about to try that. Assume equal reads and writes. Ouch. That makes replication much less attractive. Instances can't be partitioned, since they are shared by all users. Instances could be wholly on separate partitions, however. I think I need to be careful with my terminology. LJ horizontally partitioned their user database because they never joined one user's data with another user's data. Then they have a cluster database which is used to indicate which cluster a particular user's data is. Is your data structure like that? The support system I work with could probably do that because individual tickets stand alone. Thanks for the input. I'm actually more interested in sparking discussion, and I thought this was a good example to use since it's different than things like forum software and the like. Indeed. And I really welcome the opportunity. I personally have been wrestling with the concepts of a scalability/database-abstraction layer and what that means in practice. I posted about it elsewhere, but my main concern with the direction I'm heading is that it seems to be taking SQL away from the application layer which my developers aren't showing any signs that they want to understand. Yet I can see benefits from this because I could put different objects in different databases or even not in a database and the application code doesn't need to know or care. Wade. Postscript: I have an opportunity to help optimize a quite different application that is thrashing its database. The owner of the instance has a replication setup but no knowledge how to make the app talk to the slave for reads :-/. Unfortunately, the database handler is the ADODB one from PEAR. In other words, the whole thing assumes there is one database and only one database. Knobs. This is (one reason) why I don't like the PHP database library addons. And modifying it will be a non-trivial task. sigh Unfortunately, the one I've written here that does know how to use multiple databases intelligently is technically owned by my employer... "Don't give up!"
Post #274,158 by drewk 11/28/06 12:06:04 AM Reply	You can do that with PEAR And still keep it away from the programmers. Create a class that extends DB.php. Wrap the connection method with your own, which checks if you're reading or updating. If you're reading, do it from one of the slaves. Best way is to put a load balancer in front of the slaves so you can dynamically add/remove slaves without interrupting the app. Next best is round-robin or random select within your class. Wait, you said ADODB. I used DB, not DB_ado, but I assume the above still applies. === Kip Hawley is still an idiot. === Purveyor of Doc Hope's [link\|http://DocHope.com\|fresh-baked dog biscuits and pet treats]. [link\|http://DocHope.com\|http://DocHope.com]
Post #274,159 by static 11/28/06 12:11:50 AM Reply	ADODB has a reputation. Mainly that it's a heavy layer for what it does. You're right: derive a new class and use that instead makes sense. That's probably what I'll do for that other project. Wade. "Don't give up!"
Post #274,213 by admin 11/28/06 12:40:02 PM Reply	Re: We're back to a scalability layer. Is your data structure like that? The support system I work with could probably do that because individual tickets stand alone. There may be some opportunities, but they are probably limited. Regards, -scott anderson "Welcome to Rivendell, Mr. Anderson..."

Welcome to IWETHEY!