The problem with shared-instance caching is that the data can be very volatile.
This is what usually happens when I say 'caching' in this context. :-)
The sort of caching I usually have in mind in this context is the kind that, assuming a typical PHP page, stops the page hitting the database multiple times for the same piece of data. Lifetime is typically seconds and validity is local to the page.
Shared-instance caching of data is a different game. You don't cache volatile data with that; you cache essentially static data. The active user list is a good example, provided users don't get added or delete all that often. There is a range of data in most apps like that that is a few to a few dozen entries and doesn't change in days. Even if it's not much, I imagine putting it in memcached is cheaper than constantly fishing it off a database.
On the other hand, putting PHP session information into memcached apparantly works very well. I'm just about to try that.
Assume equal reads and writes.
Ouch. That makes replication much less attractive.
Instances can't be partitioned, since they are shared by all users. Instances could be wholly on separate partitions, however.
I think I need to be careful with my terminology. LJ horizontally partitioned their user database because they never joined one user's data with another user's data. Then they have a cluster database which is used to indicate which cluster a particular user's data is.
Is your data structure like that? The support system I work with could probably do that because individual tickets stand alone.
Thanks for the input. I'm actually more interested in sparking discussion, and I thought this was a good example to use since it's different than things like forum software and the like.
Indeed. And I really welcome the opportunity.
I personally have been wrestling with the concepts of a scalability/database-abstraction layer and what that means in practice. I posted about it elsewhere, but my main concern with the direction I'm heading is that it seems to be taking SQL away from the application layer which my developers aren't showing any signs that they want to understand. Yet I can see benefits from this because I could put different objects in different databases or even not in a database and the application code doesn't need to know or care.
Wade.
Postscript: I have an opportunity to help optimize a quite different application that is thrashing its database. The owner of the instance has a replication setup but no knowledge how to make the app talk to the slave for reads :-/. Unfortunately, the database handler is the ADODB one from PEAR. In other words, the *whole thing* assumes there is one database and only one database. Knobs. This is (one reason) why I don't like the PHP database library addons. And modifying it will be a non-trivial task. *sigh*
Unfortunately, the one I've written here that *does* know how to use multiple databases intelligently is technically owned by my employer...