Post #273,817
11/23/06 11:21:37 PM
|
Not so fast, bubba boy.
There's a database handler, true, and it does all the switching between servers that are in a replicated arrangement. It's not quite as full-featured as any of the PEAR classes, but then, it's not tied to their method of doing things either (there *are* disadvantages of copying each others API calls...).
It is on top of the DB abstraction layer that the objects sit. They are the ones that hide the data fetches and saves. Both of them have been developed to have a low memory footprint - there is a very important call that uses an unbuffered query for that very reason, as well as lots of PHP references. Sometimes I hear the strain on the langauge...
We look at MySQL clustering from time to time but we can't afford machines that have 80Gb of RAM. If they exist. And we're looking at how to do a dispersed database, too, where the data migrates between application instances according to certain rules (like where the end-user is). MySQL doesn't know how to do that.
[link|http://www.danga.com/words/2005_oscon/oscon-2005.sxi|http://www.danga.com...on/oscon-2005.sxi] is a presentation created by Brad Fitzpatrick of LiveJournal. He describes LJ's journey from a single server to a large number of servers all doing bits-n-pieces with no single point of failuire.
Our system is currently like his slide 19. *I'm* the one thinking furthest ahead - further than my boss, who is currently taking his first steps beyond the 'buy a bigger server for the database' mode. And none of my web developers are that far up. They are still thinking in terms of one web-server talking to one-database. The load balancer and the database handler and a few other things make magic so that they can keep programming like that, but they're resisting the learning. So my approach is a little different to LiveJournals': I'm beginning to abstract away the scalability tricks. Thus a 'scalability layer'. (The term also helps put in people's mind the idea that it's more than a simple database API. :-)
Wade.
"Don't give up!"
|
Post #273,822
11/24/06 1:55:46 AM
|
A question you might ask yourself
and I'm asking more and more is "do you really need a database"? Could you do as well with a well thought out directory structure and files? More and more I'm finding the answer is "probably".
Of course, I've become a bit more adventurous since working at big river. There are precious few conventional databases to be found there, but lots of clever data stores.
[link|http://www.blackbagops.net|Black Bag Operations Log]
[link|http://www.objectiveclips.com|Artificial Intelligence]
[link|http://www.badpage.info/seaside/html|Scrutinizer]
|
Post #273,825
11/24/06 4:53:47 AM
|
It is a promising thought, I'll admit.
I've suggested something like that to my IT colleagues, but I haven't so much as hinted it at the web developers yet; they'll challenge it unless it can be given to them working and better than what is in place now. And I really need them using the API before I can do anything like that. At the moment, there is a still a lot of SQL done in the website...
I do empathise. I see signs of 'it's a database - that must be the way to do it' :-/
Wade.
"Don't give up!"
|
Post #273,829
11/24/06 11:38:17 AM
|
Just gone through that
I've automated some inbound email->order entry into one of our systems. I'm not supposed to create an order for certain type of products when the inventory is low, I'm supposed to hold the order for the next day's processing and try again.
So of course, hold order == database item. Or so I thought.
Than I realized the inbound email had to be stored in it's entirety, which really means an index value of inbound sequence number, and then a blob. I needed to parse it new each time since the parsing took into account external translation tables, which could change each run, which means there is no win to storing the post-parsed message.
Things are never on hold for very long, and the qty of on-held orders is low, max into the hundreds.
So, I simply created a hold file for each message and feed them back into my process every day. The file name is the inbound timestamp which insures FIFO processing.
|
Post #273,995
11/26/06 10:38:22 PM
|
Are we going back to frame-style usage?
What you're talking about sounds like the way I understand mainframe "databases" work. I may be making connections that aren't there, but it sounds like there was a style that worked well on the frame. Then we developed relational databases. They were a convenient abstraction for the programmers. They worked well for smaller problems and smaller datasets, and ran on midranges and workstations.
Then as the available horsepower grew some of the old frame jobs were re-written to use relational DBs on fast workstations. (Yes, Barry, this means you.) Now it sounds like the problems we're trying to solve -- the size of data and the number of transactions -- are exceeding what we can do well with relational DBs. So now people are starting to re-invent the style of creating their own domain-specific database.
Is this completely wrong?
===
Kip Hawley is still an idiot.
===
Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats]. [link|http://DocHope.com|http://DocHope.com]
|
Post #274,010
11/26/06 11:49:21 PM
|
Interesting take on things.
Has some merit, I'll admit. I thought I liked the idea of having one MySQL database tuned for some of my data and another tuned differently for another lot of data. You're suggesting that flatfiles might be better for some other types of data and perhaps some other, highly custom arrangement better for another sub-domain.
But then, you couldn't do the sort of adhoc stuff on mainframe datasets that we do now on many relational DBs. You had to program something and wait for it to finish. Which might be several minutes, it might be several hours, it might be several days.
I think the role of 'database programmer' is being re-invented.
Wade.
"Don't give up!"
|
Post #274,017
11/26/06 11:55:34 PM
|
Best tool for the job and all that
People treat SQL databases as golden bullets for their data storage needs, rarely taking into account the downsides. Time to take a step back and figure out what you really need to accomplish, and THEN determine the best method of storage and access.
|
Post #274,032
11/27/06 1:28:49 AM
|
Yah
Too many people just think "I need to store a lot of data on disk - I guess I need a database". They don't realize that databases COST you something too.
If its just about persistence - probably you can do better with plain old files. OTOH, if you really do have a lot of different ways of looking at your data, then a database is just the thing.
I'm actually moving an app from an oodb, which is mostly only good at persistence, to postgres because I have ever expanding reporting requirements.
[link|http://www.blackbagops.net|Black Bag Operations Log]
[link|http://www.objectiveclips.com|Artificial Intelligence]
[link|http://www.badpage.info/seaside/html|Scrutinizer]
|
Post #274,120
11/27/06 8:07:37 PM
|
Power of abstractions vs. absolute performance
OODBMSs are supposed to offer better abstractions to make programming easier, but I haven't heard anyone say the performance is acceptable. Eventually the hardware will pick up enough that today's experiments will become practical. But by then we'll be trying to do more, with more data.
Maybe we should keep the greybeards around so that when we hit the wall of what we can do with current hardware, they can remind us how they used to do things.
===
Kip Hawley is still an idiot.
===
Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats]. [link|http://DocHope.com|http://DocHope.com]
|