Data changes. So will the plan.
You should automatically collect stats on an ongoing basis
Data changes. So will the plan. |
|
One-off event
This was the first time handling the largest event for a new (this year) client. There was no comparable historical data. This scenario - unique or annual events - is going to keep happening. -- Drew |
|
Isn't that "one off" stuff...
supposed to be the selling point of renting machines/dBs/etc. on "The Cloud"? It looks like it is...: Amazon Web Services provides a broad range of services to help you build and deploy big data analytics applications quickly and easily. AWS gives you fast access to flexible and low cost IT resources, so you can rapidly scale virtually any big data application including data warehousing, clickstream analytics, fraud detection, recommendation engines, event-driven ETL, serverless computing, and internet-of-things processing. With AWS you don’t need to make large upfront investments in time and money to build and maintain infrastructure. Instead, you can provision exactly the right type and size of resources you need to power big data analytics applications. You can access as many resources as you need, almost instantly, and only pay for what you use. Your example seems to be an indication that even "general purpose" problems have to be tuned, and that can be very difficult in one-off situations. Dunno if this means anything in particular in relation to AWS (presumably they have a mountain of internal dB optimization experience, but who knows how much of that they offer to their AWS customers), but I can imagine that the pressure to "optimize" costs for one-off situations could lead to lots and lots of inefficiencies ("just throw more hardware at it - we only do it once a year so it's not worth figuring out how to do it better..."), and lack of growth of internal staff. ADP and the like have been doing computing stuff for companies for decades and it's "more efficient", but I can't help but think that too much out-sourcing of things like this doesn't help companies (and their employees) in the long-run. Just some musings... Cheers, Scott. |
|
Huh?
One off or going to keep happening. You say both. The key issue is unless you are paying by compute cycles, assuming there is nonpeak times, scheduled stats collection is a no cost win. And if there is never a slow time, then run it anyway but with reduced priority. |
|
What I mean is we'll keep having one-off events
We've never done this event for this client at this venue before. I find it hard to believe the things that matter to query performance would change so radically from event to event. Volume should be the major factor, so if we handle $x TPS we should be good for anything < $x. That doesn't seem to be true. -- Drew |
|
Absolutely not
TPS is total volume. Which nothing to do with data skew. New or changing field contents can dramatically change which indexes to use or how to use them, or to use them at all, which modifies query plan. |
|
This.
The larger the data load, the bigger the potential change in index distribution. Regards, -scott Welcome to Rivendell, Mr. Anderson. |