IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User

Welcome to IWETHEY!

New How much impact should automatic query plan optimization have?
We've got a query that was crushing the server - CPU pinned until we started killing instances of the query. Once they re-ran statistics on the server we ended up with a new query plan that kept CPU at <30%.

Does this suggest that we need to review the query so there isn't a "bad" plan? Or is this just one of those things that happens when you have high volume?

New Realistic volume, perhaps?
It sounds like something changed from the original usage expectations, or said expectations were not accurate begin with. The query plans are just another cog in the machine. I would consider it normal for them to be re-evaluated periodically under real life circumstances (or if either the application, or its "application" knowingly changes.)
New You should automatically collect stats on an ongoing basis
Data changes. So will the plan.
New One-off event
This was the first time handling the largest event for a new (this year) client. There was no comparable historical data.

This scenario - unique or annual events - is going to keep happening.

New Isn't that "one off" stuff...
supposed to be the selling point of renting machines/dBs/etc. on "The Cloud"?

It looks like it is...:

Amazon Web Services provides a broad range of services to help you build and deploy big data analytics applications quickly and easily. AWS gives you fast access to flexible and low cost IT resources, so you can rapidly scale virtually any big data application including data warehousing, clickstream analytics, fraud detection, recommendation engines, event-driven ETL, serverless computing, and internet-of-things processing. With AWS you don’t need to make large upfront investments in time and money to build and maintain infrastructure. Instead, you can provision exactly the right type and size of resources you need to power big data analytics applications. You can access as many resources as you need, almost instantly, and only pay for what you use.

Your example seems to be an indication that even "general purpose" problems have to be tuned, and that can be very difficult in one-off situations. Dunno if this means anything in particular in relation to AWS (presumably they have a mountain of internal dB optimization experience, but who knows how much of that they offer to their AWS customers), but I can imagine that the pressure to "optimize" costs for one-off situations could lead to lots and lots of inefficiencies ("just throw more hardware at it - we only do it once a year so it's not worth figuring out how to do it better..."), and lack of growth of internal staff.

ADP and the like have been doing computing stuff for companies for decades and it's "more efficient", but I can't help but think that too much out-sourcing of things like this doesn't help companies (and their employees) in the long-run.

Just some musings...

New Huh?
One off or going to keep happening. You say both.

The key issue is unless you are paying by compute cycles, assuming there is nonpeak times, scheduled stats collection is a no cost win. And if there is never a slow time, then run it anyway but with reduced priority.
New What I mean is we'll keep having one-off events
We've never done this event for this client at this venue before.

I find it hard to believe the things that matter to query performance would change so radically from event to event. Volume should be the major factor, so if we handle $x TPS we should be good for anything < $x. That doesn't seem to be true.

New Absolutely not
TPS is total volume. Which nothing to do with data skew. New or changing field contents can dramatically change which indexes to use or how to use them, or to use them at all, which modifies query plan.
New This.
The larger the data load, the bigger the potential change in index distribution.
Welcome to Rivendell, Mr. Anderson.
New I've not worked with a DB that needed that to happen.
Sounds like something peculiar to Oracle. I know in MySQL if a query breaks the DB, then you have bigger problems and the solutions are therefore far more wide-reaching and effective.

     How much impact should automatic query plan optimization have? - (drook) - (9)
         Realistic volume, perhaps? - (scoenye)
         You should automatically collect stats on an ongoing basis - (crazy) - (6)
             One-off event - (drook) - (5)
                 Isn't that "one off" stuff... - (Another Scott)
                 Huh? - (crazy) - (3)
                     What I mean is we'll keep having one-off events - (drook) - (2)
                         Absolutely not - (crazy) - (1)
                             This. - (malraux)
         I've not worked with a DB that needed that to happen. - (static)

#include <stdlrpd.h>
142 ms