IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 1 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Indexes are only better if you're accessing under 7%...
of the table. If you're accessing more than that, a full table scan is faster.

Note that 7% is a rule of thumb that Oracle came up with internally. Depending on specifics of physical layout, it could be higher or lower than that.

Also note that in a complex query, using an index on one table might get in the way of using an index on another. Don't be scared of the full table scan.

And finally note that in a transactional database, indexes bring a lot of update and maintainance overhead. They aren't a free miracle, and the overhead may kill you. But when they work, they work very nicely. :-)

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New It's 5% now.
At least, as of last Thursday, and in 10g.

Speaking of index overhead, I learned a neat thing about indexing: reverse indexes are nice for getting rid of hotspots and wasted space in monotonically increasing keys, as long as you don't need to do a range scan on the key.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Source?
I got the 7% figure from my SQL tuning class. (We were lucky to have a very good instructor, Ron Soltani. He normally teaches the security class.)

I'm willing to believe a lower figure. But I'm even more willing to believe that the right figure is very dependent on what hardware you have. (Over time it will trend lower as sequential read performance continues to improve faster than scattered read performance.)

You're right about reverse indexes. That's why Oracle created them. :-)

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New My SQL tuning class instructor. :-D
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Is it coincidence that we both took that class recently?
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Dunno.
We're moving to 10g soon and wanted to give everyone a head's-up on the cost-based optimizer.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New There was a cost-based optimizer in Oracle 8 as well
I'm finding lots of bugs.

I haven't produced a simple case for the latest yet...

That one turned out to be a bug in my code. (There was an unnecessary group by one level up.)

However I've hit enough bugs that when I see odd behaviour and can't easily find a problem in my query, I suspect Oracle before looking harder. And I'm often right. That's a bad sign.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
Expand Edited by ben_tilly April 11, 2006, 05:07:40 PM EDT
New I'd appreciate a list if you can take the time.
We're on 9i right now, but using rules-based optimization. Cost-based only is one of the reasons we've put off 10g for so long. But if there are bugs, we'll want to know that.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Hmm...
I haven't been keeping a list. But every so often I run across something. Here is stuff off of the top of my head.

The last real one that comes to mind I didn't track down all of the parameters on. However in the middle of an insanely complex query, in several nested subqueries if I did not put an explicit alias on the table and refer to it in the subquery for a specific field, I got no error but that field wasn't populated with any information. I added the aliases and data appeared. (If it was truly ambiguous, then I should have had an error message. It wasn't ambiguous, and I didn't get an error, but somehow Oracle got confused when it shouldn't have. I checked that Oracle doesn't get confused on simple queries, and didn't bother taking the time to figure out what combination of things it took to confuse Oracle.)

We also have a database where for reasons nobody has figured out, if you set timing on, timing always reports that things took twice as long as they really took. What I mean is that if I do something like this:

select to_char(sysdate, 'YYYY-MM-DD HH:MI:SS') from dual;
set timing on;
select (something complex and slow)
;
set timing off;
select to_char(sysdate, 'YYYY-MM-DD HH:MI:SS') from dual;

the time that Oracle says the query took is consistently double the difference between the dates.

There is a non-bug that will irritate some users. There is a trick that a group by does an implicit sort, so you can speed up some queries by leaving out the sort. However the sorting of NULL versus everything else is different when you have 1 column versus more than 1 column. Since this is officially undefined behaviour, it isn't Oracle's bad that this changed. However it may still trip up some people.

There were a couple more bugs that I remember hitting which Oracle already had patches for. But those brain cells have been repurposed so I forget what they were.

I've run across a number of situations where Oracle simply can't find a good query plan that I wish it did. This is not a bug though. For example consider a query like this:
\nSELECT (stuff)\nFROM big_table, small_table_1, small_table_2, ...\nWHERE big_table.id = small_table_1.id (+)\n  AND big_table.id = small_table_2.id (+)\n  ...\n

Oracle will only consider query plans where it takes the big table, merges it with one small table, merges that with another, etc. That gets slow because it is throwing around a lot more temporary data than it needs to. The best workaround that I know of is to create a temporary table that has all of the fields from small_table_1..n in it. Do separate inserts for each small table, then do a join between a group by off of that temporary table with the big table.

Again, not a bug. But this workaround has proven useful to me more than once. (And there are other cases where I've found it faster to offload data from the database, process it in code, and then load the result back into the database. In fact I'm in the middle of a project like that.)

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Thanks.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New OK, here's a real bug, just encountered
Consider the following piece of SQL (against a stupid schema that someone else designed, natch):
\n            SELECT /* /dashboard/scorecard/ */\n                TO_CHAR(v.date_dm, 'YYYY-MM-DD') as "Week"\n              , v.visitors as "Unique Visitors"\n              , v.registered_visitors as "Registered Visitors"\n            FROM (\n                SELECT TRUNC(q.date_dm, 'DY') as date_dm\n                  , NVL(SUM(DECODE(q.type, 15, q.value, NULL)), 0)\n                    as visitors\n                  , NVL(SUM(DECODE(q.type, 16, q.value, NULL)), 0)\n                    as registered_visitors\n                FROM vw_ext_quickstats q\n                WHERE q.date_dm >= TRUNC(TO_DATE(:1, 'YYYYMMDD'), 'DY')\n                  AND q.date_dm <= TRUNC(TO_DATE(:2, 'YYYYMMDD'), 'DY')\n                  AND q.subtype = 1\n                GROUP BY TRUNC(q.date_dm, 'DY')\n              ) v\n

When I substitute in '20040404' and '20040504' in for :1 and :2 I get numbers for visitors and registered_visitors. When I execute the exact same SQL passing in parameters for placeholders, the date field is populated and everything else has 0s in it. I know that the parameters are actually being passed in properly because if they were wrong, then I wouldn't have all of the right dates in the output. (Plus I'm running the query through a well-tested piece of code, parameter passing darned well should work.)

I don't need help solving it, I already have a DBA working on the problem. Furthermore if it can't be fixed, then I'm sure I can find a workaround. (Heck, "substitute in actual values" is a decent workaround in this case - in fact it is what I'll probably do.) But I just wanted to give you a reality check demonstrating that I really do encounter bugs from time to time in 10G.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New I'll have to give that one a whirl.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New It can probably be simplified further...
And I don't know how general it is - just that it hit the database I was working against.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New I heard 8% the other day for MySQL/InnoDB
But no authorative reference was given.
New Percentage differences that close are really just a SWAG.
Depends on record size, speed of disk (sequential VS seeking), RAID type, how much processing you will do in between each record, etc.

There is no way any of the "experts" would be willing to bet a number against a database (fully populated) unless they's tested it.
     subtracting dates inside rdbms as opposed to outside - (boxley) - (18)
         Indexes key to efficiency - (ChrisR) - (17)
             that runs faster by eyeball :-) thx! -NT - (boxley)
             Lotta people make that mistake. - (static)
             Indexes are only better if you're accessing under 7%... - (ben_tilly) - (14)
                 It's 5% now. - (admin) - (11)
                     Source? - (ben_tilly) - (10)
                         My SQL tuning class instructor. :-D -NT - (admin) - (9)
                             Is it coincidence that we both took that class recently? -NT - (ben_tilly) - (8)
                                 Dunno. - (admin) - (7)
                                     There was a cost-based optimizer in Oracle 8 as well - (ben_tilly) - (6)
                                         I'd appreciate a list if you can take the time. - (admin) - (5)
                                             Hmm... - (ben_tilly) - (4)
                                                 Thanks. -NT - (admin)
                                                 OK, here's a real bug, just encountered - (ben_tilly) - (2)
                                                     I'll have to give that one a whirl. -NT - (admin) - (1)
                                                         It can probably be simplified further... - (ben_tilly)
                 I heard 8% the other day for MySQL/InnoDB - (dws) - (1)
                     Percentage differences that close are really just a SWAG. - (broomberg)

You tread upon my patience.
67 ms