IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Left Outer Join question
My mind is currently drawing a blank on this one, so I thought maybe someone else would know the answer. On SQL Server, I need to join two tables (a drug table and a formulary table) where the drug table is the primary selection, and there may be zero or more formulary rows associated with that drug. What I need is to only join on the top row on the match. Something like:
SELECT *\nFROM\n   drug d\n   LEFT OUTER JOIN formulary f ON (f.drugid = d.drugid)

Which works in the case of zero or one, but returns the same drug multiple times when there's more than one associated formulary row. I tried something along the lines of:
SELECT *\nFROM\n   drug d,\n   (\n      SELECT TOP 1 *\n      FROM formulary f\n      WHERE (f.drugid = d.drugid)\n   ) x

But that's not compilable, because d is not in the scope of the inner query. Is there a way to do a left outer join on only zero or one rows?

Thanks.
Expand Edited by ChrisR June 7, 2006, 10:26:29 PM EDT
New One approach
Here goes:

\nSELECT *\nFROM\n  drug d\n  LEFT OUTER JOIN (\n    SELECT f.*\n    FROM\n      formulary f\n      INNER JOIN (\n        SELECT MIN(formularyid) AS formularyid\n        FROM formulary\n        GROUP BY drugid\n      ) top\n      ON f.formularyid = top.formularyid\n    ) f\n    ON (f.drugid = d.drugid)\n


(I got this from adapting a trick I've developed with analytic functions in Oracle. But you can implement this special case without analytic functions.)

Cheers,
Ben

PS I should note that this solution is untested. I don't guarantee that there aren't silly syntax errors or worse.

PPS I swear that when I posted it I had indentation. :-(
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
Expand Edited by ben_tilly June 8, 2006, 12:05:30 AM EDT
Expand Edited by ben_tilly June 8, 2006, 01:13:51 PM EDT
New Some variation on MIN...
...should do the trick. My concern would be in performance, as these are rather large tables and I'm not sure whether the constraints that are outside of the inner views/queries will limit the returned rows - or whether they will build the table the with all the mins and then do a join across to the temporary table. Should be easy enuf to test - since they are large tables, it should be obvious if there's a performance hit.

Anyhow, now that you've got me past my memory block, I should be able to come up with a query. Thanks.
New It probably will not optimize
My experience suggests that complex queries without group bys can be optimized. But as soon as you add a group by, that level of nesting becomes "opaque" to the query analyzer which then does the whole group by only to filter later.

The workaround that I've used for that is to stick inner inner queries in that create a condition that greatly reduces the size of that subselect. As you might imagine, this makes queries rather heinous with lots of duplicated information, so when I do this I typically also switch over to writing code to generate the SQL.

In fact that's usually how I handle complex queries. It is a lot easier to read code that produces SQL, and it is far easier to edit. With the query that I wrote above I might write something like this:
\nmy $formularyid_by_drugid_sql = qq{\n  SELECT MIN(formularyid) AS formularyid\n  FROM formulary\n  GROUP BY drugid\n};\n\nmy $full_formulary_by_drugid_sql = qq{\n  SELECT f.*\n  FROM\n    formulary f\n    INNER JOIN (}\n      . indent($formularyid_by_drugid_sql)\n      . qq{    ) top\n    ON f.formularyid = top.formularyid\n};\n\nmy $drug_and_formulary_isql = qq{\n  SELECT *\n  FROM\n    drug d\n    LEFT OUTER JOIN (}\n      . indent($full_formulary_by_drugid_sql)\n      . qq{    ) f\n    ON d.drugid = f.drugid\n};\n


And now I find the structure of the query somewht easier to interpret.

BTW this wasn't the best example of the technique. It makes more of a difference when you have multiple subqueries and/or interesting WHERE conditions on some subqueries.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Yep, took forever to run.
But I did end up figuring out a SS solution using the min function (in non-SQL-standard sort of fashion).

SELECT *\nFROM\n   drug d\n   LEFT OUTER JOIN formulary f ON (f.drugid = d.drugid)\nWHERE\n   ((f.formularyid IS NULL) OR\n    (f.formularyid = (\n       SELECT MIN(forumularyid)\n       FROM formulary x\n       WHERE (x.drugid = d.drugid))))

New Nothing non-standard about that
In fact, that's the solution I'd like to think I'd have come up with if I'd worked on it long enough.
===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
New Pre indentation
PPS I swear that when I posted it I had indentation. :-(

There's a quirk in the forum software where editing an old post that has a pre statement will forget the spaces and indentation. I noticed it when I edited the post up above.
New Bug, indentation can be lost during edits (new thread)
Created as new thread #258179 titled [link|/forums/render/content/show?contentid=258179|Bug, indentation can be lost during edits]
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Does SQL Server support "DISTINCT ON"?
It's an extension I used a lot in PostgreSQL. You'd get:

SELECT DISTINCT ON (d.drugid) *\nFROM\n   drug d\n   LEFT OUTER JOIN formulary f ON (f.drugid = d.drugid)\nORDER BY\n   f.formularyid



[EDIT]

Nope, scratch that. I checked the PostgreSQL docs and it appears the behavior of DISTINCT ON one field when ordering by a different field is undefined and pretty much broken. The better solution would be:

SELECT d.*, f.*, MIN(f.formularyid) AS f_unique\nFROM\n   drug d\n   LEFT OUTER JOIN formulary f ON (f.drugid = d.drugid)\nGROUP BY\n   d.drugid


Without having it installed here, I'm not sure what the MIN() will do with null rows returned by the OUTER JOIN. Assuming that returns the "correct" thing, this seems more readable than Ben's version. I suspect his may perform better on large tables though.
===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
Expand Edited by drewk June 8, 2006, 08:55:49 AM EDT
New That is really what I need!
Unfortunately it doesn't appear to be available for SS. But being able to restrict the DISTINCT to a particular column(s) would've been a nice solution. Of course, that leaves the problem of which row to keep and which to throw away.
New The more readable version is invalid (at least in Oracle)
Oracle wants every field in the SELECT that is not a GROUP BY to be an aggregate. So you'd need all of f.* to be in the GROUP BY, at which point your min(formularyid) would accomplish nothing (since you've grouped by formularyid).

That is why I needed two inner levels. One to ind the minimum formularyid for the drugid, and the other to pull out all of the formulary information for that ID.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Hmm, both Postgres and MySQL support the same non-standard?
I'm fairly certain that in both of them when you SELECT * from a table, MIN() on one value, GROUP BY another value, you get the entire row that corresponds to the record that contained the MIN. I'm so sure I used to use this behavior, and got the expected results, that I think they both do it.

Or I may have lost my mind since then.
===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
New Can you please help me find mine?
Since you only lost your recently, its more likely to be found.
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey
Freedom is not FREE.
Yeah, but 10s of Trillions of US Dollars?
SELECT * FROM scog WHERE ethics > 0;

0 rows returned.
New That is a MySQL shortcut
And it doesn't work like you think it does.

First of all it doesn't work in any version of Oracle through Oracle 10. I just tried it.

If you wish to try it in other databases, be my guest. I'd be interested in who elsedoes that.

But from [link|http://lists.evolt.org/archive/Week-of-Mon-20030908/148130.html|comments] that I saw while browsing, missing fields in the group by is something that MySQL allows and Postgres does not. In [link|http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-fields.html|the MySQL documentation] the extension is explained and they admit that it is not standard SQL. They also document the fact that MySQL is free to return any value in the group, so while you might sometimes get the row corresponding to the min, they tell you that you can't rely on that behaviour.

Note that the MySQL documentation for how to be portable is exactly backwards, at least as it comes to dealing with Oracle. Oracle allows expressions in the group by but won't recognize an alias there.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Not surprising
I haven't used Postgres in anger for a couple of years. I'm sure there's a lot of MySQL-isms that I incorrectly attribute to Postgres.

As for getting the wrong values back when using this technique in MySQL: either I got very lucky very often, or I used to know a way to make sure you got the right thing back, which I have since forgotten.
===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
New Or you didn't double-check enough
Logic errors are the hardest kind of bugs to find.

Depending on the implementation, if you were trying it in an example like the above on a table that had been filled in sequentially, it might frequently pick the minimum row. So it would look like it was doing something, even when it wasn't.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Also possible
Though it was long enough ago that I can't even remember a representative example. I'll chalk that up as "something I may have gotten away with". It wouldn't be the only thing.
===

Purveyor of Doc Hope's [link|http://DocHope.com|fresh-baked dog biscuits and pet treats].
[link|http://DocHope.com|http://DocHope.com]
New I've encountered that problem.
I didn't have the patience to figure out a solution in SQL, but could do it another way - one of my favourite solutions is temporary tables, but they bring their own problems.

We had Oracle on the cards a few months ago and so I went [link|http://troels.arvin.dk/db/rdbms/|researching compatibility]. I was surprised that Oracle had so many gotchas. But then there's a certain arrogance from Oracle about The Way To Do Databases, isn't there? :-)

Wade.
"Insert crowbar. Apply force."
New Justifiable arrogance in many cases
Much of what I do with analytic queries you simply Would Not Want to try with MySQL.

Ditto if you are paranoid about your data.

However there is no question that MySQL wins hands down on cost and ease of installation. Its raw performance for simple things ain't shabby either. (On complex queries, Oracle wins again.)

If you want a more detailed comparison, Barry is the one to talk to.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Oh I have little doubt about that.
However, I neglected to mention I was limiting my comment to their brand of SQL. :-)

I'm beginning to think we're coming up against some of MySQL's more subtle and complex limits. I wish I could engineer to time to try my app in PostGres... Oracle is not an option for us, financially.

Wade.
"Insert crowbar. Apply force."
New Oracle is picky but...
its brand of SQL is more flexible once you get used to it.

That is, there are not so many shortcuts. But you can do more with it. (Though MySQL is one by one removing its limitations. There is no question that Oracle is not the future of databases.)

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
     Left Outer Join question - (ChrisR) - (20)
         One approach - (ben_tilly) - (6)
             Some variation on MIN... - (ChrisR) - (3)
                 It probably will not optimize - (ben_tilly) - (2)
                     Yep, took forever to run. - (ChrisR) - (1)
                         Nothing non-standard about that - (drewk)
             Pre indentation - (ChrisR) - (1)
                 Bug, indentation can be lost during edits (new thread) - (ben_tilly)
         Does SQL Server support "DISTINCT ON"? - (drewk) - (12)
             That is really what I need! - (ChrisR)
             The more readable version is invalid (at least in Oracle) - (ben_tilly) - (10)
                 Hmm, both Postgres and MySQL support the same non-standard? - (drewk) - (9)
                     Can you please help me find mine? - (folkert)
                     That is a MySQL shortcut - (ben_tilly) - (7)
                         Not surprising - (drewk) - (2)
                             Or you didn't double-check enough - (ben_tilly) - (1)
                                 Also possible - (drewk)
                         I've encountered that problem. - (static) - (3)
                             Justifiable arrogance in many cases - (ben_tilly) - (2)
                                 Oh I have little doubt about that. - (static) - (1)
                                     Oracle is picky but... - (ben_tilly)

Fighty Bits: Yay!
Talkie Bits: SUCK!!
95 ms