Post #324,570
4/13/10 8:38:58 AM
|
Dumb SQL question
I've got a table listing subscribers. It contains subscriber_ID, list_ID, email, and IP. I need to update all instances where list_ID = foo and IP = bar to the IP for the same email where list_ID = baz.
Here's what I tried:
update
subscribers s1
set IP = (
select IP
from subscribers s2
where
s2.email = s1.email
and r2.list_ID = 'baz'
)
where
s1.IP = 'bar'
and s1.listID = 'foo';
MySQL responds:
ERROR 1093 (HY000): You can't specify target table 's1' for update in FROM clause
MySQL doesn't allow updating a table that is also used in a subquery. I know there's a workaround, but for the life of me I can't remember what it is.
--
Drew
|
Post #324,583
4/13/10 11:23:53 AM
|
My lazy way
Do a temp table build, then update from that.
I hate any type of correlated subquery and will move away from them unless required (for a particular poor transaction engine).
|
Post #324,587
4/13/10 1:26:55 PM
|
You have ...
..."non lazy" ways? ;-) (nudge)
I will choose a path that's clear. I will choose freewill.
|
Post #324,589
4/13/10 2:05:59 PM
|
Depends
If I needed to update 50MM records in oracle, and I was worried about rollback / undo segments, and I was worried about speed, I'd do an PL/SQL bulk update in batches of 50,000 records.
Of course, it took me about 3 days to figure out an acceptable method of easyness and speed. I'd then write a Perl scripts to generate the PL/SQL and run it, wrapped in some decent reporting. Which then turned it into a simple perl script that said:
ora_update.pl --target_table=emp --source_table=emp --tmp_table=tmp_data --from_field=source_data --to_field=dest_data --join_key=emp_idx
Then I'd never need to think about it again, at least until someone asks how I'd do it.
You get the idea.
|
Post #324,594
4/13/10 3:10:33 PM
|
Maybe not so simple then
I assumed this was an easy thing to do. Guess I'll do the temp table. This is a one-time maintenance job (yes, really) so I won't need something fast or repeatable.
--
Drew
|
Post #324,617
4/13/10 8:17:00 PM
|
Did you try a self-join?
It feels wierd the first time you do it, but when it works you start wondering what all the fuss was about.
update subscribers s1 inner join subscribers s2 on s1.email = s2.email and s2.list_ID = 'baz' set s1.IP = s2.IP where s1.IP = 'bar' and s1.listID = 'foo'
Wade.
Q:Is it proper to eat cheeseburgers with your fingers? A:No, the fingers should be eaten separately.
|
Post #324,620
4/13/10 9:23:05 PM
|
Not yet, but about to
This is what happens when you don't write code for a while. You forget the elegant way.
--
Drew
|
Post #332,623
9/15/10 8:07:41 PM
|
And then there is the (ugly) semi-manual way
OK, I know this is half a year old and solved already, "done and dusted". Just thought I'd contribute this other technique that can be useful in all situations like this: Run the Select first, written to give you a bunch of single-line update statements; then take the output from your select and run that. Something like, say:
select 'update subscribers set IP = ' || IP || ' where email = ' || email || ' and list_ID = ''foo'';'
from subscribers where IP = 'bar' and listID = 'baz';
Here's the "(semi-)manual" bit: Now copy the lines from your SQL output window into your SQL input window and click "Run as a script" (or whatever your equivalent is), and you're done. Heck, you can make it even more manual if you want to: Having trouble getting those apostrophes/single-quotes right around the "foo" bit? Just write "*&%#@*" in stead of ''''foo''' in the select statement, run it, then paste the whole resulting shebang into a text editor and do a global search-and-replace (replace "*&%#@*" with "'foo'"), before pasting the corrected result into your SQL command window and running it.
Not for the Crazed One's 500 M rows, of course, but for anything up to a few thousand...? Heck, yeah -- what you lose in execution time is more than made up for by the coding time you save by simplifying the task this way. It's "Old-school hot-rodding applied to SQL RDBMS": It's an ugly kludge, but it works, and it's fast. :-)
--
Christian R. Conrad
Same old username (as above), same old GMail.
|
Post #332,624
9/15/10 8:14:45 PM
|
This is also good for invasive updates.
You know the sort: lots of rows to change and no downtime available. The advantage is that lots of small UPDATE statements won't prevent the site running normally.
Wade.
Q:Is it proper to eat cheeseburgers with your fingers? A:No, the fingers should be eaten separately.
|
Post #332,630
9/15/10 9:12:21 PM
|
That's my kind of ugly
I've done more maintenance that way than I care to remember. (And obviously more than I did remember at the time I was asking.)
--
Drew
|
Post #332,632
9/15/10 9:17:00 PM
|
It's also an interesting lesson in pragmatism.
Whilst it has it's own kind of elegance, a lot of purist programmers will cringe at this solution. They would prefer Leveraging The Power Of SQL directly, which ignores, however, the real-world issue that the database can't be unavailable for an hour whilst your big, clever query runs to completion...
Wade.
Q:Is it proper to eat cheeseburgers with your fingers? A:No, the fingers should be eaten separately.
|
Post #332,635
9/15/10 9:27:26 PM
|
I've done round trips mySQL -> vim -> PHP -> mySQL
Run a query and grab the output.
Search/replace in vim.
Save as a PHP script that issues updates in batches of n.
I've done this to move sites to new servers when the respective mySQL boxes couldn't talk to each other.
--
Drew
|
Post #332,641
9/15/10 11:25:34 PM
|
Another batching method
One thing I've done when I had to update a lot of live records and I wasn't concerned with rolling them back is to create an update that works in little chunks. Use whatever option your SQL system has to limit the number of rows updated/deleted, such as top in TSQL. Then stick the whole thing in a while loop that does the SQL, sleeps for a few seconds and then loops as long as their are any rows to process.
Here is one I used to delete out all the detail records for one very large contract from a live system. Slightly altered to protect the innocent. The loop breaks when it runs out of records because @@ROWCOUNT records how many records the last update/insert/delete affected in TSQL.
While (1=1)
Begin
delete top (10000) from spot where contract_id = 1234;
if @@ROWCOUNT = 0 break;
-- wait 10 second between loops to relieve congestion on the server
WAITFOR DELAY '00:00:10';
End
Not exactly the same problem. I did this because doing a delete more then a million rows at once was just clogging up the transaction log and slowing everything to crawl. However, with a bit of string manipulation and exec you could use this solution to scale your method up to the really big problems.
Jay
|