IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New I just did a deduping project kind of like that
Smaller scale, but the approach that I used was to use our existing contract with mapquest to have them geocode all of the addresses, then I matched them up by latitude/longitude. (We already have custom software to make those requests from mapquest.)

It took, I think, 19 hours to process just over 100,000 addresses, and they successfully geocoded a bit over 98% of them.

Barry probably has a better approach to do the same thing, but this worked for a one-off. (Given the existing contract and mapquest code...)

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Thanks for the tip!
Did some searching for geocoding on CPAN and ended up [link|http://geocoder.us|here]. Looks useful so far, going to run some tests with it tomorrow. I'm thinking about geocoding our entire database, seems like a great idea.
--
Steve
[link|http://www.ubuntulinux.org|Ubuntu]
     Postal address list cleansing - (Steve Lowe) - (21)
         I just did a deduping project kind of like that - (ben_tilly) - (1)
             Thanks for the tip! - (Steve Lowe)
         do a dump then sort by address. the dupes are identified - (boxley) - (4)
             YM, delete HALF of them... Or NOBODY at that address is left -NT - (CRConrad)
             2 things - (ben_tilly) - (2)
                 so going by lat and long makes more sense, thx -NT - (boxley) - (1)
                     There are apps that standardize addresses - (drewk)
         Here ya go - (broomberg) - (12)
             I think #2 does what I did - (ben_tilly) - (2)
                 Your geocoding process standardized first - (broomberg) - (1)
                     Exactly - (ben_tilly)
             Re: Here ya go - (Steve Lowe) - (8)
                 ObLRPD: "Vote him off the island!" - (Another Scott) - (3)
                     That'd be harder than giving a bath to a bobcat. -NT - (admin) - (2)
                         the trick is... - (cforde) - (1)
                             Talk your talk, wee man. -NT - (admin)
                 Firstlogic match/consolidate is verra nice - (broomberg) - (3)
                     Thanks, having a look. -NT - (Steve Lowe)
                     Re "How much is the cost to mail each duplicate each month?" - (CRConrad) - (1)
                         Exactly - (Steve Lowe)
         Send me a dump of the list in e-mail. - (folkert)

Too bad dark languages rarely survive...
79 ms