Why I Needed Something Other Than "grep"

1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User

Welcome to IWETHEY!

Post #197,595 by pwhysall 3/8/05 3:08:44 AM Reply	Why I Needed Something Other Than "grep" I have a list of 1500 things. They're arrays. The reason they're not in a hash is because there are duplicate keys. I need to locate these duplicates and act on them in some way. My first loop through the file creates a hash keyed on the potentially duplicate value. I add one to each hash member. When I'm done, I then iterate over the hash and use the keys with values greater than one to create a list of duplicate items. I then iterate through the original file and if the current item is in the list of duplicates, I execute some logic to decide what to do with it. As you can imagine, I wanted to avoid (for performance reasons) iterating over 1500 items (each iteration involves splitting a string and performing a regex match and making a decision on it) for, potentially, 1500 items. And the 1500 items may actually grow :-) Peter [link\|http://www.ubuntulinux.org\|Ubuntu Linux] [link\|http://www.kuro5hin.org\|There is no K5 Cabal] [link\|http://guildenstern.dyndns.org\|Home] Use P2P for legitimate purposes!
Post #197,612 by ben_tilly 3/8/05 9:31:10 AM Reply	You may only need one loop... As you're walking the list the first time to create the hash, when you come to a duplicate the second time, you know that it is a duplicate. This may be good enough. (But won't be if you want to treat the first duplicated item differently as well.) Cheers, Ben I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
Post #197,664 by pwhysall 3/8/05 12:20:43 PM Reply	And there's the rub. I don't know ahead of time which of the two duplicates it will be; they require different processing. Peter [link\|http://www.ubuntulinux.org\|Ubuntu Linux] [link\|http://www.kuro5hin.org\|There is no K5 Cabal] [link\|http://guildenstern.dyndns.org\|Home] Use P2P for legitimate purposes!
Post #197,722 by broomberg 3/8/05 7:32:37 PM Reply	Post the code Too many words. Show me the code.

Perl array question - (pwhysall) - (20) - March 7, 2005, 09:51:52 AM EST

if (grep {$_ eq $thing} @stuff) {...} - (ben_tilly) - (11) - March 7, 2005, 11:08:08 AM EST

Thanks - (pwhysall) - March 7, 2005, 12:04:52 PM EST

grep is cool... - (admin) - (5) - March 7, 2005, 12:08:15 PM EST

Sure beats the old way - (FuManChu) - March 7, 2005, 01:02:55 PM EST

my %seen; # Not $seen = (); - (ben_tilly) - (3) - March 7, 2005, 02:53:31 PM EST

*shrug* I got that from the Perl cookbook. - (admin) - (2) - March 7, 2005, 04:22:17 PM EST

Yes it does work. Once. - (ben_tilly) - (1) - March 7, 2005, 04:28:56 PM EST

Nope, you're right. - (admin) - March 7, 2005, 04:38:02 PM EST

Why I Needed Something Other Than "grep" - (pwhysall) - (3) - March 8, 2005, 03:08:44 AM EST

You may only need one loop... - (ben_tilly) - (1) - March 8, 2005, 09:31:10 AM EST

And there's the rub. - (pwhysall) - March 8, 2005, 12:20:43 PM EST

Post the code - (broomberg) - March 8, 2005, 07:32:37 PM EST

Re: Perl array question - (admin) - (7) - March 7, 2005, 11:14:54 AM EST

woooo.... - (folkert) - March 7, 2005, 11:22:44 AM EST

"other"? - There's another? -NT - (broomberg) - (5) - March 7, 2005, 07:16:12 PM EST

Yes. Ruby. -NT - (ben_tilly) - (4) - March 7, 2005, 07:40:56 PM EST

No, that's the *other* other one. -NT - (admin) - (3) - March 7, 2005, 07:42:42 PM EST

You like TCL?? -NT - (ben_tilly) - (2) - March 7, 2005, 07:59:55 PM EST

Not without a ring, Mister. -NT - (admin) - (1) - March 7, 2005, 08:00:21 PM EST

You prefer to take DOS to bat? -NT - (ben_tilly) - March 7, 2005, 08:28:34 PM EST

iwethey.org

Whoze Kewl, whoze tepid and whoze nonexistent.
96 ms