IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New if (grep {$_ eq $thing} @stuff) {...}
For some definition of quickest, this is it. One line and you have the answer.

Algorithmically this isn't very effient though, you walk the whole array, O(n) per test. If you're going to be repeating this test a lot of times for the same array, then you can build a hash and test hash membership. The rule of thumb that I use based on a very old benchmark is that it is worthwhile for 7 or more membership tests. That's probably wrong in detail, but the principle is right: building the hash is O(n) (with a worse constant) but after that testing membership is O(1).

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Thanks
I actually went with the sub given on the Perl FAQ :-)

But thanks anyway.


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New grep is cool...
My favorite is still
$seen = ();\n@uniq = grep { !$seen{$_}++ } @stuff;


This is somewhat nifty in Python 2.4, if you don't care about the order of the unique list:
>>> stuff = (1,2,3,4,3,4,5,6,45,6,7,8,9,5,6)\n>>> print set(stuff)\n(1, 2, 3, 4, 5, 6, 7, 8, 9, 45)
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Sure beats the old way
>>> stuff = (1,2,3,4,3,4,5,6,45,6,7,8,9,5,6)
>>> dict.fromkeys(stuff).keys()
[1, 2, 3, 4, 5, 6, 7, 8, 9, 45]

Sets are available in 2.3, by the way, just as a library module instead of a builtin. But you knew that. But not everyone does. ;)
New my %seen; # Not $seen = ();
I assume that you want to be sure that %seen is an empty hash, not that the scalar $seen is set to be undef.

Using strict catches this common typo.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New *shrug* I got that from the Perl cookbook.
And it does work, incidentally.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Yes it does work. Once.
Hashes start off empty. But if you put your code into a subroutine and call it twice, the second time around it will still be populated from the first.

Incidentally I'm willing to bet, sight unseen, that the actual code in the cookbook got this right. :-P

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Nope, you're right.
I misread % as $; the cookbook has it correct.

One more reason not to use Perl: it sucks for people with bad eyes. ;-)
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Why I Needed Something Other Than "grep"
I have a list of 1500 things.

They're arrays. The reason they're not in a hash is because there are duplicate keys. I need to locate these duplicates and act on them in some way.

My first loop through the file creates a hash keyed on the potentially duplicate value. I add one to each hash member.

When I'm done, I then iterate over the hash and use the keys with values greater than one to create a list of duplicate items.

I then iterate through the original file and if the current item is in the list of duplicates, I execute some logic to decide what to do with it.

As you can imagine, I wanted to avoid (for performance reasons) iterating over 1500 items (each iteration involves splitting a string and performing a regex match and making a decision on it) for, potentially, 1500 items.

And the 1500 items may actually grow :-)


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New You may only need one loop...
As you're walking the list the first time to create the hash, when you come to a duplicate the second time, you know that it is a duplicate. This may be good enough. (But won't be if you want to treat the first duplicated item differently as well.)

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New And there's the rub.
I don't know ahead of time which of the two duplicates it will be; they require different processing.


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New Post the code
Too many words.
Show me the code.
     Perl array question - (pwhysall) - (20)
         if (grep {$_ eq $thing} @stuff) {...} - (ben_tilly) - (11)
             Thanks - (pwhysall)
             grep is cool... - (admin) - (5)
                 Sure beats the old way - (FuManChu)
                 my %seen; # Not $seen = (); - (ben_tilly) - (3)
                     *shrug* I got that from the Perl cookbook. - (admin) - (2)
                         Yes it does work. Once. - (ben_tilly) - (1)
                             Nope, you're right. - (admin)
             Why I Needed Something Other Than "grep" - (pwhysall) - (3)
                 You may only need one loop... - (ben_tilly) - (1)
                     And there's the rub. - (pwhysall)
                 Post the code - (broomberg)
         Re: Perl array question - (admin) - (7)
             woooo.... - (folkert)
             "other"? - There's another? -NT - (broomberg) - (5)
                 Yes. Ruby. -NT - (ben_tilly) - (4)
                     No, that's the *other* other one. -NT - (admin) - (3)
                         You like TCL?? -NT - (ben_tilly) - (2)
                             Not without a ring, Mister. -NT - (admin) - (1)
                                 You prefer to take DOS to bat? -NT - (ben_tilly)

Hell Carnate.
122 ms