IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Keys...
The keys of a hash are always strings. If you use a reference to a complex data structure as a hash key, it won't store the data structure in the hash, just a string identifying the data structure. Depending on how you use the hash this may or may not matter to you. Usually ont. (It would matter if you're using the keys of the hash to try to remove duplicates, you'll remove duplicates but you'll also stringify things.)


You can also use the Tie module, to store other things as the key.
New Tie is NOT a module (and it sucks)
It is a built-in to the language allowing an object with the right methods to masquerade as a native data structure. To get the object with the right methods you would write a module, but the tie itself is not implemented as a module. People get all worked up about this, but it is really a simple concept. The one that you're thinking of would be written like this (untested):
\n  package Tie::Original::Keys;\n  use strict;\n\n  sub TIEHASH {\n    my $class = shift;\n    my $self = bless {\n      hash => {},\n      original_key => {},\n    };\n    while (@_) {\n      $self->STORE(splice @_, 0, 2);\n    }\n    return $self;\n  }\n\n  sub FETCH {\n    my $self = shift;\n    my $key = shift;\n    return $self->{hash}{$key};\n  }\n\n  sub STORE {\n    my $self = shift;\n    my $key = shift;\n    my $value = shift;\n    $self->{hash}{$key} = $value;\n    $self->{original_key}{$key} = $key;\n  }\n\n  sub DELETE {\n    my $self = shift;\n    my $key = shift;\n    delete $self->{hash}{$key};\n    delete $self->{original_key}{$key};\n  }\n\n  sub CLEAR {\n    my $self = shift;\n    $self->{hash} = {};\n    $self->{original_key} = {};\n  }\n\n  sub EXISTS {\n    my $self = shift;\n    my $key = shift;\n    return exists $self->{hash}{$key};\n  }\n\n  sub FIRSTKEY {\n    my $self = shift;\n    # reset each() iterator\n    my $a = keys %{$self->{hash}};\n    return each %{$self->{hash}};\n  }\n\n  sub NEXTKEY {\n    my $self = shift;\n    return each %{$self->{hash}};\n  }\n\n  sub SCALAR {\n    my $self = shift;\n    return scalar %{$self->{hash}};\n  }\n\n  1;\n

after you've done all of that you can write things like:
\n  tie my %is_seen, 'Tie::Original::Keys';\n  $is_seen{$_} = 1 for @non_unique;\n  my @unique = keys %is_seen;\n

and the stringification issue that I named is gone.

That's well, fine and dandy. But there are a host of problems with it.

  1. Using tie has a huge performance overhead. Here is a much faster solution to the above problem:
    \n  my %original_key;\n  $original_key{$_} = $_ for @non_unique;\n  my @unique = values %original_key;\n

    So you see that the technical note about keys is just that, a technical note. Sometimes you need to know it, but if it matters to you, it is easily worked around.
  2. Using tie is horribly confusing to people. That is because the language sets up strong expectations that native data structures really are native data structures, and now you violate those expectations. Without those expectations it would be a very simple idea, with those expectations there is considerable surprise.
  3. Historically tie has been a bit buggy. A fair fraction of the bugs that I know of in Perl involve tie in one way or another. That is because of the implementation, where every internal function in the API checks for whether a data structure has "magic" associated with it, and if it does it does something based on that magic. So the implementation is scattered throughout procedural code. And there are bugs. (It should work pretty well now though.)

Now one personal note. Many people seem to think that tie is somehow "very cool" and is a sign of really interesting stuff being exposed from within Perl. It isn't.

Tie is a band-aid for a self-inflicted wound.

Perl goes through a lot of work to make a specific set of data structures available to you. It is a well-chosen set; it is surprisingly hard to find an algorithm in which the naive implementation in Perl using those data structures (particularly hashes where they make sense) is not the same as the sophisticated algorithm. (Different constants though.) However sometimes they are not what you need. When they are not, then you have to rewrite a lot of code to be able to get custom data structures (aka objects) that do exactly what you want to. Or else you can use tie and avoid a lot of the rewriting.

For an example of how to solve the same problem by not creating it in the first place, see Ruby. Everything is an object. The objects for native data types are accessed in the same way that your user objects are. If you want something that is the same as a native data type only slightly different, you can just write your own object for it. If you want something that is the same as a native data type but has extra capabilities that is also easy - just have a new method. (In Perl you wind up having to write crap like tied(%foo)->some_method_call(); Ugh.)

Now you can object that a native data type has a lot of behaviour. Implementing all of that is hard. Sure you could, but would you in practice?

Well it turns out that you only have to do the heavy lifting once. Ruby supports mixins, that make it possible to support a complex API from a simple one. So you can exactly parallel what Perl does. You implement several base methods, mixin the right class, and now you support the full API of a native datatype. Only it doesn't look like magic when you do it, because the idea fits into the language as a whole rather than being a hacked-on piece of "magic".

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
     One line description of data model for Perl - (Arkadiy) - (33)
         Yep, you got it right. - (admin) - (1)
             You're not helping - (Arkadiy)
         Perl: Everything is... - (ChrisR)
         Over simplifying - (broomberg) - (1)
             linquistic versus data personalities - (tablizer)
         To pick up from what Barry said. - (static) - (4)
             No, everything is whatever Barry needs it to be - (Arkadiy) - (3)
                 The logic - (ben_tilly) - (2)
                     Keys... - (Simon_Jester) - (1)
                         Tie is NOT a module (and it sucks) - (ben_tilly)
         Well, Lisp - everything is a list or an atom... - (Simon_Jester) - (18)
             Atonm, list, hash... - (Arkadiy) - (17)
                 References are essentially pointers - (broomberg) - (7)
                     I am not trying to build anything in particular just now - (Arkadiy) - (6)
                         some answers - (cforde) - (5)
                             OK, another arbitrary distinction to remember - (Arkadiy) - (4)
                                 It's easy enough to test... - (Simon_Jester) - (2)
                                     Yes it is easy to test. - (Arkadiy) - (1)
                                         Testing has some disadvantages.... - (Simon_Jester)
                                 It isn't arbitrary - (ben_tilly)
                 References: pointers in languages that don't have pointers -NT - (FuManChu)
                 okay...look at it this way.... - (Simon_Jester)
                 PERL DOES NOT STORE LISTS!!! - (ben_tilly) - (6)
                     OK, in that case, what is (a,b,c) ? - (Arkadiy) - (5)
                         In which context? - (ben_tilly) - (4)
                             In the context of grammar and syntax - (Arkadiy) - (3)
                                 Simple answer: there is no syntactic difference - (ben_tilly) - (2)
                                     OK, I think I get it. - (Arkadiy) - (1)
                                         Yup, sounds like you've got it - (ben_tilly)
         lets try another viewpoint - (daemon)
         Sorry for not responding in this thread earlier - (ben_tilly) - (2)
             No worries. - (Arkadiy) - (1)
                 :-) -NT - (ben_tilly)

The pursuit of balance can create imbalance because sometimes something is true.
73 ms