IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Comparing Big Lists in Perl
I have two lists of things.

@old and @new.

Each list item is itself a list. Let's say, for argument's sake, an item looks like this (comma separated for human readability):
\nM1/0001A, 30/2/001/002, "somestring"\ngeog      elec          yadda\n

The difference between @old and @new is that geog hasn't changed (and is unique), but I need to examine the last two elements of elec (in this case, /001/002). I don't care about somestring.

At the moment I'm trying to store these things in a hash keyed on geog. As usual, I think I'm making this somewhat more complex than I need to.

Would a better solution be to just split each line and push it onto a list, then chop up elec on demand?

Also, is there any non-hash-based method for doing this that isn't going to suck?


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New I'd probably do something like this
Note: I would not store the whole record if i was tight on memory.

\n#!/usr/bin/perl -w\n\nuse strict;\nuse Data::Dumper;\n\n\nmy @list_a = ( \n\t\t\t\t\t[ 'M1/0001A', '30/2/001/002', 'somestring_a_0' ] ,\n\t\t\t\t\t[ 'M2/0001A', '30/2/001/222', 'somestring_a_1' ] ,\n\t\t\t\t);\n\nmy @list_b = ( \n\t\t\t\t\t[ 'M1/0001A', '30/2/011/002', 'somestring_b_0' ] ,\n\t\t\t\t\t[ 'M2/0001A', '30/2/001/222', 'somestring_b_1' ] ,\n\t\t\t\t);\n\n\ncompare_lists(\\@list_a, \\@list_b);\n\n\nsub compare_lists{\n\tmy ($list_ref_a, $list_ref_b) = @_;\n\n\tmy @fields = qw/geog elec yadda/;\n\n\tmy (%h1);\n\n\tforeach my $ref (@{$list_ref_a}){\n\t\tmy %rec;  \n\t\t@rec{@fields} = @{$ref};\n\t\t($rec{tail}) = $rec{elec} =~ m{([^/]+/[^/]+)$}; # match the last 2 pieces\n\t\t$h1{$rec{geog}} = \\%rec;\n\t}\n\n\tprint Dumper (\\%h1);\n\n\tforeach my $ref (@{$list_ref_b}){\n\t\tmy %rec;  \n\t\t@rec{@fields} = @{$ref};\n\t\t($rec{tail}) = $rec{elec} =~ m{([^/]+/[^/]+)$}; # match the last 2 pieces\n\n\t\tif (defined($h1{$rec{geog}})){\n\t\t\tif ($h1{$rec{geog}}->{tail} ne $rec{tail}){\n\t\t\t\tprint " Different: $h1{$rec{geog}}->{tail} ne $rec{tail}\\n";\n\t\t\t} \n\t\t}\n\t}\n\n}\n


This produces:

\n$VAR1 = {\n          'M2/0001A' => {\n                          'yadda' => 'somestring_a_1',\n                          'tail' => '001/222',\n                          'elec' => '30/2/001/222',\n                          'geog' => 'M2/0001A'\n                        },\n          'M1/0001A' => {\n                          'yadda' => 'somestring_a_0',\n                          'tail' => '001/002',\n                          'elec' => '30/2/001/002',\n                          'geog' => 'M1/0001A'\n                        }\n        };\n Different: 001/002 ne 011/002\n
New Thanks
Yes. Better. Much :-)

This produces a Z feature request: A syntax colourising weecode.


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New ROFL
Uh huh... :-D
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Why rofl?
It'd rock.


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New That's not the issue.
Of course it would rock.

Which syntax? And who writes the parser to figure out what needs highlighting?

The ROI isn't high on that one.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New All of them. You.
There. Not so hard, was it?


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New So, you want EMACS in *Z*?
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey

[link|http://it.slashdot.org/comments.pl?sid=134485&cid=11233230|"Microsoft Security" is an even better oxymoron than "Military Intelligence"]
No matter how much Microsoft supporters whine about how Linux and other operating systems have just as many bugs as their operating systems do, the bottom line is that the serious, gut-wrenching problems happen on Windows, not on Linux, not on Mac OS. -- [link|http://www.eweek.com/article2/0,1759,1622086,00.asp|source]
New *I* can't think of a single *valid* objection.


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New No, gvim
But gvim does get confused occasionally.
Page-Up / Page-Down usually fixes it.
New You're being too lazy
Let me outline how to do it in such a way that you'll find most of your work already done for you.

Install gvim. Write a small utility to save the current file to a filename with an extension indicating the current filetype. (.html, .pl, .c, etc) Then run the following shell command:
\ngvim -f +"syn on" +"run\\! syntax/2html.vim" +"wq" +"q" $filename\n

Now read back $filename.html. Post-process that slightly if you want.

(You may not need to escape ! depending on how you execute this - in bash I need it.)

This will work somewhat better if gvim has access to X. (According to the documentation it does a better job of picking colors, whatever that means.)

Voila!

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Au contraire
I'm more interested in the system being self-contained. With such a constraint, implementing this suggestion entails quite a bit more work than "shell out to gvim"...
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New You can still leverage the effort though...
Many editors have syntax highlighting files.

If you write a parser that parses some editor's syntax highlighting files and then displays based on that, then you at least don't have to write your own syntax files - you just need the parsing and the display code.

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New "just"...
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Compared to what you save...
the effort that you'd have to expend is the merest trifle. :-P

Cheers,
Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Compared to doing nothing and laughing at Peter...
It's a lot of work, and not nearly as satisfying...
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Don't forget the pretty print feature as well...
...to fix the indentation the way it should be. :-)
New Here's the whole (working, ugly) program
#!/usr/bin/perl -w\n# checksigs.pl\n# 1.0.0.0 pw 18-03-05\n# Check PCO signals data has been correctly merged into RCC\n# Usage: checksig <pco_tree_path> <rcc_data_dir>\nuse strict;\nuse File::Find;\nuse Data::Dumper;\n\n# Check for incorrect invokation and populate parameters\nmy $argc = scalar @ARGV;\ndie "Usage: checksig <pco_tree_path> <rcc_data_dir>\\n" unless $argc == 2;\nmy $pco_dir = $ARGV[0];\nmy $rcc_dir = $ARGV[1];\nmy @pco_sigs;     # array to contain PCO signals data\nmy @rcc_sigs;     # array to contain PCO signals data\nmy $sig_array;    # Reference to point to required destination array\nmy $debug = 1;    # debug flag: set to 1 to enable debug output\nmy $sig_count;    # count of processed signals. Info only.\n\nsub debug\n{\n\tif ($debug)\n\t{\n\t\tmy $msg = shift;\n\t\tprint "DEBUG: $msg\\n";\n\t}\n}\n\n# utility functions\nsub is_lit_device    \n{\n\n\t# this function will determine if a device is one of those\n\t# transferred to the LIT\n\treturn 1;\n}\n\nsub process_file\n{\n\tmy $file = $_;\n\tmy $dir  = $File::Find::dir;\n\tif ( $file =~ /^DEVICE.SIG/i )\n\t{\n\t\topen DEVICE, $file || die "Cannot open $file: $!\\n";\n\n\t\t# read the signals into the pco_sig hash\n\t\tdebug("processing $file in $dir");\n\t\twhile (<DEVICE>)\n\t\t{\n\t\t\tnext if (/^!/);    # skip comments\n\t\t\tmy @line = split(/,/);    # split line at commas\n\n\t\t\t#\t\t\tmy $geo_addr = $line[0];      # get the geog addr\n\t\t\t#\t\t\tmy @ele_addr =\n\t\t\t#\t\t\t  split( ///, $line[1] );    # split elec addr at /\n\t\t\t#\t\t\tmy $tpr = $ele_addr[2];       # get the TPR\n\t\t\t#\t\t\tmy $lnk = $ele_addr[3];       # get the elec addr on the link\n\t\t\t# add device to hash, omitting LIT devices\n\t\t\tpush @$sig_array, \\@line;\n\t\t\t$sig_count++;\n\t\t}\n\t\tclose DEVICE;\n\t}\n}\n\nsub compare_lists\n{\n\tmy ( $list_ref_a, $list_ref_b ) = @_;\n\tmy @fields = qw/geog elec params/;\n\tmy (%h1);\n\tforeach my $ref ( @{$list_ref_a} )\n\t{\n\t\tmy %rec;\n\t\t@rec{@fields} = @{$ref};\n\t\t( $rec{tail} ) =\n\t\t  $rec{elec} =~ m{([^/]+/[^/]+)$};    # match the last 2 pieces\n\t\t$h1{ $rec{geog} } = \\%rec;\n\t}\n\n\t#\tprint Dumper (\\%h1);\n\tforeach my $ref ( @{$list_ref_b} )\n\t{\n\t\tmy %rec;\n\t\t@rec{@fields} = @{$ref};\n\t\t( $rec{tail} ) =\n\t\t  $rec{elec} =~ m{([^/]+/[^/]+)$};    # match the last 2 pieces\n\t\tif ( defined( $h1{ $rec{geog} } ) )\n\t\t{\n\t\t\tif ( $h1{ $rec{geog} }->{tail} ne $rec{tail} )\n\t\t\t{\n\t\t\t\tprint\n" Different: $rec{geog} : $h1{$rec{geog}}->{tail} ne $rec{tail}\\n";\n\t\t\t}\n\t\t}\n\t}\n}\n\n# Traverse the $pco_dir tree, process entries with the process_file sub\n$sig_array = \\@pco_sigs;\nfind( &process_file, $pco_dir );\ndebug "added $sig_count signals to PCO list";\n$sig_count = 0;\n$sig_array = \\@rcc_sigs;\nfind( &process_file, $rcc_dir );\ndebug "added $sig_count signals to RCC list";\nmy $pco_ref = \\@pco_sigs;\nmy $rcc_ref = \\@rcc_sigs;\ncompare_lists( $pco_ref, $rcc_ref );\n


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
Expand Edited by pwhysall March 21, 2005, 09:19:51 AM EST
New Suggestion
\nopen DEVICE, $file || die "Cannot open $file: $!\\n";\n


Either:
open (DEVICE, $file) || die "Cannot open $file: $!\\n";
or
open DEVICE, $file or die "Cannot open $file: $!\\n";

but NEVER

open DEVICE, $file || die "Cannot open $file: $!\\n";


The "or" has lower precedence. You can get into a situation where logic on the right hand side of the || gets executed before the open. Very bad habit.
New Thanks.
That's just the sort of thing I'm not really very aware of. Well, that and the small issue of "writing good Perl".


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New My pleasure
I figure I'm working off my mail seed fee.
New sig_array global bad.
Just send the \\@pco_sigs and \\@rcc_sigs to the process file function.
     Comparing Big Lists in Perl - (pwhysall) - (21)
         I'd probably do something like this - (broomberg) - (20)
             Thanks - (pwhysall) - (14)
                 ROFL - (admin) - (12)
                     Why rofl? - (pwhysall) - (11)
                         That's not the issue. - (admin) - (10)
                             All of them. You. - (pwhysall) - (3)
                                 So, you want EMACS in *Z*? -NT - (folkert) - (2)
                                     *I* can't think of a single *valid* objection. -NT - (pwhysall)
                                     No, gvim - (broomberg)
                             You're being too lazy - (ben_tilly) - (5)
                                 Au contraire - (admin) - (4)
                                     You can still leverage the effort though... - (ben_tilly) - (3)
                                         "just"... -NT - (admin) - (2)
                                             Compared to what you save... - (ben_tilly) - (1)
                                                 Compared to doing nothing and laughing at Peter... - (admin)
                 Don't forget the pretty print feature as well... - (ChrisR)
             Here's the whole (working, ugly) program - (pwhysall) - (4)
                 Suggestion - (broomberg) - (2)
                     Thanks. - (pwhysall) - (1)
                         My pleasure - (broomberg)
                 sig_array global bad. - (broomberg)

Gloat.
146 ms