Not quite sure what that comment is saying based on that.
And how weird is that word? Yes, I know it's an acronym, but at this point, we "say" it.
Is python TIMTOWTDI?
Not quite sure what that comment is saying based on that.
And how weird is that word? Yes, I know it's an acronym, but at this point, we "say" it. |
|
No, it's the opposite.
Well, not entirely: usually there is more than one way to do anything, but with Python there is generally one best way and then all of the hacks.
Regards,
-scott Welcome to Rivendell, Mr. Anderson. |
|
Good, that's what I want.
I'll try to get him posting here for ongoing advice.
Is it reasonable to assume that anything I come up with as an example perl script that is running on my local ubuntu desktop will also be implementable in a python script? As long as we don't depend on any CPAN based drivers. For example, we just write a program that takes a variable number of command line arguments. The arguments are file names. The goal of the program is to parse and store the ids of several files, then run a final file though and test if the final file matched any of the previous files (at the ID level), and output SOME of the final file (records that have IDs that matched), and output it in a different order depending on which of the input files it matched. The files can be either tab, pipe, or CSV delimited. We do not really parse them out fully, though, since the ID can be pulled via a REGEXP from all the files. The code exercises REGEXPs, splits, joins, multi-level hashes, REFs, and refs of arrays in anonymous hashes, for storing and later lookup, and the perl idiom of pushing into an element into an array, which required the obscure casting syntax. The program reads the files and stores ONLY the primary key of each file into a file specific hash for a quick truth test later. We then read STDIN. We parse the ID from each record and see if the ID was in any of the previous files. We store each record (the whole thing, which then leads to Perl/DBM/Tie discussion for when we run out of memory and need to start using disk) into an array that is specific to the matching previous file. We then output all the records that we read via STDIN that matched ANY of the named files (some didn't), and we output them in the order of the previous files that they matched. That means all the records that matched the 1st file are output, then all the records that matched the 2nd file, etc, until done. Essentially a 15 line perl script the way I write it, maybe 5 for BT. Knowing no python at all, but with my Perl background pushing implementation directions (possibly poorly), how long should it take for us to write this together in python? Wanna give an pseudo code outline so we do it the "right" way? |
|
Data examples?
The problem with giving you Python pseudocode is that Python is pseudocode that runs... ;-)
Post the Perl REs if you want: Python can use them. Regards,
-scott Welcome to Rivendell, Mr. Anderson. |
|
Here's the code
Note: Lots of silly assignments to tmp vars.
This is so we can dump it using data::dumper. And the backslashes got lost on the post.
|
|
Re: Here's the code
#!/usr/bin/python Also, if you don't want to add a number to the command line and don't like the stdin trick I used, just use: def read_files(): Regards,
-scott Welcome to Rivendell, Mr. Anderson. |
|
Thanks
|
|
How do *you* pronounce it? Me: tim-TOW-te-dee
--
Drew |
|
Re: How do *you* pronounce it? Me: tim-TOW-te-dee
TIM-TOW-DEE
|