IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Twitter?
http://www.radicalbe...loper-alex-payne/

How has Ruby on Rails been holding up to the increased load? By various metrics Twitter is the biggest Rails site on the net right now. Running on Rails has forced us to deal with scaling issues - issues that any growing site eventually contends with – far sooner than I think we would on another framework. The common wisdom in the Rails community at this time is that scaling Rails is a matter of cost: just throw more CPUs at it. The problem is that more instances of Rails (running as part of a Mongrel cluster, in our case) means more requests to your database. At this point in time there’s no facility in Rails to talk to more than one database at a time. The solutions to this are caching the hell out of everything and setting up multiple read-only slave databases, neither of which are quick fixes to implement. So it’s not just cost, it’s time, and time is that much more precious when people can['t] reach your site. None of these scaling approaches are as fun and easy as developing for Rails. All the convenience methods and syntactical sugar that makes Rails such a pleasure for coders ends up being absolutely punishing, performance-wise. Once you hit a certain threshold of traffic, either you need to strip out all the costly neat stuff that Rails does for you (RJS, ActiveRecord, ActiveSupport, etc.) or move the slow parts of your application out of Rails, or both.It’s also worth mentioning that there shouldn’t be doubt in anybody’s mind at this point that Ruby itself is slow. It’s great that people are hard at work on faster implementations of the language, but right now, it’s tough. If you’re looking to deploy a big web application and you’re language-agnostic, realize that the same operation in Ruby will take less time in Python. All of us working on Twitter are big Ruby fans, but I think it’s worth being frank that this isn’t one of those relativistic language issues. Ruby is slow.


That's from 2007, so presumably they've found ways around those issues.

http://highscalabili...00-percent-faster has some tidbits on ways they sped it up. That's from 2009.

Supposedly they've moved to Java now - http://www.readwrite...ruby-for-java.php

HTH a little. Good luck!

Cheers,
Scott.
New Ow.
Twitter really gave up on Ruby on Rails and went for Java? I know PHP and several other platforms can easily scale to the same heights as Java, so that's terrible for Ruby on Rails.

Still, they've mentioned all the same things I've learnt about scaling. If RoR lacks multiple database handlers, then that's a REALLY BIG lack. It's probably the first thing I do when adding scalability to a PHP codebase.

Wade.
Static Scribblings http://staticsan.blogspot.com/
New PHP vs Java scaling
The only way PHP and the other scripting languages scale like Java is by going to a shared-nothing sharded approach, vs. a typical clustered Java approach. However, none of them can touch Java when it is used in the same fashion.

Architecture can fix a lot of ills, but eventually at the high end (like Twitter, Facebook, etc) you'll reach a point where the increment of a faster language across your stack is no longer more expensive than just buying more machines.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New +5, Informative.
Static Scribblings http://staticsan.blogspot.com/
New I believe they went to Scala on the JVM
Why Use Scala?
Why use Scala when you have Ruby and Ruby on Rails? Well, we still use Rails. It works great for front-end stuff. The productivity is worth the tradeoff for working in a slower-performing dynamic language. When you think about what a web framework is doing under the hood, it’s tons and tons of string concatenation. Ruby on Rails can handle that.

What we had a need for as Twitter grew was for long-running heavy processes, message-queuing, caching layers for doing 20,000 operations a second. Ruby garbage-collection is tough, Ruby doesn’t do really well with long-running processes.


http://blog.redfin.c...r_uses_scala.html

Do some searches on Scala and Twitter. I agree with Scott, the servers must do a shared-nothing approach and they probably won't use the back-end of Rails ORM as much (ActiveRecord).

Have fun.
Expand Edited by S1mon_Jester Aug. 31, 2011, 11:32:51 AM EDT
New different type of application, easy to shard
index on uids transactions based on financial debits so inline as opposed to one to many.
Hmmm, will be an interesting conversation
Any opinions expressed by me are mine alone, posted from my home computer, on my own time as a free American and do not reflect the opinions of any person or company that I have had professional relations with in the past 55 years. meep
New As BT lectured me a while ago
Paraphrased of course:
===================================================================================
Ruby is GREAT.
Really.
Every time I get a task, I immediately want to do it in Ruby.
And then I work through all the stuff Perl has prebuilt in CPAN, and realise I will have to reimplement it in Ruby, and I realize the project will take forever, so I go code it in Perl.
Even though I don't want to.
===================================================================================
That was bout 5 years ago. Dunno how much Ruby has grown up and libraries / drivers / interfaces added.

New Saw a presentation on Ruby on Rails yesterday.
It's fairly impressive. I'd seen it before, but Rails has developed rather since and the guy presenting was better, too.

I haven't tried Rails myself, but I had a very brief tinker with Ruby years ago when it was new. It strikes me as a language that has borrowed a lot.

Wade.
Static Scribblings http://staticsan.blogspot.com/
New I've used Rails
It reminded me of Visual BASIC: very very easy to do simple things, but then you end up fighting it to do things the way you know they ought to be done.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New That still holds water...
BIG TIME!

Perl just has everything.
New Dunno about everything, but...
Every time I show the boy something new in capability and usage, his eyes light up and he thinks how that particular aspect will affect his life.

It's pretty big. His job (and 90% of those around him) would be dramatically changed if a real Perl coder was in the mix.

And then I say:

That's a tiny little bit of what's there. And you'd never even know it unless I showed it directly to you.

And then he realized the HUGE jump in productivity was really only the 1st step, a tiny little bit, and he has lots to learn, because he wants is ALL.

He's agreed to give it a good 2 years of effort, to continuously learn, rather than learn enough to be simply useful and them move on.

We need to discuss the concepts of idioms and make sure he understands that the language alone is not what it is about, but the way you mix and match it in proven ways.

Also, we need to learn 2 other languages in the same timeframe. Not to learn to be professionally usable, but he MUST get enough exposure to alternative implementations and solutions in other languages to he will have a hint of the variety out there and maintain a "best tool for job" mentality, rather than gets fucked by "baby chick syndrome".

So, what other 2 languages should we learn together?
New I'd say Java and C.
I'd add Python to the list but you only wanted 2. :-)

Cheers,
Scott.
New Re: Dunno about everything, but...
2 is hard... I would say Python (another scripting language, different enough to get the point across: explicit, one way to do it, easy to read, forced style, non-C syntax) and Java (compiled, huge library set, likely to be seen in the wild). If you go with 3: Python, Java, and C/C++ (non-scripting, Verra Low Level, easy to shoot yourself in the foot and reload repeatedly).

Either Java or C will give him a real appreciation of what Perl gives him with respect to productivity. Python will show him a completely different approach to scripting that is just as valid.

Ruby is too much like Perl to be worth it (yet). PHP is a freaking mess and he'll need more to compare it against before getting exposure. No need to go boutique yet (Obj-C, Groovy, Go, Lisp, Smalltalk, etc).

I'd show him Java/C before Python to give him the greatest system shock.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New Well, what they said
but he should really also go for C... not C++. It really is like a portable assembler, and learning how that flies will help him tremendously.
New My assumption
Was that any instruction with C/C++ would start with C and build on from there to the full glory that is C++.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New Speaking of Python... ;-)
Are you still using Python 2.x? I'm still on 2.x as I'm still trying to come up to speed on wxPython and it hasn't been ported to Python 3 yet. Python 3.x supposedly is cleaner, but I haven't looked at it yet.

If one were starting from scratch with Python, I assume it would still be safer to stay with 2.x because of all of the libraries which haven't been ported yet.

Thoughts? Thanks.

Cheers,
Scott.
New I still use 2.x
A good summary: http://wiki.python.o.../Python2orPython3

The main point: write your 2.x code as if you were running in 3, and moving to 3 in the future will be relatively painless.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New Ehhh
K&R C I can handle. Hell, I can teach the course. And low level variable types are an important aspect of programming, even if you don't directly create them in scripting languages. Also, since most OS projects are coded in it (or C++), it'll give him an entry into those if he wants to go there.

But Java? Coded in it for a few months, got paid, never took another Java project again. It annoyed me then, it'll annoy me now. It took me about 3 months of reading (before coding start) before I was the slightest bit useful.

Perl is already C like (ehh, actually Algol like, I already gave him the spiel), along with most of the common languages. As is Java. So I'd want something else, something strange, something that breaks both our heads as we work through the differences. There is NO expectation to become proficient, I just want to give him something to compare against.

I was thinking python or ruby if doing a scripting language, or maybe scheme:
http://en.wikipedia....mming_language%29

Not that I like scheme (f'ing HATE it), but it is used in Gimp for picture manipulation, and he's got an art side it might tickle.

New Consider Haskell
Give him some functional programming paradigm to bite into.
New Actually, I was
I just forgot the name.

I did a tiny bit of Haskell coding a long time ago.
New Other stuff.
You could show him some SNOBOL. It's a string and pattern matching-based language that developed without all of the Unix history. It looks a little like FORTRAN but it is very much not. If he embraces it, he will get a very different appreciation for how regular expressions work that will be valuable.

I found learning Java you need an existing project that uses only a basic level of framework in unsurprising ways. My first Java exposure was to a Struts-based project with some considerably byzantine setup. My second Java exposure was to a Spring-based project and things made a lot more sense.

Wade.
Static Scribblings http://staticsan.blogspot.com/
New What are you trying to teach?
If you want to show him something different from Perl, but still useful, then I stand by teaching him Java. It's annoying, it has stupidities, but it also has a lot of things that C doesn't, and it's a useful contrast. If you really can't stand it, try Scala or Groovy: both can use the Java libraries (which is the best part of the language), and they've both fixed some of the cruft.

Anything other than those and you're teaching him stuff just for the sake of finding something odd that he'll never see again.

You can do functional programming in Perl, Python, or even Java (with a bit of bending) for that matter.

Python is different enough from Perl to be a very good counterexample. Show him generators, continuations, coroutines, list comprehensions, meta classes, and how you can use Perl REs. Read this and then decide whether it's worth showing it to him or not: http://www.dabeaz.co...es/Coroutines.pdf

But if you really are just looking for strange, teach him LISP or Smalltalk. But *really* teach them, idiomatically. Otherwise they'll just annoy.

If you want to teach him useful differences, however, then you'll want to go through languages actually being used in the wild, solving the same problems: build systems, automated testing, all of the things that you have to do to be a real programmer and not just a clever programmer. Hence Java. Annoying as it is, you can learn a lot (good and bad) about building and testing things just by going through what you have to do with the language to make it usable (set up Jenkins with automated testing, code coverage, database creation, artifact repositories, build promotion, and deployment, and tell me you didn't learn something). There's a lot more to it than just writing a script and deploying it.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
Expand Edited by malraux Sept. 2, 2011, 10:28:01 PM EDT
New Way overkill
#1 - I want him to be able to be VERY productive in Perl using it for both Windows and Unix automation, and for data processing (direct mail type data) and for database access.

It will take at least a concentrated year for me to get enough Perl/SQL/Unix idiom into his head. He works full time, and this is not directly his job (yet), so this is off-hours stuff.

And he has NO programming background. None. Really. We are at ground zero here.

I just don't want him to baby chick imprint on anything. I want some alternative methods of doing things in his mind so he doesn't think anything is the one true way.

I also want him to have a better understanding of low level coding, so I want him to know enough C to be able to do some simple file processing, and a bit of in memory manipulation. We will NOT go as far at B-Trees or linked lists. This is just so he can get the experience of using a debugger and stepping through the machine code that his program produced.

Since he will be so focused on Perl, I'd like him to have exposure to some other high level language such as Python or Ruby to make sure he doesn't become a Perl bigot.

But past that, nope. I have my own life to live, and that does not include writing any Java code.
New Python and C then.
As I said, Python is very different than Perl in many ways. Ruby is like a cross of Smalltalk and Perl, with the most important similarity (and difference from Pythong) being TIMTOWTDI.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New Is python TIMTOWTDI?
Not quite sure what that comment is saying based on that.

And how weird is that word? Yes, I know it's an acronym, but at this point, we "say" it.
New No, it's the opposite.
Well, not entirely: usually there is more than one way to do anything, but with Python there is generally one best way and then all of the hacks.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New Good, that's what I want.
I'll try to get him posting here for ongoing advice.

Is it reasonable to assume that anything I come up with as an example perl script that is running on my local ubuntu desktop will also be implementable in a python script?

As long as we don't depend on any CPAN based drivers.

For example, we just write a program that takes a variable number of command line arguments. The arguments are file names.

The goal of the program is to parse and store the ids of several files, then run a final file though and test if the final file matched any of the previous files (at the ID level), and output SOME of the final file (records that have IDs that matched), and output it in a different order depending on which of the input files it matched.

The files can be either tab, pipe, or CSV delimited. We do not really parse them out fully, though, since the ID can be pulled via a REGEXP from all the files.

The code exercises REGEXPs, splits, joins, multi-level hashes, REFs, and refs of arrays in anonymous hashes, for storing and later lookup, and the perl idiom of pushing into an element into an array, which required the obscure casting syntax.

The program reads the files and stores ONLY the primary key of each file into a file specific hash for a quick truth test later.

We then read STDIN.

We parse the ID from each record and see if the ID was in any of the previous files.

We store each record (the whole thing, which then leads to Perl/DBM/Tie discussion for when we run out of memory and need to start using disk) into an array that is specific to the matching previous file.

We then output all the records that we read via STDIN that matched ANY of the named files (some didn't), and we output them in the order of the previous files that they matched.

That means all the records that matched the 1st file are output, then all the records that matched the 2nd file, etc, until done.

Essentially a 15 line perl script the way I write it, maybe 5 for BT.

Knowing no python at all, but with my Perl background pushing implementation directions (possibly poorly), how long should it take for us to write this together in python?

Wanna give an pseudo code outline so we do it the "right" way?
Expand Edited by crazy Sept. 3, 2011, 04:52:46 PM EDT
Expand Edited by crazy Sept. 3, 2011, 05:10:35 PM EDT
New Data examples?
The problem with giving you Python pseudocode is that Python is pseudocode that runs... ;-)

Post the Perl REs if you want: Python can use them.
Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New Here's the code
Note: Lots of silly assignments to tmp vars.
This is so we can dump it using data::dumper.
And the backslashes got lost on the post.


#!/usr/bin/perl -W
use strict;
use Data::Dumper;
# Usage: year_pri.pl 3 year_1.txt year_2.txt year_3.txt input_data.txt
# ###### Note the stupid 3 to tell it 3 files to follow, I wasn't showing him
# the getopts lib yet.

my $file_count = shift;
my $bug = 1;

my @pri_files;
while ($file_count--){
my $file = shift;
push(@pri_files, $file);
}

my %priority_rec;
foreach my $priority (@pri_files){
my ($tmp) = read_file($priority);
$priority_rec{$priority} = $tmp;
}

my %save_records;

while (<>){
my $in_rec = $_;

#"1990118STTSKYJH","BA","INH_ALL-PNCOA3-2011|Work|51207","LAST","FIRST","","","","PATTERSON & KELLY PA","",

if (/|(d+)"/){
my $id = $1;
YEAR_LOOP:
foreach my $priority (@pri_files){
if (defined($priority_rec{$priority}->{$id})){
push (@{$save_records{$priority}},$in_rec);
last YEAR_LOOP;
}
}
} #end if we got a match for the id
else{
die "Cannot get id from: [$_]n";
}

}
foreach my $priority (@pri_files){
print @{$save_records{$priority}};
}

exit (0);

sub read_file {
my $file = shift;

my %ret;

open (IN, $file) or die "Can't open $file for read - $!n";
while (<IN>){
my (@rec) = split(/t/, $_, 2);
$ret{$rec[0]}++;
}
close(IN);

return(%ret);
}
New Re: Here's the code
#!/usr/bin/python


# Usage: cat input_data.txt | ./sort-id.py pri1.txt pri2.txt pri3.txt -

import fileinput, re, sys

pri_files, id_cache, save_records = [], {}, {}

matcher = re.compile('"(\d+)') # not necessary but faster

def process_priority_file(line):
if fileinput.isfirstline(): # save the file name and initialize the record capture list
pri_files.append(fileinput.filename())
save_records[fileinput.filename()] = []

id, separator, rest_of_line = line.partition('\t') # compare to line.split()...

# You're not printing out multiple copies of the same record if it shows up in more than one
# priority file, so I'm not sure what the point of saving a hash per priority filename was.
if id not in id_cache:
id_cache[id] = fileinput.filename()


def process_stdin(line):
matches = matcher.match(line) # matches = re.match('"(\d+)', line) if not precompiled
if not matches:
print 'Cannot get id from %s' % line
sys.exit(1)

rec_id = matches.group(1)

pri = id_cache.get(rec_id)
if pri:
save_records[pri].append(line)

def print_prioritized():
for priority in pri_files:
for r in save_records[priority]:
# didn't see you doing a chomp, but here's the Python version since print adds a newline
print r.rstrip('\r\n')

def read_files():
# http://docs.python.o...ry/fileinput.html is worth reading.
# I didn't feel like reading the filename count.
for line in fileinput.input():
if fileinput.isstdin():
process_stdin(line)
else:
process_priority_file(line)

print_prioritized()


# This test lets you only run something if the module is run from the command line.
# This code won't run if someone does 'import thisfilename'; idiomatically, people will
# sometimes put test code in this stanza so tests can be easily run from the command line.
# You could also put getopts tests for --help, etc. in here.
if __name__ == "__main__":
read_files()


Also, if you don't want to add a number to the command line and don't like the stdin trick I used, just use:

def read_files():

priority_files, input_file = sys.argv[1:-1], sys.argv[-1]
for line in fileinput.input(priority_files):
process_priority_file(line)

for line in fileinput.input(input_file):
process_stdin(line)

print_prioritized()

Regards,
-scott
Welcome to Rivendell, Mr. Anderson.
New Thanks
New How do *you* pronounce it? Me: tim-TOW-te-dee
--

Drew
New Re: How do *you* pronounce it? Me: tim-TOW-te-dee
TIM-TOW-DEE
New Pretty dern close...
Rose::DB recently came into use here. Really nice and does Data Hygiene before it even make a DB connection that would have failed.

Dancer... really neat simple Web Framework. Allows for some really powerful stuff.

Moose and Mouse... nearly self explanatory.

Testing frameworks galore and most of them work well.

abstractions... a (usually) well written module to do about anything you want. (Payment Gateways are a really good example there)

In general, if you can think it... there is usually a CPAN module for it and sometimes 4 different ones to suit you the way you like to use things.
     need to understand how to hugely scale ruby on rails - (boxley) - (37)
         Twitter? - (Another Scott) - (33)
             Ow. - (static) - (32)
                 PHP vs Java scaling - (malraux) - (1)
                     +5, Informative. -NT - (static)
                 I believe they went to Scala on the JVM - (S1mon_Jester) - (1)
                     different type of application, easy to shard - (boxley)
                 As BT lectured me a while ago - (crazy) - (27)
                     Saw a presentation on Ruby on Rails yesterday. - (static) - (1)
                         I've used Rails - (malraux)
                     That still holds water... - (folkert) - (24)
                         Dunno about everything, but... - (crazy) - (23)
                             I'd say Java and C. - (Another Scott)
                             Re: Dunno about everything, but... - (malraux)
                             Well, what they said - (jake123) - (3)
                                 My assumption - (malraux) - (2)
                                     Speaking of Python... ;-) - (Another Scott) - (1)
                                         I still use 2.x - (malraux)
                             Ehhh - (crazy) - (15)
                                 Consider Haskell - (jake123) - (1)
                                     Actually, I was - (crazy)
                                 Other stuff. - (static)
                                 What are you trying to teach? - (malraux) - (11)
                                     Way overkill - (crazy) - (10)
                                         Python and C then. - (malraux) - (9)
                                             Is python TIMTOWTDI? - (crazy) - (8)
                                                 No, it's the opposite. - (malraux) - (5)
                                                     Good, that's what I want. - (crazy) - (4)
                                                         Data examples? - (malraux) - (3)
                                                             Here's the code - (crazy) - (2)
                                                                 Re: Here's the code - (malraux) - (1)
                                                                     Thanks -NT - (crazy)
                                                 How do *you* pronounce it? Me: tim-TOW-te-dee -NT - (drook) - (1)
                                                     Re: How do *you* pronounce it? Me: tim-TOW-te-dee - (crazy)
                             Pretty dern close... - (folkert)
         passed the tech screen, thx guys, in person interview Fri -NT - (boxley) - (2)
             Yay! - (crazy) - (1)
                 commute first, then yeah a move after a while -NT - (boxley)

"No question," Trudeau said confidently, "it was definitely out my butt."
390 ms