Post #183,539
11/8/04 3:27:22 PM
|
Re: Don't agree
Again, I'm NOT saying this is a good thing, just that I'm not surprised. So you don't agree that I'm surprised? :-)
On 1, if you're a Java programmer you may not ever deal with the OS. Especially if you're using an IDE.
On 3, you said "C", which led me to believe you were looking for pointer arithmetic.
On 5, yes, you're interviewing idiots. I was under the impression you were talking about exact sizes of the primitives (32-bits for ints, etc.). Personally I wouldn't put a SSN in a numeric type anyway. :-P
As I said, though, you should be looking for people who have scripting experience as well.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,544
11/8/04 3:46:18 PM
|
Oh yeah - we want one scripting language
but it could be shell scripting.
"Personally I wouldn't put a SSN in a numeric type anyway."
Really? If its a database field I think I agree.
But what if I told you that you work for a fraud detection company and your job is to produce a little server that listens on a socket for a SSN and writes back an integer 1 or 0 depending on whether the SSN was valid (held by a live person). Assume the US Gov sends you a text file every month that has all valid SSN's - one per line. Also assume SSN data is potentially sparse.
How would you represent the data? How much space will you need? What is the Big-O efficiencey of worst case search?
This is rhetorical BTW. You don't have to answer :-)
I don't expect them to solve it on the phone. I do expect them to pick a strategy (or two) and explain what its going to cost in terms of memory consumption and time efficiency.
This is typical of the kinds of interview questions we ask around here. Which reminds me - got any favorite interview (programming) puzzles you want to share? We are always looking for new ones.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,548
11/8/04 3:50:34 PM
|
Please don't ask the one about the manhole covers. :-/
[link|http://www.sellsbrothers.com/fun/msiview/default.aspx?content=question.htm|Sheesh! It's the first one!]
Cheers, Scott. (Who figures every new IT person in the world has memorized the "right" answers without thinking about the questions.)
|
Post #183,560
11/8/04 4:17:49 PM
|
Otherwise known as...
... the "all your ideas are belong to us" question: If MS told you we were willing to invest $5 million in a start up of your choice, what business would you start? Why? Because you're looking for a job, you'll be too poor to sue us when we rip off the best 5 ideas we get from asking this question.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,554
11/8/04 4:04:34 PM
|
Shell scripting isn't the same thing at all.
People can write serious programs with the "higher" scripting languages. But again, I've had good luck with people who put down things like Python and Ruby because that means they actually USE them instead of just putting them on a resume to get past filters. But what if I told you that you work for a fraud detection company and your job is to produce a little server that listens on a socket for a SSN and writes back an integer 1 or 0 depending on whether the SSN was valid (held by a live person). Assume the US Gov sends you a text file every month that has all valid SSN's - one per line. Also assume SSN data is potentially sparse. See, you're providing more information now. :-) Depends on how fast it *needs* to be, what the machine constraints are, and how much time you want to spend on a solution. If you're bringing stuff in over TCP/IP, then the data is already in string format, for one thing. If efficiency is everything, and you can't afford the RAM, then sure, put it in a numeral. Or if you just want a solution and RAM is not a problem, read the whole thing into a HashSet and get O(1) lookups straight from the string you read in. Actually, the best way might be: allocate 125M of RAM, and on startup flip the corresponding bit to 1 for each SSN you read in. When you get a request, convert the SSN to long or whatever, divide by 8, look up the byte, and bitmask off the remainder to see if the SSN is valid. That way you don't have to keep any of the actual SSNs in memory at all, and your question is more or less moot. ;-) I've personally never been in a situation like that with SSNs, though. :-) Programming puzzles... I don't like them, in general, but it depends on what you're hiring for. When I hire for web developers, I give them a project to do that should take no more than 8 hours of time. They get a CD with a full environment on it, README, install instructions, compiler, Tomcat, JUnit libs, etc., and they're expected to produce a working web application given a spec. They can do it at home, here at the office on a spare machine, whereever. No frameworks allowed, no external libraries, but they can use any reference guide they wish. As a result I get people who can actually program and discuss the decisions they made intelligently, instead of people who are great at "puzzles" but couldn't actually program their way out of a wet paper bag on an actual project. Let me know if you want a copy of the CD I have prepared.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,563
11/8/04 4:31:34 PM
11/8/04 4:35:33 PM
|
I said it was rhetorical
I also knew that wouldn't stop you. :-)
Personally, I look for diversity of experience - more languages/environments is better.
"Depends on how fast it *needs* to be"
Here everything is optimized for speed. RAM is cheap. Volumes are really high though. This is the first place I've ever worked where maximum performance software was really important.
"Or if you just want a solution and RAM is not a problem, read the whole thing into a HashSet and get O(1) lookups straight from the string you read in. "
Around here *money* is not a problem for hardware as long as you're buying off the rack (no exotics). With that answer I'd ask for a quick estimate of RAM required in the worst case of all SSN's being populated. What is the overhead of a HashSet (I have no idea without doing research)?
"Actually, the best way might be: allocate 125M of RAM, and on startup flip the corresponding bit to 1 for each SSN you read in. When you get a request, convert the SSN to long or whatever, divide by 8, look up the byte, and bitmask off the remainder to see if the SSN is valid. That way you don't have to keep any of the actual SSNs in memory at all, and your question is more or less moot. ;-)"
Yep, bit vector is the best answer (smallest and fastest).
"I've personally never been in a situation like that with SSNs, though. :-)"
That's a real program at a real company where I once interviewed in San Diego. The question generally produces the nice thought process you wrote down. It lets me see the thinking pattern.
Most frequent answer from soon to be rejected candidates is to suggest a binary tree, then fail to be able to calculate the memory requirements of the fully populated structure. They get tangled up in trying to figure out memory partitioning algos to swap to disk too.
The CD sounds intriguing but I don't know how applicable it is. We use very little Java and the stuff I need coded isn't even a web app.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,565
11/8/04 4:47:36 PM
|
Don't you mean, "rederickal"?
Sorry, Beep. ;-)
Diversity is good, but the scripting languages seem to bring out the more curious, problem-solving types. Most people who can't be arsed to learn a scripting language for the hell of it also can't be arsed to do a lot of other things.
This place is all about speed, too. We get thousands of pricing ticks a second, and those explode into huge calculations. And updates have to be as near to real time as possible.
The CD doesn't sound too useful, other than as an example. We've also given applicants a shell login and had them code in a C++ environment on a server here. You could probably do the same thing with Oracle if that's what you're looking for. Just make sure you trust your sysadmins to get the permissions right. :-P
The key is to make them do an actual project in an environment where they can be comfortable, in order to get an idea of how they would work on the job as opposed to in an interview.
I put a copy of the problem up for general perusal at [link|http://www.mr-anderson.net/stuff/PROJECT.html|http://www.mr-anders...tuff/PROJECT.html] if anyone is interested.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #274,157
11/28/06 12:03:24 AM
|
Admin, I'd like a copy of the CD or the ISO.
-- [link|mailto:greg@gregfolkert.net|greg], [link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwetheyFreedom is not FREE. Yeah, but 10s of Trillions of US Dollars? SELECT * FROM scog WHERE ethics > 0;
0 rows returned.
|
Post #274,162
11/28/06 12:24:31 AM
|
I'll throw a copy up on my site.
I'll need to scrub it first, though.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #274,167
11/28/06 2:02:23 AM
|
I wonder if that CD would be good as a VMWare image
Don't mean do it, just thinking that CD image seems to be a good fit for Virtual Machine image technology.
--Tony
|
Post #274,168
11/28/06 2:08:16 AM
|
When ever. Its a *NOT* short term thing.
-- [link|mailto:greg@gregfolkert.net|greg], [link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwetheyFreedom is not FREE. Yeah, but 10s of Trillions of US Dollars? SELECT * FROM scog WHERE ethics > 0;
0 rows returned.
|
Post #274,204
11/28/06 12:04:29 PM
|
Re: I'll throw a copy up on my site.
[link|http://www.mr-anderson.net/stuff/j2eedist.tar.bz2|http://www.mr-anders.../j2eedist.tar.bz2]
I truncated the binary files and the Tomcat directories to save space. It was all very old stuff, and easy to obtain at any rate.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #274,240
11/28/06 3:58:54 PM
|
Interesting.
Its a project low enough for me to do, but has all the challenge of a real world thingamabober.
It may take me longer than 8 hours... but then I don't have really any programming experience in Java.
Thanks.
-- [link|mailto:greg@gregfolkert.net|greg], [link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwetheyFreedom is not FREE. Yeah, but 10s of Trillions of US Dollars? SELECT * FROM scog WHERE ethics > 0;
0 rows returned.
|
Post #274,248
11/28/06 4:51:52 PM
|
Enjoy.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,579
11/8/04 5:29:51 PM
|
Potentially sparse? I don't think so...
Perhaps in that big gap from 1_000_000_000 up. But there are about 300 million people in the USA, most of whom have a social security number (and others have been handed out), so social security numbers are mostly filled.
An optimal solution for the "sparse social security number" problem is going to use far too much memory when you have a mostly filled data space. Heck, 300 million pointers will take 1.2 GB on a 32-bit machine. Scott's solution is an extremely good trade-off that will be hard to beat on any significant criteria, let alone all of them combined.
Incidentally one thing to watch for in Scott's solutions, I believe that social security numbers can start with 0.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,585
11/8/04 5:42:36 PM
|
I took the ones that start with 0 into account.
125M is approximately 1,000,000,000 / 8.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,587
11/8/04 5:49:44 PM
|
I was referring to the step where you...
convert the string to long. Hand-rolled functions will be OK, but some library functions may take a leading 0 to indicate that the number is in octal, which generally won't do what you want it to.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,597
11/8/04 6:21:56 PM
|
Ah, gotcha. Good point.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,603
11/8/04 6:57:18 PM
|
For various values of "sparse"
its only 1/3 populated and the gaps are uneven. Furthermore, SSN numbers are not issued sequentially. The first three digits have geographic significance.
Furthermore, people die in random order.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,612
11/8/04 7:45:36 PM
|
Unless you have a lot of knowledge...
that isn't boiling down to anything that gives you a big win. Not when 300 million pointers takes that much space.
The knowledge that it is "gap-y" could provide a space win, but only at some performance cost. You'd have to store the data in carefully compressed form and then decompress before using. For instance you could have a table that maps every 64K block of social security numbers to a block of data, which has Scott's format stored using a standard kind of compression. To lookup a social security number you have to decompress the appropriate block then look up your social security number.
Well filled-in stretches and gaps should both compress well. However now every lookup involves decompressing a block of data rather than just a bitmap. This can only be recommended as a win if the data is seriously uneven and 128 MB of RAM simply can't be done.
Given the cost of RAM, I'd go with Scott's solution.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,692
11/9/04 2:08:24 AM
|
Re: social security numbers can start with 0
Yep, mine does! But, there are other non-zero digits as well. :)
Alex
In politics, what begins in fear usually ends in folly. -- Samuel Taylor Coleridge, poet (1772-1834)
|
Post #274,305
11/29/06 12:34:54 AM
|
Re: Oh yeah - we want one scripting language
This is typical of the kinds of interview questions we ask around here. Which reminds me - got any favorite interview (programming) puzzles you want to share? We are always looking for new ones.
\r\n\r\n Though not really used in interviews (our requirements are slightly oddball, so our actual methodology probably wouldn't be useful to anyone else), our lead dev has been peppering us with ideas for a series of quizzes he's writing -- things like "what is this regex meant to match" (often followed up with "what's the bug in it"). \r\n\r\n Personally, I'm a fan of laying out an actual (or close to actual) scenario and playing the "how would you fix this" card; specific solutions aren't necessary, just evidence of a working knowledge of debugging under fire (like I said, our requirements are a bit oddball; it's not terribly unusual to get a request at noon for something to be developed from scratch and live before 5PM, nor is it unusual for us to get into situations where we have no way to really test something before deployment -- see [link|http://www.jacobian.org/writing/2006/nov/08/breaking-news/|our lead dev's write-up of our election coverage] for an example).
--\r\nYou cooin' with my bird?
|
Post #274,320
11/29/06 7:50:27 AM
|
IRLRPD. (new thread)
Created as new thread #274319 titled [link|/forums/render/content/show?contentid=274319|IRLRPD.]
|