Post #183,534
11/8/04 3:12:31 PM
11/8/04 3:15:09 PM
|
Don't agree
On 1 - the question is usually phrased as something like: There 10,000 html templates in a unix directory structure with that mention fresh fish. We no longer sell fish and we have to remove those references by tomorrow. How can you come up with a list of files that need to be fixed?
On 5 - even if you're a java programmer you have to be able to select the right data type to hold the number I'm going to give you. I actually had a guy suggest you could represent a US SSN using a short.
On 3 - suppose I asked you to write a function to reverse the words in a string - if its not Java I'll say it has to be done in place. If it is Java you can have a StringBuffer that's the same size. Most cannot answer this question.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,536
11/8/04 3:18:47 PM
|
Hire Jake
He's obviously just the thing Big River Dead Tree Editions needs to put it over the top.
-drl
|
Post #183,539
11/8/04 3:27:22 PM
|
Re: Don't agree
Again, I'm NOT saying this is a good thing, just that I'm not surprised. So you don't agree that I'm surprised? :-)
On 1, if you're a Java programmer you may not ever deal with the OS. Especially if you're using an IDE.
On 3, you said "C", which led me to believe you were looking for pointer arithmetic.
On 5, yes, you're interviewing idiots. I was under the impression you were talking about exact sizes of the primitives (32-bits for ints, etc.). Personally I wouldn't put a SSN in a numeric type anyway. :-P
As I said, though, you should be looking for people who have scripting experience as well.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,544
11/8/04 3:46:18 PM
|
Oh yeah - we want one scripting language
but it could be shell scripting.
"Personally I wouldn't put a SSN in a numeric type anyway."
Really? If its a database field I think I agree.
But what if I told you that you work for a fraud detection company and your job is to produce a little server that listens on a socket for a SSN and writes back an integer 1 or 0 depending on whether the SSN was valid (held by a live person). Assume the US Gov sends you a text file every month that has all valid SSN's - one per line. Also assume SSN data is potentially sparse.
How would you represent the data? How much space will you need? What is the Big-O efficiencey of worst case search?
This is rhetorical BTW. You don't have to answer :-)
I don't expect them to solve it on the phone. I do expect them to pick a strategy (or two) and explain what its going to cost in terms of memory consumption and time efficiency.
This is typical of the kinds of interview questions we ask around here. Which reminds me - got any favorite interview (programming) puzzles you want to share? We are always looking for new ones.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,548
11/8/04 3:50:34 PM
|
Please don't ask the one about the manhole covers. :-/
[link|http://www.sellsbrothers.com/fun/msiview/default.aspx?content=question.htm|Sheesh! It's the first one!]
Cheers, Scott. (Who figures every new IT person in the world has memorized the "right" answers without thinking about the questions.)
|
Post #183,560
11/8/04 4:17:49 PM
|
Otherwise known as...
... the "all your ideas are belong to us" question: If MS told you we were willing to invest $5 million in a start up of your choice, what business would you start? Why? Because you're looking for a job, you'll be too poor to sue us when we rip off the best 5 ideas we get from asking this question.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,554
11/8/04 4:04:34 PM
|
Shell scripting isn't the same thing at all.
People can write serious programs with the "higher" scripting languages. But again, I've had good luck with people who put down things like Python and Ruby because that means they actually USE them instead of just putting them on a resume to get past filters. But what if I told you that you work for a fraud detection company and your job is to produce a little server that listens on a socket for a SSN and writes back an integer 1 or 0 depending on whether the SSN was valid (held by a live person). Assume the US Gov sends you a text file every month that has all valid SSN's - one per line. Also assume SSN data is potentially sparse. See, you're providing more information now. :-) Depends on how fast it *needs* to be, what the machine constraints are, and how much time you want to spend on a solution. If you're bringing stuff in over TCP/IP, then the data is already in string format, for one thing. If efficiency is everything, and you can't afford the RAM, then sure, put it in a numeral. Or if you just want a solution and RAM is not a problem, read the whole thing into a HashSet and get O(1) lookups straight from the string you read in. Actually, the best way might be: allocate 125M of RAM, and on startup flip the corresponding bit to 1 for each SSN you read in. When you get a request, convert the SSN to long or whatever, divide by 8, look up the byte, and bitmask off the remainder to see if the SSN is valid. That way you don't have to keep any of the actual SSNs in memory at all, and your question is more or less moot. ;-) I've personally never been in a situation like that with SSNs, though. :-) Programming puzzles... I don't like them, in general, but it depends on what you're hiring for. When I hire for web developers, I give them a project to do that should take no more than 8 hours of time. They get a CD with a full environment on it, README, install instructions, compiler, Tomcat, JUnit libs, etc., and they're expected to produce a working web application given a spec. They can do it at home, here at the office on a spare machine, whereever. No frameworks allowed, no external libraries, but they can use any reference guide they wish. As a result I get people who can actually program and discuss the decisions they made intelligently, instead of people who are great at "puzzles" but couldn't actually program their way out of a wet paper bag on an actual project. Let me know if you want a copy of the CD I have prepared.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,563
11/8/04 4:31:34 PM
11/8/04 4:35:33 PM
|
I said it was rhetorical
I also knew that wouldn't stop you. :-)
Personally, I look for diversity of experience - more languages/environments is better.
"Depends on how fast it *needs* to be"
Here everything is optimized for speed. RAM is cheap. Volumes are really high though. This is the first place I've ever worked where maximum performance software was really important.
"Or if you just want a solution and RAM is not a problem, read the whole thing into a HashSet and get O(1) lookups straight from the string you read in. "
Around here *money* is not a problem for hardware as long as you're buying off the rack (no exotics). With that answer I'd ask for a quick estimate of RAM required in the worst case of all SSN's being populated. What is the overhead of a HashSet (I have no idea without doing research)?
"Actually, the best way might be: allocate 125M of RAM, and on startup flip the corresponding bit to 1 for each SSN you read in. When you get a request, convert the SSN to long or whatever, divide by 8, look up the byte, and bitmask off the remainder to see if the SSN is valid. That way you don't have to keep any of the actual SSNs in memory at all, and your question is more or less moot. ;-)"
Yep, bit vector is the best answer (smallest and fastest).
"I've personally never been in a situation like that with SSNs, though. :-)"
That's a real program at a real company where I once interviewed in San Diego. The question generally produces the nice thought process you wrote down. It lets me see the thinking pattern.
Most frequent answer from soon to be rejected candidates is to suggest a binary tree, then fail to be able to calculate the memory requirements of the fully populated structure. They get tangled up in trying to figure out memory partitioning algos to swap to disk too.
The CD sounds intriguing but I don't know how applicable it is. We use very little Java and the stuff I need coded isn't even a web app.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,565
11/8/04 4:47:36 PM
|
Don't you mean, "rederickal"?
Sorry, Beep. ;-)
Diversity is good, but the scripting languages seem to bring out the more curious, problem-solving types. Most people who can't be arsed to learn a scripting language for the hell of it also can't be arsed to do a lot of other things.
This place is all about speed, too. We get thousands of pricing ticks a second, and those explode into huge calculations. And updates have to be as near to real time as possible.
The CD doesn't sound too useful, other than as an example. We've also given applicants a shell login and had them code in a C++ environment on a server here. You could probably do the same thing with Oracle if that's what you're looking for. Just make sure you trust your sysadmins to get the permissions right. :-P
The key is to make them do an actual project in an environment where they can be comfortable, in order to get an idea of how they would work on the job as opposed to in an interview.
I put a copy of the problem up for general perusal at [link|http://www.mr-anderson.net/stuff/PROJECT.html|http://www.mr-anders...tuff/PROJECT.html] if anyone is interested.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #274,157
11/28/06 12:03:24 AM
|
Admin, I'd like a copy of the CD or the ISO.
-- [link|mailto:greg@gregfolkert.net|greg], [link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwetheyFreedom is not FREE. Yeah, but 10s of Trillions of US Dollars? SELECT * FROM scog WHERE ethics > 0;
0 rows returned.
|
Post #274,162
11/28/06 12:24:31 AM
|
I'll throw a copy up on my site.
I'll need to scrub it first, though.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #274,167
11/28/06 2:02:23 AM
|
I wonder if that CD would be good as a VMWare image
Don't mean do it, just thinking that CD image seems to be a good fit for Virtual Machine image technology.
--Tony
|
Post #274,168
11/28/06 2:08:16 AM
|
When ever. Its a *NOT* short term thing.
-- [link|mailto:greg@gregfolkert.net|greg], [link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwetheyFreedom is not FREE. Yeah, but 10s of Trillions of US Dollars? SELECT * FROM scog WHERE ethics > 0;
0 rows returned.
|
Post #274,204
11/28/06 12:04:29 PM
|
Re: I'll throw a copy up on my site.
[link|http://www.mr-anderson.net/stuff/j2eedist.tar.bz2|http://www.mr-anders.../j2eedist.tar.bz2]
I truncated the binary files and the Tomcat directories to save space. It was all very old stuff, and easy to obtain at any rate.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #274,240
11/28/06 3:58:54 PM
|
Interesting.
Its a project low enough for me to do, but has all the challenge of a real world thingamabober.
It may take me longer than 8 hours... but then I don't have really any programming experience in Java.
Thanks.
-- [link|mailto:greg@gregfolkert.net|greg], [link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwetheyFreedom is not FREE. Yeah, but 10s of Trillions of US Dollars? SELECT * FROM scog WHERE ethics > 0;
0 rows returned.
|
Post #274,248
11/28/06 4:51:52 PM
|
Enjoy.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,579
11/8/04 5:29:51 PM
|
Potentially sparse? I don't think so...
Perhaps in that big gap from 1_000_000_000 up. But there are about 300 million people in the USA, most of whom have a social security number (and others have been handed out), so social security numbers are mostly filled.
An optimal solution for the "sparse social security number" problem is going to use far too much memory when you have a mostly filled data space. Heck, 300 million pointers will take 1.2 GB on a 32-bit machine. Scott's solution is an extremely good trade-off that will be hard to beat on any significant criteria, let alone all of them combined.
Incidentally one thing to watch for in Scott's solutions, I believe that social security numbers can start with 0.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,585
11/8/04 5:42:36 PM
|
I took the ones that start with 0 into account.
125M is approximately 1,000,000,000 / 8.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,587
11/8/04 5:49:44 PM
|
I was referring to the step where you...
convert the string to long. Hand-rolled functions will be OK, but some library functions may take a leading 0 to indicate that the number is in octal, which generally won't do what you want it to.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,597
11/8/04 6:21:56 PM
|
Ah, gotcha. Good point.
Regards,
-scott anderson
"Welcome to Rivendell, Mr. Anderson..."
|
Post #183,603
11/8/04 6:57:18 PM
|
For various values of "sparse"
its only 1/3 populated and the gaps are uneven. Furthermore, SSN numbers are not issued sequentially. The first three digits have geographic significance.
Furthermore, people die in random order.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,612
11/8/04 7:45:36 PM
|
Unless you have a lot of knowledge...
that isn't boiling down to anything that gives you a big win. Not when 300 million pointers takes that much space.
The knowledge that it is "gap-y" could provide a space win, but only at some performance cost. You'd have to store the data in carefully compressed form and then decompress before using. For instance you could have a table that maps every 64K block of social security numbers to a block of data, which has Scott's format stored using a standard kind of compression. To lookup a social security number you have to decompress the appropriate block then look up your social security number.
Well filled-in stretches and gaps should both compress well. However now every lookup involves decompressing a block of data rather than just a bitmap. This can only be recommended as a win if the data is seriously uneven and 128 MB of RAM simply can't be done.
Given the cost of RAM, I'd go with Scott's solution.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,692
11/9/04 2:08:24 AM
|
Re: social security numbers can start with 0
Yep, mine does! But, there are other non-zero digits as well. :)
Alex
In politics, what begins in fear usually ends in folly. -- Samuel Taylor Coleridge, poet (1772-1834)
|
Post #274,305
11/29/06 12:34:54 AM
|
Re: Oh yeah - we want one scripting language
This is typical of the kinds of interview questions we ask around here. Which reminds me - got any favorite interview (programming) puzzles you want to share? We are always looking for new ones.
\r\n\r\n Though not really used in interviews (our requirements are slightly oddball, so our actual methodology probably wouldn't be useful to anyone else), our lead dev has been peppering us with ideas for a series of quizzes he's writing -- things like "what is this regex meant to match" (often followed up with "what's the bug in it"). \r\n\r\n Personally, I'm a fan of laying out an actual (or close to actual) scenario and playing the "how would you fix this" card; specific solutions aren't necessary, just evidence of a working knowledge of debugging under fire (like I said, our requirements are a bit oddball; it's not terribly unusual to get a request at noon for something to be developed from scratch and live before 5PM, nor is it unusual for us to get into situations where we have no way to really test something before deployment -- see [link|http://www.jacobian.org/writing/2006/nov/08/breaking-news/|our lead dev's write-up of our election coverage] for an example).
--\r\nYou cooin' with my bird?
|
Post #274,320
11/29/06 7:50:27 AM
|
IRLRPD. (new thread)
Created as new thread #274319 titled [link|/forums/render/content/show?contentid=274319|IRLRPD.]
|
Post #183,593
11/8/04 6:05:13 PM
|
NEED to be fixed or MAY need to be fixed?
grep -rli 'fresh.*fish' root_directory_to_search > fishy_list
That should catch the files that you want. I'm assuming GNU grep of course.
It is not perfect though - if someone broke fresh and fish onto separate lines, it will miss that. Ditto if someone had "Fish, Fresh". Also there are ways to code links which will be missed. (In one file you have a function that takes a code name and returns a link for that item, complete with name, in another file you ask for code 12344, which is fresh fish. You'll have to look for 12344 as well...) And it will catch all of the files with references to things like "Fresh water fishing getaways".
Yeah, yeah, you were looking for the find solution that I know about but never use so my fingers don't know it. After glancing at the man pages, find -type f -print | xargs grep -li fresh.*fish Note that using xargs that way lessens the overhead of starting lots of copies of grep.
And yes, I would know enough to pick an int for that problem (assuming that you are on a 32-bit or better platform).
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,601
11/8/04 6:54:34 PM
|
MAY
Its a work scoping exercise. They'll get fixed by hand so probable matches is good.
The way to fail this one is to start writing an elaborate Java or C program to try to solve it. This is typical of the J-Heads. They'll write a whole Java program with custom string scanning and stuff.
Wrong tool for the job. Perl, Ruby, grep, whatever man.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,598
11/8/04 6:36:54 PM
|
you dont sell fish anymore, who cares if its fresh?
1.find . -print|xargs grep fish >foo.ou$$ 2. will TOP do? I can read oop but cant think in it 3. cat string | awk '{print $5, $4, $3, $2, $1}' 4. I got yer data structures right here, what do ya want to do with it? 5. $STDIN >foo |tail -f foo|grep SSBASE >foobar$$ ll foobar$$ | awk '{print $(filesize, depends on system}>$STDOUT using the number 0 and 7 instead of 1 and 0. Well you get the idea :-) thanks for letting me play regards, daemon
that way too many Iraqis conceived of free society as little more than a mosh pit with grenades. ANDISHEH NOURAEE
|
Post #183,623
11/8/04 8:34:57 PM
|
Reversing words is trickier than chars
I don't think many people ever wrote this, except as an exersice.
I would locate first and last word, picked the shorter of them, swapped the chars up to the shorter's size in place, and then I'd have to shift the whole buffer (mmemove) by 1 char to free the space for the next char of the long one to be copied. Of course, a better way is to have a fixed-size buffer (say, 8 chars) so that at least some words can be copied with a single shift.
This produces slow and ugly code. In reality, I'd be sorely tempted to allocate another buffer and finish the whole thing with a single memcpy. Or am I missing an obvious and good algorithm here?
--
This guy's ahead of his time! He's using quantum programming methods: in universes where invalid data is passed to this function, it does not return. Thus you are ensured that you will only have valid data after calling it. Optimally you'd destroy the universe on failure, but computers haven't quite advanced to that level yet.
-- [link|http://thedailywtf.com/archive/2004/10/26/2920.aspx|The] Daily WTF
|
Post #183,632
11/8/04 9:00:47 PM
|
read the post above yours :-)
read STDIN and assign variables to each non whitespace block, print them in reverse order. regards, daemon
that way too many Iraqis conceived of free society as little more than a mosh pit with grenades. ANDISHEH NOURAEE
|
Post #183,644
11/8/04 9:26:37 PM
|
You're missing an obvious and good solution
Simply reverse the entire array by putting a pointer at each end and swapping chars until they pass each other.
Now scan it for word breaks and reverse each word in place.
Two passes, in place, no messy math.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,645
11/8/04 9:31:04 PM
|
ohhh
I guess I fail...
--
This guy's ahead of his time! He's using quantum programming methods: in universes where invalid data is passed to this function, it does not return. Thus you are ensured that you will only have valid data after calling it. Optimally you'd destroy the universe on failure, but computers haven't quite advanced to that level yet.
-- [link|http://thedailywtf.com/archive/2004/10/26/2920.aspx|The] Daily WTF
|
Post #183,650
11/8/04 10:31:24 PM
|
An obvious but NOT necessarily good solution
Consider the sentence, This sentence, right here, doesn't work well with Todd's algorithm.
With Todd's algorithm we get the following, .algorithm s'Todd with well work t'doesn ,here right ,sentence This
The output that I would expect (Arkadiy's algorithm would give it to you) is something more like, algorithm s, Todd with, well'work t doesn here right'sentence This.
The best possible output would be, algorithm Todd's, with well, work doesn't here right sentence This.
I hope that questioning the spec would give brownie points.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|
Post #183,667
11/8/04 11:36:56 PM
|
It gives the results I expect
but your application of it I find odd.
This sentence, right here, doesn't work well with Todd's algorithm.
.algorithm s'Todd with well work t'doesn ,here right ,sentence This
I don't think an appostrophe counts as a word boundary - either in a contraction or when using a possessive. So I don't understand how reversing Todd's gets you s'Todd or doesn't gets you t'doesn.
As for the placement of punctuation - its kind of a fuzzy area I suppose and nobody is wrong as the desired behavior is left unspecified. The test string I had in mind was simpler - along the lines of "I was a bad boy". You can argue either result is correct. My gut is to treat sentence separators (commas, periods, etc) as individual words. You seem to have a different idea about leaving them between words 3 and 4 even if the original words 3 and 4 are now 9 and 8.
I hope that questioning the spec would give brownie points.
Pretty much any evidence of coherent thought gives brownie points. Failing completely but providing a view onto decent thought processes can still carry the day around here. We're just trying to find critical thinkers with decent grounding in CS that don't view all that "theoretical stuff" as useless.
This is apparently pretty hard to find.
"The significant problems we face cannot be solved at the same level of thinking we were at when we created them." --Albert Einstein
"This is still a dangerous world. It's a world of madmen and uncertainty and potential mental losses." --George W. Bush
|
Post #183,753
11/9/04 11:24:29 AM
|
I was assuming...
the most naive definition of "word" - a string of word characters. In my experience people tend to start with that and then go from there.
You're right that the spec did not specify what to do with the non-word characters. But when I hear, "Do this", my first question is, "What other side-effects are OK?" My default assumption tends to be "none" unless I find out otherwise.
Cheers, Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
|