Fun-looking stuff in Boston

Post #24,454 by ben_tilly 1/15/02 12:45:41 PM Reply	Fun-looking stuff in Boston A friend of mind sent me a link to [link\|http://www.itasoftware.com/careers/programmers.php\|ITA Software] because their ad has some fun programming problems. (I asked them, they don't want solutions discussed publically until after they change the problems, so let's not discuss how to solve their problems. Besides which, they aren't all that hard if you have a decent math background.) And in a change from the usual, they want C++ and Common Lisp. Cheers, Ben
Post #26,044 by Arkadiy 1/28/02 10:22:30 AM Reply	Re: Fun-looking stuff in Boston I think I got #1. My result is 12. Anyone else?
Post #26,054 by ben_tilly 1/28/02 11:05:40 AM Reply	Nope, the correct answer has length 15 Unless they changed their data file that is. Just to irritate people, my solutions to problems 1 and 2 run in under 30 seconds, even on my old P200 laptop. (They are written in Perl of course.) OK, the first passes didn't run that fast. But you really can reduce the amount of unneeded work astonishingly. Cheers, Ben
Post #26,088 by Arkadiy 1/28/02 1:50:16 PM 1/28/02 2:12:45 PM Reply	Oh shit! Oh, mine is about that fast too, or faster (N log N). Written in C++. Except looks like it's wrong... OK, I'll look at it again. Edited by Arkadiy Jan. 28, 2002, 01:56:03 PM EST Edited by Arkadiy Jan. 28, 2002, 02:12:45 PM EST Expand All History
Post #26,102 by ben_tilly 1/28/02 3:01:12 PM Reply	Heh Actually mine is O(n) (assuming that hash lookups are O(1) and words have a fixed average length), but with a bad constant. (Well for one thing it is written in this slow scripting language...) Running it on a 600 MHz (dual processor but single-threaded) machine I get: tilly@prod1:~$ time perl add_gram_fast4.pl WORD.LST to + n = ton + i = into + n = niton + s = nitons + e = tension + m = mentions + a = nominates + i = antinomies + r = nitrosamine + t = terminations + d = antimodernist + e = determinations + i = intermediations + n = INDETERMINATIONS (length 15) real 0m6.296s user 0m6.120s sys 0m0.110s My initial tries, um, didn't quite run that quickly. :-) Cheers, Ben
Post #26,126 by Arkadiy 1/28/02 4:43:13 PM Reply	OK, I know what you did. :) I did not use hash, used STL maps instead, hence log N. And the constant is probably just as bad as mine: between 20 and 30. :) Now if I could only find the effing bug... Will look first thing I get home.
Post #26,192 by Arkadiy 1/28/02 9:02:51 PM Reply	Fixed. $ time adda word.lst 2>/dev/null indeterminations by adding n (16 letters) intermediations by adding i (15 letters) determinations by adding e (14 letters) antimodernist by adding d (13 letters) terminations by adding t (12 letters) nitrosamine by adding r (11 letters) antinomies by adding i (10 letters) nominates by adding a (9 letters) mentions by adding m (8 letters) tension by adding e (7 letters) nitons by adding s (6 letters) niton by adding n (5 letters) into by adding i (4 letters) ton does not need to be derived real 0m10.000s user 0m0.000s sys 0m0.000s Even with optimization - still too slow (that's PIII - 800). I gues hashes are really that much better for thins problem.
Post #26,204 by ben_tilly 1/28/02 10:04:01 PM Reply	I wouldn't count on that First of all you still have an off-by-one bug. There should be a "to" at the beginning of that list, which you didn't output. But more than that, I do have some non-obvious optimizations in there. For instance as I build the chains downwards, I delete nodes to avoid repeating logic with other chains. (Did that from the first.) I do an obvious length test to allow me to break out with the first chain that I know I can't improve with. Add in some moving of steps that I know are slow in Perl (eg avoid entering functions), and my different versions have a huge performance differences. I would furthermore be shocked and horrified if the simple act of removing all of the scalar allocation and deallocation, and memory management overhead wouldn't result in good performance gains. I was joking about Perl, but the truth is that I used my math and knowledge of the language to push it to its limits, but those limits should be easily exceeded. Cheers, Ben
Post #26,206 by Arkadiy 1/28/02 10:14:10 PM Reply	Re: I wouldn't count on that >>>>>>>>>> First of all you still have an off-by-one bug. There should be a "to" at the beginning of that list, which you didn't output. <<<<<<<<<<< Nope. They said that derivations start at 3-letter words. As to deriving downward... Interesting. Going to try this one. (but the slowest part for me is actually loading data...)
Post #26,208 by Arkadiy 1/28/02 10:35:58 PM Reply	Big improvement! 30% on the down algorithm. My hat is off to you, sir. Amazingly enough, it's a different word. $ time adda_down word.lst 2>/dev/null dictatorialness by adding c (15 letters) dissertational by adding t (14 letters) radiationless by adding s (13 letters) rationalised by adding i (12 letters) desalinator by adding r (11 letters) dealations by adding s (10 letters) dealation by adding o (9 letters) dentalia by adding i (8 letters) lanated by adding n (7 letters) alated by adding d (6 letters) alate by adding t (5 letters) alae by adding e (4 letters) ala does not need to be derived real 0m7.080s user 0m0.000s sys 0m0.000s Still not as fast as Perl. Damn.
Post #26,219 by Arkadiy 1/28/02 11:38:20 PM Reply	Used STLport instead of gcc's STL. Another second saved $ time adda_down word.lst 2>/dev/null dictatorialness by adding c (15 letters) dissertational by adding t (14 letters) radiationless by adding s (13 letters) rationalised by adding i (12 letters) desalinator by adding r (11 letters) dealations by adding s (10 letters) dealation by adding o (9 letters) dentalia by adding i (8 letters) lanated by adding n (7 letters) alated by adding d (6 letters) alate by adding t (5 letters) alae by adding e (4 letters) ala does not need to be derived real 0m6.040s user 0m0.000s sys 0m0.000s I am going to try hash maps at this point.
Post #26,221 by Arkadiy 1/28/02 11:40:56 PM Reply	STL + hash_map Almost 2 seconds more. Now C++ is faster than perl, and I can go to sleep :) $ time adda_down word.lst 2>/dev/null countershadings by adding s (15 letters) countershading by adding h (14 letters) undercoatings by adding s (13 letters) undercoating by adding o (12 letters) underacting by adding d (11 letters) uncreating by adding u (10 letters) recanting by adding t (9 letters) recaning by adding n (8 letters) anergic by adding r (7 letters) incage by adding e (6 letters) acing by adding g (5 letters) cain by adding i (4 letters) can does not need to be derived real 0m4.280s user 0m0.000s sys 0m0.000s
Post #26,250 by ben_tilly 1/29/02 7:19:29 AM Reply	See? I said that Perl was slow and bloated! I am sure you could get that performance down quite a bit more as well. As for whether an add-a-gram has to start at 3, that is an interesting spec ambiguity. I can see why you read the spec that way, but for me the pattern and the "and so on" suggested that an add-a-gram could start at any length. BTW any luck on problem 2? Cheers, Ben
Post #26,272 by Arkadiy 1/29/02 9:23:12 AM 1/29/02 9:36:02 AM Reply	No luck on #2 I gave up and found a solution on the Net. After looking at it for a while, I still can't say I grok it. I don't quite see how the author can be sure that he exausted every possible combination (although I see why he thinks so). Also, the solution does not allow for unary "-". Would you let me look at yours? As to more optimization - I could not do it w/o sacrificing that readability and compactness. I could use more C instead of STL, but that would be reinventing the wheel. I think it's good enough as is. Edited by Arkadiy Jan. 29, 2002, 09:36:02 AM EST Expand All History
Post #26,293 by ben_tilly 1/29/02 10:15:56 AM Reply	You can skip the - very easily There is a complete and utter symmetry of positive and negative at the numbers you can reach in all of the calculations. Noticing and using this means that if you only consider calculations with non-negative pairs of numbers that will result in non-negative numbers, you get a factor of 4 reduction in combinations of pairs of numbers, and have shorter sets of expressions to include. This ties directly into the fact that there is no need to consider every possible expression. You merely need to be sure that you visit every possible value. (Or actually every non-negative value.) For the Perl solution you also have to assume that floating point round-off is not going to rear its ugly head. This is true, but proving it is non-trivial. The following solution is not my fastest in Perl, but it is not too far off and has the benefit that with a -v argument you get told the expressions leading to smaller numbers. Note that Perl is an extremely poor solution for this problem because it spends a lot of time doing number-string conversions. (As is seen by the fact that building up the actual string expressions is relatively little extra work.) (BTW Scott, I loathe my workaround for the extra returns...) #! /usr/bin/perl use strict; use Getopt::Std; getopts('v'); use vars qw($opt_v @find_in); # Set up my array of hashes of how to find various numbers with # n 9's. @find_in =( {}, # None of size 0! { 9 => '9' }, # 2 with 1 9 map {}, 2..9 ); foreach my $i (1..9) { print "Searching depth $i\\n" if $opt_v; my $find_in_a = $find_in[$i]; foreach my $j (1..$i) { next if 9 < $i + $j; print " Searching combinations of $i, $j\\n" if $opt_v; my $find_in_b = $find_in[$j]; my $find_in_sum = $find_in[ $i+$j ]; foreach my $val_a (keys %$find_in_a) { my $expr_a = $find_in_a->{$val_a}; foreach my $val_b (keys %$find_in_b) { my $expr_b = $find_in_b->{$val_b}; $find_in_sum->{$val_a + $val_b} \|\|= "($expr_a + $expr_b)"; $find_in_sum->{$val_a * $val_b} \|\|= "($expr_a * $expr_b)"; \tif ($val_a < $val_b) { $find_in_sum->{$val_b - $val_a} \|\|= "($expr_b - $expr_a)"; \t} \telse { $find_in_sum->{$val_a - $val_b} \|\|= "($expr_a - $expr_b)"; \t} if (0 != $val_b) { $find_in_sum->{$val_a / $val_b} \|\|= "($expr_a / $expr_b)"; } if (0 != $val_a) { $find_in_sum->{$val_b / $val_a} \|\|= "($expr_b / $expr_a)"; } } } } } my $ans = 1; while (exists $find_in[9]{$ans}) { print "$ans\\t$find_in[9]{$ans}\\n" if $opt_v; $ans++; } print "ANSWER: $ans\\n"; Cheers, Ben
Post #26,313 by Arkadiy 1/29/02 11:10:55 AM Reply	That's the part I can't see Yes, I get that inverting signs gives you symmetric values. But how about mixing signs? If you mix positive and negative nines, you may reach some numbers unreacheable any other way. E.g., suppose we limit ourselves to +, - and () only. Then 0 will be unreacheable w/o using unary minus. Also, for a value that you cannot produce, you need to be sure that you exhausted all possible approches, no? I guess I just suck at number theory...
Post #26,315 by Arkadiy 1/29/02 11:15:09 AM Reply	May be I can see, after all :) The way you handle binary '-' must be the key. This way, any use of unary minus can be expressed through binary one.
Post #26,331 by ben_tilly 1/29/02 12:27:10 PM Reply	Right Here are two ways to see it. The first is to note that if you apply unary - to an expression it can always be distributed into the expression. Either into the first term (case of * and /) or across both terms (+ and binary -). Therefore in any expression with unary -'s comes out to the same answer (after mechanical operations) as one where the only unary -'s are applied to individual terms. In other words you are building up expressions out of 9 and -9. But that just establishes the symmetry that I used to avoid having to consider negative numbers, so I ignore the -9. The second is even slicker. You agree with the symmetry note? Then I only need to consider operations with non-negative numbers that come out to non-negative answers. Every expression involving 2 non-negative numbers comes out to something that is either on my list, or that is negative, or that gives the same value as something on my list. In short, the unary - is useful because it establishes the symmetry. And once established, it turns out you never need to look at the unary -. (Actually it turns out to be a red herring. While the symmetry is relatively easily spotted, all you really need to know is that the set of negative values you can reach with n 9's is a subset of the set of negatives of positive values you can reach with n 9's.) Cheers, Ben
Post #26,072 by wharris2 1/28/02 12:59:28 PM Reply	Certainly an interesting approach From the two or three times I was there, Boston has good public transportation system and is probably as pricey as anything else on the east coast. Have a friend who was an investment broker there for a while. "Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth

Welcome to IWETHEY!