2009 1C

2015-12-30 00:00:00 +0000

All Your Base

Insight:

  1. If digits are fixed, then the base should be as small as possible, to the point of larget digit + 1

  2. If base is fixed, then the first digit should be 1, and second 0, and third 2..

  3. Special case: only 1 digit

Pseudo code

base = the number of unique digits
create a map
first char -> 1
for i = [1, n)
  if digit does not exists in map, 
    map[char] = map.count or 0 if map.count == 1
  digit  = map[char]

translate digits into the base

Center of Mass

Consider only the 2 dimesnion case

only 1 firefly: distance toward a single line

2 fireflies:

both x and y are fully determined by time of inspection t. i.e., xC= fX(t), yC = fY(t), and both fX(t) and fY(t) are linear functions.

Therefore, yC can be resolved into a linear function of xC in a closed form. i.e., we can reduce the 2 fireflies case into our simplistic case

Same argument applies to 3 dimension case

Bribe the Prisoners

Translate the requirement: find the order of release the prisoners such that # of coins paid

Swap unknown and known: suppose we release first prisoner. It becomes 2 segments, each of which will represent the original problem

i.e.,

max(start, end) if(start == end) return 0;

releaseHead = 1 + max(start+1, end)
releaseTail = 1+ max(start, end -1)

  For i = start + 1 to end -1
    releaseMiddle = max(start, i) + max(i+2, end) + 2  

it would be the min of all choices

2009 1B

2015-12-29 00:00:00 +0000

Decision Tree

Simple parsing and traversal problem

  1. concate all strings into a single line
  2. parse the string into a binary tree
  3. pass each query in to traverse

parse(int start, int end)

  skip space
  double prob = tokenizeNextDouble

  skip space
  if (start == end)
    return new node(prob, "", null, null)
  else
    string nodeName = tokenizeNextStr
    skip space
    int leftEnd = findNextRP(start)
    leftNode = parse(start, leftEnd)
    skip space
    int rightEnd = findNextRP(start)
    rightNode = parse(start, rightENd)
    return new node(prob, nodeName, leftNode, rightNode)

calc(double p, set features) p *= prob if (p->leftNode == null) return p

if(features.count(nodeName) == 1)
    return leftNode->calc(p, features)
else
    return rightNode->calc(p, features)

The Next Number

starting from the right end, find the longest sequence such that digits[i] >= digits[i+1]

if the sequence is the whole string, we need to add a 0,i.e, the new sequence would be
digit[0] + 0 + digit[1]....digit[n-1] reversed

Otherwise, we include the next digit to the left, and the next number would be 
  smallest digit greater than that digit, and then the remaining digits in an ascending order

Square Math

Number theory related. Too hard for me.

2009 1A: B & C

2015-12-23 00:00:00 +0000

I failed to solve them, but still record here where I got stuck

B. Crossing the road

I am not familar with techinques of modifying exisiting graph algorithms. Too little experience with it. That is why I was not sure what node in the graph should represent

C. Collecting cards

geometric distribution: number Y = X - 1 failure before the first success. Each failure is independent and has a failure rate p

coupon collector problem (CCP)

hyper geometric distribution:k success in n draws, without replacement from a finite populate of size N that contains exactly K success, i.e., N, K are upper bounds for n, k, respectively

Sharing the insight with CCP:

P(m, n) = after we have n different cards, what is the probability that we can get m new cards from the next package (in CCP m = 1)?

since now we have a handle at the state of having n cards, we will ask the expectation with n in it, i.e.

Ex(n) = # of expected packs to buy to collect n cards

and we can define Ex(n+ m) based on Ex(n) and P(m,n)

One problem I ran into is how to calculate P(m,n) with high enough precision. Turns out I should calculate binomial coefficient in a recursive way

2009 1A: Multi-base happiness

2015-12-22 00:00:00 +0000

Insights:

  1. A number is either happy or not

  2. The reduction to 1 forms a trail. All numbers on that trail will be happy

  3. The reduction progess cares only set of numbers, i.e., given a set of digits, any permutations of that is happy (of course 0 can not be first digit)

  4. Notice there is < 2^9 possible answer sets, we can just do pre-computation. Moreover, since it is a set, we dont have to generate all 2^9 answers! we can just generate subsets of them, and combine the result with unions and intersections

Why I got stuck

  1. I had problem understanding the upperbound of the number, turns out brute force search alone is enough to cover it. What I should have done is to brute force the small case, and guess/experiment from that

  2. I thought a number is more likely to be unhappy, since the final step gives any digit between 1 adnd 9, i.e., 10 % chance to be happy(roughly). So we should generate set of integers based on set of digits forming a happy number. Turns out I ignored the square part in the computation,i.e., this assertion is false

  3. Another key insight is that, similar to a lot of self-reinforcing operations, e.g., on a group. If the chain keeps going, either we end up with identity or itself. So instead of search for happies, we can look for unhappiniess

Pseudo code

For num from 1 to 20 * 10^ 6 keep a bit map on what bases are happy

For each base from 2 to 10 decide if num is happy under that base, with the caching previous result update the bitmap

based on the bit map, update the final answer map

output the answer map

The code that generates output is just a read-lookup loop

2008 1C: Increasing Speed Limit

2015-12-20 00:00:00 +0000

Subproblem for small case:

Given index i, how many j < i, s.t., n[i] < n[j]

This subproblem stops working for the large case, because in the worse case, a sorted array, the summation step will be n^2, i.e., we can not sum one by one now directly now. This means we need an index-like structure that can represent a summary of ranges, to avoid doing ops across the range one by one.


Subproblem:

Given index i, how many subsequence ends before i s.t. the ending element of subsequence < n[i]?

How do we decompose this problem?

A more natural association will be segment tree, i.e., static range + dynamic attached node info.

Subproblem:

  1. What should be static range part represent
  2. What should be stored in segment tree node?

Option 1:

range represent range of indices. Then inside tree node, to answering the question, we need to store numeric value information regardless, to avoid going down all to the leaf.

Option 2:

range represent range of values inside array, similar to 1, we need to store range of indices inside node. But upon closer inspection, since we are wrapping up from left to right, all frequency will fall in the range j < i. i.e., we can do without it! ——

Pseudo code:

sort the array and construct BST and segment tree, wiht node representing values in the array

each node has attached node info set to 0

for i = 1 to n get sum of frequncey with range < n[i] save result to answers, which represent # of increasing subsequence ending at it update tree to include the sum calculated. (if previous step gives 0, set it to 1 instead)

Segment Tree

2015-12-18 00:00:00 +0000

Basic idea:

  1. A binary search tree,i.e., each node has 0 or 2 child nodes, which is static

  2. By static we mean the shape of the tree and the value of each node,i.e., we do not have to worry about re-balancing the tree. We

  3. This means normally we construct a balanced BST during init, and do logn querying/updating attached node info as part of op

  4. Due to the nature of the tree, the parent node info can represent a summary of the tree. This enable effiicent range queries, and cascading update will cost logn, i.e., # of level in a balanced BST.

  5. Implementation wise, since we can perfectly balance it during it, we can use a dense array implementation,i.e., root node at 1, left child index *2, right child : index * 2 + 1

Desgin of property-based testing lib from functional programming in Scala

2015-12-02 00:00:00 +0000

Basic idea:

  1. Framework generates values and run propery checks against those values to ensure property is satisfied

  2. Programmer comes up with algebraic properties of the API to be tested

  3. Framework should be able to minimize failed test and generate exhaustive values for the properties space

Design process:

  1. Obviously, we need a Gen[A] to represent generatoer

  2. To create a property, we use forAll(Gen[A]) (A => Boolean) : Prop. A => Boolean represent the algebraic rule specified by the programmer

  3. Prop should be composable, i.e., it needs to support && and || both of type Prop => Prop, i.e., accepts another clause and returns a combined clause

  4. Prop needs to be able to check, i.e., need a check: Either[SuccessInfo, FailureInfo]

  5. For Gen, often we need context-sensitive generation, e.g., first generate the size of the list, and then generate list based on size, so Gen is a monad and needs to support flatMap

  6. For Gen to know how to generate, we need to pass in a State[RNG, A], where State itself is a RNG => (A, RNG)

  7. For Prop to how how to run test, we need to pass the logic over, i.e., how many cases to pass before we show the result? TestCases => Result

  8. Since Prop is responsible for running the check, it needs to know how to generate as well. Instead of TestCases => Result, we need a (TestCases, RNG) => Result, so Gen can accept that rng and run

Code Jam 2008 1C: Ugly number

2015-11-28 00:00:00 +0000

By Chinese remainder theorem, because 2,3,5,7 are coprime, we know any answer can be reduced to a number in mod 235*7 group,i.e., one dimenstion of the state is the element in the mod 210 group.

Another dimension is the position in the string.

At each step in the string, we have 3 choices, concatenate, +, or -, note that we do not know the position into which we insert signs => we need to try them all!

Gotchas

  1. Note that we can not 2^64 is only 19 digits, so we can not convert the whole string directly into numbers. Fortunately num * 10 + digit preverses its mod under Group 210, because 210 is divided by 10. You can prove this by induction, adding digit by digit

  2. Alternatively, the state can be [start][end][mod] instead of [end][mod] to avoid trying all in loops

Starting state is ways[firstDigit] [0] = 1, ways[other memember in mod 210 groupd] [0] = 0;

Moreover, if number % 210 is divisible by 2,3,5, or 7, we know itself must be divisible by 2,3,5,or 7.

if we want to find all ugly numbers, since all can be reduced to one of 210 choices, we can brute force search and filter out non-ungly ones

Code Jam 2008 1B: Mousetrap

2015-11-27 00:00:00 +0000

Started with the last element remaining, we can reconstruct the whole sequence of deletion by DP => N^2

Solve the problem differently, instead of last to first, we can go first, and we notice that we can easily determin the initial position of each card, since by construction, you know exactly the interval between prev and next position, but brute force construciton doesnt give better run time.

What is blocking us is that we have to go through index one by one JUST to collect count, i.e., we need to introduce additional index-like data structure which contains count-related info.

We can use the classic sqrt break down, i.e., sqrt indices, each represents state of sqrt of rows. Both are sqrt to make sure worst case works are equiavlent in both cases. This solution can barely fit the run time

A related idea is to use interval-tree like data structure to answer the query, i.e., instead of 1-level indexing, use multi-level indexing:

Given current index i, what is the next index after skipping j unoccupied indices?

Segment tree’s idea is similar. Note the true segment tree is used to answer range queries, and is not supposed to change after creation.


This problem shares a lot of similarity with Josephus problem. The sqrt trick would work as well

Code Jam 2008 1C: Text Messaging Outrage

2015-11-18 00:00:00 +0000

Intuition: if f(a) >= f(b), then k(a) <= k(b) so that total # is minimized

Proof: Suppose there exists f(a) >= f(b), and k(a) > k(b), swapping the key of a and b will reduce the total #. Thus, we can repeat the process and keep reducing the total #,i.e., an arrangement satifies the condition is better than all arrangments not satisfying the condition, this means the condition is optimal

Note that sorting is increasing order by default. In this case we need decreasing order, or we can start checking from tail to head instead of head to tail

Code Jam 2008 1B: Number Sets

2015-11-17 00:00:00 +0000

Unknown: number of sets

Known: consecutive sequence of ints, each initially in its own set

Conditions: for each set, from any # to another #, we can form a path where the two ends share a prime factor at least P


Insight:

1.for each #, either it has a prime divisor <= sqrt(B) or itself is a prime
2.If two share a prime factor p, then either p <= sqrt(B) or p > sqrt(B)
3.If two share a prime factor p, then the difference of 2 >= p

Subproblem: is it possible for p > sqrt(B)

Answer: 10^6 >= difference of 2 # >= p, but sqrt(B) = 10^6, contradiction!


Therefore, we need to know only primes up to 10^6, and then do something similar to sieve to build numbers bottom up instead of top down, because factoring is hard and slow.

Pseudocode

sieve to 10^6

for each prime p from P to min(10^6, B)
  		factor = A/p + A%p ? 2 : 1

  	
while(p * factor < B) 
	union(p * factor, p * factor -1)
	factor++


for each n from A to B 
	add find(n) to a set

output the size of set

My original implementation can not pass the large dataset, because I used map to store the large number in my union find. Changing map to an array + adding logic to translate large index into a small one solved the problem

Code Jam 2008 1A: Milkshakes

2015-11-10 00:00:00 +0000

Unknown: arrangement

Known: clients preferences

Conditions:

1.Each client has at most 1 malted
2.Arrangement has min # of malted
3.All customers must be satisfies, i.e., at least 1 in arrangement must match customer preference

Insights:

1.Only 1 choice case: arrangement is forced
2.If many choices have same set of item preferences: all 0s will do
3.In terms of order of processing, we must solve all single cases, and propagate the changes and solve all single cases it generates
4.What remains is list of preferences with all free arrangements, each has at least 2 entries => all 0 would work, because we know at least 1 is 0!

However, I run into serious problems implmenting pseudo code above. Namely, updating the list of user requirements and extract the remaining one

Official solution:

1.start with all 0s
2.If all clients satified, done
3.If not satisifed, should we satisfiy its 0 requirement or 1 requirment?
4.0 requiremnet is not satisfied because all its positions are 1 already, this means we have to set the remaining pos 1 from 0 to meet the 1 requirement
5.1 requirement is not satfisfied, because that pos is 0. Continue with argument from the previous step, we set that bit to 1. 
6. repeat 2 - 5

This approach works because we turn only from 0 to 1, and not the other direction. Therefore, 2-5 will run at most # of batches times, each time iterate all customer once!

Can use a bitset to make comparison easier, also, keep malted and unmalted in seperate collections.

Reflections:

The basis of the algorithm is unit propagation. But because there is at most 1 true value. We dont have to keep track of # of active clauses => there is only 1 choice, unlike the general unit propagation case, where you need the count to help decide if it is the time to flip.

In either general or specialized case, we just need to know should we flip 0 to 1, and never from 1 to 0 because it is flipped by previous propagation

This unit propagation pattern is

1.start with identity answer
2.Similar to the spirit of greedy solution, move to the optimal solution one step at a time, with sure-fire step that improves the result
3. This sure-fire step becomes the inductive step
4. if there is no more step we can apply, we know we've reached the optimal solution

This pattern can be seen in a lot of optimization problems, e.g., Bellman-Ford, greedy algorithms, gradient descent..

Code Jam 2008

2015-11-08 00:00:00 +0000

1A: Minimum Scalar Product

I discovered the insight from the 2 element, and to an extend, 3 element case. But how to prove it? Direct induction does not work well.

Suppose our current permutation is not as desired, we start changing this into our target, from the smallest to larget, one element at a time. 

We will fix the element by connecting to the correct one, and connect the now 2 empty ones. The shape of these changed connections are same as in the 2-element case. 

Thus, each step only reduces the overall sum,i.e., for any perm, by transforming into our target perm, the sum will only decrease => Our arrangement is optimal

Notice the idea of steps of transformation vs final states. This kind of duality can be seen in the log-state duality and gradient descent

We can also see the idea of step-wise, gradient-descent like optimaization approach in the next problem!

1B: Crop Triangles

Unknown: number of 3-point sets

Known: list of points

Conditions:

1.sum of x-coordinates of the 3 points is a multiply of 3
2.same as y
3. list of points is generated from a hash-like function with parameters

Subproblem: consider only the x coordinate case

1.From unknown: it is equivalent to x1 + x2 + x3 mod 3 = 0
2.We realize mod operation represent a group operation with size 3,i.e., any 3-element check is reduced to a pattern of 3 elements from the group
3. matching from numbers to groups is too hard, so we will try matching from group to numbers. i.e., find patterns in the group that satisfies conditions, and see what elements can be reduced to patterns

Now consider y coordinate case: same arguments apply.


Subproblem: given (x,y) in the group of size 9, how many 3 element sets I can find to meet criteria?

Answer: since size is only 9, we can just brute force


Subproblem: after we identified good patterns, how do we know how many point sets can be reduced to such patterns?

Answer: 3 cases

1.if 3 points are reduced to the same element in the group
2.if 2 points are reduced to the same element in the group
3.if 3 points are reduced to 3 different elements in the group

Adds the answer for each 3-elemnet pattern together and we have the answer


Problems I ran into:

1.memset a unsigned long long array gives problem(?)
2.One instance should use long long to avoid overflow
3.We are looking for sets, but the counting we did is for combinations, i.e., same sets are repeated many times in different orders. I missed one case
4.0 requiremnet is not satisfied because all its positions are 1 already, this means we have to set the remaining pos 1 from 0 to meet the 1 requirement
5.1 requirement is not satfisfied, because that pos is 0. Continue with argument from the previous step, we set that bit to 1. 
6. repeat 2 - 5