Codeforce Notes

2016-08-06 00:00:00 +0000

534C: Polycarpus Dice

  1. max value can be reached  = min(d(i), A - (n - 1))

  2. min value can be reached = max(1, A - (sum - d(i)))

469C: 24 Game

  1. n = 4, just multiple them together

  2. n = 5, possible the form a solution

  3. for all n >= 6, we can reduce to case n < 6, just keep multiply with i - (i -1), where i >= 6  

236C: LCM Challenge

  1. obviously, there is an optimal solution where the any of the gcd(i,j) should be 1, so that the answer is a * b * c. Otherwise, we can reduce a solution into a format with this pattern

  2. if n <= 2, just pick n

  3. we know gcd(n,n-1, n-2) = 1, and n * n-1 * n-2 is maximal, if n is odd

  4. But if n is even, then we should pick n-1 * n-2 * n-3 

318C: Perfect Pair

  1. if any number is negative, then we can not make from an non-perfect to perfect

  2. if any is posivite, just brute force it => if x <= y, turn into (x+y, y), we know the lower end increases by at least double 

578A: A Problem about Polyline

  1. since polyline can not be higher than x, then we know b <= x

  2. if a < b, that is impossible to get the line

  3. if the point is on the rise,  a - b will be on the axis, i.e, (a-b) = 2k * x

  4. if the point is on the fall, a + b will on the axis a+b = (2k + 1) * x

  5. try both and take minimum

Codeforce Notes

2016-08-05 00:00:00 +0000

456C: boredom

Consider each number appears only once

If we have 3 consecutive numbers, we should take 1st and 3rd

If we have 4 consecutive nubmers, we shoud take 2nd and 4th

i.e., if there no repeats, start the biggest, and we take every other numbers

If there are repeats,

best(n) = max(score(n) + best(n-2), score(n-1))

so sort the number, then

best(n) = max of

  1. score(n) + best(n-1), if a[n] > a[n-1] + 1

  2. score(n) + best(n-2), if a[n] = a[n-1] + 1 

  3. best(n-1), if a[n] = a[n-1] + 1
      

269A: Magical Boxes

Sort (k, c) pairs

start from 1 to n-1, calculate, how many box of this size can fit all previous sizes. Set balance to max(boxes to fit all, a[i])

if balance =1, we are done. Otherwise, the final answer is kmax + log(4, balance) + 1

204A: Little Elephant and Interval

Standard transformation: calculate all such pair < R

for numbers with m digits , all possible number is 9* 10^(m-2)

For numbers with same number digits as R,

if first digit <= last digit, (fd - 1) * 10 ^(m-2) + middleDigits + 1

if first digit >= last digit, same answer, (fd -1) * 10 ^(m-2) + middleDigits

437C: The Child and Toy

What will an optimal solution look like?

Consider the vertex with biggest value, since all edges must be brokean, we want they to be borken from the higher end, i.e., the highest value vertex should be removed first, to satisfy the conditions.

Repeating this process, we can form a sequence of actions, where for each edge, we guarantee that the edge is broken from the bigger end, i.e., the smallest possible cost is incurred

Codeforce Notes

2016-08-04 00:00:00 +0000

599C: Day at the Beach

Suppose we find our first segment s with length l, what property will that have?

  1. it will contain the l smallest numbers. Conversely, if we find the first l smallest numbers, we can declare a segment

  2. any remaining numbers >= any numbers in the s,i.e., the min of the remaining segment >= max of s. Conversely, having this property
means s must have the smallest numbers,i.e., can form a segment

This leads to 2 ideas

Idea 1

We keep track of a balance of numbers as we scan through the array from the left, every time we hit with balance 0, we announce that we detect a segment

Idea 2

We scan from left to right, every time we detect i s.t. max(1…i) <= min(i+1…n), by property 2 we can declare a segment

123A: Prime Permutation

Consider the target string:

All s[i] where i 2 will share the same letter
if n >= 6, then i 3 will share the same letter as previous, because s[6] = s[3] = s[2]    
Similarly, if n >= 10, then all i 2, i 3 , and i 5 will share the same char

By induction, we know for all prime <= n/2, i | prime will share the same letter. Moreover, if a composite number <= n/2, it must have a prime fact <= n/2, i.e., all number in [2, n/2] must share the same number

Conversely, if prime > n/2, it can appear only once as factor, i.e, at i = prime itself. if composite > n/2, then it must be a prime factor <= 2/n, which means it will share char with previous steps

So the algorithm is


  1. scan string to keep track of counts, and which letter is the most popular

  2. all number in [2, 2/n] must be same

  3. for numbers in 2, use a process similar to sieve to generate composite numbers > n/2 that has to share the same char

  4. scan through the answer again to fill in whatever left

297B: Fish Weight(!!!)

Idea 1

Consider the delta between A and B, it is possible for A to win if and only if

  1. delta of the biggest fish is in the favor of A

  2. first continous positive delta in the favor of A > sum of continuous positve delta previously in the favor of B.

But coding this is not the easiest thing

  1. sort <weight, count> pair in decreasing order for A and B

  2. if top weight (A) > tw(B) || top weight(A) = tw(B) && count of tw(A) > count oftw(B), done

  3. if top count and weight same, progress both indeies 

  4. progress B indice, until conditions 2) is statisfied, update the top most count for B

  5. now progress A index until 2) is violated, update the + balance for A

  6. compare if + balance is > - balance

Notice here we need to iterate through map <int, int> which is often sigh of flawed algorithm

Idea 2

just sort all fishes caught, on the first difference, if A is winning we are done. If B is winning, we need to keep scanning until we find a position where A is winning

Claim: A is winning in at least one postition <=> it is possible to find a solution

  1. (<=)  if A is not winning at any position, obviously impossible 

  2. (=>) we can make the delta arbitarily large to cover all potential balances

Codeforce Notes

2016-08-03 00:00:00 +0000

343B: Alternating Current

After looking at smaller cases, we can see that we can reduce consecutive ++ or – pairs, until the whole string is reduced to empty string.

Therefore, we can scan from left and right, and eliminate all ++ or – as we scan through. However, we can not afford to modify the underlying strng. Therefore, we have to maintain a prev[i], i.e., a linked list

as we scan through , if s[i] = s[prev[i]], we set prev[i+1] = prev[prev[i]], i.e., we remove i and prev[i] from the linked list

Init prev[i] = i-1 in [0, n], linked list will be empty if prev[n] = -1, i.e., the init value for prev[0].

198B: jumping on walls

Consider the subproblem by dropping the condition: what if without the rising water, how to solve it?

We can reduce to a graph connectivity problem, which can be solved with DFS

  bool visit(height, side)

      mark (height, side) as visited

      canEscape = height > n - k

      canEspace = canEscap || visit (height + 1, side) if it is not visted yet.

      canEspace = canEscap || visit (height - 1, side) if it is not visted yet.

      canEspace = canEscap || visit (height + k, !side) if it is not visted yet.

      return canEscape

But with the rising water, how to prove the existance of a solution?,

 1. we know the time arriving at a spot must be before water reaches that level. 

 2. Therefore, we look for the earliest time to arrive at a spot. A solution exists iff we can find an earliest time to escape

 3. This transforms the problem into a shortest path in a unit length edge graph problem -> a BFS would do

 4. note that since if we go up, we go at least 1 unit up, we do not even need to worry about water in that case
  
 5. However, if we go down, we need to ensure water level is not high enough yet,i.e., earliest time at the spot = water at that time <
height of the spot

Codeforce Notes

2016-08-02 00:00:00 +0000

350C: Bombs

Standard greedy problem

Idea 1: group bombs by rows, sort each row by cols.

Idea 2: If A is on the path to B, then we know mahatann distance of (A) < MD(B), i.e., if we process bombs in the order of MD, we know that all potential blocking bombs are cleared already

334C: secrets

For the optimal solution, we need to use as many small denomincations as possible in the place of large ones,i.e., the optimal solution have only the smallest denomination, because for any other solution, we can replace larger coins with smaller one and improve the result.

Consider any potential solution, the smallest denomination can not divide n, because otherwise, we can take out that denomination, and the sum is either still greater than n, or < n, which implies the sum = m.

Therefore, in the optimal solution, the smallest coin should not divide n

525C: Ilya and Sticks

Suppose we have 4 pairs a < b < c < d, c * d + a*b > a * d + b * c, i.e., we need to be greedy, and form the first pair we can

Idea1

Saw as you go. Start scanning len from max to min,

  1. push all evens to the result list
  
  2. if there is 1 remaining left, and count[len-1] > 0, push an additonal pair

  3. In the final run, iterate through pairs and output

Idea 2:

Do all sawing first, and then look for the matching pairs. Scan len from max to min

  1. if count[len] is even, no need to saw

  2. if count[len] is odd and count[len-1] > 0, saw one. Note that since we saw at most one stick at each length and only when there already exists a stick with len - 1, we know that we are not re-sawing the same stick twice

  3. if count[len] is odd and count[len-1] == 0, count[len]--

  4. Now our scan handles only even entries, and we need to keep track of only the outstanding pair

272C: Dima and Staircase

Notice that the first block always get covered for each drop. So we need to keep track of height of the first block, when there is a drop, the answer will be max(right anchor, first block), and we need to update first block height afterwards

Note that without the covering first w(i) requirement, this needs to be solved with a segment tree

Codeforce Notes

2016-08-01 00:00:00 +0000

66D: Petya and His Friends

Obviously, a(i) can not be a prime, otherwise, it will violate 2nd constraint

One construction would be a(i) = product of first n primes / ith primes

Another construct is to look at case of N = 3 (N = 2 obviously gives no answer), can you expanding to N with a known solution at N = 3

353C: Find Maximum

Note that m <= 2^n - 1

Notice that all a[i]s >= 0, so we should be greedy and include as many #s as possible. So we should interate through the prefix of m

at each i, if m[i] = 1, update suffix sum, update result with prefix sum + suffix sum , update prefix sum

343A: Rational Resistance

Note that this problem does not allow joining 2 elements. Therefore, we can just backtrack.

  1. we are done if a = b

  2. if a/b > 1, then we reverse n steps so that a/b - n <= 1

  3.  if a/b < 1, then we reverse n steps so that b - n * a <= a

  4. Note that at recursive call we backtrack n steps, in the spirit of GCD algo

316B2: EKG


  1. if prev[a] = b (b != 0), populate next[b] = a;

  2. for each prev[a] = 0, go through the linked list to find its length, add it to a vector size if it does not have the target

  3. possible [v][i] = possible[v] [i - 1] || possible [v - size[i]][i - 1]

  4. iterate through all possible[v][size.lenght], print v + pos if it is true

150B: Quantity of Strings

if k = 1, any string would do

if k = n, any string of length l/2 would do

Claim: if k is even, then the string can have only 1 letter

Proof: suppose k = 2* j. Consider the substring starting at 1

  1. a[j -1] = a[j] because they are the inner most pair

  2. a[j -2] = a[j], 2nd inner most pair

  3. a[j-3] = a[j -1] = a[j], 3rd inner most pair
  
  ....

Claim: if k is odd, then the string can have only 2 letters, all odd chars are the same, and all even chars are the same

Proof: suppose k = 2 * j + 1, Consider the substring staring at 1

  1. a [j + 1] = a[j +3] =  = a[j - 1] , inner most pair


  2.a [j] = a[j+4] =  a[j - 2]

  3. repeat steps for substring staring at 2..... n -k

Codeforce Notes

2016-07-31 00:00:00 +0000

384C: Milking cows (!!!)

This is type of the problem where we decompose by classification, and hopefully the target function is a simple composition of each class, without each class overlapping of each other’s cases question is: how do we classify?

We can just focus on the cases of 2 cows, 4 cases

1. >< : we will lose 1 regarldess of which side we pick

2. >> : we should try pick the left most one first

3. <>: we will not lose any regardless of which side we pick

4. <<: we should try picking the right most one first

since case 1 and 3 gives constant cost, if we can just need to minimize case 2 and 4 AT THE SAME TIME, we can reach the minimal cost

Fortunatly, we can decompose the target string into »»> and ««< : even the two interleaves, how they interleave doesn’t affect the final target functions

So we just need to calculate the total cost of case 1 => do a linear scan with state update and we are good

118D: Caesars Legions

Consider the compliement of the problem: how many ways to have no more than k1 consecutive Fs or K2 consecutive Cs?

  ways(nF, nC, F) = ways(nF-1, nC, C) + ways(nF-2, nC, C) + ..... ways(nF-k1, nC, C)

  ways(nF, nC, C) = ways(nF, nC - 1, F) + ways(nF, nC-2,  F) +....ways(nF, nC - k2, F)

  ways(1, 0, F) = 1 and ways(0, 1, C) = 1

answer is 2^n - ways(nF, nC, F) - ways(nF, nC, C)

Codeforce Notes

2016-07-29 00:00:00 +0000

207-A: Knight Tournament

Easy problem with c++ set and it O(logn) upper_bound. But it can be solved with the parent[x] array, plus the path compression trick in union-find data structure

265-A: no to palindromes

Suppose we know the first index to flip the first letter is not same and prev and prev - 1, then we can keep generating, with this rule from a, knowing that it will not introduce new palidromes. Otherwise, the inner part would have a palindromes already

One thing to notice is that since we are looking for the next one, we should try from the tail, so that we can break immedately upon finding one

336-A: Chain Reaction

The first idea is to DP by index i, i.e., ans[i] = 1 + ans[j] where j is larget num s.t. a[i] - a[j] < b[i], but this requires a bsearch

Can we avoid the bsearch?

Consider the value is within 1 mil range, at distance d

  ans[d] = ans[d-1] if d is has no lamp

  and[d] = ans[d - b[l[d]]i - 1] + 1 if d has lamp

Codeforce Notes

2016-07-28 00:00:00 +0000

334 A: Alternative Thinking

Flipping a substring can be decomposed into 2 prefix flips. Easy to prove each flip increase the score by at most 1, and if there is an optimal solution (i.e., can increase), there is one that flips at the first 00 or 11. Therefore, we just look for total number of 00s and 11s, cap it at 2

112 C: Another Problem on Strings

Similar to the problem above, substring can be decomposed into delta of 2 prefixs. This means substring with k 1s = prefix with n 1s and a smaller preifx with n-k 1s

Therefore, num of substrings with k 1s = sum of ( # of prefixs with n 1s * # of prefixes with n-k 1s) where n is from k to n

211 C: Fixing Typos

Easy problem, but can you solve it with one pass only.

Consider the inductive steps: if my current string is valid, and I try appending new one, what should I do to ensure it keeps valid?

9C: Hexadecimal Numbers

Need to know how many numbers < n that has only digit 0 and 1?

Idea 1: recursive computation

  num(n) =
    if (n = 1) return 2

  if top digit is 1
    num(n - 10s) + num(10s - 1)

  else if top digit is not 1
    num(n - remainig + 10*)

Idea 2: Generation-Filtering Generate all numbers with digit 0 and 1 that can possibly be less than < n, i.e., 9 digits, then filter out those >= n. This works, because we generate < 2^10 = 1024 numbers

Codeforce Notes

2016-07-21 00:00:00 +0000

372A: Counting Kangaroos is Fun (!!!)

since max answer <= n/2, we can partition the data into 2 halves, each with size n/2.

Claim: there always exists an optimal solution where all big ones are in the right half, while small ones are in the left half

  Proof: suppose we have a pair (a, b) in the same half. 

  If the other half has empty space, we can move a or b to that space, while preserving the max value

  If the other half has no empty space, and  no pair exist in the other half, then overall count > n/2, contradiction.

  Thus, there must exist a pair (c, d) in the other half, with a < b < c < d, i.e., we can just swap b, c to form (a,c), (b,d) and the optimal
value still preserves

So we can use a two pointer approach, start from one side, and search for the matching other side

567C: Geometric Progressions

So we need to find a(i), a(j), a(k) which forms a geometric sequence. Similar to the problem where you find a longest increasing subarray with one odd point, we can focus search for the middle point, the benefit is that we can break down the solution into 2 parts, we can easily bsearch on each part

475B: Strongly Connected City

Given the constraint, a DFS on each node would do in O(n^2 * m^2), but can we do better?

Consider base case: 2 rows, 2 cols, the for edeges must form a cycle (cw or ccw)

If add a a new row or col, it would do as well

Claim: as long as the outer sell forms a cw or ccw cycle, we can form a SCC

  Proof: by induction
  When we all a a new row, any new point can reach any old point, and any old point can reach any new point, and any new point can reach any
  new point

Codeforce 202 A: Mafia

2016-07-20 00:00:00 +0000

Notice that we can always construct an optimal solution with simple greedy approach, i.e., if a(i) < a(j), we know there exists an optimal solution where numberOfLeader(i) >= numberOfLeader(j). Reason: we can force a(i) to be leader until a(j) is reduced to a(i), and then each of them takes turns to be leader.

Also notice we are looking for the minimum number, i.e., the answer is monotonic, i.e., if m satisfies conditions, so does m + 1

Suppose we know final answer m, given the greedy insight, we can find the solution as follows,

  1. a(1) is leader for m - a(1) games

  2. a(2) is the leader for (m - a(1)) - (a(2) - a(1)) games = m -a(2)

  3. a(3) is the leader for (m - (a(1) + a(2))) - (a(3) - (a(1) + a(2))) games = m - a(3)...and so on

Since m also is number of games a(i) being leaders, we know m <= m - a(1) + m - a(2) + m - a(3)….. Solve it and we get lower bound for m

Codeforce 335 A: Sorting Railway Cars

2016-07-15 00:00:00 +0000

This is the type of the problem that looks at its inverse - i.e., instead of finding out what moves, find what did not move and its extreme values.

Focus on the unknown, what properties can you infer from the unknown?

  1. The unmoved parts, after moves, will be collapsed together, i.e., they must be consecutive numbers

  2. Therefore, # of unmoved <= # of items where consecutive numbers have increasing indices, i.e., max(# of unmoved) <= max(# of consecutive numbers with increasing indices) 

  3. Can the upper bound always be reached? i.e., does that longest sequence give the answer we need?

  4. Then we need to prove max(# of unmoved) >= max(# of consecutive numbers with increasing indices) 

  5. To prove 4), we can argue that we can always construct a solution around that subsequence, easily prove by induction

  6. Therefore, by 2) and 4), we just need to find longest consecutive value sequences with increasing indicies => use the inverse index

Codeforce round #FF: DZY loves sequences

2016-06-27 00:00:00 +0000

Option 1:

len(i, lastChange, hasChange): longest strictly increasing subsegment ending at i

when we move to i + 1

len(i+1, false, false) = a[i+1] > a[i] ? len(i, false, false) : 0 + 1 len(i+1, true, true) = len(i, false, false) + 1 len(i+1, true, false) = invalid case len(i+1, false, true) = max of ( a[i+1] > a[i] ? len (i, false, true) : 0 + 1 a[i+1] > a[i-1] + 1? len(i, true, true) : 1 + 1 )

and then check all (i, bool, bool) combos for max answer base case= len(0, false,false) = 1, len(0, true, true) = 1

Option 2:

Calculate all lenEnd(i), Calculate all lenStart(i)

for all i, compute lenEnd(i) + lenStart(i), and get max

Techinique: Find all potential solutions and get the max value. This means looking at the unknown and try to infer from its properties,i.e., one out of order means we need to search based on which entry is out of order

Codeforce round 357 & 358

2016-06-19 00:00:00 +0000

357C - Heap operations

The simple greedy approach would work. I TLEed because I did not include ios_base::sync_with_stdio(false);

Consider the first invalid case we run into. Options

  1. removeMin() on empty heap,

  2. getMin() > current min on heap

  3. getMin() < current min on heap

We can fix the heap on the spot, and repeat this step, by induction, our result will be consistent.

  1. we missed an insert, the value does not matter
  
  2. we missed removal

  3. we missed insert

In contest I spent more time proving the greedy approach is optimal than writing code!

358C - Alyona and the Tree

These types of problems introduce a speical definition, so we need to focus on that definition and infer conditions out of it.

From the perspective of a node, it must be removed if path to any of its ancestor > value of this node,i.e., we need to keep track of the max path from ANY ancestor to the node itself

MaxPath(node) =  E(parent, node) +  max(0, MaxPath(parent))

Also, since we remove from leaves, we need to go from root, and remove the subtree rooted at the first bad node

To make implementation easier, instead of calculating how many are removed, we calculate how many nodes are saved, the result is N - saved. (Inclusion-exclustion principle)

Codeforce round 357 & 358

2016-06-18 00:00:00 +0000

357C - Heap operations

The simple greedy approach would work. I TLEed because I did not include ios_base::sync_with_stdio(false);

Consider the first invalid case we run into. Options

  1. removeMin() on empty heap,

  2. getMin() > current min on heap

  3. getMin() < current min on heap

We can fix the heap on the spot, and repeat this step, by induction, our result will be consistent.

  1. we missed an insert, the value does not matter
  
  2. we missed removal

  3. we missed insert

In contest I spent more time proving the greedy approach is optimal than writing code!

358C - Alyona and the Tree

These types of problems introduce a speical definition, so we need to focus on that definition and infer conditions out of it.

From the perspective of a node, it must be removed if path to any of its ancestor > value of this node,i.e., we need to keep track of the max path from ANY ancestor to the node itself

MaxPath(node) =  E(parent, node) +  max(0, MaxPath(parent))

Also, since we remove from leaves, we need to go from root, and remove the subtree rooted at the first bad node

To make implementation easier, instead of calculating how many are removed, we calculate how many nodes are saved, the result is N - saved. (Inclusion-exclustion principle)

359C: Robbers’ watch

Typical complete search problem with generate-filter pattern. Because we have 2 constraints (uniqueness and value upperbound), we can generate to-be-filtered data set in 2 different ways. The basic idea is to generate according to one constraint, and then filter generated with another.

  1. generate all permutations of 0-7 with next_permutation, and check it first dn + dm digits to make sure the number formed < n and m. In the end, divide by
(7-dn-dm)! to filter out duplications. You can also use a set to maintain it, but it is more cumbersome to code

  2. generate all possible (n, m) pairs, and check this pair has no duplicate. I find this is easiler to code.

SRM 691 Div 2: Sunnygraphs

2016-05-31 00:00:00 +0000

Notice that each node has exaclty 1 out edge. The graph would be a tree, with children pointing to the parent, and root pointing to some random child

Also notice that this n node, n edge is composable structure, so potentially there are multiple components

Now assume only 1 component in the graph.

The graph can be broken down into 2 parts

  1. a directed cycle
  
  2. directed trees with roots on that cycle

Proof: Induction on the number of vertices, when we add a new edge, which node should it point to?

  1. if it points to a vertex on the cycle, it has 3 choices, no incoming edge, incoming edge from cycle, incoming edge from tree

  2. If it points to a vertex on the tree, these 3 choices either introduce more branches, or break the old cycle and introduce a new one

Now if the component is just a cycle, then any subset would do => 2^n

If the component is cycle + branch

  1. if branches do not change, any subset from cycle will do =>  2^ size(cycle)
  
  2. if branches change, then any non-empty subset from cycle will do => (2^size(branches) - 1) * (2^size(cycle) -1)

  3. sum it up, we get 2^size(component) + 1 - 2^size(branches)

Now lets consider the multiple component cases

  1. for each component, the argument above still holds, because of the n vertice, n edges structure is composable, except that we have to
exclude empty set case, because each component needs to contribute something to the n node

  2. Therefore, the total # of ways is the product of each indiviual componenets ways - 1 (-1 to exclude empty set case)

  3. therefore, we should DFS the graph to discover the component size and cycle/branch size, and then product result of each component together

Implementation notes

  1. to calculate size of component, need to convert it to undirected graph

  2. to find the length of the cycle, need to use the directed version, and keep a DFSish clock (See Floyd cycle finding).

  3. 1L << shift to calculate power for long

Time, Clocks and Ordering of Events in a Distributed System - Study questions

2016-05-27 00:00:00 +0000

1.Definition of partial ordering

2.if C(a) > C(b), can I say a → b? Why or why not?

3.How the partial order maintained during processes and send/receives?

4.How is total order defined based on partial order?

5.Why the paper says total order is not defined only by the system of events?

6.Describe the distributed sync problem

7.Why a central process alone approach will work?

8.Describe the distributed sync algorithm and its assumptions?

9.Why do we need physical time to detect failure?

10.What is the “anomalous behavior”? What are the 2 solutions proposed by the paper?

11.Why the strong clock condition physical lock only sets forwards?

Codeforces Notes

2016-05-13 00:00:00 +0000

673D: Bear and the two paths

Construction a graph. No solution exists if n = 4 or k <= n !!!

675C: Money Transfers

We know answer <= n, more specifically, answer = n - # of bands whose sum is 0, i.e., transfer happens only within the band.

How to find the max # of bands?!!!

675D: Tree Construction

Consider the final result, should the newly inserted be a left child or a right child?

  Claim: if it is a left child, then its parent is the tightest upper bound

  Proof: if there exists v < p1 < p, consider the common ancestor of p and p1. 

  If it is p1, then v should go left

  If it is p, then v should go p1

  If it is some other node, which means value between p1 and p, then v should go to left at that ancestor.


Similarly, if it is a right child, then its parent is the tightest lower bound

But we have 2 choices, left or right, which one to take?


  Claim: we have exactly one choice

  Proof: consider the LCA a of l and r, if l has right child and r has no left child, we know a is not l or r, i.e. a will be a tighter
bound for v then l and r. 
  
  This contradicts our previous lemmas. So we just need to check which one has no left/right child

ZooKeeper: wait-free coordination for internet scale systems

2016-05-08 00:00:00 +0000

High level

  1. What is the failure model assumed?
  2. Critique the CAP of ZK system
  3. What is the liveness and durability guarantee?
  4. Why wait-free is not enough for coordination?

Implementation details

  1. Why use full path instead of handle to access znodes?
  2. Why zk can process read locally at each replica? How about stale local copy problem?
  3. Consier the case of new leader changing configs, we need to ensure that no partial config is read ever, even if new leader failed in the middle. Why the lock approach in Chubby can not handle this? How to construct a solution with zk to solve this?
  4. ZK uses WAL and fuzzy snapshot to ensure exactly-once processing? Why the fuzzy snapshot works, even if it may not cache actual state at ANY givne time?
  5. Why the transaction to be broadcasted need to be idempotent? Why in ARIES the operation inside the WAL does not need to be idempotent?
  6. How does client use heatbeat to maintain connection with server?

Implementaion patterns

  1. Configuration
  2. Group membership
  3. Leader election
  4. Leader information

Role of Zookeeper in Kafka

  1. Map from partition to replica list
    • As a comparison, in GFS, it is the chunker server that keeps track of truck info. The master known which chuck is on which server through heartbeats
    • As common in most master/replication problems, this metadata has a version/epoch to handle the stale master/fail-recover problem
  2. For each partition, which replica is the leader and which replicas are in sync
  3. leader epoch is to handle stale leader
  4. version effctively handles membership view version
  5. What the role of controller? Similar to the master server in GFS
  6. Broker information. This is the membership managment storage, along with view number, in active-passive replication
  7. Controller id and epoch
  8. Compare this with the master lease technique
  9. Controller will force alive memebers to elect a new primary/leader, i.e., like the master in GFS

SRM 690 and 686

2016-05-05 00:00:00 +0000

SRM 690: GerrymanderEasy

I actually failed system test, because my check is k < N instead of k <= N ! I should not move onto the 3rd question so quickly.

My approach uses a sliding window, which requires my to search all k <= N.

One easier way to code is just to iterate on all i and j >= i, and sum up, and we try upddating max only when (j-i +1) >= K

SRM 686: SegmentsAndPoints

I could not prove my alternative greedy approach, which failed system test, i.e., I got punished by not following proper problem solving approach.

An alternative strategy is to reduce this into a bipartite matching problem, which turns out to be a massive overkill.

Since we are just looking for the existence of an answer, we can just force an order of discovery, if there is ever one, it will be covered by forced order. This forced order insight is what I missed during contest!!!

One natural guess is go from left to right. Now it becomes obvious that to match the leftmost point, we should take the segment ends leftmost and yet covers the point.

Proof #1

claim: If there exists a solution, then our greedy solution can discover at least one of them, i.e., in a feasible solution, if point p can be matched to s, then our algorithm never returns p never matches

proof: if q is matched to s instead of p, then we know q must be to the left of p, due to our forced order of discovery,i.e., s covers both p and q.

Now consider the segment t that would have covered q, by our algorithm, t must end to the right of s, and start to the left of q => i.e., it covers p as well, i.e., our algorithm at least can reutrn p-t matching, if no other is available!

Proof #2

By contradiction, suppose our algorithm returns n matches, and we have a matching p1, p2, …..pn, pn+1, ordered by p, with minimum sum(p) and the minium sum(end of segments)

case 1: consider the first p(i) such that p(i) < our(j), this can happen because the s(i) is being used by some other matching of our algorithm already. Otherwise we hae s(i), p(i) not used by anything and our algo doesn’t return it => clearly contradiciton, if we can prove that for p1…p(i-1), our solution IS same as the optimal solution we suppose. Easily provable by induction!

case 2: p1….pn are same, we just happen to have a p(n+1), proof is similar to case 1