Codeforces Notes

2016-11-29 00:00:00 +0000

271C: Secret

for group ki, just assign 2(ki)-1, 2 * k to that group, and then the next k element to the group 1 to k, and remaining elements to group 1

444A: DZY Loves Physics(!!!)

Claim: that highest density subgraph can have only 2 nodes and 1 edge

Proof: suppose the target subgraph has more than 2 nodes, then for each (u, v) pair, in the graph , the density is sum(U) + sum(V) / sum(e(u, v)). However, we know that there exists u + v / e(u, v) >= that density. Otherwise, the equation is no longer possbile

140C: New Year Snowmen(!!!)

Solution 1: bsearch on the final answer with construction algorithm

Claim: we can form k snowmen, if and only if then there exists an arrangement takens from sorted number array, (1, k+1, 2k+1), (2, k+2, 2k+2), on condition that no number appears more than k times

Proof:

Construction => result : obvioius

result => construction

	1. at least 3 * k snowballs. 

        2. no number appears more than k times

	3. Therefore, if we sort all snowballs, this arrangement can still work, because of 2) 

Solution 2: greedy

Claim: all quantities are < k and total >= 3 * k <=> we can form k snowmen with min number of snowballs

Proof:

Condition => result

if there are >=3 numbers appearing more than k times, obviously

if there are one or 2 numbers appearning more than k times, take these 2, and another random one, and do induction

if there are no number appearing more than k times, we just take any 3 numbers, and do induction

Result => condition: obvious

Codeforces Notes

2016-11-25 00:00:00 +0000

700A: As Fast As Possible(!!!)

I failed to solve this problem during contest. I thought the bus had to take students to the end and then head back

Insights

To be able to minimize the time, all students should arrive at the same time. Otherwise, the bus could have headed back earlier and helped the slowest one
Since input is 10k max, we can do a bsearch on number. Because we have the insight 1), the related search would be the final time every one arrives. Note that our epsilon is 10^-6
Because each group arrive on the same time, and v1, v2 are constatnt, that means each group spend on the bus for the same amount of time. This furthur suggests the time spent to meet the bus heading back is same across all groups
Therefore, we can calculate by multiplication, how much time bus has been running, and as long as bus run can conclude within t, the answer is possible

Remember to setprecision to display required number of precisions

645A: Amity Assessment

The DFS-state approach is too complex to code for this problem. For this problem, we notice that no matter how we move, the relative clockwise order of tiles remain same.

Therefore, it is enough just to form a 3 element string from inputs (minus the x), and see if we can rotate the from to form to

592C: The Big Race

If it will be a tie if, L is within the range of (i* lcm(w, b), i* lcm(w, b) + min(w, b))

So total # = (t / lcm(w, b) - 1) * min(w,b) + min(w,b) - 1 + tailBand

Note that need to handle the case where min(w, b) > T

53C: Little Frog

n = 3, 1 -> 3 -> 2

n = 4, 1 -> 4 -> 2 -> 3

n = 5, 1 -> 5 -> 2 -> 4 -> 3

The pattern is obvious now

717C: Potions Homework

Sort the student by laziness

Claim: the optimal is achieved when lowest lazyness is paired with highest difficlutiles

Proof: Exchange proof. Suppose we have l1 LT l2 and d1 LT d2, we can improve the solution by pair l1 with d2 and l2 with d1

l1 * d1 + l2 * d2 - l1 * d2 - l2 * d1 = (l2 - l1) * (d2 - d1) > 0

12C: Fruits

Sort the fruits by numbers

Max total = max # of fruit is most expensive

Min total = min # of fruite is most expensive

Exchange proof, note that the swap works for both ways

616C: The Labyrinth

Flood fill on each passable cells, for each passible cell assign the component number, and then for each wall, get the set of its neighbor component numbers and add up

635A: Orchestra

Calculate n(r, c) = number of vs in the rectangle with bottom right corner at (r, c), note that n(i, 0) = 0, and n(0, i) = 0

n(r, c) = n(r-1, c) + n(r, c-1) - n(r-1, c-1)

Then brute force on all (prevTop, prevLeft, br, rc) combos, find all combons such that n(br, rc) - n (prevTop, rc) - n(br, prevLeft) + n(prevTop, prevLeft), note that prevTop < br and prevLeft < rc

In Search of an Understandable Consensus Algorithm - Study Questions

2016-11-19 00:00:00 +0000

Why does raft use randomized timout values?
“GFS and HFDS use a seprate replicated stat machine to manage leader election and store configuration”, what does the author mean? Give concrete examples.
What is the max # of servers that can fail while raft is still avaiable? what is the min # of servers that can fail while raft becomes unavailable?
Descript the seqence of actions when a CANDIDATE handle an incoming AppendEntries call?
How can a candidate discover current leader or a new term? Why it becomes a follower upon such discovery?
During election, it is possible that no one is elected, how can that happen? How does raft deal with that?

consider the cases of leader/follower crashes, where can these scenarios happen?

     a. follower may have missing entries

     b. follower may have extra uncommited entries

     c. follower may have both missing entries and extra entries

When and how will raft fix the inconsitences in Figure 7)?
Why we need to ensure that any elected leader for an given has ALL of commited entries from the previous term? Why this property is satisfied in raft?

Describe the scenario in Figure 8

    a.replicated on majority of servers but not commited entry

    b.such entry in a) got overwritten

What if broadCastTime « electionTimeout is violated? what if electionTimeout !« mean time between failures (MTBF)?
Describe the disjoint majorities problem, how they can be triggered during membership changes, when each server just switch directly
What happened when cluster leader itself is not included in the new membership?
Give a scenario where removed cluster members can trick the new leaders into a follower, if without the remedy done by the raft.
What kind of metadata should snapshot include?

Codeforces notes

2016-10-11 00:00:00 +0000

631C: Report

Start from the tail, and get increasing list of range

for the entries outside the maxrange, push them to the end of answer

for each entry in the range, push the delta with the next entry from either head or tail depending on sort position

For the last entry, turns out we can just set delta to 0

622C: Not Equal on a segment

Since it just asks for existance/any, we can just look for an extreme solution, i.e., given a index r, find the rightmost l < r s.t. a[l] = a[r], this can be done in a linear scan

144C: Anagram Search

Do a sliding window and update the number of deficits and number of ?s, it is a hit if and only if # of deficits = number of ?s

Code forces notes

2016-10-10 00:00:00 +0000

161D: Distance in Tree

cacluate n(v, d) = number of nodes with has distance d from node v

Then the answer will be sum of, for each v

n(v,d) 

n(cv1, m - 1) * n(cv2, d - m - 1) 

Note the second part = n(v, d- m) - n(cv1, d- m -1), i.e., we need to go down a different path. Therefore we exclude the cv1 itself

369C: Valera and Elections

1. n nodes and n-1 edges suggest a tree

2. therefore, we can start from root 1, and dfs, we elect its immediate child if its subtree does not elect anything, and the child-parent
edge is broken

429A: Xor-tree

just be greedy from root, and recursive traverse through the tree.

510C: Fox And Names

The corner is the impossible case

if the dag has a cycle, i.e., the v is marked as visited, but its post number has not been inited yet

if the second string is the prefix of the first string

The Design and Implementation of a Log-Structured File System

2016-10-06 00:00:00 +0000

Why the paper says the disk operation will be dominated by writes?
What are pros and cons of a large file buffer?
For a small file focused workload, what is the performance characteristics?
Compare the inode discoery mechanisms of BFS and LFS
given we have dir1/file1, and dir2/file2 next to each other on our log, what will the log look like?

6.describes segment cleaning up process

7.How does LFS decide if a block is live?

8.Describe the segment usage table

How does LFS recover when the checkpointing op itself failed?
what are the benefits of having inode block AFTER data blocks?
How does LFS recover from the case where inode and directory info are not consistent?
Why LFS does not support fuzzy checkpoints, unlike ARIES and its WAL?
Why LFS wants a bimodal distribution of segment utilization?
explain the cost-benefit policy and why it helps achieve the bimodal distribution of utilization?

Data on the Outside versus Data on the Inside

2016-09-25 00:00:00 +0000

Classic paper on the data format/semantics related to data infrastructure

Services should act independently, i.e., they can still maintain availability even if the depending service failed. Also, there is no 2PC between services, all transactional semantics should remain within the SAME service

However, this means that the message can either representing past or future, but just not right now, time is unreliable, and it is up to the up to the application logic to reconcile this!

Comment

1. Microservices means PA systems

2. Message may be stale, this means consumer of that message has the final say of the result given its context, even the result is obsolete,
or the entity no longer exists!

As for the message between services, we need to make sure they are stable, i.e., gives same meaning when interpreted at different time/locations.

Comment

This suggests that

1. To simulate time, the ID and/or version number/timestamp should at least, not decreasing, this also means the IDs should not be reused,
because we can not move back our clock!

2. From the domain model perspective, only with ID is no longer enough for entity, we need version/timestamp as well as part of messages

3. being stable/versioned along with ID, means that the message could become indentifiable => this enables possibility for exactly-once
processing!

Reference data has a central authority service that publishes for other servies to use -> ID + mutliple version, itself can handle multiple concurrent updaters

For data ingestion, we should shred the incoming message into the relational information, and keep the extra unshredded somewhere else. Often, this original message is kept regardless for auditing and non-repudiation

Note that schema in general is against the idea of encapsulation => that belongs to the realm of object, which itself is not-queriable at all

Fenwick Tree

2016-09-18 00:00:00 +0000

Problem to solve

Fast prefix sum calucaltion when there are updates to the underlying trees

Unlike segment tree (ST), FT is “flat”,i.e., O(n) space complexity
Each index represents the segement sum of [index - 2^r + 1, index], notice that index starts at 1, and r is the bit index from right to left, starting at 0, i.e., each index covers a seg of length 2^r
as a result: tree[1] starts at 1 - 1 + 1 = 1, tree[2] starts at 2-2 + 1 = 1, tree[3] starts at 3 - 1 + 1….
Therefore, when we are reading, we start taking out the right most 1 bit from the input repeatedly until it becomes 0
When we are updating an entry, we keep adding the right most 1 bit from the input until it becomes size of the array
To calculate the 2^(right most bit pos), i.e., the shortest bit suffix that starts with a 1 bit, we can use num & -num!
To initialize the tree, we just mark all 0, and then use the write operations to set each index to the input value

459D: Pashmak and Parmida problem

iterate through array once, compute frequency for each entry, and populate the FT for frequencency

iterate the array again, from right to left

find the current accumulated frequncy for current entry 

update FT, with FT(frequency)-- and FT(frequency-1) ++

use FT to answer prefixsum(frequency), add the result to the final answer

update current frequency++

Codeforce Notes

2016-09-06 00:00:00 +0000

645B: Mischievous Mess Makers

Worse case: order from n to 1

Claim: in a optimal solution, regardless of k, n should be the first, and 1 should be last

Proof: by induciton on k, when k = 1, obviously

When k = n, we apply the 1-n swap first, and then apply the induction hypotheis on (n-1, k-1) case

total bad introduced = (n- 1 + n - k) * k. Of cource we need to handle the case of k > n/2

166C: Median

sort the array, if the number does not exist, then we need to insert it first and then resort

We need to calculate the band range of the target number. If left is to the right of median position, we can reverse-calculate the length of the target array, so that left becomes median = Left index * 2 - 1

Calculate similar if right is to the left of median

Otherwise, we are done already

476D: Dreamoon and Sets

Suppose k = 1, we just use (1, 2, 3, 5) (4, 7, 9 , 11)(8, 13, 15, 17),

Note that if k > 1, we can just map k=1 solution by multiplying k to each entry in both directions, i.e., we just need to solve k = 1 case!

so each group is the form of 1 even + 3 odds, we also notice that 3 consecutive odds are coprime for sure, because their difference is 2 or 4, which means their gcd can only be 1 or 2, but 2 suggests that they are odd, contradiciton

(!!!) The final form is {6a +2, 6a+1, 6a+3, 6a+5},each to prove that 6a+ 2 is coprime with everyone else

364A: Matrix

interate all (i, j) combinations to calculate count(x)= number of ways to form x, x >= 0 and x <= 4000 * 9

Then check all divisors of a up to sqrt(a), add count(div1) * count(div2) to the answer

the final answer needs to * 2, except when a has a perfect square

689C: Mike and Chocolate Thieves(!!!)

a * k^3 <= n, assume k = 2, 3, 4…, and we know m = n/8 + n/27 + …. + 1, Note that m > n/8, therefore, n < m * 8

I had difficulties proving n’s upperbound. Also notice we are looking for minimal viable answer here, so need to do extra check at bsearch’s midpoint

451C: Predict Outcome of the Game

Given we have the value of 2nd, we have 4 combos for the +, - of d1 and d2, therefore, we can calculate the sum of v1, v2, and v3. We just need to ensure that each v is in [0, k]

5C: Longest Regular Bracket Sequence

Maintain a stack , with each entry knows which index it is from. when we pop the entry, we update the length or the length count

However, know that the length will be from the SECOND top, because that the last time we do not have a regular string. Otherwise, the composition case ()() will fail!

416C: Booking System

Consider the request with max p(i), and can fit into a table, we should accept it. Otherwise, we can improve the solution

Therefore, sort requests by p(i), and then get lower bound for each pi, if lower bound does not exist, reject

710C: Magic Odd Square

At row i <= n/2, E * i , and then n - 2 * i O, and then E * i again

At row i = n/2 + 1, only Os

At row i > n/2 + 1, E *(n -i), then 2 * i - n, O and then E * (n - i again)

We can prove by induction this arrangement is correct

567D: One-Dimensional Battle Ships

For each insertion, we update the max possible # of ships, by finding its lower and upper bound, and calculate the delta. When we reduce the total # LT k, we know it is bad

Note that we probably want to storage segments instead of points, because both lower_bound and upper_bound do not return number LT current one, unless you do iterator– yourself

552D: Vanya and Triangles

Seems the a brute force n^3 solution works, i.e., are 3 points on the line at all?

Since two points are enough to define a segment, we can just calculate the equivalent class of its segments, and deduct combination counts. The trick is how to calculate the hash of the segment??!!

632A: Grandma Laura and Apples

Just reverse the seqence of operations, and first number = 0

if current is half

  ans += number * p

  number *= 2

else
  
  ans += number * p  + p / 2

  number += number + 1

204B: Little Elephant and Cards

Calculate number of each color. If such answer exists, we know max we have 4 possible canadidates, and then brute force each color => calculate # of face ups and only face downs, and get the minimal answer

237C: Primes on Interval

Sieve to generate all primes <= 10^6,

start from the first prime >= a, check if the a+k entries is still within the range. If it is , update the difference as answer

442B: Andrey and Problem

If pick only one, obviously, pick the one with highest prob.

Given we have a set, we can write down the exact formula. It becomes obvious that we can add a new memeber to set and increase overall probability as long as sum( p(i) / 1 - p(i)) > 1

So we can sort p(i)s and include them until that S <= 1

Codeforce Notes

2016-09-03 00:00:00 +0000

448C: Painting Fence

Notice that, if we use a H, either it touches top of fence, or it has followup Hs to touch the top of fences

This means that, if the lowest fences are not painted by H, all are done by V

Consider the last fence, is it painted by H or V

if by H, then it is same as with n-1 entries, with base height 0

if by V, then it is same as with n-1 enights, with base height h(n),

ans(i, baseHeight) =

if(baseHeight >= h(i))

  ans(n-1, h(i))  

else min of

  ans(n-1, baseHeight) + 1 //vertical

else
  ans(n-1, h(i)) + h(i) - baseHeight //horizontal

234C: Weather

Scan through array, calculate prefix sum of number of positive, negative and 0, entries

scan through array again from 1…n-1, assume the index as the first positive number, and then calculate the potential cost

598D: Igor In the Museum

Flood fill problem: DFS for each component, keep track of the cell belongs to which component, and how many paints this component can visit

Upon seeing query, just retrieve pre-computed result

321A: Ciel and Robot (!!!)

Idea 1

consider the 1d case, and only 1 dimensional: then either it reaches that in the first run, or never if the overall delta is in the opposite direciton

If we can reach within 1 iteration => manhattan distance <=

therefore, we try to move it so that its manhatann distance can be within |s|, after that we need to try 2|s| times traversing so that we can make sure all within bound spots are checked

Idea 2

To reach the path, the final form is (k * dx[s] + dx[p], k * dy[s] + dy[p]), we can iterate through all possible Ps, and then reverse calculate if k exisits

592C: The Big Race

if w = b, always tie

if w != b, tie <=> t is [lcm(w, b), lcm(w, b) + min(w, b)], note that lcm(w, b) start with 0

total = (t - 1)/lcm(w,b) * min(w,b) + min((t -1) % lcm(w,b), min(w, b))

353C: Find Maximum

just calculate f((prefix - 1) « i + 1s), iterate through all is, or max(v) = f(2^n -1) + max(v - 2^n)

295B: Greg and Graph

Speical case: assume the removal is in the order of n…1

this is basically the reverse of floyd washall

Note that

  1. a brute force O(n^3) space will exceed memory limit. Therefore, we need to reuse the previous state. Note that this does not affect the
correctness, because len[from][inter][inter-1] = len[from][inter][inter] 

  2. Even if we care only the partial graph, we still need to build all nodes as we expand the inters, so that when the new inter node is
introduced, we can immediately connect with existing nodes. Also, because we only calc path passing inter nodes, we know we are not
calculating other SPs by accident

340D: Bubble Sort Graph

for any (a[i], a[j]) tuple in the original graph where a[i] > a[j] and i < j, we will add an edge to the graph, otherwise, the sorting routine will not stop

Look at the unknown, we know that conversely, the 2 Vs inside the MIS does not have edge <=> they must map to 2 a[i], a[j]s where i < j and a[i] < a[j]

Therefore, the answer is the longest increasing subsequence of the input permutation

182D: Common Divisors

Iterate through all distinct prime factors of gcd(len(s1), len(s2)), check if the prefix of length of that prime factor is a building block for both s1 and s2 Take into account it may have same prime factors many times

Codeforce Notes

2016-08-31 00:00:00 +0000

159D: Palindrome pairs

Idea 1

For each i, we can calculate longest palindrome formed with i as center, while we are expanding, we can add that to RE[i+len], and LE[i - len], i.e., number of palidromes that with i+len as right end, and i-len as left end, and then we iterate all RE[i] * LE[j] with i < j

Idea 2

Suppose np[i], number of palidromes within 1…i, and then we calculate p[i][j], i.e., is the substring i..j a palindrome. so np[j] = np[j-1] + p[k][j] k over all 0..j

Final answer is np[i] * (np[n] - np[i]) over all i

219C: Color Stripe(!!!!)

k = 2, just need to brute force one of the two possible answers

k > 2, just go though each letter and change every same neighbors, just need to make sure it is different from both sides

225C: Barcode

Consider the base case n = 1

ans[i][true] = cost to make a solution, where column i is white

is min of

ans[i -1] [true] + whiteCost(i), ans[i-1][false] + whiteCost(i)

ans[i] [falst] = cost to make a solution, where i is the last black band of the stripe

ans[i-x][false] + blackCost(i-x + 1, i) ….ans[i-y][true] + blackCost[i- y][false]

final answer is min of ans[n][true], ans[n][false]

Codeforce Notes

2016-08-28 00:00:00 +0000

229A: Shifts

precompute LC(i, j) = min number of left shifts to make t(i, j) = 1, and similarly RC(i,j)

To compute RC(i, j)

  last1 = index of right most 1

  LP(j, 0, m)
    
    if (a[i][j] = 1)

        cost = 0;
        last1 = j

    else
        cost = last1 > j ? m - last1 + j + 1 :  lasst1 - j - 1

To compute LC(i, j), is similar

after that BF on the final column that is all one = m * n

484B: Maximum Value

Notice that the value range is small, this and plus it is mod based means we can operate/iterate on value directly

we need to brute force on a(j) over all entreis, for each a(j), we look for prev(2a(j)), prev(3a(j)),… and update final result as we go

prev(a(j)) can be pre-computed. the total iteration cost is sum(2M/a(j)) <= O(MlogM). Note that this cost analysis is exactly same as Sieves.

Onte that we need to sort entries, and ignore duplicate entires, otherwise, the sieve analysis may not work: we may be repeating low values too many times

Codeforce Notes

2016-08-26 00:00:00 +0000

425A: Sereja and Swaps

consider the case k = 1, of course, we have to swap in the biggest number

Therefore, for all (i, j) combos, sort entries within [i, j] and outside [i,j] separately, swap the first k entries within the band that is smaller than the outer band

244C: Checkposts

Find SCC of the directed graph, for each SCC, put one post on the one with lowest value. and then put one for each v outside SCCs

518D: Ilya and Escalator

O(n^2) dp, special case at cnt = 0 or cnt = n

313C: Ilya and Matrix

each entry appears at least once, therefore, we should force higher numbers to appear more e.g., biggest number can appear n + 1 times, the follwoing 3 numbers appear n times, the following 3 * 4 numbers appear n - 1 times…

337C: Quiz

ideally, if we do not trigger the k rule, the final score will be m. We need to limit the times the k rule is trigger as much as possible

of course, we want to trigger all double rules asap, i.e., at the beginning, otherwise, later correct answers will contribute to base

so the final solution would look like a continous band + bands of size k + a remainder band

L + (k - 1) * i + LR = m

L + k * i + LR = n

L + k * (n - m) + LR = n

we can bsearch on the length of the contnous band length, find the smallest that works

for the contnious band, the

So we can do fast exponatation here

f(n) = 2^(n-1) * f(1), where f(1) = 2 * k + 2 * k

Mistakes I made during implementation

During fast expontation, the resursive call should appear only once
During mod operations, need to defend against negative logs
This means for those easy-to-generate cases, I should run the max case myself just to ensure the sanity
In bsearch, need to delete if mid == low, if we want to be greedy to push it down further, because low <= mid < high

558C: Amr and Chemistry

Applying 1 and then 2 will have no effect. Therefore, all 2 ops must appear before 1)

Binary search the final answer may not work, because the result function is NOT monotonic!

In the final solution, the form can not have all 2s or 1s as the last step

if the final value is odd, then either all applies 2 or they are already like that the final value must be of form 1xxx10..0, the 1xxx1 part must be a prefix for all numbers. Therefore, we find longest common previx for all numbers, that will be the prefix of the equal value as well, and then we can bruteforce on number of appending 0s

557C:Arthur and Table

Suppose final max len is L, all l > L are removed, all l < L we keep all but |L| most expsive ones

We start from smallest length and maintain a heap, On reaching a new final length, we can

get top |L| expensive ones from the heap, these are the ones we save
	
calculate total = TotalE - bandE - saveE

add this band to the heap

We iterate through it and will get the final answer

137C: History

Note that we just need to decide if a event can be within any other event We need to sort by a[i] first, to eliminate one condition, and then when we scan through, we need to keep updating the rightmost endpoint

303A: Lucky Permutation Triple

So we can fix the first perm as 0…n-1

n = 1, just 0

n = 2, no answer

n = 3, (0, 1, 2) + (2, 0, 1) or (0, 1, 2)

ease to prove by induction than when n is odd, we just force b(i) = a(i)

(!!)How do you prove that a solution does not exist when n is even?

535C: Tavas and Karafs

Suppose we have (l,r), to be eaten, we need to have sum of height(l,r) <= mt, also height(r) <= t

Claim: this condition is enough Proof: induction on t t = 1 obvious

t = n, we apply the m-bites on the hightest entries if band within >= m, the inductive hypotheis is kept, and we can use t = n -1 case. if band within <= m, we know we can reduce all highest by 1, and since height(r) <= t, we can just eat them all

so we can bsearch for the highest r that satisfies that

297A: Parity Game

op 1 means that in the final result, # of 1s is increased by at most 1, if odd. or at most stay same if even. Therefore, if number of 1s in b > # of 1s in a + 1, impossible

consider a = 1, what b strings can we generate? => any string with 1 or 2 1s, and any a string with at least one 1 can be reduced to 1

consider a = 11, any b strings with <= 2 1s can be generated, because a can be reduced to 1 as well

procedure

if number of 1s is odd, apply op 0 to make it even

fit the 0 prefix of b by applying op 0

apply op 2 until it removes the first 1 in a

apply op 1 so that the 0...01 prefix in b becauses a suffix in a

now we can repeat ops 1-3 to make all 0...01 segs match, and finally use op 2 to remove redundant A prefixes

Bigtable: A Distributed Storage System for Structured Data

2016-08-24 00:00:00 +0000

How does master discover the tablet server? How does client discover the tablet server? Compare and contrast the solution with similar problems in GFS
How does bigtable detect master health? What will happen if IT THINKS the master has failed? Compare this failure detection and recovery with GFS’s design on its master and primary chuck server
How does master detects tablet server’s health? What happens when the master thinks the tablet server has failed? Compare this interaction with GFS’s between its master and chunk server
How can master learn the the newly split tablet? How does GFS master learn the newly created chuck?
The commit log is in GFS. What did they do in case GFS is experiencing latency? How would recovery work in this case?
SSTables are in GFS already, why do we still need that commit log?
Sequence of actions on read? compare this with GFS’s sequence of reading
Sequence of actions on write? compare this with GFS’s sequence of writing
What kinds of info/metrics they monitor, and how did they do that? Compare their collections with GFS’s.

Codeforce Notes

2016-08-22 00:00:00 +0000

436A: Feed with Candy

Consider the final solution: we should never eat 2 in a row. Therefore, we just try 2 cases, the first we eat is type 1 or type 2, and pick the best answer.

407A: Triangle

we can fix one point at (0,0), and then brute force to search for the second point coordinate with the a conditon, and then use the b condition to locate the 3rd point, or use the fact that (x, y) and (-y, x) form a right angle.

The part I failed is that I forget to check if the 3rd edge is parallel to any axis!.

460C: Present

Idea 1

Bsearch the answer, for each answer, we linear scan the array from left to right, also, we need to maintaind a d matrix to record the stop watering event

Idea 2

Greedy works on O(nm), but need additonal DS to speed it up

407B: Long Path

Claim: when we arrive at i + 1, all j < i has been visited even times already

Proof: By induction,

  1.  when we reach the i for the first time, by inductive hypothesis, prevous i-1 has beenn visted even times 

  2. if i points to itself, it will loop until it reaches even time, then move onto i+1

  3. if move to previous entry, then following sequence of action is same as p visited for the first time. By the difference of the
induction on case p and case i, we know the number of ops introduced by the subseqenct actions at each cell is even. Now we have visited i
even time, and move onto i+1

Therefore, when we later on move back to previous entries, the # of actions is same as we visit it for the first time, because even times case = 0 times case

visit(i,j) : number of steps to visit j, from i for the first time, with all cells visted even times already, with the exception of i

visit(i, j) = visit(i, j -1) + visit(p(j-1), j -1) + 2 //1 for moving from j-1 to p(j-1), one for moving from j - 1 to j

base case visit(i, i) = 0

246D: Colorful Graph

For each edge, consider C(u) and C(v), we just update Q(c(u)) and Q(c(v)), and in the end select the set with max cardinality

similarly, it is as if we create a graph based on colors, and add edge between colors

459C: Pashmak and Buses

This one uses the trick of focusing on individual in collective actions

Consider the unknown: if the two students have the same assignment each day, what would that suggest?

Consider the sequece of assignment for each student, there are k^d possible choices, this means the two students will be friends if and only if they share exactly a sequence of assignments. Therefore, if n > k^d, we know it is bad. Otherwise, we can assign a unique sequence id to each student

My wrong approach focused solely on collective actions, which gives a maximum of k * d

C. Bear and Prime Numbers

My (Failed) Idea

generate all primes <= 10^7

factor all n numbers, for each factor, increase the match[factor] by 1

sweept through the array generated by 2, to compute the prefix sum

the range becomes difference of prefix sums

The basic idea is sound, however, the main problem is that factoring in step 2) is too slow. Instead, we can use the sieve to reverse the factoring process, and calculate answers for all numbers in a single runr, i.e., instead of increasing match[factor] by 1, we can increase it by f(p)

429B: Working out

Consider the cell they meet at (mi, mj), one has to enter horizontally, and the other has to enter vertically, otherwise, they will have more than 1 same cells, we just need to iterate all possible meeting points, each with 2 possiblites

also, note that the path is effective cut in 2, and each half must exist, i.e., meeting point can not touch border!

so 2 * 10^6 to calculate the cost in 4 directions + 10^6 brute force on meeting points

311A: The Closest Pair

the total number of ops <= n (n-1) / 2, therefore, if k >= that max # of ops, no answer

Notice that we only require delta(x) < d, then we can force all points to have to same x coordinate, and then put them on the same y-axis, the distance between them does not even matter

Codeforce Notes

2016-08-20 00:00:00 +0000

371C: Hamburgers

Just bsearch the answers, and calcuate number of pieces one has to buy

471C: MUH and House of Cards

If we can form a higher level of tower, we can form a lower level of tower, i.e, we just need to look for the max level of towers

Given a tower of level h, we can know the min # of cards we need to build it. Therefore, we can just bsearch for the answer: find the highest h s.t., f(h) <= n

f(h) = (1 + h) * h /2 * 3 - h

185A: Plant

f(up, n) = 3 * f(up, n -1) + f(down, n -1)

f(down, n) = 3 * f(down, n-1) + f(up, n-1)

Eq 1 - Eq 2 and we discover a geometric series, and then we can calculate the power in O(logN)

623A: Graph and String

s(i) = b <=> deg(i) = n - 1

Note that if we have a solution, we can swap all as and cs and the result is still a valid result

s(i) = a <=> deg(i) = num of cs + (num of as - 1). Therefore, all other notes not b will be a

The remaining will be c, and then we scan through all edges again calculate degrees for each v.

4D: Mysterious Present

sort by (w,h), then O(n^2) DP, finally scan through answers that can fit the envelope

580D: Kefa and Dishes

At first I tried brute force generate all set, and potentially find all optimal matching in the set, but I can not find an easy algorithm to do so.

However, considering that a lot of graph-ish problem are very dp-like, we can locate the following relationship

score(set ending at i) = min(set ending at j, and i not in set) + score(i followed by j)

2^n * n states, each state takes n to update

417C: Football

|E| = k * n, therefore , k <= (n-1)/2

when k = 1, total

, with each edge has an outgoing edge , i.e., a cycle

when n = 1 or 2, impossible.

when k is within the upper bound, we can just add edge the lexo sorted next k nodes

Codeforce Notes

2016-08-19 00:00:00 +0000

402C: Searching for Graph

Base case: when n = 5. K5 has 10 edges, i.e., p = 0

when n = 6, K6 has 15 edges, i.e., p <= 3

idea : connect v6 to 2 + p existing nodes in n = 5

when taking a k element subgraph, total # edges = edges in K5 +

edges to v6

<= 2 (k-1) + 2 + p => we are good

when n = 7, K7 has 21 edges, i.e., p <= 7. So we can have A(6, 3), and then connect 2 + p - 3 edges to nodes in A6, if p > 3, if p < 3, we just add 2 edges to 1 and 2, and construct A(6, p)

when n = 8, K8 has 28 edges, i.e., p <= 12, So we can have A(7, 7) , and then connect 2 + p -7 edges to nodes in A7

i.e. we find a way to construct answers

Note the official solution has different insights

339C: Xenia and Weights

good(round, weightUsed, balance) =

LPE(lw, 1, 10)
  if(w < b && s[w] == '1' && s[lw] == '1')  
    good(r, w,b) = good(r, w, b) || good(r-1,lw, w - balance)

10^5 states, each cell iterates 10 times to update => 10^6 run time

287C: Guess Your Way Out

by induction, if the leaf exists in left child, we do not have to leave that child. On the other hand, if the leaf does not exist, we
will have to visit all before

base case h = 1

ans(L, h, n) = ans(R,h - 1, n - 2^(h-1)) if exit in the left, or 2^h - 1 + ans(L, n)

ans(R, h, n) = ans(L,h - 1, n/2) if exit in the right, or 2^h - 1 + ans(R, n - 2^(h-1))

150A: Win or Freeze

Calcualte non-trival prime divisors of 2. Note that npd != 1, because mutliplication requires 2 numbers

if npd = 0, A wins
  
if npd = 2, B wins

if npd > 2, A takes all but 2 prime divisors, and and hand over result to B

Therefore, B wins iff npd = 2, we just need to write a quick check to see if npd = 2. So we start from 2, and find first divisor of d
<= sqrt(q), and check if q/d is prime or not

Alternatively, we can iterate though [2, sqrt(q)], to calculate # of divisors. Note that if after the iteration, q is not 1 yet, we
need to add that 1 prime divisor to the total count!

This is a typical problem that can be broken into 2 parts, with each part solved differently and answer to the first part provides context to the second part

Codeforce Notes

2016-08-17 00:00:00 +0000

448D: Multiplication Table

final answer <= 10 * 10^10, we can just bsearch the answer, at each guess, check form 1 to n, we can check how many > answer, and how many = answer

if nl + ne = k ne > 0 we are done. Otherwise, the guess is too small
if nl + ne > k if nl <= k-1 we are done otherwise, guess is too small if nl + ne < k our guess is too big

493C: Vasya and Basketball

Sort a and b, we then increase d from 0 to max(a[i], b[i]), with each step a value of a or b , and update score as we go

Note that the two pointer approach does NOT work here! The reason is that the sliding window generally works when everything is triggered by moving your end pointer, and front pointer merely follows. However, in this case, because we need to handle the bands of equal values on both sides, when the front pointer moves, it has to update the state, thus violates the pattern.

590A: Median Smoothing

For any a[i] = a[i+1], this segment will be stable

therefore, the seq can be partitioned by those continous bands

consider the case: 1010101..01 each round takes out a 0

consider the case: 1010….0, each round moves the leftmost 0 to the right, until all 1s and 0s are together

so we iterate through the seq, and detect those alternating bands

However, during implmentation, note that to detect alternating band, we should use a[i -1] != a[i] && a[i] != a[i+1], instead of just one side. Otherwise, the code can not handle the case of crossing the boundary of continous bands

621C: Wet Shark and Flowers

Two commmon tricks: reduce bands to prefixs, and caluclate complements.

Note that if we have to calculate Pr(A) + Pr(B) - Pr(A) * Pr(B). Often it is a sign we might as well calculate its complement!

Total E = sum of Ex for each shark
  
Pr(i generates a prime) = (# of primes <= R) - (# of primes <= L)

Pr(product of i and i+ 1 is prime) = 1 -  (1 - Pr(i generates a prime)) * (1 - Pr(i+1 generates a prime))

Photon: Fault-tolerant and scalable joining of continuous data streams

2016-08-16 00:00:00 +0000

The requirement is near-exactly-once processing. That means they need to ensure at-most and at least processing

How is at most once implemented?
How is at least once implemented?
The joiner update the state and then flush the result, but these 2 operations are not in a single atomic step. How do they mitigate the problem where state is updated, but joiner is unable to flush the result? Even after these mitigatations, what other scenarios could break the at least once semantics
How does event source know that which click is associated with with query?
Each click has a unique id, how/when/where is it generated?
The IdRegistry is shared across DCs with up to 100ms latency. in production each PaxosDB leader has ~ 10 requests per second. How do they use so few requests to handle millions of clicks/write requests per min?
IdRegistry is, unsurprising, sharded, but they have the requirement to dynamically increase/decrease # of shards. How do they ensure that the same click, before and after sharding changes, will be routed to the same shard?
How do they mitigate outage at a single DC?
Consider at two different DCs processing the same click, how do they resolve the read-write conflict, specifically, the two DCs tries to add the same click id to the registry at the same time?
How do they validate their join logic?
Due to business requirement and storage restrictions, they keep only the most N days of data. Due to the single master architeure of paxos, it makes sense to run GC only at the master replica. But what if master changes and clock time drifted? How do they defend against it?

Codeforce Notes

2016-08-14 00:00:00 +0000

367A: Sereja and Algorithm

For string s if an algo can terminate, we know exactly |x| , |y| , and |z|

if len % 3 = 0, |x| = |y| = |z|

if len % 3 != 0, |x|, |y|, |z| can not differ more than 1

Claim: if a string satisfies the condition above, we know algo on it terminates

Proof: induction

base case n = 1,2,3, obvious

when len % 3 = 0, we find the longest prefix where |x| = |y| = |z| - 1, apply the hypthesis, and then apply the hypothesis again on the suffix , with the last 2 letters of the reformed prefix

when len % 3 = 1. we look for a prefix wher |x| = |y| = |z|, apply the hypothesis on the suffix, and then the prefix plus the first letter of the reformed suffix

when len % 3 = 2. similar to case 3)

479D: Long Jumps

Need 0 marks, if there exists a(i) + x, a(j) + y already

need 1 mark, if

  1. 1 already exists from existing mark

  2. m s.t. abs(a(i) -  m) = x and abs(a(j) - m) = y. We can generate by one condition and then filter by the other, i.e., can do a linear
search on points, both direction with delta x, and then filter out those outside [0, l], and m +/- y is not part of existing points

Otherwise, need 2 marks.

103B: Cthulhu

Assume the graph is connected

This is similar to the sunny graph problem,i.e., when |v| = |e|, <=> each node has only 1 parent in directed graph, and note that the cycle has min length at least 3, so that is satisfied as well

617D: Polyline

If there is only 1 line, then they share all x or y, if and only if

If there are 2, then at least 2 share x or y, however, if we have shared x or y, we need do make sure z has the other axis which does not fall into the range of (xx, yx)

To make coding easy, we can do a bruce force search on the first line.

628B: New Skateboard

Idea 1:

num [i] [mod] = number of substrings ending at i, with mod as the substring # % 4

num[i + 1] [(mod * 10 + digit) % 4] += nums[i] [mod] num [i +1 ] [digit % 4] ++

Idea 2:

if number % 4 == 0, <=> the last 2 digits form a number % 4 == 0, therefore, we look for ll 2-letter substrings whose num % 4 == 0 , and add length of prefix to final ans

Previous Page: 20 of 23 Next