Codeforces notes
631C: Report
1. Start from the tail, and get increasing list of range
2. for the entries outside the maxrange, push them to the end of answer
3. for each entry in the range, push the delta with the next entry from either head or tail depending on sort position
4. For the last entry, turns out we can just set delta to 0
622C: Not Equal on a segment
Since it just asks for existance/any, we can just look for an extreme solution, i.e., given a index r, find the rightmost l < r s.t. a[l] = a[r], this can be done in a linear scan
144C: Anagram Search
Do a sliding window and update the number of deficits and number of ?s, it is a hit if and only if # of deficits = number of ?s
Code forces notes
161D: Distance in Tree
cacluate n(v, d) = number of nodes with has distance d from node v
Then the answer will be sum of, for each v
1. n(v,d)
2. n(cv1, m - 1) * n(cv2, d - m - 1)
3. Note the second part = n(v, d- m) - n(cv1, d- m -1), i.e., we need to go down a different path. Therefore we exclude the cv1 itself
369C: Valera and Elections
1. n nodes and n-1 edges suggest a tree
2. therefore, we can start from root 1, and dfs, we elect its immediate child if its subtree does not elect anything, and the child-parent
edge is broken
429A: Xor-tree
just be greedy from root, and recursive traverse through the tree.
510C: Fox And Names
The corner is the impossible case
1. if the dag has a cycle, i.e., the v is marked as visited, but its post number has not been inited yet
2. if the second string is the prefix of the first string
The Design and Implementation of a Log-Structured File System
-
Why the paper says the disk operation will be dominated by writes?
-
What are pros and cons of a large file buffer?
-
For a small file focused workload, what is the performance characteristics?
-
Compare the inode discoery mechanisms of BFS and LFS
-
given we have dir1/file1, and dir2/file2 next to each other on our log, what will the log look like?
6.describes segment cleaning up process
7.How does LFS decide if a block is live?
8.Describe the segment usage table
-
How does LFS recover when the checkpointing op itself failed?
-
what are the benefits of having inode block AFTER data blocks?
-
How does LFS recover from the case where inode and directory info are not consistent?
-
Why LFS does not support fuzzy checkpoints, unlike ARIES and its WAL?
-
Why LFS wants a bimodal distribution of segment utilization?
-
explain the cost-benefit policy and why it helps achieve the bimodal distribution of utilization?
Data on the Outside versus Data on the Inside
Classic paper on the data format/semantics related to data infrastructure
Services should act independently, i.e., they can still maintain availability even if the depending service failed. Also, there is no 2PC between services, all transactional semantics should remain within the SAME service
However, this means that the message can either representing past or future, but just not right now, time is unreliable, and it is up to the up to the application logic to reconcile this!
Comment
1. Microservices means PA systems
2. Message may be stale, this means consumer of that message has the final say of the result given its context, even the result is obsolete,
or the entity no longer exists!
As for the message between services, we need to make sure they are stable, i.e., gives same meaning when interpreted at different time/locations.
Comment
This suggests that
1. To simulate time, the ID and/or version number/timestamp should at least, not decreasing, this also means the IDs should not be reused,
because we can not move back our clock!
2. From the domain model perspective, only with ID is no longer enough for entity, we need version/timestamp as well as part of messages
3. being stable/versioned along with ID, means that the message could become indentifiable => this enables possibility for exactly-once
processing!
Reference data has a central authority service that publishes for other servies to use -> ID + mutliple version, itself can handle multiple concurrent updaters
For data ingestion, we should shred the incoming message into the relational information, and keep the extra unshredded somewhere else. Often, this original message is kept regardless for auditing and non-repudiation
Note that schema in general is against the idea of encapsulation => that belongs to the realm of object, which itself is not-queriable at all
Fenwick Tree
Problem to solve
Fast prefix sum calucaltion when there are updates to the underlying trees
-
Unlike segment tree (ST), FT is “flat”,i.e., O(n) space complexity
-
Each index represents the segement sum of [index - 2^r + 1, index], notice that index starts at 1, and r is the bit index from right to left, starting at 0, i.e., each index covers a seg of length 2^r
-
as a result: tree[1] starts at 1 - 1 + 1 = 1, tree[2] starts at 2-2 + 1 = 1, tree[3] starts at 3 - 1 + 1….
-
Therefore, when we are reading, we start taking out the right most 1 bit from the input repeatedly until it becomes 0
-
When we are updating an entry, we keep adding the right most 1 bit from the input until it becomes size of the array
-
To calculate the 2^(right most bit pos), i.e., the shortest bit suffix that starts with a 1 bit, we can use num & -num!
-
To initialize the tree, we just mark all 0, and then use the write operations to set each index to the input value
459D: Pashmak and Parmida problem
iterate through array once, compute frequency for each entry, and populate the FT for frequencency
iterate the array again, from right to left
1. find the current accumulated frequncy for current entry
2. update FT, with FT(frequency)-- and FT(frequency-1) ++
3. use FT to answer prefixsum(frequency), add the result to the final answer
4. update current frequency++
Codeforce Notes
645B: Mischievous Mess Makers
Worse case: order from n to 1
Claim: in a optimal solution, regardless of k, n should be the first, and 1 should be last
Proof: by induciton on k, when k = 1, obviously
When k = n, we apply the 1-n swap first, and then apply the induction hypotheis on (n-1, k-1) case
total bad introduced = (n- 1 + n - k) * k. Of cource we need to handle the case of k > n/2
166C: Median
sort the array, if the number does not exist, then we need to insert it first and then resort
We need to calculate the band range of the target number. If left is to the right of median position, we can reverse-calculate the length of the target array, so that left becomes median = Left index * 2 - 1
Calculate similar if right is to the left of median
Otherwise, we are done already
476D: Dreamoon and Sets
Suppose k = 1, we just use (1, 2, 3, 5) (4, 7, 9 , 11)(8, 13, 15, 17),
Note that if k > 1, we can just map k=1 solution by multiplying k to each entry in both directions, i.e., we just need to solve k = 1 case!
so each group is the form of 1 even + 3 odds, we also notice that 3 consecutive odds are coprime for sure, because their difference is 2 or 4, which means their gcd can only be 1 or 2, but 2 suggests that they are odd, contradiciton
(!!!) The final form is {6a +2, 6a+1, 6a+3, 6a+5},each to prove that 6a+ 2 is coprime with everyone else
364A: Matrix
interate all (i, j) combinations to calculate count(x)= number of ways to form x, x >= 0 and x <= 4000 * 9
Then check all divisors of a up to sqrt(a), add count(div1) * count(div2) to the answer
the final answer needs to * 2, except when a has a perfect square
689C: Mike and Chocolate Thieves(!!!)
a * k^3 <= n, assume k = 2, 3, 4…, and we know m = n/8 + n/27 + …. + 1, Note that m > n/8, therefore, n < m * 8
I had difficulties proving n’s upperbound. Also notice we are looking for minimal viable answer here, so need to do extra check at bsearch’s midpoint
451C: Predict Outcome of the Game
Given we have the value of 2nd, we have 4 combos for the +, - of d1 and d2, therefore, we can calculate the sum of v1, v2, and v3. We just need to ensure that each v is in [0, k]
5C: Longest Regular Bracket Sequence
Maintain a stack , with each entry knows which index it is from. when we pop the entry, we update the length or the length count
However, know that the length will be from the SECOND top, because that the last time we do not have a regular string. Otherwise, the composition case ()() will fail!
416C: Booking System
Consider the request with max p(i), and can fit into a table, we should accept it. Otherwise, we can improve the solution
Therefore, sort requests by p(i), and then get lower bound for each pi, if lower bound does not exist, reject
710C: Magic Odd Square
At row i <= n/2, E * i , and then n - 2 * i O, and then E * i again
At row i = n/2 + 1, only Os
At row i > n/2 + 1, E *(n -i), then 2 * i - n, O and then E * (n - i again)
We can prove by induction this arrangement is correct
567D: One-Dimensional Battle Ships
For each insertion, we update the max possible # of ships, by finding its lower and upper bound, and calculate the delta. When we reduce the total # LT k, we know it is bad
Note that we probably want to storage segments instead of points, because both lower_bound and upper_bound do not return number LT current one, unless you do iterator– yourself
552D: Vanya and Triangles
Seems the a brute force n^3 solution works, i.e., are 3 points on the line at all?
Since two points are enough to define a segment, we can just calculate the equivalent class of its segments, and deduct combination counts. The trick is how to calculate the hash of the segment??!!
632A: Grandma Laura and Apples
Just reverse the seqence of operations, and first number = 0
if current is half
ans += number * p
number *= 2
else
ans += number * p + p / 2
number += number + 1
204B: Little Elephant and Cards
Calculate number of each color. If such answer exists, we know max we have 4 possible canadidates, and then brute force each color => calculate # of face ups and only face downs, and get the minimal answer
237C: Primes on Interval
Sieve to generate all primes <= 10^6,
start from the first prime >= a, check if the a+k entries is still within the range. If it is , update the difference as answer
442B: Andrey and Problem
If pick only one, obviously, pick the one with highest prob.
Given we have a set, we can write down the exact formula. It becomes obvious that we can add a new memeber to set and increase overall probability as long as sum( p(i) / 1 - p(i)) > 1
So we can sort p(i)s and include them until that S <= 1
Codeforce Notes
448C: Painting Fence
Notice that, if we use a H, either it touches top of fence, or it has followup Hs to touch the top of fences
This means that, if the lowest fences are not painted by H, all are done by V
Consider the last fence, is it painted by H or V
if by H, then it is same as with n-1 entries, with base height 0
if by V, then it is same as with n-1 enights, with base height h(n),
ans(i, baseHeight) =
if(baseHeight >= h(i))
ans(n-1, h(i))
else min of
ans(n-1, baseHeight) + 1 //vertical
else
ans(n-1, h(i)) + h(i) - baseHeight //horizontal
234C: Weather
Scan through array, calculate prefix sum of number of positive, negative and 0, entries
scan through array again from 1…n-1, assume the index as the first positive number, and then calculate the potential cost
598D: Igor In the Museum
Flood fill problem: DFS for each component, keep track of the cell belongs to which component, and how many paints this component can visit
Upon seeing query, just retrieve pre-computed result
321A: Ciel and Robot (!!!)
Idea 1
consider the 1d case, and only 1 dimensional: then either it reaches that in the first run, or never if the overall delta is in the opposite direciton
If we can reach within 1 iteration => manhattan distance <= | s |
therefore, we try to move it so that its manhatann distance can be within |s|, after that we need to try 2|s| times traversing so that we can make sure all within bound spots are checked
Idea 2
To reach the path, the final form is (k * dx[s] + dx[p], k * dy[s] + dy[p]), we can iterate through all possible Ps, and then reverse calculate if k exisits
592C: The Big Race
if w = b, always tie
if w != b, tie <=> t is [lcm(w, b), lcm(w, b) + min(w, b)], note that lcm(w, b) start with 0
total = (t - 1)/lcm(w,b) * min(w,b) + min((t -1) % lcm(w,b), min(w, b))
353C: Find Maximum
just calculate f((prefix - 1) « i + 1s), iterate through all is, or max(v) = f(2^n -1) + max(v - 2^n)
295B: Greg and Graph
Speical case: assume the removal is in the order of n…1
this is basically the reverse of floyd washall
Note that
1. a brute force O(n^3) space will exceed memory limit. Therefore, we need to reuse the previous state. Note that this does not affect the
correctness, because len[from][inter][inter-1] = len[from][inter][inter]
2. Even if we care only the partial graph, we still need to build all nodes as we expand the inters, so that when the new inter node is
introduced, we can immediately connect with existing nodes. Also, because we only calc path passing inter nodes, we know we are not
calculating other SPs by accident
340D: Bubble Sort Graph
for any (a[i], a[j]) tuple in the original graph where a[i] > a[j] and i < j, we will add an edge to the graph, otherwise, the sorting routine will not stop
Look at the unknown, we know that conversely, the 2 Vs inside the MIS does not have edge <=> they must map to 2 a[i], a[j]s where i < j and a[i] < a[j]
Therefore, the answer is the longest increasing subsequence of the input permutation
182D: Common Divisors
Iterate through all distinct prime factors of gcd(len(s1), len(s2)), check if the prefix of length of that prime factor is a building block for both s1 and s2 Take into account it may have same prime factors many times
Codeforce Notes
159D: Palindrome pairs
Idea 1
For each i, we can calculate longest palindrome formed with i as center, while we are expanding, we can add that to RE[i+len], and LE[i - len], i.e., number of palidromes that with i+len as right end, and i-len as left end, and then we iterate all RE[i] * LE[j] with i < j
Idea 2
Suppose np[i], number of palidromes within 1…i, and then we calculate p[i][j], i.e., is the substring i..j a palindrome. so np[j] = np[j-1] + p[k][j] k over all 0..j
Final answer is np[i] * (np[n] - np[i]) over all i
219C: Color Stripe(!!!!)
k = 2, just need to brute force one of the two possible answers
k > 2, just go though each letter and change every same neighbors, just need to make sure it is different from both sides
225C: Barcode
Consider the base case n = 1
ans[i][true] = cost to make a solution, where column i is white
is min of
ans[i -1] [true] + whiteCost(i), ans[i-1][false] + whiteCost(i)
ans[i] [falst] = cost to make a solution, where i is the last black band of the stripe
ans[i-x][false] + blackCost(i-x + 1, i) ….ans[i-y][true] + blackCost[i- y][false]
final answer is min of ans[n][true], ans[n][false]
Codeforce Notes
229A: Shifts
precompute LC(i, j) = min number of left shifts to make t(i, j) = 1, and similarly RC(i,j)
To compute RC(i, j)
last1 = index of right most 1
LP(j, 0, m)
if (a[i][j] = 1)
cost = 0;
last1 = j
else
cost = last1 > j ? m - last1 + j + 1 : lasst1 - j - 1
To compute LC(i, j), is similar
after that BF on the final column that is all one = m * n
484B: Maximum Value
Notice that the value range is small, this and plus it is mod based means we can operate/iterate on value directly
we need to brute force on a(j) over all entreis, for each a(j), we look for prev(2a(j)), prev(3a(j)),… and update final result as we go
prev(a(j)) can be pre-computed. the total iteration cost is sum(2M/a(j)) <= O(MlogM). Note that this cost analysis is exactly same as Sieves.
Onte that we need to sort entries, and ignore duplicate entires, otherwise, the sieve analysis may not work: we may be repeating low values too many times
Codeforce Notes
425A: Sereja and Swaps
consider the case k = 1, of course, we have to swap in the biggest number
Therefore, for all (i, j) combos, sort entries within [i, j] and outside [i,j] separately, swap the first k entries within the band that is smaller than the outer band
244C: Checkposts
Find SCC of the directed graph, for each SCC, put one post on the one with lowest value. and then put one for each v outside SCCs
518D: Ilya and Escalator
O(n^2) dp, special case at cnt = 0 or cnt = n
313C: Ilya and Matrix
each entry appears at least once, therefore, we should force higher numbers to appear more e.g., biggest number can appear n + 1 times, the follwoing 3 numbers appear n times, the following 3 * 4 numbers appear n - 1 times…
337C: Quiz
ideally, if we do not trigger the k rule, the final score will be m. We need to limit the times the k rule is trigger as much as possible
of course, we want to trigger all double rules asap, i.e., at the beginning, otherwise, later correct answers will contribute to base
so the final solution would look like a continous band + bands of size k + a remainder band
L + (k - 1) * i + LR = m
L + k * i + LR = n
L + k * (n - m) + LR = n
we can bsearch on the length of the contnous band length, find the smallest that works
for the contnious band, the
So we can do fast exponatation here
f(n) = 2^(n-1) * f(1), where f(1) = 2 * k + 2 * k
Mistakes I made during implementation
- During fast expontation, the resursive call should appear only once
- During mod operations, need to defend against negative logs
- This means for those easy-to-generate cases, I should run the max case myself just to ensure the sanity
- In bsearch, need to delete if mid == low, if we want to be greedy to push it down further, because low <= mid < high
558C: Amr and Chemistry
Applying 1 and then 2 will have no effect. Therefore, all 2 ops must appear before 1)
Binary search the final answer may not work, because the result function is NOT monotonic!
In the final solution, the form can not have all 2s or 1s as the last step
if the final value is odd, then either all applies 2 or they are already like that the final value must be of form 1xxx10..0, the 1xxx1 part must be a prefix for all numbers. Therefore, we find longest common previx for all numbers, that will be the prefix of the equal value as well, and then we can bruteforce on number of appending 0s
557C:Arthur and Table
Suppose final max len is L, all l > L are removed, all l < L we keep all but |L| most expsive ones
We start from smallest length and maintain a heap, On reaching a new final length, we can
1. get top |L| expensive ones from the heap, these are the ones we save
2. calculate total = TotalE - bandE - saveE
3. add this band to the heap
We iterate through it and will get the final answer
137C: History
Note that we just need to decide if a event can be within any other event We need to sort by a[i] first, to eliminate one condition, and then when we scan through, we need to keep updating the rightmost endpoint
303A: Lucky Permutation Triple
So we can fix the first perm as 0…n-1
n = 1, just 0
n = 2, no answer
n = 3, (0, 1, 2) + (2, 0, 1) or (0, 1, 2)
ease to prove by induction than when n is odd, we just force b(i) = a(i)
(!!)How do you prove that a solution does not exist when n is even?
535C: Tavas and Karafs
Suppose we have (l,r), to be eaten, we need to have sum of height(l,r) <= mt, also height(r) <= t
Claim: this condition is enough Proof: induction on t t = 1 obvious
t = n, we apply the m-bites on the hightest entries if band within >= m, the inductive hypotheis is kept, and we can use t = n -1 case. if band within <= m, we know we can reduce all highest by 1, and since height(r) <= t, we can just eat them all
so we can bsearch for the highest r that satisfies that
297A: Parity Game
op 1 means that in the final result, # of 1s is increased by at most 1, if odd. or at most stay same if even. Therefore, if number of 1s in b > # of 1s in a + 1, impossible
consider a = 1, what b strings can we generate? => any string with 1 or 2 1s, and any a string with at least one 1 can be reduced to 1
consider a = 11, any b strings with <= 2 1s can be generated, because a can be reduced to 1 as well
procedure
0. if number of 1s is odd, apply op 0 to make it even
1. fit the 0 prefix of b by applying op 0
2. apply op 2 until it removes the first 1 in a
3. apply op 1 so that the 0...01 prefix in b becauses a suffix in a
4. now we can repeat ops 1-3 to make all 0...01 segs match, and finally use op 2 to remove redundant A prefixes
Bigtable: A Distributed Storage System for Structured Data
-
How does master discover the tablet server? How does client discover the tablet server? Compare and contrast the solution with similar problems in GFS
-
How does bigtable detect master health? What will happen if IT THINKS the master has failed? Compare this failure detection and recovery with GFS’s design on its master and primary chuck server
-
How does master detects tablet server’s health? What happens when the master thinks the tablet server has failed? Compare this interaction with GFS’s between its master and chunk server
-
How can master learn the the newly split tablet? How does GFS master learn the newly created chuck?
-
The commit log is in GFS. What did they do in case GFS is experiencing latency? How would recovery work in this case?
-
SSTables are in GFS already, why do we still need that commit log?
-
Sequence of actions on read? compare this with GFS’s sequence of reading
-
Sequence of actions on write? compare this with GFS’s sequence of writing
-
What kinds of info/metrics they monitor, and how did they do that? Compare their collections with GFS’s.
Codeforce Notes
436A: Feed with Candy
Consider the final solution: we should never eat 2 in a row. Therefore, we just try 2 cases, the first we eat is type 1 or type 2, and pick the best answer.
407A: Triangle
we can fix one point at (0,0), and then brute force to search for the second point coordinate with the a conditon, and then use the b condition to locate the 3rd point, or use the fact that (x, y) and (-y, x) form a right angle.
The part I failed is that I forget to check if the 3rd edge is parallel to any axis!.
460C: Present
Idea 1
Bsearch the answer, for each answer, we linear scan the array from left to right, also, we need to maintaind a d matrix to record the stop watering event
Idea 2
Greedy works on O(nm), but need additonal DS to speed it up
407B: Long Path
Claim: when we arrive at i + 1, all j < i has been visited even times already
Proof: By induction,
1. when we reach the i for the first time, by inductive hypothesis, prevous i-1 has beenn visted even times
2. if i points to itself, it will loop until it reaches even time, then move onto i+1
3. if move to previous entry, then following sequence of action is same as p visited for the first time. By the difference of the
induction on case p and case i, we know the number of ops introduced by the subseqenct actions at each cell is even. Now we have visited i
even time, and move onto i+1
Therefore, when we later on move back to previous entries, the # of actions is same as we visit it for the first time, because even times case = 0 times case
visit(i,j) : number of steps to visit j, from i for the first time, with all cells visted even times already, with the exception of i
visit(i, j) = visit(i, j -1) + visit(p(j-1), j -1) + 2 //1 for moving from j-1 to p(j-1), one for moving from j - 1 to j
base case visit(i, i) = 0
246D: Colorful Graph
For each edge, consider C(u) and C(v), we just update Q(c(u)) and Q(c(v)), and in the end select the set with max cardinality
similarly, it is as if we create a graph based on colors, and add edge between colors
459C: Pashmak and Buses
This one uses the trick of focusing on individual in collective actions
Consider the unknown: if the two students have the same assignment each day, what would that suggest?
Consider the sequece of assignment for each student, there are k^d possible choices, this means the two students will be friends if and only if they share exactly a sequence of assignments. Therefore, if n > k^d, we know it is bad. Otherwise, we can assign a unique sequence id to each student
My wrong approach focused solely on collective actions, which gives a maximum of k * d
C. Bear and Prime Numbers
My (Failed) Idea
1. generate all primes <= 10^7
2. factor all n numbers, for each factor, increase the match[factor] by 1
3. sweept through the array generated by 2, to compute the prefix sum
4. the range becomes difference of prefix sums
The basic idea is sound, however, the main problem is that factoring in step 2) is too slow. Instead, we can use the sieve to reverse the factoring process, and calculate answers for all numbers in a single runr, i.e., instead of increasing match[factor] by 1, we can increase it by f(p)
429B: Working out
Consider the cell they meet at (mi, mj), one has to enter horizontally, and the other has to enter vertically, otherwise, they will have more than 1 same cells, we just need to iterate all possible meeting points, each with 2 possiblites
also, note that the path is effective cut in 2, and each half must exist, i.e., meeting point can not touch border!
so 2 * 10^6 to calculate the cost in 4 directions + 10^6 brute force on meeting points
311A: The Closest Pair
the total number of ops <= n (n-1) / 2, therefore, if k >= that max # of ops, no answer
Notice that we only require delta(x) < d, then we can force all points to have to same x coordinate, and then put them on the same y-axis, the distance between them does not even matter
Codeforce Notes
371C: Hamburgers
Just bsearch the answers, and calcuate number of pieces one has to buy
471C: MUH and House of Cards
If we can form a higher level of tower, we can form a lower level of tower, i.e, we just need to look for the max level of towers
Given a tower of level h, we can know the min # of cards we need to build it. Therefore, we can just bsearch for the answer: find the highest h s.t., f(h) <= n
f(h) = (1 + h) * h /2 * 3 - h
185A: Plant
f(up, n) = 3 * f(up, n -1) + f(down, n -1)
f(down, n) = 3 * f(down, n-1) + f(up, n-1)
Eq 1 - Eq 2 and we discover a geometric series, and then we can calculate the power in O(logN)
623A: Graph and String
s(i) = b <=> deg(i) = n - 1
Note that if we have a solution, we can swap all as and cs and the result is still a valid result
s(i) = a <=> deg(i) = num of cs + (num of as - 1). Therefore, all other notes not b will be a
The remaining will be c, and then we scan through all edges again calculate degrees for each v.
4D: Mysterious Present
sort by (w,h), then O(n^2) DP, finally scan through answers that can fit the envelope
580D: Kefa and Dishes
At first I tried brute force generate all set, and potentially find all optimal matching in the set, but I can not find an easy algorithm to do so.
However, considering that a lot of graph-ish problem are very dp-like, we can locate the following relationship
score(set ending at i) = min(set ending at j, and i not in set) + score(i followed by j)
2^n * n states, each state takes n to update
417C: Football
|E| = k * n, therefore , k <= (n-1)/2
when k = 1, total | E | = | n | , with each edge has an outgoing edge , i.e., a cycle |
when n = 1 or 2, impossible.
when k is within the upper bound, we can just add edge the lexo sorted next k nodes
Codeforce Notes
402C: Searching for Graph
Base case: when n = 5. K5 has 10 edges, i.e., p = 0
when n = 6, K6 has 15 edges, i.e., p <= 3
idea : connect v6 to 2 + p existing nodes in n = 5
when taking a k element subgraph, total # edges = edges in K5 + | edges to v6 | <= 2 (k-1) + 2 + p => we are good |
when n = 7, K7 has 21 edges, i.e., p <= 7. So we can have A(6, 3), and then connect 2 + p - 3 edges to nodes in A6, if p > 3, if p < 3, we just add 2 edges to 1 and 2, and construct A(6, p)
when n = 8, K8 has 28 edges, i.e., p <= 12, So we can have A(7, 7) , and then connect 2 + p -7 edges to nodes in A7
i.e. we find a way to construct answers
Note the official solution has different insights
339C: Xenia and Weights
good(round, weightUsed, balance) =
LPE(lw, 1, 10)
if(w < b && s[w] == '1' && s[lw] == '1')
good(r, w,b) = good(r, w, b) || good(r-1,lw, w - balance)
10^5 states, each cell iterates 10 times to update => 10^6 run time
287C: Guess Your Way Out
1. by induction, if the leaf exists in left child, we do not have to leave that child. On the other hand, if the leaf does not exist, we
will have to visit all before
2. base case h = 1
3. ans(L, h, n) = ans(R,h - 1, n - 2^(h-1)) if exit in the left, or 2^h - 1 + ans(L, n)
4. ans(R, h, n) = ans(L,h - 1, n/2) if exit in the right, or 2^h - 1 + ans(R, n - 2^(h-1))
150A: Win or Freeze
Calcualte non-trival prime divisors of 2. Note that npd != 1, because mutliplication requires 2 numbers
1. if npd = 0, A wins
2. if npd = 2, B wins
3. if npd > 2, A takes all but 2 prime divisors, and and hand over result to B
4. Therefore, B wins iff npd = 2, we just need to write a quick check to see if npd = 2. So we start from 2, and find first divisor of d
<= sqrt(q), and check if q/d is prime or not
5. Alternatively, we can iterate though [2, sqrt(q)], to calculate # of divisors. Note that if after the iteration, q is not 1 yet, we
need to add that 1 prime divisor to the total count!
This is a typical problem that can be broken into 2 parts, with each part solved differently and answer to the first part provides context to the second part
Codeforce Notes
448D: Multiplication Table
final answer <= 10 * 10^10, we can just bsearch the answer, at each guess, check form 1 to n, we can check how many > answer, and how many = answer
if nl + ne = k
ne > 0 we are done. Otherwise, the guess is too small
if nl + ne > k
if nl <= k-1 we are done
otherwise, guess is too small
if nl + ne < k
our guess is too big
493C: Vasya and Basketball
Sort a and b, we then increase d from 0 to max(a[i], b[i]), with each step a value of a or b , and update score as we go
Note that the two pointer approach does NOT work here! The reason is that the sliding window generally works when everything is triggered by moving your end pointer, and front pointer merely follows. However, in this case, because we need to handle the bands of equal values on both sides, when the front pointer moves, it has to update the state, thus violates the pattern.
590A: Median Smoothing
For any a[i] = a[i+1], this segment will be stable
therefore, the seq can be partitioned by those continous bands
consider the case: 1010101..01 each round takes out a 0
consider the case: 1010….0, each round moves the leftmost 0 to the right, until all 1s and 0s are together
so we iterate through the seq, and detect those alternating bands
However, during implmentation, note that to detect alternating band, we should use a[i -1] != a[i] && a[i] != a[i+1], instead of just one side. Otherwise, the code can not handle the case of crossing the boundary of continous bands
621C: Wet Shark and Flowers
Two commmon tricks: reduce bands to prefixs, and caluclate complements.
Note that if we have to calculate Pr(A) + Pr(B) - Pr(A) * Pr(B). Often it is a sign we might as well calculate its complement!
1. Total E = sum of Ex for each shark
2. Pr(i generates a prime) = (# of primes <= R) - (# of primes <= L)
3. Pr(product of i and i+ 1 is prime) = 1 - (1 - Pr(i generates a prime)) * (1 - Pr(i+1 generates a prime))
Photon: Fault-tolerant and scalable joining of continuous data streams
The requirement is near-exactly-once processing. That means they need to ensure at-most and at least processing
-
How is at most once implemented?
-
How is at least once implemented?
-
The joiner update the state and then flush the result, but these 2 operations are not in a single atomic step. How do they mitigate the problem where state is updated, but joiner is unable to flush the result? Even after these mitigatations, what other scenarios could break the at least once semantics
-
How does event source know that which click is associated with with query?
-
Each click has a unique id, how/when/where is it generated?
-
The IdRegistry is shared across DCs with up to 100ms latency. in production each PaxosDB leader has ~ 10 requests per second. How do they use so few requests to handle millions of clicks/write requests per min?
-
IdRegistry is, unsurprising, sharded, but they have the requirement to dynamically increase/decrease # of shards. How do they ensure that the same click, before and after sharding changes, will be routed to the same shard?
-
How do they mitigate outage at a single DC?
-
Consider at two different DCs processing the same click, how do they resolve the read-write conflict, specifically, the two DCs tries to add the same click id to the registry at the same time?
-
How do they validate their join logic?
-
Due to business requirement and storage restrictions, they keep only the most N days of data. Due to the single master architeure of paxos, it makes sense to run GC only at the master replica. But what if master changes and clock time drifted? How do they defend against it?
Codeforce Notes
367A: Sereja and Algorithm
For string s if an algo can terminate, we know exactly |x| , |y| , and |z|
1. if len % 3 = 0, |x| = |y| = |z|
2. if len % 3 != 0, |x|, |y|, |z| can not differ more than 1
Claim: if a string satisfies the condition above, we know algo on it terminates
Proof: induction
1. base case n = 1,2,3, obvious
2. when len % 3 = 0, we find the longest prefix where |x| = |y| = |z| - 1, apply the hypthesis, and then apply the hypothesis again on the suffix , with the last 2 letters of the reformed prefix
3. when len % 3 = 1. we look for a prefix wher |x| = |y| = |z|, apply the hypothesis on the suffix, and then the prefix plus the first letter of the reformed suffix
4. when len % 3 = 2. similar to case 3)
479D: Long Jumps
Need 0 marks, if there exists a(i) + x, a(j) + y already
need 1 mark, if
1. 1 already exists from existing mark
2. m s.t. abs(a(i) - m) = x and abs(a(j) - m) = y. We can generate by one condition and then filter by the other, i.e., can do a linear
search on points, both direction with delta x, and then filter out those outside [0, l], and m +/- y is not part of existing points
Otherwise, need 2 marks.
103B: Cthulhu
Assume the graph is connected
This is similar to the sunny graph problem,i.e., when |v| = |e|, <=> each node has only 1 parent in directed graph, and note that the cycle has min length at least 3, so that is satisfied as well
617D: Polyline
If there is only 1 line, then they share all x or y, if and only if
If there are 2, then at least 2 share x or y, however, if we have shared x or y, we need do make sure z has the other axis which does not fall into the range of (xx, yx)
To make coding easy, we can do a bruce force search on the first line.
628B: New Skateboard
Idea 1:
num [i] [mod] = number of substrings ending at i, with mod as the substring # % 4
num[i + 1] [(mod * 10 + digit) % 4] += nums[i] [mod] num [i +1 ] [digit % 4] ++
Idea 2:
if number % 4 == 0, <=> the last 2 digits form a number % 4 == 0, therefore, we look for ll 2-letter substrings whose num % 4 == 0 , and add length of prefix to final ans
Codeforce Notes
540C: Ice Cave
If there is a solution, we know that
1. there a path from (r0,c0) to (r1, c1) without touching existing Xs
2. (r1, c1) must have at least 2 . neighbors if it is . itself, one for the incoming edge, one for stepping out and back
3. note that we do not need more than 2, because we can always reorganize our paths to use only 1 . neighbor, if there exists any path
Special case is r0 = r1 and c0 = c1, where you just need one . neighbor
463C: Gargari and Bishops
There are only 2 classes of diagonals:
-
captureable from (1,1), by only going diagonals
-
captureable from (1,2), by only going diagonals
proof: induction on the size of n, base case n = 2 and 3, each time n -= 2
Therefore, we go through each point, calculate which class it is from sum of the value it captures, and update that class maximum
A point is in class 1 in if abs(y -x) is even, otherwise when abs(y-x) is odd, it is class 2
pre-calculate all 2 * n diagonal sums for quick retrival. For all left leaning diagonals, cells with same i-j are on the same diagonal. For the right leaning, cells with i+j are on the same diagonal
552C: Vanya and Scales
Consider the number in w-nary representation, from the smallest digit form.
if it is 0, do nothing
if it is 1, we put our only weight on the RHS
if it is w - 1, we put our only weight on the LHS,
if it is other digit, we can do nothing
Note that we go small > large instead of the other direction is because we are doing additions and advancing a digit will affect downstream ops
329A: Purification
Consider with out E
Claim: we need at least n spells
Proof: induction on n, n =1 obvious n = i, if it can be done with < i spells, we can move those points to within the i-1 range, thus contradicts the inductive hypothesis
If we have n spells, how should we cast?
if we can find a X or T of Es, then we know there is no solution, otherwise, one must exist, because we can fill into that x or T hole and remove those Ts
if there is a row of E, consider the final solution, we know we should cast spell on each column once, and conversely such will satisfy a solution, since and X or T do not exist, we know that each column has a cell we can fill, we just fill it. After n rounds, all n columns are filled.
if there is a col of E, then we go after each rows with steps similar
Codeforce Notes
543A: Writing Code
Note that we are actually looking for a O(n^3) solution
way[i][j][k] : 1…i to write j lines of code with total bug k
ways[i][j][k] = sum of (ways[i-1] [j - j1] [k - j1 * a(i)) => This gives O(n^4)
Alternativly, it is equal to
ways[i-1][j][k] + ways[i-1][j - 1] [k - a(i)] //i doesnt write a single + i writes at least one line
Notice this recursion is really similar to how you calculate binomial coefficents
Note that although O(n^3) run time is ok, but space is not ok. Therefore, we need to reuse the overwrite dp state trick, i.e., we use a 2-d array, and on calculating each ways, ways[i-1][j][k] is actually the cell we are about to update, and ways[i][j-1][k-a] is the (j-1, k-a) cell we updated in this i round, i.e., i should be the outmost loop
703C: Chris and Road
Idea 1:
The two moving against each other is equivalent of tourist moving in diagonal against static object. Therefore, he has to wait if that diagonal cuts into polygon. If it does, then tourist has to wait until the line just meets a single point of the polytime. The amount of time to wait is equal to v/(horizontal shift to make the diagonal tangent).
Note we can use bsearch to brute force the shift
Idea 2:
We hold at the origin, until polygon passes a certain point, and then we go full speed toeard to end
If we may crash into the bus, then the final time >= (w - y)/u + x/v, for any point in the polygon.
Claim : the optimal way is the MAXIMAL value.
Proof: we will get better overall time if and only if we start earlier given our schedule. but it will hit the polygon, because we will have a case where final time < (w -y)/u + x/v, directly violating our conditions
What I did wrong during the contest
1. upon realizing that the man has to wait and it is hard to describe it, I should transform the problem into something more approachable,e.g. , the tangent approach or
the hold-release approach
2. the lastT < passTime check did not consider the case of points on the same y-axis, and therefore, need to introduce a max filter
Codeforce Notes
354A: Vasya and Robot
consider the final solution, 1…m will be picked by left, m + 1…n by right.
Therefore, we need to brute force on the m. To minimize cost, we need to alternative L and R as much as we can.
if m <= n/2, total cost = staticLeft(1…m) + staticRight(m+1….n) + (n - m - m -1) * rc else total cost = staticLeft(1…m) + staticRight(m+1….n) + (m - (n -m) - 1) * rc
22D: Segments
Consider the left most seg, just need to be greedy to nail at its right end, and then take out all segs starting to the right of that end
282C: XOR and OR
Idea 1
If they have different lengths, then of course we can not transform
Difference between XOR and OR: only difference is at 11, one is 0, the other one is 1
len = 1, can do only if they are same already
when len = 2
1. 00 <=> 00
2. 01 <=> 11 <=> 10
Claim: we can transform => we can start fixing from the tail (if side is obvious). Notice that each step is reversible, so we can force direction from a to b
proof: n = 1 or 2, obvious
for case n, if 0-0 or 1-1 case, then we can follow inductive hypothesis, with nop at the end
if 0-1 case, the only way to transform, if possible, is to use rule 2,i.e, change the tail to 10, for all possible ways.Therefore, we can fix the order of application
Idea 2
Claim: any non-zero string can be converted to each other
Proof: any string can be converted to 10…0. From right to left, just keep applying 01 => 10, 11 => 10
493D: Vasya and Chess
n = 2, w wins immediatley
n = 3, all will go down until reach the bottom, and then white got captured by black
n= 4, white goes down, not sure how that work…
but if white goes right, then from the black perspective it is same as n = 3 case, where both have to reach the bottom first. i.e., black will lose for sure!
n = 5, if white goes left, black should go right, so that we have similar case to n = 3, i.e. black will win
if white goes down, black will go down as well, since now it appears that, the width matters most instead of heights, and this reduces to the first case of n = 5
i.e., when n is even, we can always devise a strategy for white to win, when n is odd, we can always design 1 strategy for black to win