835D: Palindromic characteristics

2017-08-01 00:00:00 +0000

Insights

if a string is k-palindrome, then it is also a k-1 palindrome!!!
```
Proof: simple induction on k
```
if a string is k-palindrome <=> then it is a palindrome, plus it is left half is a k - 1 palindrome
```
Proof: induction on k, base case k = 2
```

So we need to calculate maxK(l, r) = max degress of k in in the substring [l, r]. This means we need 2 DPs, one for if the substring is a palindrome, another for the max degree of left half >= k - 1.In the end, we do a reverse prefix sum from

to 1

TiDB vs CockroachDB

2017-07-28 00:00:00 +0000

Last updated: Apr 8, 2019

Similar designs

Both follow the architecture of a stateless SQL layer on top of a replicated, strongly consistent KV store.
Both use RocksDB to serve the KV store. Consistency is guaranteed by the standard write ahead log (WAL) + replicated state machine (RSM) model. WAL is replicated by raft.
- This architecture is inspired by Spanner/F1.
- Each db has its own optimization on top of the vanilla raft. Will not dive deep into details here, because the actual optimization keeps evolving
To implement transaction, both use the standard 2PC + MVCC idea.
- Both, by default, follows serializable concurrency model, implemented by lease reads.
Both use online schema change, based on the paper ”Online, Asynchronous Schema Change in F1”,i.e., multi-stage schema change

The similarities end at this high level designs. Huge difference in implementation details

Performance

The test I did in 2017. Note that I was completely new to tidb, if I run it again, the numbers would be much matter.
- Use snapshot read instead of consistency read
For standard innodb mysql, our planning expectation is 5k qps at 10 ms avg latency on an r4.4xlarge.

System design: distribute whitelist

2017-07-24 00:00:00 +0000

Consider we have mail client that updates the blocked website list for our service. Design one such service so that we can distribute the black list. the server list is updated once per second the client pulls our service once every 30 mins

Design: Obviously, we need to optimize for read performance

so the client has been updated 1800 times between each pull

The client should send a request fetchUpdate(clientVersion), which should return the deltas since the client version

On the reader side, we should have a list of deltas and with each version, and return list of versions. In a k-v store that supports range scan, this should be highly efficient => because they are aligned together in persistant layer

To future improve performance for long missed read, we can have a compacted view every certain versions, if the new request < last compacted version, just return the whole thing. we should do a log compaction from time to time to purge too old deltas.

In terms of global updates, we will designate 1 DC as master, and other as slaves, and we use cross region ZK to locate which one is alive and forward all write traffic to it. when the master recovered, we will just talk to the new master and sync from the single source of truth to recover

Codeforces notes

2017-07-18 00:00:00 +0000

828C: String Reconstruction(!!!)

A brute force way will timeout. What can we do to remove duplicate operations?

My idea

So we need to track of continuous intervals, and skip them if it is already filled.

We can keep each segment’s previous empty and next empty index, with path compression to help run time. However, this gives TLE?!

Official solution

Sort strings by index, and keep track of the index before which all has been seen. At each step, either increase index and fill partial strings, or ignore the current string because it is within the processed prefix

831C: Jury Marks

Calculate prefix sum of scores, and sort both deltas. We know the min value of B must match [0, k - n] of the delta array, so we can just brute froce on each potential match to see if it is feasible to backtrack, calculate the backtrack and add the calculated result to the result set

Note that once we fix ONE mapping between b(i) and a(j), the whole sequence becomes deterministic, i.e., the answer <= k - n + 1

Weekend Contests

2017-06-26 00:00:00 +0000

agc076b: Built?

A naive MST approach is too slow, namely, there are too many potential edges. Now that we are interested in an optimal solution, can we use some greedy insight to eliminate edges that are not actually required? This is where I got stuck!!!

What I should have done is to experiment simpler cases, and see if there is any edge that I know will never appear in the final solution.

Consider 3 points, a, b, c, with xa < xb < xc, we know the edge from xa to xc will never appear in the final solution, because we can just keep xa-xb, and xb-xc in the final MST anyway. Therefore, we just need to the add the edge between its immediate neighbors, and reduce number of edges to O(n)

821D: Okabe and City

Easy to see that we can model it as a shortest path problem. But the problem is how to reduce the number of edges in the naive approach. Similar to the atCoder problem, I am stuck proving/reducing number of edges!!!

Insights

In the final solution, we know we will light each row and column at most once. Otherwise, there is cycle in our path
This means we will visit each node in 3 ways ``` a. by adjacent cells

b. by lighting the row above or below or same

c. by lighting the col above or below or same

```

Therefore, during BFS traversal, we need to update all 3 cases, because of insight 1), in case b and c , each cell gets updates once max each (If we use a pq to dequeue next to visit, we know the first time we light the row/column we absolutely need it, and this by 1, is the first and only time we light it in the solution) . Overall cost linear to k
To detect if we can reach the destination, we need to check if the node itself is reachable, or anything from the last and second last row/col is reachable!!!

agc016c: +/- Rectangle

2017-06-22 00:00:00 +0000

Key insight I missed!!!

If sum of any h by w rectangle is negative, how come the sum as a whole is positve?

Consider the 1 row case, it becomes clear that if such thing can happen only if W % w != 0. Otherwise, the sum of the whole row can be broken down the the sum of W/w non-intersecting subrows => but the sum of all these negative numbers can not be postivie.

By extension, in H by W case, as long as we can’t divide cleanly into h by w squares, we can fulfill such need. We can prove by constructing one solution for all such cases.

Construction

We will assume sum of each h by w rectangel is -1. Then total number of complete squares ncs= W/w * H/h. Number of padding cells npc = H * W - ncs * h * w.

So we can fill each cell by value v = ncs/npc + 1, when i % h != h-1 || j % w != w- 1 Otherwise, we fill the cell by value -v * (h * w - 1) - 1

817D: Imbalanced Array

2017-06-15 00:00:00 +0000

Why I didn’t solve it

I got the idea, but I had problem handling the tiebreaker case!!!, i.e., what if within the same segment, there are multiple minival values, which one to pick?

To handle the tie breaker, we just calculate for each value, how many values this value is the LEFTMOST smallest value, i.e., to handle tie breaker, we introduce a secondary condition that helps us identify.

Similarly, for the rightside, we just use a silghtly different procecure to calculate the first number to the right that is greater than the current one

Codeforces Notes

2017-06-07 00:00:00 +0000

703D: Mishka and Interesting sum (!!!)

Insights

answer to query = XOR of all elements XOR distinct elements

To answer XOR of distinct elements => we need to solve "find # of distinct values in a segment", need to help of Fenwicks Tree

Need to process queries in sorted by ending index way, so that we update each entry in the segment tree only once

Final runtime is O(nlogn)

144D: Missile Silos

A modification of dijkstra’s algorithm, try adding to the answer when we are expanding

679B: Bear and Tower of Cubes(!!!)

Consider the upper bound m, the higher upper bound the better result is

Assume the first block we pick is max a s.t. a^3 <= m

Then answer is 1 + num(m - a^3)

If the first block we pick is (a-1), the then answer is 1 + num(a^3 - 1 - (a-1)^3)

by the same logic, we can see the only possible choice that yield the best solution is either block with side a or a-1, because if we pick a-2, the answer will be 1 + num((a-1)^3 - (a -2)^3), which can not be better do the greedy propety of the upper bound m

During implementation, when we are calcuate n-1 cases, we need to take the cube too, because otherwise, the two branch will remain on the same magnitute => linear solution!!!After taking the max-1 cube in the same step, the second branch shrinks quickly and will not pose problem to us

Educational Codeforces Round 22

2017-06-06 00:00:00 +0000

813B: The Golden Age

Need to take care of long long overflow. I use a log based solution to limit the max value for a and b. The official solution uses

while x <= L/ a
	x *= a;

and calculate the powers inside iteration

813D: Two Melodies(!!!)

Since we care only the last entry, we will keep track of dp[x][y] => last index of the each list

Obviously, we can enforce x > y due to symmetric rule,

The key is that to avoid intersecitons, we will update dp[x][y] only with dp[i][y], with i != y and i < x, because based on our dp definitely, we know that dp[i][y] gives no interseciton, and x has not been used by sequence ending at either i or y.

Conversely, if we update dp with dp[x][i], we can not guarantees that y has not been used by sequence ending at x yet.

Above proves that our scheme is sound. It is also complete because we cover all possible cases with a given y.

Another case to watch out for is the empty melody case, i.e., we need to decide when/where to start a new melody. To fix it, we change index from 0-based to 1-based, and mark dp[0][y] as the case with a single melody

then we start y from 0 to n, and then scan x from y + 1 to n need to know max of dp[k][y] s.t. a[i] % 7 = a[k] % 7 need to know max of dp[k][y] s.t. a[i] - 1 = a[k]

so 4 cases to consider when we update dp[i][y]

Codeforces Notes

2017-06-04 00:00:00 +0000

121B: Lucky Transformation

Note that when we are a 447 or 477 at the starting at the odd position, we run into an infinite case. Therefore, we just need to detect such infinite case.

61D: Eternal Victory

Notice it is a connected graph with n-1 edges, so it is a tree So we have 2 cases:

cover all nodes and then return to the root
cover all nodes and without returning to the node

so we can just DP on these 2 cases

487B: Strip

when we start from the left, and expanding, when we first reach the case where max - min > s, we will have to break current strip into two. To maintain max flexibilty for later choices, and satifies the min length requirement, we will just keep the last violating one

arc075E: Meaningful Mean

2017-06-03 00:00:00 +0000

Why I did not solve it

I did realize that the problem can be transformed into, given number a[i], how many numbers a[j] < a[i] while j < i
I also realize that we can use some sort of segment-tree DS to solve the problem above. But I didn’t have enough experience with such DS

Official answer

compress the values to 0 and N 

Use a fenwick tree, here we can calculate prefix sum, in this case, total number of elements < compress value,  quickly with update supported, and add answer to the final answer 

Codeforces Notes

2017-06-02 00:00:00 +0000

319B: Psychos in a Line (!!!)

This problem is not easy even though it appears so.

Idea 1

Observe that we definitely want to know something about the number that

j < i

a[j] > a[i]

j is the biggest about all that satifies 1 and 2

if a[i] < a[i - 1], then time[i] = 0;

if a[i] > a[i - 1], then

if time[i - 1] = -1, i.e., always alive, then time[i] = -1; 

if time[i - 1] >= 0, and there exists j that satifies the conditions we mentioned above,  time[i] = max(time [(j...i)] + 1. Note that this upper bound is achievable and thus is the optimal answer

Otherwise, time [i - 1] = -1, it is always alive;

To calculate j, we can use a stack to keep track of numbers, and popping until we reach the one that is greater than the current one, and then add the i onto the stack

Note that number of steps = time to kill + 1

Also note that, the answer for case 3 2 1 is 1 instead of 0!

Idea 2

Consider a brute force O(n^2) solution

For each turn, we maintin the state of the array, and we add new item one by one to all turns

If end of the list > current to add, we mark current dead at turn i + 1, and add it to all at turn 0 to i 

The final answer is the max of all kill times

Now we need to optimize it to O(n)

For space, obviously, we need to keep track of only the right most entry

For time, instead of iterating all entries, we just put it on the stack, and remember them top to bottom 

each interval got put on stack at most once.

359D: Pair of Numbers(!!!)

My idea

Assume they are all distinct Suppose we have one such collection, it must be of V shape in terms of values

So we start with each local minimal points, and try expanding, because of that v shape-thing, we know we cover each point at most once So we try calculating longest right-expanding sequence, and then reverse and calculate the longest left-expanding sequence, and then sum it up

Note the corner case where all values are same, we will have multiple entries

Official solution

Binarly search on the value of r-l, use a data structure to answer GCD(l,r) and min(l, r),e.g., a segment tree-ish DS

358D: Dima and Hares(!!!)

Brute force DP seems to be O(n^3). How can we improve that?

Notice that we have a lot of overcalculation here: place 1 and then 3 is same as placing 3 and then 1,i.e., we need to find a different angle.

Consider the leftmost entry. We have two choices

1. Fill 0 with 1 is empty => equivalent of totalCost(1...n, leftFilled) + cost(0, single)  
2. Fill 0 while 1 is filled already => equiavlent of totalCost(1..n, leftNotFiled) + cost(0, double)

For easier coding, we can reason the same thing but start with the rightmost

Codeforces Notes

2017-06-01 00:00:00 +0000

333B: Chips(!!!)

Assume there is no obstables, what is the maximal number we can add?

Because of rule 2 and 3, we can not add chips to both ends
Because of rule 2, given i, we can add to either row i or col i, but not both
However, if we add to (i, 1), we can also add to (n, i), (n - i + 1, n), (1, n - i + 1)
Therefore, the maximal possible is around 2 * n. Note that if n is even, the middle row is the special case.
Also, need to try all from 2 to n -1, but need to consider the case where previously added chips blocks later additions!!!

115B: Lawnmower(!!!)

However, I had problem implementing the solution quickly and cleanly. The reason being my greedy algorithm is not concise enough.

Suppose we’re on a row, facing right. This strategy say that we need to move to the right as long as there is a weed to the right of us either on this row or on the row directly below us.

Note that this insight can handle the empty row cleanly. Also, we need to track of the last row to calculate how many move down operations we need

34D: Road Map

The old map represents a parent-child relationship easily. Therefore, we can construct the graph and do a simple tree traversal to update the new map

367B: Sereja ans Anagrams

Notice that p is fixed. We can just group entries by ps, and then do a slide window scan to match all possible starting indices

776D: The Door Problem

My WA approach(!!!)

Suppose there exists a solution, what property it has?

If the door is closed : then t(d1) + t(d2) is even, i.e., they have the same oddity
If the door is open: then t(d1) + t(d2) is odd, they have the same oddity

Note that each switch has only 2 choices, even or odd, any numbers can be reduced to this choice.

Conversely, if we can find an arrangement that satifies such odditiy requirment, then we know in the end all doors are open => i.e., we are good, and this condition is strong enough.

So what we can do

In case 1, we union them
For case 2, we union all it opponents
In the end, we scan through all contradictions again, to make sure they are not unioned, i.e., no conflict

Official Approach

Since we care only the oddity, this means we can reduce to toggles to 2 cases, either 0 or 1 Model switch as nodes and door as edges, try coloring node from switch 1, and see if we can introduce any conflicts

Note that implementation wise, it is OK to have duplicates edges in our DS.Although need to consider the case of multiple components - I missed that case!!!

Codeforces Notes

2017-05-31 00:00:00 +0000

799D: Field expansion

I tried a greedy approach, but couldn’t prove it. This is often the sign a greedy approach won’t work.

Insight

Obviously, we only care highest factors
The key insight I missed is that given the input is only 10^6, number of factors <= 34, i.e., we can do a brute force dp approach!!!

maxV[i][h] = max w value when we are at factor i, with h value h

max[i][h] = MAX(max[i-1][h] * f[i], max[i-1][h/f[i]])

when h, w is greater than the bound, we will use a dummy value for that!

140D: New Year Contest

Obviously, we pick the quickest problems, as long as total sum <= 720

Claim: we should solve problems in the order of length, that gives the best solution

Proof: consider the the last different choice. If the finish the time is before midnight, we can just make changes, and won’t affect the overall result. But if the finish time of that block is after midnight, let us move the longest unassigned one after the different choice

penalt after <= oldPentalty - min(diffFinishTime, longestLen) + 0, + 0 because the end time is the same => i.e., swapping such means optimal solution

229B: Planets (!!!)

Idea similar to dijkstra, the only different is that upon expanding, see if we need to wait

I tried re-inserting entry but got TLE, so we just wait until we can expand to next edge.

The speical case is when we are at n already

Codeforces Notes

2017-05-30 00:00:00 +0000

811C: Vladik and Memorable Trip

Note the English transaltion of the problem is a bit misleading: the problem actually means for each city’s traveller, either all on the same segment, or none selected at all, i.e., impossible to have case where some people are going to city x while some won’t

This means that we will have the cascading effect, i.e, we need to merge all overlapping [l, r] segments until there is no overlapping one, if we ever want to pick one city in the overlapping segments

after that we can just do a brute force DP, for each r that is an end, look for all start l, can calculate the XOR, and update the DP result.

808D: Array Division

since all numbers are positive, we can calculate the prefix sum, and then for each number, we look for (half of total) + that number , see if there exists one such prefix sum.

807D: Dynamic Problem Scoring(!!!)

What I did wrong

I was trying to brute force search on the final band of problems, but realize that one new account will affect the band of ALL past problems bands!!!

Insights

From a new account, implementation wise, we don’t need to submit wrong answers at all, since only the total account number matters
For each problem, we either need increase its good rate or lower good rate
Consider the case when we only need to increase the good rate. To minimize number of accounts created, each account should submit right to as many problems as possible, so as to improve the pass rate of multiple questions, e.g., we can do better with one account submit right to two problems than two accounts each submit to one??
When we need to lower the good rate, the account just do nothing at that problem, and is optimal
Therefore, we have an optimal scheme that satifies 3 and 4 at the same time, the key insight being because of 2, the two cases don’t interfere with each other
Note that we can not binary search on answers, because we can not pile on new account when there is unsolved problems => which will only lower the rate of the problems we want to increase
Worse case, we just need to increase n by 32 times, to dilute all problems to 500 or make it 3000, so its 32 * 5 * 120

794C: Naming Company

2017-05-29 00:00:00 +0000

Consider the final string will have 2/n chars from each person. Obviously, Oleg will take the smallest n/2 chars, and Igor biggest n/2 chars from their own sets, since we can improve the answer from each’s perspective otherwise.

When O is playing

if current O’s smallest < I’s largest, O will need to fill the left most unoccupied with its own smallest, otherwise I will fill it with largest => not optimal

if current O’s smallest >= I’s largest, O will need to fill the right most unoccupied with O’s largest

When I is playing

if current O’s smallest < I’s largest, I’s largest goes left most

if current O’s smallest >= I’s largest, I’s smallest goes to right most

What I did wrong

The tie breaker case, the player should not occupy the the leftmost position with their own char, instead, they should try to force the opponent to take that position, while preserve their best candidate for the next round!!!

809B: Glad to see you!

2017-05-22 00:00:00 +0000

Why I didn’t solve it

I realizee that I can use bsearch and detect the direction of points by asking (i, i+1). However, I got stuck at handling the answer to my query => i.e., what does the answer mean?

Insights

(mid, mid+1) returns true => there must exist one point to the left of mid, including mid
returns false => there must exist one point to the right of mid+1, include mid + 1
So we can just do a bsearch, and we know there must exist an point in our range, and we can definitely find one => when r - l = 1
Now that we repeat similar bsearch in [1, mid) and (mid + 1, n], we know that at least one points in the intervals for sure
Note that to root out false positvies, we need to specifically check the second point’s validity by issuing another query!!!
Need to handle the case of first return is 1 or n with special care!!!

arc073c: Ball Coloring

2017-05-20 00:00:00 +0000

Insights

Obviously, the global max and min value will appear in the final formula.
When global min is red, we need to paint all smaller value in each bag red to minize red max, this also anchors blue max to the global max and max the blue min => since it is the bigger of each bag already => i.e., it must be optimal
From now on, we try checking all locally optimal solutions, classified the rank of red min, and the globally optimal solution would be the best of the locally optimal solutions.
When the globally second smallest value is in the same bag as globally min value, we must paint global min to blue, which is exactly same case as 2)
When the globally second smallest value is not in the same bag as the globaly min value, the optimal solution is exactly same as case 2, except that for the globally min pair, we paint global min as blue, and its counter part is red
Therefore, we can check all locally optimal points one by one, by doing such flipping, and increase the possible rmin values. Notice that this impliies both the max value and min value will be blue in these classes

This turned out to be a very hard problem for me to solve. The second try gives WA too!!!

793C: Mice problem

2017-05-14 00:00:00 +0000

What I did wrong

since the given precision is 1e-6, the epsilon should be 1e-12, because of the base offraction arithmetic is (1e-6)^2
To calculate the range of valid time, i calculate the segment as a whole, instead of separating them into two dimesnions. This introduced additional float point calculation and introduced errors!!!

Idea

Mouses in ranges <=> each mouse is in range

Mouse in range <=> x is in x-axis range and y is in y-axis range. So we just need to search for a time range that satifies such condition.

Therefore, we just calculate the range it takes to stay in x-axis range and y-axis range, respectively, if there exists a common intersection, we know it is the answer, otherwise, it is impossible

When v < 0, we can just mirror the whole thing to the opposite sign => the result is still the same yet saves our coding

Reflections

Seems taht I keep having trouble with problems where we are searching for things that satifies both conditions at the same time. The main difficulty almost always seems that the two conditions interfere with each other.

Therefore, the key step is too process the two conditions in two steps, with each step handling one, without worrying about the interference of the other. Ideas include

search for first condition, when we are inside the first condition, we look for the second condition

Separate the construction, such that, each part handles one condition, and the final answer is a simple composition of parts, with proof that both parts can be reached at the same time

similar to 1, filter by the first conditon, and then filter by the second conditon, the result should should be an intersection of the two parts

As for this problem, we need to concisously transalte the orignal condition to two conditions and then handle the two in two steps. This way is easier to code, and in general often is the only way to solve the problem!

799C: Fountains

2017-05-12 00:00:00 +0000

When one is C and one is D, simple greedily choose the highest beatuty within limit for each type

When both fountains are of the same type, call them F1, F2, if we search for all F1, we need to answer the query, max beauty for all price < T - P(F1). Therefore, we can build such array while we scan through, moreoever, since maxB(v1) >= maxB(v2) whenever v1 > v2, we can bsearch on the weight or value.

Pseudo code

sort the (p, b) tuple by p
maxB(0) = 0
ans = 0;

for i in 1 to n
  maxB(i) = max(maxB(i - 1), PB(i).second) 
  if(PB(i).first < totalValue)
    ans = max(ans, PB(i).second + bsearch(0, i-1)) 

Previous Page: 17 of 23 Next