On writing weekly report
- The presence of rigid, email-based status reports comes down to control, a lack of imagination, and a lack of trust in the organization. However, strategic assessment of the week is still needed, i.e., what is working and what is not
- Often a good candidate for 1:1
Common parts:
- Date and list of stakeholders
- Previous’s week’s plan
- This week’s progress
- Next week’s plan
Previous’s week’s plan
- Quote objectives, as reminder to the management. Keep it CONCISE
This week’s progress
- May order by status, keep list of completed milestones, may keep them at the front, espeically if it is ahead of schedule.
- What issues and challenges did we run into? What need to be addressed?
- Any part of context we need to update? Include both before value and after value
- What results, not effort, did we get?
- If possible, include completion percetage and scheduled completion date. Note that status (is it on track?) is more important than progress
- Organize by projects, NOT employee’s tasks!
- Mention projects even if it has NO progress
- May use metric dashboards and other visualization tool when there are too many metrics.
- Maintain a daily log on activities
- Do not wait for better result. Send report in-schdule, and we can update in person later.
- Focus on accompolishments than tasks! Task is for ICs!
Next week’s plan
- Objectives for the next week. Should include all items from the “In Progress” section
- May include important events, e.g., meetings
- Milestones should be SHARP, we can even trade being measurable for sharpness
-
May include relationship with long term goals
- Managers shall not just read, but make comments in the employee’s report. Giving employees feedback is crucially important.
- Weekly report is an effective way to get credits. Also, improves your visibility
- Send it 24 hours before the report
References
On measuring performance
Getting performance data
- A better qustion to ask during performance testing is, while maintaining my SLO, what is the max throughput I can achieve?
- Develop test harness early, so that you can enough time to refine experienments and metrics to collect - this is a time consuming process!
- One script to automate the running of the script, and another one to automate the generation of the report from the collected data
- Add additional tests to ensure config change is indeed deployed
- Defend against fast but incorrect implmentation. Include correctness check as non-timed part of the experiment,e.g., check data structure invariant after the timed portion
- Micro benchmark is for certain aspects of the system. They are never good representation of the overall system health. Use maro-benchmark that represents the real-life workload to show the overall picture
- What you measured is often response time instead of service time. Due to queueing and bufffering, customers may experience latency while you do not see it on your monitor.
- Introduce a base line across the benchmarks and compare with the baseline instead of comparing with each other.
- Baseline is often the optimal or state of the art solution
- Do the same run twice in a row to verify unintended caching effect, and do it twice AFTER some other runs so to see the cold + hot cache case
- Use neighbors of the regular/power strides and random points, so that you are very clear if you ran into corner cases or not
- Run untimed warm-ups
Interpreting performance data
- Many published performance nubmers measure what is optimized or the underlying system optimizes what is the measures
- Seperate calibration workload from evaluation workload(?). Otherwise, no way to show the predictive power of the model, which is the most important part of the model
- Need to show clearly the trade off areas and regression performance numbers,i.e., which numbers are better, which stays same, and which becomes worse?
- If subset of test is used, explain why some can be omitted. Note the omitted case may display a different trend at all
- Throughput and load analysis should come together
- Percetile and average metrics are very misleading, although they are good ways to tell stories. Rely on histrogram instead if possible.
- Double spikes distribution often the sign of multple behaviorial groups
- The “shoulder” of the gamma distribution often a sign of a small but separate behaviorial groups
- Due to number of requests to serve a single user, even good 99.9% percentile means pretty likely your user will hit a bad latency. Max latency carries a lot of weight in showing problems.
- Performance degradation easily got muffled in avg metrics. Basically averages are a good way to fool client to make them feel safe.
- Raw average must accompany with standard deviation. May also need Student’s t-test for significance
- Don’t use mean over a set of different benchmarks, even worse if they are normalized. Use geometric mean instead
- Use geometric means for multiplication problems, and harmonic mean for speeds, e.g., when speed s1 for t1 and speed s2 for t2.
- Give complete results instead of ratios
- Compare the exact data points instead of graphs over the same interval
On percentile
The most important thing about a metric is its distribution, and P99 does not say anything about the remaining 1% and previous 98%. Therefore, we can construct many counter-intutitive cases, and normally can not apply transitivity and composition directly
Suppose P(99,X) = 100ms, X consists of two sequential steps A and B, P(99,A) = 60ms
- Can P(99,B) > 100 ms? No
- If slowest 1% of A is matched with slowest 1% of B, then we can conclude P(99,B) = 40 ms
- Can P(99,B) in (40, 100] ms? Yes
- suppose P(98,A) = 1ms, P(100,A) = 60ms, P(100,B) = P(98,B) = 99ms, P(1, B) = 1ms
- Then we can constuct the distribution as P1(X) = 61ms, P(99,X) = 100ms, P(100,X) = 159ms
- Can P(99,B) in (0, 40) ms?
- Yes, suppose P(99,B) = 1ms, P(100,B) = 40ms, P(99,A) = P(1, A) 60ms, P(100,A) = 99ms
- Then we can construct P(99,X) = P(100,X) = 100ms, P(98, X) = 61ms
X consists of two sequential steps A and B, P(99,A) = 60ms, P(99,B) = 40ms
- Can P(99,X) > 100ms?
- Can P(99,X) be less than 95ms?
- Yes to both cases, just use the two constructions we have in the last sections
X sends requests to M, which will batch requests into a single DB request
- Can P(99,X) > P(99,M)? Yes, due to batched waits
- Can P(99,M) > P(99,X)? No, the only thing we can confirm is P99 of composition >= P99 of parts
In Prometheus
- Prometheus only records count and sum of buckets.
- If max bucket is too small, qunatile will record only the max bucket. The end effect is a straight line
- If bucket range is too big, too many data will fall into the same bucket, with the assumption that they are evenly distributed within the same bucket
References
Problems from the weekend
exawizards2019_c: Snuke the Wizard
Note that after the each operation, the relative position of golems with the same letter remains the same. This means the golems moved out will form two continous bands ends at the left and right end respectively (may overlap). So we just bsearch on each’s end points
exawizards2019_d: Modulo Operations
The only effective sequence is strictly decreasing. Therefore, we sort the set in decreasing order and do a knapsack-ish DP. Note the value is only 1e5, so it can fit into memory.
Consider the i’s step, f(x, i) = sum of results over all permutations, the permutation can be in two cases
- s[i] is the first of the permutation, i.e., this part contributes f(x %s[i], i + 1)
- s[i] is NOT the first of the permutation, i.e., it can be in any of the N - i - 1 positions, each of them is in effect an i + 1, case, i.e, this part contributes (N-i -1) * f(x, i+ 1)
1105D: Kilani and the Game
Just BFS on each player in turn, don’t expand if the neighbor is already colored. Also keep track to see if everyone can progress
On PACELC
Note that C,A, and P mean very specific things in this context.
- C: linearizability among all servers’ requests, as if exeuting on a single node. If B starts after A, then B can’t see the state before A
- This definition is different from the C in ACID
- Snapshot and MVCC are by design non-linearizable
- A: every request to ANY NON-FAILED node must return a response if it is successful or failure
- Node failure is outside the scope of CAP!
- Returning failure still gives A
- P: network is allowed to lose any messages from one node to another.
- Note this does not mean packet loss - in the proof they use a TCP-ish protocol already. However, from a node’s perspective you can not distinguish between a failed node and partitioned node
- Retrying communication indefinitely is in essence choosing C over A.
- Trivial CP system: ignore all requests, return “no result” => Not being available at all is enough to be CP
- Trivial AP system: return v0 to all requests
- It is possible for a system to be none of CP, AP, or CA! Most systems don’t need atomic consistency or perfect availability during P anyway.
PACELC
Network partitions(P) are too likely to happen, so CA is useless in practice, but else(E), even when P is not happening, the system should ideally have the perfect CA, i.e., the trade-off between C and latency(L).
If a system chooses to provide C over A in the presence of P, it can:
- refuse to respond to some requests, e.g., don’t write in two phase commit
- shutdown entirely
- Only r/w if the data’s master node is inside the partitioned component
Case studies
- Paxos
- AP? no when the link to the leader is cut. Note AP protocols in general need conflict resolution
- CA? because it is async model, so liveness can not be guaranteed
- 2PC
- CP? not consistent
- AP? not available, because we can not do any multi-node action
- Not even CA in this sense
- What if master failed? the view is not consistent!
- When master failed and prepared, all slaves are locked
- A common CA solution is the full strict quorum control
- Master-slave replication
- Not A - when client is partitioned from the leader, but not follower
- Cassandra
- Not A - when client is partitioned from the leader, but not follower during quorum read
Codeforces Notes
1060D: Social Circles
It is one of those greedy problems where you use the extreme value/boundary conditions as the starting points of optimization!!!
Suppose you have the longest LHS, then you need to right the longest RHS to minimize wait, we union the merge, and keep going until we merged all.
1032D: Barcelonian Distance
A, B can either leave, the avenue by x or y axis, so we have 4 choices on top of manhantann distance.
490D: Chocolate
First op: reduce a 2 factor Second operation: increase a 2 factor and then increase 3
So we just reduce the 3 factor to be the same, and then remove the 2 facters. Note that we can not apply the second ops to only 1 side!
993B: Open Communication
The answer has only 9 * 8 * 7 = 504 choices, for each pair, we see if both exist in the set, if there is only 1 possible pair, we know it is good, > 1 uncertain, = 0 , invalid
Notes on the Mythical Man-Month
Chapter 1. The Tar Pit
- Effort to create standalone program effort * 3 = programming system, i.e., integratedly tested with other components
- Effort to create debugg progrom * 3 = programming product, which includes comprehensive tests and doc
- What shipped to user is programming system product
Chapter 2. The Mythical Man-Month
- Don’t mix effort with progress during estimation.
- Man-month is an dangerous metric as it gives the false impression that manpower and time are interchangable
- Rule of thumb
- 1/3 planning
- 1/6 coding
- 1/4 component test and early system test
- 1/4 complete system test
- Hard to esitmate without hard data. Often need to have a backbone to defend again hunchs
Chapter 3. The Surgical Team
- Given the 10x rule, if a 200-man project has 25 managers who are the most competent and experienced programmers, fire the 175 troops and put the managers back to programming.
- Surgeon in the team settles the difference. The system should be the product of at most TWO minds to maintain conceptual integrity
- Small sharp team is best for throughput but not enough output for big project
- Solution: clean separation of architecture and implementation
Chapter 4. Aristocracy, Democracy, and System Design
- Design of an implementation, given the architecture, should have much design creativity. Again, need clean separation of architecture and implementation
- Architecture, implementation, and realization can be done in parallel
Chapter 5. The Second-System Effect
- When an architect sees an estimate he disagrees with, he can either cut the design or suggest a cheaper implementation, but since this part is builder’s responsibility, the architect should:
- just suggests and does NOT dictates!
- accept another design that meets the goal as well
- suggest QUIETLY and PRIVATELY
- FORGO credit for suggested improvement
- The second system is the most dangerous due to one’s tendenacy to over-engineer and inability/not enough data points to see which part is common and which is particular. The solution is to add people who built > 2 systems.
Chapter 6. Passing the Word
- Manual should be written by at most two people to keep conceptutal consistency
- Explicitly define part of architecture that is not prescribed as what is
- Often need both formal and prose definitions, although only one should be set as standard. “Never go to sea with two chronometers; take one or three.”
- An implementation can serve as formal definition, e.g., simulator, although this approach involves great tradeoffs
Chapter 7. Why Did the Tower of Babel Fail?
- The communication style is more like network rather than tree
- Producer in the team is mostly for inter-team communication; technical director inside team and mostly technical
- Producer as the boss is better for a larger subtree of a big project. In this case, producer and technical director must see alike on technical philosophy. Main techinical issues must be talked out privately.
- For smaller team, producer should be the technical director’s right hand man, e.g., surgical team
- They can be the same person only when it is small 4-6 man team
Chapter 8. Calling the Shot
- Do NOT estimate by estimating the coding effort and then multiply by 6. The error will be greatly magnified
- Similarly, don’t extrapolate the small system/sprint effort to bigger system
- Only 50% of an engineer’s time on paper is on programming and debugging in bigger projects (> hundreads of man hours), the rest is on interruptive tasks,vacation, meetings into consideration. Conversely, team often missed their schedule by one-half
Chapter 9. Ten Pounds in a Five-Pound Sack
- Fostering a total-system, user-oriented attitube is among the top priorites of the programming manager. Otherwise, you may have many local optimization yet does not have positive impact on the user (common in larger projects)
Chapter 10. The Documentary Hypothesis
- Project manager actuall need to care about only a small set of docs. These docs need to actively maintained:w
- Only 20% is data-based, rest is communication, and this core set of docs should be enough to fit most needs
Chapter 11. Plan to Throw One Away
- Plan to retire the current design gracefully even when you are designing for the current version - you do not want to frustrate your user
- May need to keep 2-3 engineers as the fire brigade
- Senior people must be ready to manage groups or building things themselves
- Repairing/Maintanence is an entropy increasing process (although can be slow if done carefully), while building new one is entropy decreasing
- Maintanence is at least 40% of development costs
- Fixing a detect has 20% - 50% to introduce a new one, because even minimized fix may have far-reaching effects
Chapter 12. Sharp Tools
- One toolmaker per team is fair, but assemble is common team just for tooling often is not productive
- Assign shared resources in 4-6 hour blocks, shorter than that is debatable
- Build a performance simulator, outside in, top down. Start it very early. Listen when it speaks.
Chapter 13. The Whole and the Parts
- Common scaffolding: dummy component, miniature file (for io formating example), tools/auxillary programs
- Release quanta should be either large and wise spaced, or very small and frequently
Chapter 14. Hatching a Catastrophe
- Milestones must be concrete, specific, and measurable, clearly defined.
- BAD examples are “coding”, “debugging”, “planning”
- Good exmaples are “Spec signed”, “all test cases passed”
- More importatnt to be sharp-edged and unambiguous than easily verifible by the boss
- Estimate accuracy improves only AFTER project starts
- Under-estimate does not improve much until about 3 weeks before scheduled confirmation
- Team should expect sharp milestons from the manager. Fuzzy milestone is a DISSERVICE to the team
- Missing schedule/milestone is a moral killer! If this one is missed, try hitting the next one!
- The boss must distinguish between action and status information EXPLICITLY. Do NOT act on problems when it is reviewing status time
- Scheduled date from project manager, and estimated date from line manager. Project manager need to keep track of both
- Preparing PERT is the job of the boss and the managers reporting to him. Maybe delegated to a separate “Plan and Control” team
- The PERT chart with its frequent sharp milestones is the basis for true status, whether coorperatively
- This team just gathers information and has no authority, so that line-manager can focus on making decisions
Chapter 16: No Silver Bullet—Essence and Accident in Software Engineering
- Essential complexity often grows non-linearly with size
- The hardest part of software is arriving at the complete and consistent spec. Much of time is on debugging that SPEC
- Grow incrementally instead of building the software. IMPOSSIBLE to come up with complete spec, even with if customer working directly with the engineers
- How to grow great designers?
- The best are often not the most experienced
- Conitnous mentoring with career profile carefully kept
- Apprenticeship, solo desigh, tech leadership, formal education, give these as different episodes
- Let growing designers to interact with each other
Chapter 17. “No Silver Bullet” Refined
- Motivational factor increases productivity, while environmental and accidental factors does not improve productiviey, but only drags down when it is negative
- The complexity often stems from the organizational malfunctions. Modelling it instead of solving the problem is highly debatable
Chapter 19. The Mythical Man-Month after 20 Years
- Even the team as small as four can have separate architect and team manager. For large teams, have a recursion of architects.
- Architects should guess on user attributes and frequency distributions, since a survey of target user is often too expensive, but write down the explicit guess process. Better to be explicit than to be wrong
- Success factor: quality of people » organization > management » tools or technical approaches
- One can move missions successfully. But in every case of attempts to move projects, the new team in fact started over, in spite of having good documentation, some well-advanced designs, and some of the people from the sending team.
Codeforces Notes
832D: Misha, Grisha and Underground
Of the 3 nodes, which two share the most parents? We can calculate lca with binary lifting with O(logN) or with sparse tables!!!
1000D: Yet Another Problem On a Subsequence
Just DP from the head. Need to pre-compute binomial coefficent though.
965D: Single-use Stones
Bsearch on the answer, and finally, instead of answering it one by one one, we compress the paths altogether. This means, at each step, the stones we hit must be a continous band
999D: Equalize the Remainders
Calculate the current remainder on each. Start from the highest remainder, and try filling the gaps. If there is remainding ones, increase to 1 first
660D: Number of Parallelograms
sort by (slope, length)
Codeforces Notes
961D: Pair Of Lines
Take first two points, and see if they can form a candidate line, and then the remaining dots must be on a different line
If does not work, then the third point must be on the same line as either first or second point.
To check if a, b, c are on the same line, calculate 2d CROSS PRODUCT of (b- a) * (c - a), will be 0 if (b - a) and (c - a) are colinear!!!
1012B: Chemical table
Covert a cell into an edge in bipartite graph!!!
For every path of length 3 we can create a fusion, and we can replace the fusion by reducing the total # of edges by 2.
If a component is connected, we must be able to find a path of length 3, then we can just find the longest path and keep performing the fusion, until max length of path is 2, i.e., simple components
Between simple components, we just need one additional edge to create a new component
702D: Road to Post Office
Regardless of the input value, just get the min of 4 cases: all walk, all drive, drive until the last seg and then walk, drive first leg and then walk
853B: Jury Meeting
Two pointer on k segments, on each day, we need to update, for each jury, the cheapest arrival rate (lower), the cheapest departure date (higher), is it enough the satify all requirement, what is the current cost
1118D2: Coffee and Coursework (Hard Version)
Bsaerch on the number of days, and then just take the highest (number of days) of coffee, minus the ones, not contributing
1017D: The Wu
Note that we have only 2^12 possible choices, so we can just pre-compute the answer for 2^24 = 16 mil iterations
500D: New Year Santa Network
For each edge, calculate how likely it is appear in a path between (a, b). For every delta, reduce the result by 3 * delta * probablility
405D: Toy Sum
The greedy intuition is easy, but I was stuck at proving it!!!
Claim: We can always find a solution if and only if size of X is no longer than half of S
Proof:
-
If n > S/2, the lower bound of X is higher than the upper bound
-
If n LTE S/2, Suppose X has A symmetric pairs, then the remaining possible symmetric pair numbers are s/2 - A - (n - A) = s/2 - n + 2A. If we add the constraint in, we can see that we just have enough “idle” pairs!!!
615D: Multipliers
This problem is harder that it appears!!!
What is the number of divisors d(x)? Also, f(x) = x^d(x)/2 f(a *b ) = f(b)^d(a) * f(a)^d(b)
On Salary Negotiation
Suppose you are a hiring manager, what negotiation tactics your counterpart may use?
- Make sure you negotiate. Ask for some time to think about it before making a decision
- Only AFTER they really like you, AFTER the interview. Most likely via email rather than real time
- Negotiation is an important knowledge discovery skill at work. Good way/time to demonstrate it.
- Giving out current salary is debatable, giving a range is fine
- Have basic idea on expected range. Do not get surprised or become unsure
- Do not get emotional, e.g.,
- using another person’s salary as reference explicitly
- Avoid “I think”, “I feel”, “I need”
- Hardline on your expectation.
- You can pick up (easily) $5,000 in salary negotiations. The salary part is actually not THAT huge in the total compensation of the employee
- Even if number itself can not change, you can ask for other form of concession, e.g., project assignments, travel opportunities, professional development opportunities
Common arguments
- Q: “I really need a number to move the process forward.”/”This form needs a number.”
-
A: Focus on discovering fits and mutual interests. Giving out number, as discussed above, will compromise your position, although in the end you may have no choice
- Q: “We want to figure out whether you’re an appropriate candidate for the position”
-
A: Then dive deep with it
- Q: “I have to go to $EXTERNAL_AUTHORITY to get approval of that.”
- A: Refocus the discussion on things which are within the hiring manager’s personal authority.
On re-negotiation
- Make a List of Accomplishments. Do not mentions things other than that, e.g., context, requirement, times at the company, or your PERSONAL needs
- Compare Industry Salaries
- Prepare Your Case First
- Schedule a meeting so that both sides are available, prepared and ready to have a fruitful discussion.
- Anything is better than nothing
- Don’t give ultmatum
- give options, instead of just money
- Timing is important. Don’t wait until auto raise or busy time. Also gives time for them to reach a decision
- Keep it non-personal, and NOT emotional
- View it as good negotiation skill practice
References
Codeforces Notes
1114D: Flood Fill
By the rule, the final color must be either the leftmost color or the rightmost color, regardless of the starting point we picked.
cost(L,R, left) = 0 = cost(L,R, right) if [L, R] is within a seg already
cost(L, R, left) = min( cost(prevSegL, R, left), cost(prevSegL, R, right)) + 1
cost(L, R, right) = min( cost(L, prevSegR, left), cost(L, prevSegR, right)) + 1
459C: Pashmak and Buses
If we record the bus number for each students, this means that we can mark the arrangement as a k-based numbers!!!
This means that for each student, the condition translates to if there are two duplicates!!!
460C: Present
Bsearch on the min height, then scan from left to right to greedly assign bands
Problems for the day
agc031 - B: Reversi
Solved this during the contest. By default, DP[i] = DP[i - 1], i.e., do nothing case.
To calculate DP[i], maintain the sum of DP[j], s.t., c[j + 1] = c[i] and c[j] != c[i]
agc031 - C: Differ by 1 Bit
Rather standard pattern of guess what does not work/what proprties the solution has -> prove that it is actually sound AND complete by construction, which is often induction
Each number differs only by 1 bit. Therefore, the oddity of # of 1 bits in A and B must differ (in total 2^n - 1 steps)
Claim: we can always find such solution as long as A and B has different oddity in terms of # of 1 bits
Proof: by induction. N = 1: trivial. Suppose it is true for N = k;
when N = k + 1, We can construct a solution by taking a different bit between A and B out, find a C, and construct A_takeout ….C , C….B_takeout bits.
Because the position we take out is has 0 and 1 in A and B, respectively, we add the that missing bit back to the left and right half to restore the sequence.
The implementation can be done recursively. Note that we need to pass taken out bits and corresponding values during recursive calls
983B: XOR-pyramid
I had problem spotting the recursive relationships!!! The problem is trivial once you figure this part out. One thing i should do is to draw out the interation steps in the form of trees/pyramids
650B: Image Preview
Standard bsearch + two pointer
808D: Array Division
Find the lowset number s.t., prefix sum > half. We know that at least 1 number must be moved from the array
Option 1: just take it out, and move to suffix
Option 2: take one out, and replace it with the suffix
708B: Recover the String
Given the # of 00s and 11s, we know exactly number of 0s and 1s
The final answer form is 0000111101111000
Notes on Toyota Kata
Understand desired direction
- Set up a vision on how work should be done in an ideal state. Process focused and NOT outcome focused, i.e., not a business vision
- Example for software engineering:
- Zero defects
- Every check-in to production
- Highest value first, on demand
- Motivated people
- Higher level’s target condition becomes next level’s vision
Grasp current position
- Collect process metrics AND outcome metrics
Set the next challenge/target condition
- Focus on process instead of outcome
- Should take one step closer to the vision AND slight above your current knowledge threshould (“Frugality”)
- based on the current condition - should be in absolute numbers
- Set expiry date 1-3 months
- Use hyphothese - experiment - compare prediction and actual outcome to move forward to the target condition.
Repeatedly move to new target condition until you reached the vision
Context, not control
- It is OK not to please the boss, it is not OK not to serve the business. Information hiding is unacceptable
- Ideally employees do not guess what the manager wants and executes on that decision
- Employees need to learn filter only relevent topics due to the rich contextual information
- Values are shown by who gets rewarded or let go.
What should be included in the context
This is where most cross-team discussion should happen.
- Metrics
- including definition of success
- Assumptions
- Objectives
- Result oriented. Focus on customer impact.
- Timeframe/urgency/importance
- How precise the solution needs to be? No error/can correct errors/experimental
- Clearly-defined roles
- including key stakeholders, which is manager’s job to figure out
What should be avoided
- Top-down decision making
- Management approval, or managers make technical decisions for the team
- Committes
- Cross team discussion on tactics. Each team should be able to execute tactics without management approval
Comparison of flushing mechanisms: InnoDB vs Kafka
InnoDB
- Table and index data are cached in InnoDB buffer pool. It is recommend to keep innodb_buffer_pool_size 50 to 80 percent of system memory, by default it is only 128 MB.
- If buffer pool does not have the data, the data will be loaded from the page cache, which is in the system’s address space (and from storage to page cache first if have not done so)
mmap
data files directly into its user process address space- Uses fsync (
innodb_flush_method
) to flush the log to the disk immediately at commit (innodb_flush_log_at_trx_commit
) or every 1 second (innodb_flush_log_at_timeout
). Double buffering + sequential write means that this log flushing is fast.
Kafka
- By default, Kafka
fsync
asynchronously, i.e., the broker returns ACK after data is copied into page cache - Also
mmap
its data file into user’s process address space. Note that it is mapped to off-heap memory in JVM to prevent the GC of actual log data object. (How to check JVM’s native memory usage though?) - During the send, Kafka copies directly from the page cache to the network card, and socket buffer copies a file descriptor but no data. Note all these data copying happens within the kernel space, i.e., no user -> kernel space copying happened.
- Note that if JVM heap size is too huge, we may not leave enough to the OS cache, which is the actual busy area! On produciton 6G is probably enough, but leave 20G+ to OS cache on a 32G box
Note: fsync just flush buffer specific to an open file
Questions for interviewers
This is a personal list based on web sources, order by my priority. As an interviewer, be prepared to answer these questions or you can sell your company as part of the answers.
Easy questions
- How do people share learnings? How often?
- What is the hardest part of the job?
- What’s your favorite part about working at the company?
- What motivated you to join the organisation?
Tough questions
- How to define the success of this role 6 months into the job?
- How is performance measuring done?
- Who are the stakeholders of the metrics?
- How often are reviews conducted?
- What is the need behind hiring for this role?
- What happened to the last person in this role?
- From background/resume, what parts would make you doubt i would be a great fit for the company?
- I have read that you have just launched x. What will that mean for the growth of this role?
Migrating from MySql to Google Cloud Spanner
For existing MySql project, the migration cost to Spanner is often too high, including lock-in with GCP and code changes
Main cost changes:
- No incremental id generation
- HAVE to specify index to use for EACH query
- No official JPA support. Have to change code to use JDBC directly
- Change child table schema so that the PK of the parent table is the prefix of child table’s PK
- Limit the index creation to no more than 3 per DAY per db
- Avoid more than 30 DDL statements that require validation or index backfill in a given 7-day period
Migration process (copied from the official docs):
- Convert your schema and data model.
- Translate any SQL queries.
- Migrate your application to use Cloud Spanner in addition to MySQL.
- Bulk export your data from MySQL and import your data into Cloud Spanner using Cloud Dataflow.
- Maintain consistency between both databases during your migration.
- Migrate your application away from MySQL.
Reactor pattern in solving c10k problems
One example is how kafka handles it.
- Acceptor thread accepts connections, and pass them to processor threads
- Each processor thread select() on multiple connecitons, i.e., the sync event demultiplexer in the textbook reactor pattern
- On each event, processor sends the arriving requests to the request queue.
- Pooled worker threads will fetech from the requeust queue and process it. Note implicitly, the queue and its maintainance logic acts as the dispatcher.
- Woker thread puts the response to each processor’s response queue.
- Processor thread gets result from its own response queue and answer the client
Notes on Fescar
Fescar is derived from GTS (global transaction service), which has been on AliCloud since 2016. Right now it is at v0.2, and, according to their plan, still 2 major releases from GA
What problems do existing solutions have?
XA is the most common non-intrusive solution. However,
- Need to be MySQL 5.7+
- Long lock of txn resources
- XA implmentations require heavyweight application server,.e.g, WebLogic/WebSphere
Common intrusive solutions has high dev costs:
- reliable message delivery
- TCC
- Saga
What happens when you want to issue a new global txn?
Modules
- Transaction Coordinator (TC) maintains the state of gloal txns.
- Transaction Manager (TM) sends open/commit/rollback txn requests to TC
- Resource manager (RM) maintains the branch txns
- TM sends the “open new global txn” request to TC. TC will generate a globally unique id, XID.
- XID will be passed along the microservice trace
- RM will register the branch txn with the same XID on TC
- TM will start the global commit or rollback
- TC will commit/rollback all branch txns under the same XID
The flow quite similar to XA’s, the key differences are:
- XA’s RM is on the DB layer. This RM is on the application layer
- Fescar commits directly during phase 1: it is not strictly 2PC! See below for rollback discussion
How is global rollback implemented?
RM writes the rollback log including the data before and after, and will commit the data update and rollback log in the same local txn.
When RM receives the rollback request from TC, it finds the local rollback log by XID and branch ID included in the rollback request, and execute the reversed SQL.
What is the isolation level?
TC will maintain a global exculsive write lock, by default it is read UNCOMMITED.
What if my resource does not support transaction natively?
Use the manual transaction(MT) mode. You need to implement your own prepre-commit-rollback APIs for 2PC.
Codeforces Notes
220B: Little Elephant and Array
Idea 1: offline band sum
First, we need to process all queries offline, and order them by the end index so that we scan from left to right by the end index.
We need to maintain the band start where a number x has exactly x times from the start to the current end. To update the band quickly, use the band-to-delta technique,i.e.,maintain a +1 at the start of the end, and -1 at the end of the band, overall contribution is the sum over the band.
This means, for each entry we see:
- if cnt[x] LT X, do nothing
- if cnt[x] GTE X, delta[previous X times occurance ]++ => start the current valid gap
- if cnt[x] GTE X + 1, set delta[previous x-1 times occurance]-= 2 => potential end of the current gap, and cancel the end of the previous gap
- if cnt[x] GTE X + 1, set delta[previous x + 1 times occurance]– => cancel the start of the previous gap
Idea 2: online brute force
Claim: if we just consider numbers cnt(x) >= x, such potential candidates count LTE 2 * sqrt(n)!!!
Proof: If x > sqrt(n), then we know we can not have more than sqrt(n) distinct x. If x LT sqrt(n), then we have at most sqrt(n) distinct choices of x
So we can just brute force calucalte the number of appearances and answer the queries online
219D: Choosing Capital for Treeland
Set orientiation as 1 as the root
For other nodes, we calculate the following things:
- in-cost, min cost to convert when every node is reachable is from the parent, this can be done by one single DFS
- fix parent cost: # of in-direction edges from 1 to v
- reverse parent cost: # of reverse direction dedges from v to 1
For each candidate, the cost = cost to fix orientation 1 - fix parent cost + reverse parent cost
439D: Devu and his Brother
- precalc how many effort to update the min/max/values to the next value,
- bsearch on effort.
- Given the effort, we bsearch on the target number, and see if the total effort LTE effort.
472D: Design Tutorial: Inverse the Problem
Think Kruskal’s MST theorem. For each node, we know its closest neighbor must be a single edge. So we can devise a similar reverse-MST algorithm, and then DFS on each node to verify the construction
InnoDB deadlock cases
Base case
CREATE TABLE t (
id int(10) primary key
)engine=innodb
insert into t values(1);
insert into t values(3);
insert into t values(10);
Gap lock
--txn A
set session autocommit=0;
delete from t where id = 5;
--txn B
insert into t values(0); -- proceed
insert into t values(2); -- proceed
insert into t values(7); -- in conflict with gap lock (3, 10], blocked
In show engine innodb status
, it will be lock_mode x locks gap
S + X
--txn A
set session autocommit=0;
insert into t values(7);
--txn B
insert into t values(7);
Sequence of actions:
- A acquires S on id = 7
- B acquires S on id = 7
- A tries to acquire X on id = 7, but blocked by B
- B tries to acquire X, but blocked by A
shared gap lock
--txn A
delete from t where id = 6
insert into t values(5)
--txn B
delete from t where id = 7;
insert into t values(8)
Sequence of actions:
- A acquires shared gap lock on (3, 10] on delete
- B acquires S gap lock on (3, 10] on delete
- A wants X gap lock on (3, 10], blocked
- B wants X gap lock on (3, 10], blocked
Common solutions to DL
- acquirng/relase in the same order
- timed wait, and then release if not able to acquire (defaults to 50s in mysql)