On PACELC
Note that C,A, and P mean very specific things in this context.
- C: linearizability among all servers’ requests, as if exeuting on a single node. If B starts after A, then B can’t see the state before A
- This definition is different from the C in ACID
- Snapshot and MVCC are by design non-linearizable
- A: every request to ANY NON-FAILED node must return a response if it is successful or failure
- Node failure is outside the scope of CAP!
- Returning failure still gives A
- P: network is allowed to lose any messages from one node to another.
- Note this does not mean packet loss - in the proof they use a TCP-ish protocol already. However, from a node’s perspective you can not distinguish between a failed node and partitioned node
- Retrying communication indefinitely is in essence choosing C over A.
- Trivial CP system: ignore all requests, return “no result” => Not being available at all is enough to be CP
- Trivial AP system: return v0 to all requests
- It is possible for a system to be none of CP, AP, or CA! Most systems don’t need atomic consistency or perfect availability during P anyway.
PACELC
Network partitions(P) are too likely to happen, so CA is useless in practice, but else(E), even when P is not happening, the system should ideally have the perfect CA, i.e., the trade-off between C and latency(L).
If a system chooses to provide C over A in the presence of P, it can:
- refuse to respond to some requests, e.g., don’t write in two phase commit
- shutdown entirely
- Only r/w if the data’s master node is inside the partitioned component
Case studies
- Paxos
- AP? no when the link to the leader is cut. Note AP protocols in general need conflict resolution
- CA? because it is async model, so liveness can not be guaranteed
- 2PC
- CP? not consistent
- AP? not available, because we can not do any multi-node action
- Not even CA in this sense
- What if master failed? the view is not consistent!
- When master failed and prepared, all slaves are locked
- A common CA solution is the full strict quorum control
- Master-slave replication
- Not A - when client is partitioned from the leader, but not follower
- Cassandra
- Not A - when client is partitioned from the leader, but not follower during quorum read
Codeforces Notes
1060D: Social Circles
It is one of those greedy problems where you use the extreme value/boundary conditions as the starting points of optimization!!!
Suppose you have the longest LHS, then you need to right the longest RHS to minimize wait, we union the merge, and keep going until we merged all.
1032D: Barcelonian Distance
A, B can either leave, the avenue by x or y axis, so we have 4 choices on top of manhantann distance.
490D: Chocolate
First op: reduce a 2 factor Second operation: increase a 2 factor and then increase 3
So we just reduce the 3 factor to be the same, and then remove the 2 facters. Note that we can not apply the second ops to only 1 side!
993B: Open Communication
The answer has only 9 * 8 * 7 = 504 choices, for each pair, we see if both exist in the set, if there is only 1 possible pair, we know it is good, > 1 uncertain, = 0 , invalid
Notes on the Mythical Man-Month
Chapter 1. The Tar Pit
- Effort to create standalone program effort * 3 = programming system, i.e., integratedly tested with other components
- Effort to create debugg progrom * 3 = programming product, which includes comprehensive tests and doc
- What shipped to user is programming system product
Chapter 2. The Mythical Man-Month
- Don’t mix effort with progress during estimation.
- Man-month is an dangerous metric as it gives the false impression that manpower and time are interchangable
- Rule of thumb
- 1/3 planning
- 1/6 coding
- 1/4 component test and early system test
- 1/4 complete system test
- Hard to esitmate without hard data. Often need to have a backbone to defend again hunchs
Chapter 3. The Surgical Team
- Given the 10x rule, if a 200-man project has 25 managers who are the most competent and experienced programmers, fire the 175 troops and put the managers back to programming.
- Surgeon in the team settles the difference. The system should be the product of at most TWO minds to maintain conceptual integrity
- Small sharp team is best for throughput but not enough output for big project
- Solution: clean separation of architecture and implementation
Chapter 4. Aristocracy, Democracy, and System Design
- Design of an implementation, given the architecture, should have much design creativity. Again, need clean separation of architecture and implementation
- Architecture, implementation, and realization can be done in parallel
Chapter 5. The Second-System Effect
- When an architect sees an estimate he disagrees with, he can either cut the design or suggest a cheaper implementation, but since this part is builder’s responsibility, the architect should:
- just suggests and does NOT dictates!
- accept another design that meets the goal as well
- suggest QUIETLY and PRIVATELY
- FORGO credit for suggested improvement
- The second system is the most dangerous due to one’s tendenacy to over-engineer and inability/not enough data points to see which part is common and which is particular. The solution is to add people who built > 2 systems.
Chapter 6. Passing the Word
- Manual should be written by at most two people to keep conceptutal consistency
- Explicitly define part of architecture that is not prescribed as what is
- Often need both formal and prose definitions, although only one should be set as standard. “Never go to sea with two chronometers; take one or three.”
- An implementation can serve as formal definition, e.g., simulator, although this approach involves great tradeoffs
Chapter 7. Why Did the Tower of Babel Fail?
- The communication style is more like network rather than tree
- Producer in the team is mostly for inter-team communication; technical director inside team and mostly technical
- Producer as the boss is better for a larger subtree of a big project. In this case, producer and technical director must see alike on technical philosophy. Main techinical issues must be talked out privately.
- For smaller team, producer should be the technical director’s right hand man, e.g., surgical team
- They can be the same person only when it is small 4-6 man team
Chapter 8. Calling the Shot
- Do NOT estimate by estimating the coding effort and then multiply by 6. The error will be greatly magnified
- Similarly, don’t extrapolate the small system/sprint effort to bigger system
- Only 50% of an engineer’s time on paper is on programming and debugging in bigger projects (> hundreads of man hours), the rest is on interruptive tasks,vacation, meetings into consideration. Conversely, team often missed their schedule by one-half
Chapter 9. Ten Pounds in a Five-Pound Sack
- Fostering a total-system, user-oriented attitube is among the top priorites of the programming manager. Otherwise, you may have many local optimization yet does not have positive impact on the user (common in larger projects)
Chapter 10. The Documentary Hypothesis
- Project manager actuall need to care about only a small set of docs. These docs need to actively maintained:w
- Only 20% is data-based, rest is communication, and this core set of docs should be enough to fit most needs
Chapter 11. Plan to Throw One Away
- Plan to retire the current design gracefully even when you are designing for the current version - you do not want to frustrate your user
- May need to keep 2-3 engineers as the fire brigade
- Senior people must be ready to manage groups or building things themselves
- Repairing/Maintanence is an entropy increasing process (although can be slow if done carefully), while building new one is entropy decreasing
- Maintanence is at least 40% of development costs
- Fixing a detect has 20% - 50% to introduce a new one, because even minimized fix may have far-reaching effects
Chapter 12. Sharp Tools
- One toolmaker per team is fair, but assemble is common team just for tooling often is not productive
- Assign shared resources in 4-6 hour blocks, shorter than that is debatable
- Build a performance simulator, outside in, top down. Start it very early. Listen when it speaks.
Chapter 13. The Whole and the Parts
- Common scaffolding: dummy component, miniature file (for io formating example), tools/auxillary programs
- Release quanta should be either large and wise spaced, or very small and frequently
Chapter 14. Hatching a Catastrophe
- Milestones must be concrete, specific, and measurable, clearly defined.
- BAD examples are “coding”, “debugging”, “planning”
- Good exmaples are “Spec signed”, “all test cases passed”
- More importatnt to be sharp-edged and unambiguous than easily verifible by the boss
- Estimate accuracy improves only AFTER project starts
- Under-estimate does not improve much until about 3 weeks before scheduled confirmation
- Team should expect sharp milestons from the manager. Fuzzy milestone is a DISSERVICE to the team
- Missing schedule/milestone is a moral killer! If this one is missed, try hitting the next one!
- The boss must distinguish between action and status information EXPLICITLY. Do NOT act on problems when it is reviewing status time
- Scheduled date from project manager, and estimated date from line manager. Project manager need to keep track of both
- Preparing PERT is the job of the boss and the managers reporting to him. Maybe delegated to a separate “Plan and Control” team
- The PERT chart with its frequent sharp milestones is the basis for true status, whether coorperatively
- This team just gathers information and has no authority, so that line-manager can focus on making decisions
Chapter 16: No Silver Bullet—Essence and Accident in Software Engineering
- Essential complexity often grows non-linearly with size
- The hardest part of software is arriving at the complete and consistent spec. Much of time is on debugging that SPEC
- Grow incrementally instead of building the software. IMPOSSIBLE to come up with complete spec, even with if customer working directly with the engineers
- How to grow great designers?
- The best are often not the most experienced
- Conitnous mentoring with career profile carefully kept
- Apprenticeship, solo desigh, tech leadership, formal education, give these as different episodes
- Let growing designers to interact with each other
Chapter 17. “No Silver Bullet” Refined
- Motivational factor increases productivity, while environmental and accidental factors does not improve productiviey, but only drags down when it is negative
- The complexity often stems from the organizational malfunctions. Modelling it instead of solving the problem is highly debatable
Chapter 19. The Mythical Man-Month after 20 Years
- Even the team as small as four can have separate architect and team manager. For large teams, have a recursion of architects.
- Architects should guess on user attributes and frequency distributions, since a survey of target user is often too expensive, but write down the explicit guess process. Better to be explicit than to be wrong
- Success factor: quality of people » organization > management » tools or technical approaches
- One can move missions successfully. But in every case of attempts to move projects, the new team in fact started over, in spite of having good documentation, some well-advanced designs, and some of the people from the sending team.
Codeforces Notes
832D: Misha, Grisha and Underground
Of the 3 nodes, which two share the most parents? We can calculate lca with binary lifting with O(logN) or with sparse tables!!!
1000D: Yet Another Problem On a Subsequence
Just DP from the head. Need to pre-compute binomial coefficent though.
965D: Single-use Stones
Bsearch on the answer, and finally, instead of answering it one by one one, we compress the paths altogether. This means, at each step, the stones we hit must be a continous band
999D: Equalize the Remainders
Calculate the current remainder on each. Start from the highest remainder, and try filling the gaps. If there is remainding ones, increase to 1 first
660D: Number of Parallelograms
sort by (slope, length)
Codeforces Notes
961D: Pair Of Lines
Take first two points, and see if they can form a candidate line, and then the remaining dots must be on a different line
If does not work, then the third point must be on the same line as either first or second point.
To check if a, b, c are on the same line, calculate 2d CROSS PRODUCT of (b- a) * (c - a), will be 0 if (b - a) and (c - a) are colinear!!!
1012B: Chemical table
Covert a cell into an edge in bipartite graph!!!
For every path of length 3 we can create a fusion, and we can replace the fusion by reducing the total # of edges by 2.
If a component is connected, we must be able to find a path of length 3, then we can just find the longest path and keep performing the fusion, until max length of path is 2, i.e., simple components
Between simple components, we just need one additional edge to create a new component
702D: Road to Post Office
Regardless of the input value, just get the min of 4 cases: all walk, all drive, drive until the last seg and then walk, drive first leg and then walk
853B: Jury Meeting
Two pointer on k segments, on each day, we need to update, for each jury, the cheapest arrival rate (lower), the cheapest departure date (higher), is it enough the satify all requirement, what is the current cost
1118D2: Coffee and Coursework (Hard Version)
Bsaerch on the number of days, and then just take the highest (number of days) of coffee, minus the ones, not contributing
1017D: The Wu
Note that we have only 2^12 possible choices, so we can just pre-compute the answer for 2^24 = 16 mil iterations
500D: New Year Santa Network
For each edge, calculate how likely it is appear in a path between (a, b). For every delta, reduce the result by 3 * delta * probablility
405D: Toy Sum
The greedy intuition is easy, but I was stuck at proving it!!!
Claim: We can always find a solution if and only if size of X is no longer than half of S
Proof:
-
If n > S/2, the lower bound of X is higher than the upper bound
-
If n LTE S/2, Suppose X has A symmetric pairs, then the remaining possible symmetric pair numbers are s/2 - A - (n - A) = s/2 - n + 2A. If we add the constraint in, we can see that we just have enough “idle” pairs!!!
615D: Multipliers
This problem is harder that it appears!!!
What is the number of divisors d(x)? Also, f(x) = x^d(x)/2 f(a *b ) = f(b)^d(a) * f(a)^d(b)
On Salary Negotiation
Suppose you are a hiring manager, what negotiation tactics your counterpart may use?
- Make sure you negotiate. Ask for some time to think about it before making a decision
- Only AFTER they really like you, AFTER the interview. Most likely via email rather than real time
- Negotiation is an important knowledge discovery skill at work. Good way/time to demonstrate it.
- Giving out current salary is debatable, giving a range is fine
- Have basic idea on expected range. Do not get surprised or become unsure
- Do not get emotional, e.g.,
- using another person’s salary as reference explicitly
- Avoid “I think”, “I feel”, “I need”
- Hardline on your expectation.
- You can pick up (easily) $5,000 in salary negotiations. The salary part is actually not THAT huge in the total compensation of the employee
- Even if number itself can not change, you can ask for other form of concession, e.g., project assignments, travel opportunities, professional development opportunities
Common arguments
- Q: “I really need a number to move the process forward.”/”This form needs a number.”
-
A: Focus on discovering fits and mutual interests. Giving out number, as discussed above, will compromise your position, although in the end you may have no choice
- Q: “We want to figure out whether you’re an appropriate candidate for the position”
-
A: Then dive deep with it
- Q: “I have to go to $EXTERNAL_AUTHORITY to get approval of that.”
- A: Refocus the discussion on things which are within the hiring manager’s personal authority.
On re-negotiation
- Make a List of Accomplishments. Do not mentions things other than that, e.g., context, requirement, times at the company, or your PERSONAL needs
- Compare Industry Salaries
- Prepare Your Case First
- Schedule a meeting so that both sides are available, prepared and ready to have a fruitful discussion.
- Anything is better than nothing
- Don’t give ultmatum
- give options, instead of just money
- Timing is important. Don’t wait until auto raise or busy time. Also gives time for them to reach a decision
- Keep it non-personal, and NOT emotional
- View it as good negotiation skill practice
References
Codeforces Notes
1114D: Flood Fill
By the rule, the final color must be either the leftmost color or the rightmost color, regardless of the starting point we picked.
cost(L,R, left) = 0 = cost(L,R, right) if [L, R] is within a seg already
cost(L, R, left) = min( cost(prevSegL, R, left), cost(prevSegL, R, right)) + 1
cost(L, R, right) = min( cost(L, prevSegR, left), cost(L, prevSegR, right)) + 1
459C: Pashmak and Buses
If we record the bus number for each students, this means that we can mark the arrangement as a k-based numbers!!!
This means that for each student, the condition translates to if there are two duplicates!!!
460C: Present
Bsearch on the min height, then scan from left to right to greedly assign bands
Problems for the day
agc031 - B: Reversi
Solved this during the contest. By default, DP[i] = DP[i - 1], i.e., do nothing case.
To calculate DP[i], maintain the sum of DP[j], s.t., c[j + 1] = c[i] and c[j] != c[i]
agc031 - C: Differ by 1 Bit
Rather standard pattern of guess what does not work/what proprties the solution has -> prove that it is actually sound AND complete by construction, which is often induction
Each number differs only by 1 bit. Therefore, the oddity of # of 1 bits in A and B must differ (in total 2^n - 1 steps)
Claim: we can always find such solution as long as A and B has different oddity in terms of # of 1 bits
Proof: by induction. N = 1: trivial. Suppose it is true for N = k;
when N = k + 1, We can construct a solution by taking a different bit between A and B out, find a C, and construct A_takeout ….C , C….B_takeout bits.
Because the position we take out is has 0 and 1 in A and B, respectively, we add the that missing bit back to the left and right half to restore the sequence.
The implementation can be done recursively. Note that we need to pass taken out bits and corresponding values during recursive calls
983B: XOR-pyramid
I had problem spotting the recursive relationships!!! The problem is trivial once you figure this part out. One thing i should do is to draw out the interation steps in the form of trees/pyramids
650B: Image Preview
Standard bsearch + two pointer
808D: Array Division
Find the lowset number s.t., prefix sum > half. We know that at least 1 number must be moved from the array
Option 1: just take it out, and move to suffix
Option 2: take one out, and replace it with the suffix
708B: Recover the String
Given the # of 00s and 11s, we know exactly number of 0s and 1s
The final answer form is 0000111101111000
Notes on Toyota Kata
Understand desired direction
- Set up a vision on how work should be done in an ideal state. Process focused and NOT outcome focused, i.e., not a business vision
- Example for software engineering:
- Zero defects
- Every check-in to production
- Highest value first, on demand
- Motivated people
- Higher level’s target condition becomes next level’s vision
Grasp current position
- Collect process metrics AND outcome metrics
Set the next challenge/target condition
- Focus on process instead of outcome
- Should take one step closer to the vision AND slight above your current knowledge threshould (“Frugality”)
- based on the current condition - should be in absolute numbers
- Set expiry date 1-3 months
- Use hyphothese - experiment - compare prediction and actual outcome to move forward to the target condition.
Repeatedly move to new target condition until you reached the vision
Context, not control
- It is OK not to please the boss, it is not OK not to serve the business. Information hiding is unacceptable
- Ideally employees do not guess what the manager wants and executes on that decision
- Employees need to learn filter only relevent topics due to the rich contextual information
- Values are shown by who gets rewarded or let go.
What should be included in the context
This is where most cross-team discussion should happen.
- Metrics
- including definition of success
- Assumptions
- Objectives
- Result oriented. Focus on customer impact.
- Timeframe/urgency/importance
- How precise the solution needs to be? No error/can correct errors/experimental
- Clearly-defined roles
- including key stakeholders, which is manager’s job to figure out
What should be avoided
- Top-down decision making
- Management approval, or managers make technical decisions for the team
- Committes
- Cross team discussion on tactics. Each team should be able to execute tactics without management approval
Comparison of flushing mechanisms: InnoDB vs Kafka
InnoDB
- Table and index data are cached in InnoDB buffer pool. It is recommend to keep innodb_buffer_pool_size 50 to 80 percent of system memory, by default it is only 128 MB.
- If buffer pool does not have the data, the data will be loaded from the page cache, which is in the system’s address space (and from storage to page cache first if have not done so)
mmap
data files directly into its user process address space- Uses fsync (
innodb_flush_method
) to flush the log to the disk immediately at commit (innodb_flush_log_at_trx_commit
) or every 1 second (innodb_flush_log_at_timeout
). Double buffering + sequential write means that this log flushing is fast.
Kafka
- By default, Kafka
fsync
asynchronously, i.e., the broker returns ACK after data is copied into page cache - Also
mmap
its data file into user’s process address space. Note that it is mapped to off-heap memory in JVM to prevent the GC of actual log data object. (How to check JVM’s native memory usage though?) - During the send, Kafka copies directly from the page cache to the network card, and socket buffer copies a file descriptor but no data. Note all these data copying happens within the kernel space, i.e., no user -> kernel space copying happened.
- Note that if JVM heap size is too huge, we may not leave enough to the OS cache, which is the actual busy area! On produciton 6G is probably enough, but leave 20G+ to OS cache on a 32G box
Note: fsync just flush buffer specific to an open file
Questions for interviewers
This is a personal list based on web sources, order by my priority. As an interviewer, be prepared to answer these questions or you can sell your company as part of the answers.
Easy questions
- How do people share learnings? How often?
- What is the hardest part of the job?
- What’s your favorite part about working at the company?
- What motivated you to join the organisation?
Tough questions
- How to define the success of this role 6 months into the job?
- How is performance measuring done?
- Who are the stakeholders of the metrics?
- How often are reviews conducted?
- What is the need behind hiring for this role?
- What happened to the last person in this role?
- From background/resume, what parts would make you doubt i would be a great fit for the company?
- I have read that you have just launched x. What will that mean for the growth of this role?
Migrating from MySql to Google Cloud Spanner
For existing MySql project, the migration cost to Spanner is often too high, including lock-in with GCP and code changes
Main cost changes:
- No incremental id generation
- HAVE to specify index to use for EACH query
- No official JPA support. Have to change code to use JDBC directly
- Change child table schema so that the PK of the parent table is the prefix of child table’s PK
- Limit the index creation to no more than 3 per DAY per db
- Avoid more than 30 DDL statements that require validation or index backfill in a given 7-day period
Migration process (copied from the official docs):
- Convert your schema and data model.
- Translate any SQL queries.
- Migrate your application to use Cloud Spanner in addition to MySQL.
- Bulk export your data from MySQL and import your data into Cloud Spanner using Cloud Dataflow.
- Maintain consistency between both databases during your migration.
- Migrate your application away from MySQL.
Reactor pattern in solving c10k problems
One example is how kafka handles it.
- Acceptor thread accepts connections, and pass them to processor threads
- Each processor thread select() on multiple connecitons, i.e., the sync event demultiplexer in the textbook reactor pattern
- On each event, processor sends the arriving requests to the request queue.
- Pooled worker threads will fetech from the requeust queue and process it. Note implicitly, the queue and its maintainance logic acts as the dispatcher.
- Woker thread puts the response to each processor’s response queue.
- Processor thread gets result from its own response queue and answer the client
Notes on Fescar
Fescar is derived from GTS (global transaction service), which has been on AliCloud since 2016. Right now it is at v0.2, and, according to their plan, still 2 major releases from GA
What problems do existing solutions have?
XA is the most common non-intrusive solution. However,
- Need to be MySQL 5.7+
- Long lock of txn resources
- XA implmentations require heavyweight application server,.e.g, WebLogic/WebSphere
Common intrusive solutions has high dev costs:
- reliable message delivery
- TCC
- Saga
What happens when you want to issue a new global txn?
Modules
- Transaction Coordinator (TC) maintains the state of gloal txns.
- Transaction Manager (TM) sends open/commit/rollback txn requests to TC
- Resource manager (RM) maintains the branch txns
- TM sends the “open new global txn” request to TC. TC will generate a globally unique id, XID.
- XID will be passed along the microservice trace
- RM will register the branch txn with the same XID on TC
- TM will start the global commit or rollback
- TC will commit/rollback all branch txns under the same XID
The flow quite similar to XA’s, the key differences are:
- XA’s RM is on the DB layer. This RM is on the application layer
- Fescar commits directly during phase 1: it is not strictly 2PC! See below for rollback discussion
How is global rollback implemented?
RM writes the rollback log including the data before and after, and will commit the data update and rollback log in the same local txn.
When RM receives the rollback request from TC, it finds the local rollback log by XID and branch ID included in the rollback request, and execute the reversed SQL.
What is the isolation level?
TC will maintain a global exculsive write lock, by default it is read UNCOMMITED.
What if my resource does not support transaction natively?
Use the manual transaction(MT) mode. You need to implement your own prepre-commit-rollback APIs for 2PC.
Codeforces Notes
220B: Little Elephant and Array
Idea 1: offline band sum
First, we need to process all queries offline, and order them by the end index so that we scan from left to right by the end index.
We need to maintain the band start where a number x has exactly x times from the start to the current end. To update the band quickly, use the band-to-delta technique,i.e.,maintain a +1 at the start of the end, and -1 at the end of the band, overall contribution is the sum over the band.
This means, for each entry we see:
- if cnt[x] LT X, do nothing
- if cnt[x] GTE X, delta[previous X times occurance ]++ => start the current valid gap
- if cnt[x] GTE X + 1, set delta[previous x-1 times occurance]-= 2 => potential end of the current gap, and cancel the end of the previous gap
- if cnt[x] GTE X + 1, set delta[previous x + 1 times occurance]– => cancel the start of the previous gap
Idea 2: online brute force
Claim: if we just consider numbers cnt(x) >= x, such potential candidates count LTE 2 * sqrt(n)!!!
Proof: If x > sqrt(n), then we know we can not have more than sqrt(n) distinct x. If x LT sqrt(n), then we have at most sqrt(n) distinct choices of x
So we can just brute force calucalte the number of appearances and answer the queries online
219D: Choosing Capital for Treeland
Set orientiation as 1 as the root
For other nodes, we calculate the following things:
- in-cost, min cost to convert when every node is reachable is from the parent, this can be done by one single DFS
- fix parent cost: # of in-direction edges from 1 to v
- reverse parent cost: # of reverse direction dedges from v to 1
For each candidate, the cost = cost to fix orientation 1 - fix parent cost + reverse parent cost
439D: Devu and his Brother
- precalc how many effort to update the min/max/values to the next value,
- bsearch on effort.
- Given the effort, we bsearch on the target number, and see if the total effort LTE effort.
472D: Design Tutorial: Inverse the Problem
Think Kruskal’s MST theorem. For each node, we know its closest neighbor must be a single edge. So we can devise a similar reverse-MST algorithm, and then DFS on each node to verify the construction
InnoDB deadlock cases
Base case
CREATE TABLE t (
id int(10) primary key
)engine=innodb
insert into t values(1);
insert into t values(3);
insert into t values(10);
Gap lock
--txn A
set session autocommit=0;
delete from t where id = 5;
--txn B
insert into t values(0); -- proceed
insert into t values(2); -- proceed
insert into t values(7); -- in conflict with gap lock (3, 10], blocked
In show engine innodb status
, it will be lock_mode x locks gap
S + X
--txn A
set session autocommit=0;
insert into t values(7);
--txn B
insert into t values(7);
Sequence of actions:
- A acquires S on id = 7
- B acquires S on id = 7
- A tries to acquire X on id = 7, but blocked by B
- B tries to acquire X, but blocked by A
shared gap lock
--txn A
delete from t where id = 6
insert into t values(5)
--txn B
delete from t where id = 7;
insert into t values(8)
Sequence of actions:
- A acquires shared gap lock on (3, 10] on delete
- B acquires S gap lock on (3, 10] on delete
- A wants X gap lock on (3, 10], blocked
- B wants X gap lock on (3, 10], blocked
Common solutions to DL
- acquirng/relase in the same order
- timed wait, and then release if not able to acquire (defaults to 50s in mysql)
Codeforces Notes
113B: Petr#
- find all S(begin) and S(ends)
- for each i..j, calculate its rolling hash, add it to the corresponding map hash -> (start, end), if i is in S(begin), and j is in S(end)
- for each entry in the map, add the size of the entry - # of duplicates to the final answer
633C: Spy Syndrome 2
- reverse the whole sentence
- caluclate the rolling hash of each word in the dictionary
- DP on the suffix size. The overall run time will be O(n), because for each check, we reduce the overall problem size by 1,because we search for the FIRST matching one
965C: Greedy Arkady
- bsearch on the times A gets, note that D is no more than 1000, we can brute force on all possible Ds, and find the max possible factor A gets
- k * D * x >= n, K * (D - 1) * x <= n - x, also, x <= M
- The target function is D * x
448D: Multiplication Table
- bsearch on answer, at least iteration we can afford an O(n) iteration, i.e., just calculate through all rows to see if number < the target number matches, also keep track of equal coutns
Never Split the Difference
- Try NOT to speed up process to get a “Yes”, very easy to leave the impression that you are ignoring your counterpart.
- Avoid direct/assertive voice - easy to put people into defensive mode. Use a positive/playful voice by default
- Try to trigger “No” from counter part, so that the other part feels safe. You can force a “No” by mislabelling
- If very hard for the counterpart to say “No”, just walk away, it is too early
- When you talk about numbers, use precise, odd ones to so that it APPEARS to be thought after
- SLOW DOWN
- NEVER use “I understand”
- Liar is more likely to use longer sentences, third-party pronoun, and more complex sentences
- Use the other side’s own name - they want to be called!
- The more I, me, my one uses, the less impact one has in decision making. Other questions to identify deal-breakers:
- How does this affect the rest of them?
- How on board are people with this call?
- What do your team see as the main challenge in this area?
- Labelling, which works well against analyst such as myself
- It seems/looks/sounds like… DO NOT use I, otheriwse it looks like you are self-centered!
- pause 3-4 secs
- Rinse and repeat
- I didn’t say that is the actual case, I just said it seems that.
- Anger shows your passion, but reduces your counterpart’s congintive activity due to the emotional mode. You can de-escalate by proposing a time-out. Try to calm emotion before listing theories and facts
- Something doesn’t make sense means chance to discover unknowns! Best discovered during face-to-face communications, near breaks or meeting ends. Hard to discover during emails
- The word “fair” is often used so the other side goes defensive and gives concessions. Common cases:
- We just want what is fair -> OK I am sorry, let us get back to where the fair part begins
- We have given you a fair offer -> Mirror: Fair? + Label: It seems that you have evidence to support it.
- Powerful: Stop me at any time if you feel I am being unfair
- List the worst thing the other side could say and say them first, this will encourage the counterpart to say what the opposite is true
- When you sell yourself to the manager, sell yourself as ways to prove their own intelligence and that they can broadcast to the rest of the company, i.e., you are the proof of HIS importance, let him have a stake in your success
- Power Sales Negotiators know that splitting the difference does not mean splitting it down the middle. Just split the difference twice and the split becomes 75 percent/25 percent. Furthermore, you may be able to get the other party to split the difference three or more times.
- The first thing to remember is that you should never offer to split the difference yourself, but always encourage the other person to offer to split the difference.
How to negotiate with an accommodator
- Time spend communicating is a sign of building relationship, which they value
- Ask what/how questions. This lets the counterpart feel you need their intelligence to overcome the problem and they have the control on defining success. Such feeling appeals much to people with egos. It also makes people feel it is THEIR idea.
- Avoid binary questions
- Avoid when/where questions which can be answered without too much thinking
- Avoid why questions, because it is easy to trigger people into fight or flight mode
- Concretely, AFTER they agree with something for the first time. Label and summarize their answer to get a “that’s right”, and follow up with a what/how questions
- “What do you see as being the most difficult thing”
- “What is the biggest challenge you face?”
- “What is the success measure”
- “how will we know we are on track”
- What about this is important to you?
- How can I help to make this better for us?
- How would you like me to proceed?
- What is it that brought us into this situation?
- How can we solve this problem?
- What’s the objective? / What are we trying to accomplish here?
- How am I supposed to do that?
- “how do we address things if we are off-track”
- Communicating is good sign, and slience suggests angry
- Easy to be distracted and not good time manager
- Hesitation comes more often in tone and body language, and NOT in words
- Accommodators are more likely to oversell/overpromise.
How to negotiate with an assertive
- Analytical is the worst type match with assertives. Analyst type does not enjoy “What”/”How” questions.
- To assertives, getting things done is more important than making it perfect.
- They need to think you understand them before listening to your pov
- They will keep talking on every moment of slience. This leaves the room for mirroring technique
- Use a slow, calm voice to create authority and trustworthiness. Note this voice is normally used to make a point and should NOT be the default voice
- “I’m sorry…”
- Repeat the last 3 words or the last critical 1-3 words of what the assertive just said
- Keep slient and let the other side speak. This also helps reveal their strategy
- Rinse and repeat if needed
- Try NOT to offer trade-offs/reciprocity, because assertives will ask for more. Conversely, if they give up something, they expect something equal or more in return.
- Get “That’s right” from them, because that is a sign they FEEL they own the conclusion.
- Can follow up with a “According to $(name)…”
- On the other hand, “you are right” means almost nothing, and should not definitely taken as sign of progress
- I will try. You are right = I plan to fail. In this case, drill again with a “how” when you get a “that’s right”
- When the negotiation is going off-track/losing focus, just invite them to re-engage
On ThreadPoolExecutor
- NOT recommend to create via Executors static factory, instead, use
ThreadPoolExecutor
. Recommend to use newCacheThreadExecutor, defaults toThreadPoolExecutor
- Note that in general try NOT to create new threads explicitly. Let TP provide one.
- To handle exception in the worker
- try-catch inside the runnable
submit()
to execute,Future.get()
to handle exTPE.afterExecute()
- pass in
Thread.UncaughtExceptionHandler
- Constructor params
- corePoolSize
- long keepAliveTime
- When # of threads > cores, idle threads will wait this long before got terminated
- Defaults to 60s
- maximumPoolSzie, i.e., we can use it to set constant size TP.
- workQueue
- Capacity defaults to Integer.MAX_VALUE!
- Insertion is done by BlockingQueue.offer(), which will not throw exception or block.
- threadFactory
- RejectedExecutionHandler handler - when out of thread range or queue capacity
- The state of TPE,i.e., highest 3 bits of the atomic
ctl
value- RUNNING
- SHUTDOWN: shutdown(), no new task will be accepted, but will finish executing the jobs in the task queue
- STOP: shutdownNow(), no new task, not executing tasks in the q
- TIDYING: all tasks completed, will execute terminated()
- TERMINIATED: after termiated() is run
- submit() returns Future vs execute() returns nothing
Building pieces of the TPE
class Worker extends AbstractQueuedSynchronizer implements Runnable
{
Runnable firstTask;
final Thread thread;
volatile long completedTasks;
}
thread
is created in the worker’s ctor, by the thread factory passed in- Worker is a nested class of the TPE. In fact, TPE has a reference to
Worker
too
class FutureTask{
Callable callable;
private volatile int state;
private Object outcome; // non-volatile, protected by state reads/writes
private volatile Thread runner;
}
Future<?> submit(Runnable task) {
RunnableFuture<Void> tftask = new Futuretask<T>(runnable, null); //uses Executors.callable(runanble, null)
execute(tfask);
return ftask;
}
runner
is CASed as theThread.currentThread()
from null. This also acts as the mutex checkcallable
is turned fromrunnable
by the Executor. Callable returns result and may throw exceptions
TPE.execute()
void execute(Runnable command) {
int c = ctl.get();
if(wokerCountOf(c) < corePoolsize) {
if(addWorker(command, true))
return;
c = ctil.get();
}
if(isRunning (c) && workQueue.offer(command)){ //added to the blocking queue successfully
int recheck = ctl.get();
if(!isRunning (recheck) && remove(command))
reject(command);
else if (wokerCountOf(recheck) == 0)
addWorker(null, false);
} else if (!addWorker(command, false)){
reject(command);
}
}
TPE.execute()
is the interface to outside, i.e., directly called bysubmit()
- If TPE is running, and we can add the command to the task q. Note that we have to double check the state again because the offer operation is not an atomic step of
isRunning
and try removing it it. Similarly, the twoelse if
logics are quite similar because we essentialy performed the check twice. - If
isRunning
try adding the worker. Note thataddWorker
side checks the TPE state, so when the TPE is not running, the new task will not be run
TPE.runWorker(Worker w):
while (task != null || (task = getTask()) != null) {
w.lock();
if ((runStateAtLeast(ctl.get(), STOP) ||
(Thread.interrupted() &&
runStateAtLeast(ctl.get(), STOP))) &&
!wt.isInterrupted())
wt.interrupt();
try {
beforeExecute(wt, task);
Throwable thrown = null;
try {
task.run();
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
afterExecute(task, thrown);
}
} finally {
task = null;
w.completedTasks++;
w.unlock();
}
}
- This is called from worker and, by extension, worker’s thread since worker is an inner class
- Standard
lock()
and thenunlock()
insidefinally
getTask()
checks if it can poll a task withkeepAliveTime
, if it couldn’t work, it will return a null. This is how the non-core threads are terminated
Shutting down TPE
void shutdown() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try{
checkShutdownAccess();
advanceRunState(SHUTDOWN);//CAS in a loop to set state. No actual interruption
interruptIdleWorkers();
onShutdown();
}finally{
mainLock.unlock();
}
tryTerminate();
}
List<Runnable> shutdownNow() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try{
checkShutdownAccess();
advanceRunState(STOP);
interruptWorkers(); //uses Thread.interrupt(), which just sets the mark, needs Thread.interrupted() inside that thread
tasks = drainQueue();
}finally{
mainLock.unlock();
}
tryTerminate();
return tasks;
}
threadPool.shutdown(); //have to wait for the termination explicitly
while(!threadPool.awaitTermination(60, TimeUnit.SECONDS)){
//busy waiting here, most likely you need to add the max retry count
}
FutureTask.run()
if (state != NEW ||
!UNSAFE.compareAndSwapObject(this, runnerOffset,
null, Thread.currentThread()))
return;
try {
Callable<V> c = callable;
if (c != null && state == NEW) {
V result;
boolean ran;
try {
result = c.call();
ran = true;
} catch (Throwable ex) {
result = null;
ran = false;
setException(ex);
}
if (ran)
set(result);
}
} finally {
// runner must be non-null until state is settled to
// prevent concurrent calls to run()
runner = null;
// state must be re-read after nulling runner to prevent
// leaked interrupts
int s = state;
if (s >= INTERRUPTING)
handlePossibleCancellationInterrupt(s);
}
- CAS at the start uses current thread id as the optimistic lock. Hence, runner is set to null in the
finally