Generic, covariance, and contravariance

2018-11-05 00:00:00 +0000

  • a <: b means a is a subtype of b
  • The overall goal of covariance and contravariance is to reduce the case where type check passed but runtime gives typing error.
  • The main use case is when the type is in a higher-order function or generic collections.
    • Adding a type to a generic collection is somewhat similar to the function call in typing rules, i.e., the generic type is the parameter type
    • Reading from a generic colleciton is somewhat similar to the function return in typing rules, i.e., The generic type beais the returned type

Covariance

  • Return types are covariants for function subtyping, that is, if tsub <: tsuper then t -> tsub <: t -> tsuper
  • Use to mark the upper bound of generic (at super).
    • Required when we return a generic collection, so that we have a guarantee that we can get super at least from collection
    • No guarantee on what we can write, because we don’t know the concretely underlying type, which will be casted to on read.
  • Class and types are different! Class defines an object’s behavior, a type describes what fields an object has and what messages it can respond to. Subtyping is a question of substitutsubbility and what we want to flag as a type error

Contravariance

  • Argument types are contravariant for function subtyping, that is, if tsub <: tsuper, then tsuper -> t <: tsub -> t
    • Suppose you wrap your tsuper -> t type in a collection or a higher-order function. If you pass a tsub -> t type, then during execution, the underlying type always expects passing in tsub. However, because the signature is tsuper-> t, we can still pass the tsuper to the concrete tsub-> t function without violating type check => contradiction!
  • Similar reasoning on why in Java you can’t pass List to a method accepting List, because the method will add an actual Object to the underlying List
    • Note that in the source they are all stored in an object array, and cast to the type on read
    • We can’t pass List to List functions either, due to the covariance rule
  • Use to mark the lower bound of generic (at subtype). This means we are 100% safe to add only the subtype and its subtypes to the generic collections

On Handling Silence

2018-11-02 00:00:00 +0000

Source

  • The subtle power of uncomfortable silences
  • How to Handle Silence, the Worst Kind of Feedback

  • Comforatble silence margin: 3-4 seconds. Also, use open-ended question instead of letting people make a choice. Note that we often feel the slience longer than it actually is.
  • Uncomfortable silence actually helps the buyer guide the conversation
  • If you are losing the room, be slient for 3-5 secs, and let people zoom out. Once they are back, touch base again.
  • Pause 3-5 before and after the important points to improve the impact of critical points - try not to ramble! Similarly, before starting the presentation, stay silent for 3-5 seconds and look around the audience
  • During listening, if you can’t resist on thinking what you want to say instead of actual listening, focus on staying slient.
  • If the slience approaching 1 min, offer to come back to the point later and move on.

On select(), poll(), and epoll()

2018-11-02 00:00:00 +0000

  • select() requires the kernel to iterate through all FDs passed to the select() call, check their state and register CBs. When an event on any FD happend, kernel has to iterate through these FDs again to deregister CB. Note that every time you get a new connection with accept() system call, you will get a new FD
  • Linux uses epoll_ctl() to manage the registration. Call epoll_wait to wait for updates about the list of files you’re interested in.
  • Level-triggered epoll() inherits from the “thundering herd” semantics form select() - get list of every FDs you are interested that is operatable. This means everyone will wake up from epoll_wait() and try accept() - all but one will get EAGAIN (Resource temporarily unavailable)!
  • In edge-triggered, only 1 thread is triggered, but the threads may wake up unneccearily only to get EAGAIN, because on awake, it does not/can not know how many connections kernel received originally - use level-triggered + EPOLLEXCLUSIVE to wake up 1 thread at a time
  • In the case of writing to FD with level triggered, your thread will get constant notification that FD is ready, even though you may not be able to write yet because your resource to FD is not available yet.
  • In the case of writing to FD with edge triggered, you get triggered by FD is ready ONCE and write at much as you want until your write(2), and then you wait for the next signal
  • Reading from FD with level-triggered is useful when you don’t want to consume all data at once and want epoll to keep triggring while data is available. However, for performance, probably want to use edge-triggered with all data buffered and will be eventually handled
  • Poll() is very similar to select() and returns more status,i.e., it also requires the full list of FDs to watch on each invocation
  • Poll() can perform better than select() if you have a sparse set of FDs you are interested, because select() accepts a RANGE of FDs and bitsets for FDs you want for reads/writes/exceptions

  • It is also called IO multiplexing because one thread can watch over the change of multiple FDs, i.e., the processing thread itself is reused.
  • Note that call to select() and poll() are still blocking

Design Discussion: Event Sourcing

2018-11-01 00:00:00 +0000

Scenario

  • In a domain event, Service A persists one domain change to MySQL database and publishs the event DTO to Kafka, which acts as the event bus. Service B listens to Kafka for the event DTO.
  • Reliability requirement: one domain change persisted should cause at least one DTO published to Kafka eventually,i.e., no lost update to Kafka or MySQL
  • Latency requirment: The 99th percentile time between Service A starting the persistence and Service B receiving the message should be < 1s.
  • Throughput requirement: 500 msg/s
  • Following the original BASE paper, service A can have a log table so that DTO can be persisted in the same transaction as the domain change.
  • CDC/binlog is not feasible because we have mutli-row, multi-table txns

Option: Compensating transaction + polling

  • Following Ebay’s BASE paper, service A writes to log table and domain table in the same transaction
  • After the transaction commits, A will asyncly send the DTO to kafka, and upon sendng successful, A will update the entry to log table status to published, or delete the record
  • At the same time, we have log cleaner daemon process, which will clean up the old logs and re-publish logs not marked as published
  • Note that the async call means that in a single domain flow, two DTOs may arrive at the kafka out of order, because they are from two different processes of the same service A. It really depends on the actual logic to figure it out

Domain event format

  • PK
  • OP
  • State before the change
  • State after the change
  • src metadata
    • src log/db/tbl
    • src timestamp (event time)
  • capture time (processing time)

Inside DB, need message id , and message status (new, published). Debatable if we want different messages go into the same log table. If we need it, then a separate message type column is required

Case study: spring-domain-events

  • DB Schema:
    • msg id
    • msg pub date
    • listener id
    • seriaized string
    • event type
    • completion date
  • Problems with this repo in a production setting
    • The recovery didn’t consider the case of concurrent recovery
    • The recovery can be triggered only at afterSingletonsInstantiated, i.e., lacks a separate, continuous recovery process
    • The recovery query itself is just a full table scan
    • Persisting the event happens outside the same db transaction, i.e., with TransactionalApplicationEvent. Therefore, it is still possible that domain change is done, but event is lost

Analysis on the Eventuate Performance

2018-10-29 00:00:00 +0000

Overview

A very nicely designed and abstracted library that solves the message passing between bounded contexts. However, the performance of this library is not enough if you service emits > 500 messages/sec to Kafka.

Design Analysis

  1. In the case of MySQL backend, the library uses this mysql CDC tool. However, our past experience with mysql CDC is not overly positive. That is, it is stable enough for data analytics pipelines, but debatable for latency sensitve (<5s) cross-service communication. This suggests us to use polling instead of CDC to detect message to send.
  2. The library uses a published column in its events table and message table. Right now it has no logic the evict/purge published records. Moreover, the code uses a SELECT * FROM %s WHERE %s = 0 ORDER BY %s ASC LIMIT :limit statement to find message to publish,i.e., it is a naive full table scan
  3. The library stores messages for different topics in the same events or message table. The lock contention on the table’s PK column will be a major problem
  4. ‘events or message` table uses a long VARCHAR as the id column type. In reality, a 128-bit field,e.g., one generated by snowflake is enough.
  5. The producer uses the following code the send the message
    this.producer.send(aggregateTopic, this.publishingStrategy.partitionKeyFor(event), json).get(10L, TimeUnit.SECONDS);
    

    This means it sends each message synchronously. Then if we estimate the latency

    • 1 call to broker > 0.5 ms
    • broker does majority write > 0.5ms network + 0.1 ms SSD write
    • 0.5m network + 0.1 ms SSD write for DB to update the published column.
    • Add the above steps up, The hard limit time alone is already 2 ms per message,i.e., this pseudo code can support 500 msg/sec max

How to write design doc

2018-10-19 00:00:00 +0000

Summary section

  • Roles of stakeholders. Declare it explicitly for each person. Important roles includes:
    • Perform: carry out
    • Accountable: one single owner
    • Control: review the results, can veto. binding advice
    • Suggest: non-binding advice
    • Informed: should know the result of activity. Do NOT have a say in how work is done
  • Overview
    • Read by all engineers to decide if they need to keep reading
    • 3 paragraphs max

Context section

Different teams should mostly interact in this section instead of the decision section. So that teams can be highly aligned on goals

  • Motivation
    • Non-dev should be able to read it
    • Show how the project is tied to the long-term goals
  • Assumptions
  • Goals
    • in the dimension of user-driven impact
    • measuable key results
  • Non-goals

Decision

Mostly for intra-team discussions.

  • Milestones
    • For PM and manager’s manager to quick read
    • Set up milestones and check points. Ideally, milestones should enable partial launch of the project
  • Current/Proposed solution
    • Probably need to include user stories (from different povs) to show how the solution behaves dynamically
    • De-risk the project ASAP, by focusing on the risk part and prototyping and scaffolding
    • Probably need to prototype before you are able to propose
    • Details go here. Use user stories to show how the system behaves
    • Include precise pseudo code and data structure
    • Someone didn’t write the doc but understands the problem should be able to implement the solution
  • Alternative and Rejected Solutions
  • Monitoring and Alerting
    • Probably include correctness invariant and testing strategy. Testing strategy needs to be detailed too
  • Cross-Team Impact
  • Discussion
    • If the discussion has more than 5 comments, in-person discussion is probably better
  • Detailed Scoping and Timeline
    • By engineers, tech lead, and managers, that is why it is near the end of the doc,i.e., end user is the consumer of the design docs without the scoping part
    • Update the scope throughput the project. Most likely you need to update your estimate after each milestone
    • Each task step should take 2-3 days max
    • 1.5x of your estimated dev time to include buffer for interrupt tasks
    • Timebox open-ended questions and commit to a solution within the box, but be aware of trade-off between time box and long term cost
    • Dropping the project or stop it right after certain milestone is an option too.

A example template

  • Context and motivation
  • Assumptions and prerequisites
  • Goals
    • in the dimension of user-driven impact
    • measuable key results
  • Non-goals
  • Role assignment for each member, use one of the following role:
    • Perform: carry out
    • Accountable: one single owner
    • Control: review the results, can veto. binding advice
    • Suggest: non-binding advice
    • Informed: should know the result of activity. Do NOT have a say in how work is done
  • Current/Proposed solution
  • Alternative/Rejected Solutions
  • Cross-Team Impact
  • Monitoring and Alerting
  • Milestones/Timelines

References

Change attitudes by changing behavior

2018-10-11 00:00:00 +0000

source

  • If we engage behaviours we didn’t expect to do, our thoughts and feelings are likely to change
  • Too much reward/penalty may lead to over justitifcation, i.e., a person may think it is more because of external reasons instead of internal reasons.
  • Praise focuses on the internal of activity, i.e., makes us feel good about ourselves due to our accomplishments, is more likely to increase our performance and the liking of the activity
  • When we review our past activity, and decide that we are not going to do anything different, this is often a sign of reducing congitive disssonance, i.e., makes us feel less bad about ourselves - but also introduces bias!
  • Given too many choice, people may worry more about oppurtunity cost, and thus creates cognitive dissonance - a free gift is better than choosing between the two!

Persuation techniques

2018-10-10 00:00:00 +0000

  • Make sure the other side FEELs that they won
  • To trigger other people’s bragging and make them feel good: time + scenario + slightly irritating/negative experience + positively peak as the end.
  • Provide three options instead of yes/no questions, so that people is more likely to give valid info, espeically when under stress

Decoy effect

  • When target and competitor has trade-offs not that obvious to make a decision, insert a decoy which is closer to the target, but clearly inferior to the . Research shows that people will be more likely to pick the target - as much as 40% difference
    • However, the decoy can not to too undesireable
  • More applicable to intuitative thinkers than analytical person
  • Somewhat similar to “door in the face” and “selling top of the line”, where the proposal is meant to be rejected. The difference is that those two in general aims for the more expensive proposals, whereas decoy could work in both directions

pre-giving

  • Give a small favor, ideally something physical, before persuation
  • Don’t wait too long between favor and persuation

foot-in-the-door

  • The smaller request in the similar domain followed by the larger (and larger) requests
  • The inital request should not have too much external incentives, to avoid the overjustification symptom
  • Note that this technique is most useful when the person’s self-image aligns with your persuation,i.e., from yes to bigger yes, instead of no to yes
  • Another scenario is waiting for a person’s acknowledgement before

Low-ball

  • Paint a pleasant mental picture as if the customer can get it with reduced price, even if we intend to sell it at higher price
  • The key is that pleasant mental picture should buy initial agreement, and the second one should not be too outrageous
  • Similar to bait-and-switch, but low-ball is more on money and single interaction, bait and switch has separate bait and actual sales activity

Understanding Kafka exactly once semantics

2018-10-10 00:00:00 +0000

The whole process involves many moving parts. I will approach it by viewing the process from each components POV.

Producer

  • Producer will register its transaction id/PID with the transactional coordinator (TC), and get a new epoch from TC, with sequence number = 0 inside producer’s memory

  • Producer start producing to partitions.

    • To broker, each message will include PID, epoch, and sequence number. After sending a batch to a parition, producer will increase its own seq # by 1.

    • To TC, producer will register the (topic, partition, PID, epoch). Note if we use Stream API, the consumer offset will be written too.

  • When producer commits, it will ask TC to write a PREPARE_COMMIT(PID, epoch) message internally.

Consumer

  • Assume the we set consumer isolation level at read committed

  • consumer will filter out messages from aborted transactions, and will block on the first message from ongoing transactions

  • Consumer will get a list of aborted PIDs whose offset range intersects with the fetch batch size from the consumer coordinator. This means filtering is done on consumer’s client side

Broker

  • The broker process also hosts the coordinator. Thus, the HA of 2PC coordinator is guaranteed via Kafka’s partition leader election.

Trasaction Coordinator

  • Transaction log just stores the status of transactions instead of actual message it self

    • Ongoing (PID, epoch) - analogous to the PREPARE-stage of 2PC

    • (PID, epoch, topic, partition) - information to commit for PRODUER

    • (PID, epoch, topic, partition, offset) - information to commit for CONSUMER

    • Prepare-commit(PID, epoch) - marks the PREPARE-stage of 2PC done. This means we will always

    • COMPLETED(PID, epoch) - end of log marker for 2PC

Consumer Coordinator

  • manages the internal consumer offsets topics

Numbers for performance and capacity estimation

2018-10-09 00:00:00 +0000

  1. mutex lock/unlock is around the same range as RAM access - 100 ns = 0.001 ms. L2 access at 7ns
  2. Reading 1 MB sequentially from RAM - 0.25 ms.
  3. Ping between AWS AZs. Remember to turn on ICMP protocol, which is IP protocol.
    • Same AZ, 0.3 ms
    • ap-northeast-1a to ap-northeast-1c, 2.3 ms
    • ap-northeast-1a to ap-northeast-1d, 1.6 ms
    • ap-northeast-1c to ap-northeast-1d, 0.8 ms
  4. Reading 1 MB from network 10 ms. Note this number seems to include the cost of copying from kernel state to user state
  5. Reading 1 MB sequentially from disk 30 ms, excluding seek time. Sequential read 1GB/s for SSD, 4GB/s for memory. Random read 4KB from SDD is about 0.15ms
    • Disk seek is 10ms on traditional, 0.1 ms on SSD.
    • For traditional disk, writing has similar performance as read, but need to add full rotation + blocksize/transfer rate
    • SSD is log-structured and has lots of errors, but masked on the on-device logic. Hard to survive multiple power outages but average failure time > 6 years
    • for SSD, seq write improves throughput (250 MB/s vs 25 MB/s) and latency is roughly same. Seq read and random read has roughly same throughput and latency.
  6. Roundtrip between CA and Netherland is 150ms. SF to NYC is about 40ms
  7. Compressing 1K bytes with Zippy is in the same range as sending 1k bytes over 1 Gbps network - 0.01ms
  8. Empty Object in Java is 8 bytes (Some sources claim 12?). Actuall size is padded to the closest 8 bytes > estimate

Numbers for capacity planing

There are just rule of thumb. The strongest data point is PoC on the projected workload

  • Nginx
    • at least 20k concurrent connections per machine
  • Tomcat
    • Between 200 to 2k QPS per instance. Use Little’s law to estimate
      • Time out for microservices, try not to make it > 1s, 90th percentile aim for 200ms
      • Each day has 86.4k seconds, although during estimate we probably want to use 40k instead of 80k to take sleep time into account.
      • Without support of other data, we can assume 80% of traffic happens within 20% of time,i.e., peak is 4 times of average traffic
    • Default thread pool size = 200. Max # of threads in OS - 3-5k. 500 threads in a single process is OK.
    • For OLTP workload, The most likely bouding factor is blocking calls, followed by CPU. During daily peaks, the CPU peak should be between 10 and 30 percent
  • Redis
    • 20k to 50k QPS, Mostly bounded by network instead CPU
    • CPU usage should not be > 15%
  • MySQL
    • Default connneciton pool size = 151. Note RDS connection pool size is based on the instance type. Normally keep 1.5k peak/sec per instance is no problem, but something to watch out for when it reachs 3k rps
    • InnoDB default lock wait_time is 50s, i.e., at least 50s before one deadlocked process exits
    • In OLTP workload, each txn should take no more than 10 ms on average on DB layer
    • c5.4xlarge (16 core,68 G ram) should be enough to hand 500 TPS OLTP workload. Less than that means problem with with code
  • Kafka
    • 95th percentile at 5 ms, with replication factor = 3, within the same DC
    • Most likely network and memory bound than CPU
    • 2M record per second on 3 node cheap machines in their standard benchmark

AWS cost saving tips

  • Recycle unattached EBS volumes. They are not reclaimed when you retire EC2 instances
  • Rarely you need more than 15k iops if it is not storage layer. 30k is more than enough for most storage layer
  • On-demand instance should not be more than 1/3 of the fleet. The rest will be a mixture of RI and spot instance

On Fermi Problems

2018-10-08 00:00:00 +0000

  • Often you have a factor or two you have NO domain knowledge about at all, then just give your estimate to your best effort. It works because it becomes a random walk on log scale when you multiply each steps. Therefore, the mean is getting closer to the expectation

  • The more steps you have, the higher the standard deviation becomes. However, you often have better estimate on the smaller steps. This is essentially a trade off. But more steps also suggest your mental model is better

  • If the guess is too far off from the actual value, the reason of error is more valuable than the result itself

  • Only keep the part that is the deciding factor in the estimation

How many people in the world are talking on their cell phones in any given minute?

Each month i use 3 hours of cell phone, so 1/ (8 * 30) chance i will be using cellphone * 8 billion people

How many iPhone screen repairmen are there in the United States?

400 mil people, 60% with iphone, will repair the phone once in 1.5 years. So 160 mil repairs each year. Each repair takes 2 hours so each person can repair 4 * 365 = 1500 per year, i.e., about 100k iphone repair person!

How many people in US die each year?

400 mil people, birth rate at 1.5 per woman, so in 80 years, about 300 mil people are born, each year 3.5 mil are born ~ roughly equal the # of deaths

The number of grains of sand on all the beaches of the world

Guidelines on monitors and alerting

2018-09-13 00:00:00 +0000

  • Feel free to add alerts, but only page when the symptom impacts user or the user-facing impact is very likely to ensue.
  • For alerts requires intervetion but not right away, email and slack is enough.
  • We highly recommend you select “Do Not Require” for sparse metrics, otherwise some evaluations will be skipped - but this may trigger more false alarms if the window is too short
  • One idea is to create a composite monitor, one for the errorneous data, and one for the amount of data, and the alert fires only when both are satisfied
  • Datadog generally recommends against using interpolation for count/rate metrics in monitors, i.e., interpolation works only with gauge metrics
  • Short evaluation window means even shorter time buckets. In monitors with division over short evaluation windows. If metrics for the numerator and denominator aren’t both available at query time, you could get unwanted evaluation values.

References

Datadog monitoring and alert for k8s

2018-09-12 00:00:00 +0000

  1. kubernetes_state.deployment.replicas_available should not be too far off from kubernetes_state.deployment.replicas_desired

  2. Keep a timeseries on # of running pods by node or replica set, and correlate it with resource metrics

  3. For cpu, memory usage, prefer standard metrics from docker, rather than from k8s. Similar princple applies to time-series data - but later on, you can break the data down by pod, and then filter by k8s labels (K8s labels are already applied to Docker metrics)

  4. Focus on monitoring sum of requests on hosts, instead of simple CPU and memory usage

  5. Group/filter your docker metrics by k8s labels instead of hosts

On distributed lock

2018-08-27 00:00:00 +0000

Requirements

  1. only 1 thread holds the lock
  2. no dead lock
  3. can be blocked, and can be waken up from blocked status
  4. Note that it is almost impossible to achieve 1 and 2 at the same time (See the analysis below). Normally it becomes a tradeoff.

Redis

  1. set expiry time on lock. Other contenders will get the expiry time so to know when to retry
    • the current holder will have a watchdog process to renew the lease.
    • setnx and expire can be set in a same command as set
  2. record lock holder to prevent unlocking by mistake.
  3. compare and unlock is done in an atomtic action
  4. Timeshift,e.g., NTP may trick early release
  5. need record the number of acquires and incr/dec accordingly. When dec it to 0, you know it is safe to delete.
  6. Note that on production common just to use Redisson, although it does NOT handle correctness in the case of master fail-over

Problem:

  1. TTL is hard to set it right
  2. Master - slave replication may lose data

Zookeeper

  1. Zookeeper has official lock recipe. Also checkout Apache curator * create a smallest ordered number in a lock dir
    • listen for the smallest value in that lock dir if it exsits already for its deletion event * On that listened val’s deletion, ask again if it is holding the lock
    • On releasing the lock, delete ordered number created in the first setp
  2. heartbeat to prevent dead lock
  3. caller thread can be blocked, and will be waken up via Watch
  4. Problem: lost heartbeat from old caller will trick ZK server to release the lock too early - the caller process itself is still up
  • To scale up,e.g., flash sale, common to use segmented locks on segmented contents, but this also means that the client code HAS to try different segments, if the current segment does not work

On Oral Communications

2018-07-29 00:00:00 +0000

  1. SLOW DOWN! Pause is better than filter pharses
  2. In terms of body languages: a. Lean forward b. Avoid huge movement, e.g., wave hands, shake heads c. Point your feet to the talker. d. Don’t cross your arms.It often sends a signal of defenseness
  3. Have some warm up talk/questions befoe moving onto the true questions
  4. Shut up so that people can move onto the topic they want to talk about
    • stay silent to see if the other side can pursue you
  5. Notice the difference between venter and explainer modei - this applies to both sides!
    • Provide three options instead of yes/no questions, so that people is more likely to give valid info, espeically when under stress
  6. When giving feedbacks, use “I” instead of “you”
  7. Avoid: a. Argument on the well meaning of the behavior b. Jargon if possible c. Comparison with other people d. When you are in the position of authority, must pay very close attention to what you say - it is very easy to get overly-analyzed e. Give feedback immediately, privately, but direct and honest, use concrete and recent example. Only criticize behavoior that can be improved
    • Instead of use “talk”, “speak” seems to work, e.g., “I wanna come down and I wanna speak to you…”
    • Use “something” instead of “anything”, e.g., “Is there something else you want to address in the visit today?”
    • Find a way to reply with equal or more amount of information to keep conversation going

On Handling Defects

2018-07-29 00:00:00 +0000

  1. Overall goal is to deliver value to the customer, not to build a perfect product, i.e., we have to be aware of the oppurtiny cost of doing so!

  2. Product owner (PO) needs to neigotiate quality cutoff line with the dev team

  3. Fix the defect from current sprint’s user stories. Otherwise, the backlog is not considered done! However, if the defect is not from the current spring, much more debatable (see blow)

  4. Fixing defect is more of a “now or never” mentality. If the defect is critial enough, most likely it will manifest itself again. Otherwise, the issue will rarely manifest, and we can probably close the issue with a “won’t fix”. However, we also have to keep in mind that the later we fix the defect, the more expensive it becomes

  5. For quick defect fixes, we may squeeze it into the current sprint. For bigger defects, we can put it in the backlog, and delete it if it lasts over two sprints - need to keep backlog short!

On Meeting Agendas

2018-07-16 00:00:00 +0000

Source

  1. Ask the meeting particpants to suggest itmes along with the reason

  2. Each topic should cover as many members as possible. Otherwise, people will disengage

  3. List topic in the form of specific problem to solve - question to answer

  4. For each topic, specify the purpose: share info, asking for input, or making a decision?

  5. Propose and agree on a process to address each topic. Also, identify the leader for each topic

  6. First item, however, is review and modify agenda

  7. For frequent meetings, ask for feedback to improve meeting process, e.g., preparation time, process effectiveness, off-topic time, topic time estimate..etc

Common concurrency problems

2018-07-04 00:00:00 +0000

Producer - consumer problem

Can have multiple producers - consumers


mutex = 0
cur_cnt = 0 
remain_cap = N //q with max size N
q = []

def producer(item):
	down(remain_cap) //Note the order of downs can't be swapped
	down(mutex)
	q.append(item)	
	up(mutex)
	up(cur_cnt)


def consumer():
	down(cur_cnt)
	down(mutex)
        item = q.pop()
	up(mutex)
	up(remain_cap)
	return item

Readers - writers problem


reader_count = 0
read_lock = 1
write_lock = 1

def read():
	down(read_lock)
	reader_count++
	if reader_count :
		down(write_lock)
	up(read_lock)
	read_op()
	down(read_lock)
	reader_count--
	if reader_count == 0 :
		up(write_lock)
	up(read_lock)
	
def write(v):
	down(write_lock)
	write_op(v)
	up(write_lock)

Dining philosophers problem

Cigarette smokers problem

Sleeping barber problem

On Emotion Control

2018-07-02 00:00:00 +0000

  1. When you enter “fight or flight” mode, you are losing the ability of thinking rationally. Also, your counterpart will sense your emotion and start feeling similar way
  2. Common sign is that your muscle gets tense too. You should start deep breathing - inhale through nose for 5 seconds, hole for a moment, and then exhale. Note this technique is common in fast-sleeping methods too.
  3. Standing up and walking around helps activiating thinking part of the brain
  4. Come up with a neutral phase to remind yourself to stay calm
  5. Trying separating thoughts and emotion. This distances yourself from feelings and helps understand WHY you feel this way
  6. Take a break, and give up a neutral reason to stand up and pause conversation
  7. Let the other side vent. Be empathetic and not defensive
  8. Fake your emotions sparsingly because it is hugely draining, i.e., only when you know your current emotion is really imappropriate for the current scenario
  9. To get out of the negative emotion, change environment and switch focus if possible. Disconnet from outside world - cut off interruptions and noises
  10. If you have a hard time getting out of negative emotion, try delaying your reaction as much as you can

Refereces

atCoder notes

2018-07-01 00:00:00 +0000

arc100_a: Linear Approximation

By observation, we can see that the answer is always the median of A[i] - i array. We can prove by moving any cut toward the median, and it will improve the solution. Note for the even N case either cut would do.

arc100_b: Equal Cut

Multi-dimension problem. We use the standard technique of searching on one, and analysis on others.

So we brute force on the central cut.

Claim: if we fix the central cut, we can directly infer the value of the other two cuts

Proof:

  1. For each half, the cut closet to sum(seg)/2 max min and min max AT THE SAME TIME, because the overall sum is same.

  2. If the global optimals are the on the same side, then we found it already.

  3. If the global optimals are on the different side, then it must be min of both sides and max of both sides. Note that we can not move up min or max any further, so no other choices are possible, i.e, we min max and max min AT THE SAME TIME

I was able to get greedy insight during the contest, but missing the A[i] > 0 requires means I was not able to come up with theliner time method!!!