Numbers for performance and capacity estimation
- mutex lock/unlock is around the same range as RAM access - 100 ns = 0.001 ms. L2 access at 7ns
- Reading 1 MB sequentially from RAM - 0.25 ms.
- Ping between AWS AZs. Remember to turn on ICMP protocol, which is IP protocol.
- Same AZ, 0.3 ms
- ap-northeast-1a to ap-northeast-1c, 2.3 ms
- ap-northeast-1a to ap-northeast-1d, 1.6 ms
- ap-northeast-1c to ap-northeast-1d, 0.8 ms
- Reading 1 MB from network 10 ms. Note this number seems to include the cost of copying from kernel state to user state
- Reading 1 MB sequentially from disk 30 ms, excluding seek time. Sequential read 1GB/s for SSD, 4GB/s for memory. Random read 4KB from SDD is about 0.15ms
- Disk seek is 10ms on traditional, 0.1 ms on SSD.
- For traditional disk, writing has similar performance as read, but need to add full rotation + blocksize/transfer rate
- SSD is log-structured and has lots of errors, but masked on the on-device logic. Hard to survive multiple power outages but average failure time > 6 years
- for SSD, seq write improves throughput (250 MB/s vs 25 MB/s) and latency is roughly same. Seq read and random read has roughly same throughput and latency.
- Roundtrip between CA and Netherland is 150ms. SF to NYC is about 40ms
- Compressing 1K bytes with Zippy is in the same range as sending 1k bytes over 1 Gbps network - 0.01ms
- Empty Object in Java is 8 bytes (Some sources claim 12?). Actuall size is padded to the closest 8 bytes > estimate
Numbers for capacity planing
There are just rule of thumb. The strongest data point is PoC on the projected workload
- Nginx
- at least 20k concurrent connections per machine
- Tomcat
- Between 200 to 2k QPS per instance. Use Littleās law to estimate
- Time out for microservices, try not to make it > 1s, 90th percentile aim for 200ms
- Each day has 86.4k seconds, although during estimate we probably want to use 40k instead of 80k to take sleep time into account.
- Without support of other data, we can assume 80% of traffic happens within 20% of time,i.e., peak is 4 times of average traffic
- Default thread pool size = 200. Max # of threads in OS - 3-5k. 500 threads in a single process is OK.
- For OLTP workload, The most likely bouding factor is blocking calls, followed by CPU. During daily peaks, the CPU peak should be between 10 and 30 percent
- Between 200 to 2k QPS per instance. Use Littleās law to estimate
- Redis
- 20k to 50k QPS, Mostly bounded by network instead CPU
- CPU usage should not be > 15%
- MySQL
- Default connneciton pool size = 151. Note RDS connection pool size is based on the instance type. Normally keep 1.5k peak/sec per instance is no problem, but something to watch out for when it reachs 3k rps
- InnoDB default lock wait_time is 50s, i.e., at least 50s before one deadlocked process exits
- In OLTP workload, each txn should take no more than 10 ms on average on DB layer
- c5.4xlarge (16 core,68 G ram) should be enough to hand 500 TPS OLTP workload. Less than that means problem with with code
- Kafka
- 95th percentile at 5 ms, with replication factor = 3, within the same DC
- Most likely network and memory bound than CPU
- 2M record per second on 3 node cheap machines in their standard benchmark
AWS cost saving tips
- Recycle unattached EBS volumes. They are not reclaimed when you retire EC2 instances
- Rarely you need more than 15k iops if it is not storage layer. 30k is more than enough for most storage layer
- On-demand instance should not be more than 1/3 of the fleet. The rest will be a mixture of RI and spot instance