Insights

1. Consider if a node u can be reached from two police nodes, v and w, with distance d(v) and d(w), assume d(v) <= d(w), then we know the path w....u can be broken up. More specifically, the last edge can be deleted, since we know it is redundant for sure,i.e., in an optimal solution, we know for sure that such d(v) <= d(w) case won't happen, i.e., each node is cover by exactly one police node. 

2.At the same time, it becomes obvious that when such condition suffices, this solution is optimal => otherwise the requires each node is covered is no longer true!

3. Therefore, whenever we can discover a node covered by 2 police nodes, we can delete at least 1 edge, until we reach the optimal solution

But how to proceed from here?!!!

Instead of trying delete edges, we can add edges from police nodes!

Therefore, we start from each police node, and expand them until we cover the whole tree. Note that because we want to choose d(v) instead of d(w), we have to expand the police nodes in a breadth first way, so that every time we reach a new node, we know it is the shortest distance to any police node