We are often interested in infering the topology of a network. Specifically, we want to know which nodes form part of a subnetwork. We can think of a network as a collection of nodes (computers) that connect to some monitors (servers) through some routers.
When a node pings a monitor (e.g., google), we can measure the distance (measured in hop counts) between such node and such monitor. We can construct a distance matrix where each row represents a monitor, and each column represents a node. The ith column gives the distances between the ith node and all the monitors. But since not every node pings every monitor, we have incomplete data.
In a simplified model, if node i forms part of subnet k, then the ith column is given by the sum of (a) the distance between node i and router k, and (b) the distance between router kand each of the monitors:
Hence, these measurements lie in a union of subspaces. If we cluster the columns according to their subspaces, we can determine the subnets.