The 20.8 design problem for PageRank outlines two approaches. The partitioned graph approach says “each machine loads its vertices and their outgoing edges”. I wonder this should be incoming edges. If one machine does not have knowledge about all incoming links for a particular vertex then how matrix multiplication for computing rank of the vertex can be done in one machine?
Now that I think more, it can work both ways. If we store outgoing edges then each machine will do multiplication of its page rank and weights and send out the value to all machines responsible for outgoing vertices. Then those machine will do addition to complete the matrix multiplication for its vertices.