The PageRank component
To build the network, we use primarily data that is freely available in the GitHub Archive. When you sign-up to Quine we complement this with extra datapoints to be able to provide the most accurate approximation of the ecosystem.
We have three types of edges in the network:
Repo → Repo, that represents dependencies (a repository lists another repo as a dependency as e.g. in
package.jsonorrequirements.txt).Repo -> Developer, that represent commit events in the network (developer commits to a repo).
Developer -> Repo, that represent stargazer events in the network (developer stars a repo),
By drawing these edges between stargazers, repos and contributors, we complete a path where reputation flows from the stargazers and dependencies to the relevant contributors.
If we consider all such edges within GitHub, we will construct a large directed network where reputation travels from one developer to another.
DevRank uses the PageRank algorithm on this resulting network to compute the stationary state probabilities of a random walker in the network. These raw probabilities indicate the importance of a developer within the network, in the same way that PageRank calculates the importance of web pages. Crucially, a link from a high-profile source in the network is worth more than a link from a lower-profile source.
In sum, the PageRank models reputation as a number that is proportional to the endorsements that can be attributed to your commits.
A white-paper about DevRank and Stargazer Reputation is in the works and will be published soon.
Last updated