The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories
Conference Year
January 2022
Abstract
Communications surrounding open-source projects largely occurs outside the repository. Historically, large communities used a collection of mailing lists to discuss their projects. Software development and communication happening on different channels complicates the study of open-source projects. Here, we combine and standardize mailing lists of the Python community, as well as the Golang, Angular and Node.js communities. We focus on the CPython repository and merge the technical layer (GitHub account file collaborations) with the social layer (emails), identifying 33% of GitHub contributors in the mailing-list data. We then explore correlations between the social messaging and the structure of the collaboration network.
Primary Faculty Mentor Name
Laurent Hébert-Dufresne
Faculty/Staff Collaborators
Jean-Gabriel Young, James Bagrow
Status
Graduate
Student College
College of Engineering and Mathematical Sciences
Program/Major
Computer Science
Primary Research Category
Arts & Humanities
The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories
Communications surrounding open-source projects largely occurs outside the repository. Historically, large communities used a collection of mailing lists to discuss their projects. Software development and communication happening on different channels complicates the study of open-source projects. Here, we combine and standardize mailing lists of the Python community, as well as the Golang, Angular and Node.js communities. We focus on the CPython repository and merge the technical layer (GitHub account file collaborations) with the social layer (emails), identifying 33% of GitHub contributors in the mailing-list data. We then explore correlations between the social messaging and the structure of the collaboration network.