How Developers’ Collaborations Identified from Different Sources Tell us About Code Changes
-
Upload
sebastiano-panichella -
Category
Presentations & Public Speaking
-
view
285 -
download
0
description
Transcript of How Developers’ Collaborations Identified from Different Sources Tell us About Code Changes
04/09/2023
How Developers’ Collaboration Identified from Different Sources
Tell us About Code Changes
Sebastiano Gabriele Massimiliano Gerardo Giuliano
Panichella Bavota Di Penta Canfora Antoniol
04/09/2023
Outline
Context and Motivations- Software Development
Case Study- Seven Open Source Projects
Results- Evaluation of Developers Collaboration Identified from
Different Sources- Application of Networks Obtained from Different Sources
04/09/2023
Different Sources of Information…
‘‘…In everybody’s experience, different communication channels play different, sometimes complementary sometimes alternative, roles: news can be gathered (and shared) from the radio, by reading a newspaper, watching a TV broadcast or surfing blogs.’’.
04/09/2023
Academic Paper Preparation
04/09/2023
Academic Paper Preparation
04/09/2023
Academic Paper Preparation
04/09/2023
Academic Paper Preparation
04/09/2023
Academic Paper Preparation
04/09/2023
Academic Paper Preparation
‘‘Study Design’’
04/09/2023
Academic Paper Preparation
‘‘Study Design’’
‘‘Results’’
04/09/2023
Academic Paper Preparation
‘‘Abstract and Introduction"
04/09/2023
Academic Paper Preparation
‘‘Conclusion’’
‘‘Results’’
04/09/2023
Academic Paper Preparation
‘‘Conclusion’’
‘‘Results’’
04/09/2023
Once the Paper is Ready…
Title We face an important decision:determine the right order of the authors…
04/09/2023
Focusing on a single source:
04/09/2023
Focusing on a single source:
04/09/2023
Focusing on a single source:
I would Bavota as first author…
04/09/2023
Focusing on a single source:
I would Panichella as first author…
04/09/2023
Merging all the sources:
I would Panichella as first author…
I say Panichella..
I would Panichella as first author…
I would Bavota as first author…
I say Bavota..
04/09/2023
Software Development Environment
04/09/2023
Software Development Environment
04/09/2023
Software Development Environment
Example: Hibernate OSS Project
04/09/2023
Previous Work…
Bird et al. - MSR 2006
04/09/2023
Previous Work…
Canfora et al. - FSE 2012
04/09/2023
Previous Work…
Guzzi et al. - MSR 2013
04/09/2023
Previous Work…
Elliot et al. - ACM GROUP 2003
04/09/2023
IRC CHAT LOG
VERSIONING SYSTEM
ISSUE TRACKER
MAILING LIST
How Developers’ Collaborations Networks Identified from Different Sources Differ?
04/09/2023
Case Study
Goal: investigating how different communication channels would provide different views of developers’ interaction and the use of such information in recommender systems could produce different results.
Research questions:
• RQ1: to what extent do developers discuss through the different communication channels?
• RQ2: How do the inferred links between developers overlap when using different sources of information?
• RQ3: How do social network metrics change when using different sources, and how would this impact on using such information to build recommenders?
04/09/2023
Context - Objects
Project from Andr. Api
Period KLOC
Apache HTTPD June 2011-June 2013
2,021-2,240
Apache CXF June 2011-June 2013
593–771
Hibernate June 2011-June 2013
984–1,096
Infinispan June 2011-June 2013
146–286
Apache Lucene June 2011-June 2013
198–437
Samba June 2010-June 2012
1,278–1426
Weld June 2011-June 2013
108–139
04/09/2023
Data Extraction
04/09/2023
Data Extraction
04/09/2023
Data Extraction
Class 1
Class 3
Class 2
Class 4
Class 1
04/09/2023
Data Extraction
Class 1
Class 3
Class 2
Class 4
Class 1
04/09/2023
Data Extraction
04/09/2023
Data Extraction
Identifying people
that use more than
one sources
04/09/2023
Data Extraction
Identifying people
that use more than
one sources
04/09/2023
RQ1: to what extent do developers discuss through the different communication
channels?Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
04/09/2023
RQ1: to what extent do developers discuss through the different communication
channels?Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
Developers mainly use two out of three communication
channels,whereas the third one is only
used sporadically.
04/09/2023
RQ1: to what extent do developers discuss through the different communication
channels?Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
While in the past developers used emails as
main communication channel, nowadays they
are massively using chats or issue trackers.
Developers mainly use two out of three communication
channels,whereas the third one is only
used sporadically.
04/09/2023
ISSUE and CHAT ISSUE and MAIL
<35% 56%
MAIL and CHAT MAIL and ISSUE
<50%
86%
Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
Developers Overlap betweenDifferent Sources
04/09/2023
RQ2: how do the inferred links between developers overlap when using different
sources of information?Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
ISSUE and CHAT ISSUE and MAIL
<35% 56%
MAIL and CHAT MAIL and ISSUE
<50%
86%
04/09/2023
RQ2: how do the inferred links between developers overlap when using different
sources of information?Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
ISSUE and CHAT ISSUE and MAIL
<26% 38%
MAIL and CHAT MAIL and ISSUE
<20%
30%
04/09/2023
During an IRC Chat Meeting
“is there a better way? dunno like I said this is brainstorming and I have not given lots of thought to these cases”
“but we also need to create the attributes and values in the entity binding..”
04/09/2023
During an IRC Chat Meeting
“is there a better way? dunno like I said this is brainstorming and I have not given lots of thought to these cases”
1) Brainstorming
“however planning a pure standalone test suite would make things easier...”
04/09/2023
During an IRC Chat Meeting
“is there a better way? dunno like I said this is brainstorming and I have not given lots of thought to these cases”
“however planning a pure standalone test suite would make things easier...”
1) Brainstorming2) Planning (e.g. Testing activities)
04/09/2023
During an IRC Chat Meeting
“okay I think it is a bug and I’m going to create a jira first”
“however planning a pure standalone test suite would make things easier...”
1) Brainstorming2) Planning (e.g. Testing activities)
3) Open an Issue
04/09/2023
Similarity Measure of Topics Extracted from Different Communication
Channels
issues vs. mails
issues vs. chat
mails vs. chat
Apache Httpd 0.17 0.09 0.06
Apache CXF 0.86 0.11 0.01
Hibernate 0.11 0.02 0.03
Infinispan 0.07 0.03 0.03
Apache Lucene
0.08 0.03 0.02
Samba 0.06 0.02 0.02
Weld 0.11 0.04 0.03
04/09/2023
Similarity Measure of Topics Extracted from Different Communication
Channels
issues vs. mails
issues vs. chat
mails vs. chat
Apache Httpd 0.17 0.09 0.06
Apache CXF 0.86 0.11 0.01
Hibernate 0.11 0.02 0.03
Infinispan 0.07 0.03 0.03
Apache Lucene
0.08 0.03 0.02
Samba 0.06 0.02 0.02
Weld 0.11 0.04 0.03
> ≥
> ≥
> >> >>
> >
> >
04/09/2023
RQ3: How do social network metrics change when using different sources, and how would this impact on using such information to build recommenders?
04/09/2023
RQ3: How do social network metrics change when using different sources, and how would this impact on using such information to build recommenders?
Social Network Metrics:
- Identifying high-degree developers;
- Identifying mentors. (Canfora et al. - FSE 2012).
Social Network Metrics vs. Code Changes:
- Correlation between social roles and change activities.
(replicating the study by Bird et al. - MSR 2006).
04/09/2023
Mentors Overlap betweenDifferent Sources
Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
04/09/2023
Mentors Overlap betweenDifferent Sources
Considering ALL SOURCES
41%
Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
04/09/2023
Mentors Overlap betweenDifferent Sources
Considering ALL SOURCESMAIL and ISSUE
<41% 47%
Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld
04/09/2023
High Degree Contributors Overlap between Different
Sources
Considering ALL SOURCESMAIL and ISSUE
<41% 47%
Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld Considering ALL SOURCES
36%
04/09/2023
Considering ALL SOURCESMAIL and ISSUE
<41% 47%
Apache Httpd
Apache Lucene
Samba
Hibernate
Apache CXF
Infinispan
Weld Considering ALL SOURCES MAIL and ISSUE
<36%
46%
High Degree Contributors Overlap between Different
Sources
04/09/2023
Ohloh Kudos Score
Kudos score:level of appreciation or respect of a developer working for a project. It is based on the judgement of other project members.
http://www.ohloh.net/p/apache/contributors
04/09/2023
Issue, Chat and Email to Identify Leaders
Leaders
Leaders
Leaders
Leaders
Ap
ach
e L
uce
ne
Sa
mb
a
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
0%
20%
40%
20%
20%
40%
60%
60%
60%
60%
60%
80%
Precision in Recommending Leaders
04/09/2023
Replication of the Work by Bird et al.
Bird et al. - MSR 2006
‘‘Developers who actually commit
changes, play much more significant roles in the email community than
non-developers’’
04/09/2023
Social Network Metrics vs. Source Code Changes
Apache HTTPD
04/09/2023
Social Network Metrics vs. Source Code Changes
SNA Metrics
Code Metrics
Apache HTTPD
04/09/2023
Social Network Metrics vs. Source Code Changes
SNA Metrics
Code Metrics
Apache HTTPD
04/09/2023
Social Network Metrics vs. Source Code Changes
SNA Metrics
Code Metrics
Apache HTTPD
04/09/2023
Social Network Metrics vs. Source Code Changes
SNA Metrics
Code Metrics
Results varying when we consider for example issue trackers?
Apache HTTPD
04/09/2023
Social Network Metrics vs. Source Code Changes
SNA Metrics
Code Metrics
Apache HTTPD
Apache HTTPD
04/09/2023
Social Network Metrics vs. Source Code Changes
Hibernate
Hibernate
Hibernate
04/09/2023
Social Network Metrics vs. Source Code Changes
Hibernate
Hibernate
Hibernate
04/09/2023
Social Network Metrics vs. Source Code Changes
Hibernate
Hibernate
Hibernate
04/09/2023
Social Network Metrics vs. Source Code Changes
Hibernate
Hibernate
Hibernate
04/09/2023
Conclusion
04/09/2023
Conclusion
04/09/2023
Conclusion
04/09/2023
Conclusion
04/09/2023
Conclusion