eConference is a text-based conferencing tool that supports distributed teams in need for synchronous communication and structured discussion services. Other than offering communication services, it integrates an agenda and minutes editor, plus other control and coordination features, like hand raising and threaded discussion. The current version of the tool is based on Eclipse RCP and uses the eXtensible Messaging and Presence Protocol (XMPP) as the only communication infrastructure. The goal of this paper is presenting a work-in-progress to port the eConference tool on the Eclipse Communication Framework (ECF), which will enables us to abstract from the underlying communication protocol, provide better decoupling among components, and build additional team-support services.
Code comprehension has been recently investigated from physiological and cognitive perspectives using medical imaging devices. Floyd et al. (i.e., the original study) used fMRI to classify the type of comprehension tasks performed by developers and relate their results to their expertise. We replicate the original study using lightweight biometrics sensors. Our study participants-28 undergrads in computer science-performed comprehension tasks on source code and natural language prose. We developed machine learning models to automatically identify what kind of tasks developers are working on leveraging their brain-, heart-, and skin-related signals. The best improvement over the original study performance is achieved using solely the heart signal obtained through a single device (BAC 87%vs. 79.1%). Differently from the original study, we did not observe a correlation between the participants' expertise and the classifier performance (τ= 0.16, p= 0.31). Our findings show that lightweight biometric sensors can be used to accurately recognize comprehension opening interesting scenarios for research and practice.
Recent studies on Stack Overflow show that question-related factors, such as conveyed sentiment and presentation quality, can significantly influence the probability of obtaining a useful answer. At the same time, the mentorship provided by human experts was proven effective in support novice users in effective question-writing. In line with previous empirical findings, we developed QAvMentor, a tool capable of providing online, real-time, automated mentorship during question-writing in Stack Overflow.
Creating a successful and sustainable Open Source Software (OSS) project often depends on the strength and the health of the community behind it. Current literature explains the contributors' lifecycle, starting with the motivations that drive people to contribute and barriers to joining OSS projects, covering developers' evolution until they become core members. However, the stages when developers leave the projects are still weakly explored and are not well-defined in existing developers' lifecycle models. In this position paper, we enrich the knowledge about the leaving stage by identifying sleeping and dead states, representing temporary and permanent brakes that developers take from contributing. We conducted a preliminary set of semi-structured interviews with active developers. We analyzed the answers by focusing on defining and understanding the reasons for the transitions to/from sleeping and dead states. This paper raises new questions that may guide further discussions and research, which may ultimately benefit OSS communities.
The role of sentiment analysis is increasingly emerging to study software developers' emotions by mining crowd-generated content within software repositories and information sources. With a few notable exceptions, empirical software engineering studies have exploited off-the-shelf sentiment analysis tools. However, such tools have been trained on non-technical domains and general-purpose social media, thus resulting in misclassifications of technical jargon and problem reports. In particular, Jongeling et al. show how the choice of the sentiment analysis tool may impact the conclusion validity of empirical studies because not only these tools do not agree with human annotation of developers' communication channels, but they also disagree among themselves. Our goal is to move beyond the limitations of off-the-shelf sentiment analysis tools when applied in the software engineering domain. Accordingly, we present Senti4SD, a sentiment polarity classifier for software developers' communication channels. Senti4SD exploits a suite of lexicon-based, keyword-based, and semantic features for appropriately dealing with the domain-dependent use of a lexicon. We built a Distributional Semantic Model (DSM) to derive the semantic features exploited by Senti4SD. Specifically, we ran word2vec on a collection of over 20 million documents from Stack Overflow, thus obtaining word vectors that are representative of developers' communication style. The classifier is trained and validated using a gold standard of 4,423 Stack Overflow posts, including questions, answers, and comments, which were manually annotated for sentiment polarity. We release the full lab package, which includes both the gold standard and the emotion annotation guidelines, to ease the execution of replications as well as new studies on emotion awareness in software engineering. To inform future research on word embedding for text categorization and information retrieval in software engineering, the replication kit also includes the DSM. Results. The contribution of the lexicon-based, keyword-based, and semantic features is assessed by our empirical evaluation leveraging different feature settings. With respect to SentiStrength, a mainstream off-the-shelf tool that we use as a baseline, Senti4SD reduces the misclassifications of neutral and positive posts as emotionally negative. Furthermore, we provide empirical evidence of better performance also in presence of a minimal set of training documents.
Description Datasets for sentiment analysis and emotion mining, distributed with the Emotion Mining Toolkit (EMTk) Docker container (see https://collab-uniba.github.io/EMTk for more): Stack Overflow - A couple of gold standards of 4,000+ posts, manually annotated for mining both emotions and polarity. Jira - A gold standard of ~4,000 issues, manually annotated for emotions. Citation Please, see the references below for the papers to cite. Do not cite this Zenodo upload directly.
Large enterprise organizations have software development teams distributed over multiple geographical sites. Because of distance, enterprises face challenges which are similar to those that open source software (OSS) projects have experienced in the past. OSS projects overcame the problem of distance through both development practices and Collaborative Software Development (CSD) platforms, wholly made up of asynchronous tools. However, generic groupware platforms offer both same-time and different-time options for communication and collaboration. We intend to understand whether distant developers can benefit from synchronous functions other than asynchronous functions for cross-sites cooperation. As a first step, this paper provides a comparison of CSD platforms and generic groupware with respect to supported functions. As a result we propose the extension of CSD platforms with synchronous functions, as those available in widespread groupware platforms.
Building trust among remote developers is challenging because trust typically grows through close face-to-face interaction. In this paper, we present the preparatory design of an empirical study aimed to assess whether affective trust, established through social communication between developers, is a predictor of successful collaboration in distributed projects. Specifically, we intend to measure affective trust through sentiment analysis of pull-request comments.