Empirical Software Engineering

guest — Thu, 22 Sep 2011 22:24:04 +0000

I do not think we really know the answer as a community. We really lack a list of prioritized questions, a list of fundamental challenges.

But I do know is that the most relevant questions will:

1) do not have answers yet, or the answers are not consistent, or they are “folklore”

2) impact many if an answer is found

3) stated as hypotheses that can be checked.

Tim Menzies mentioned to me recently that the fallacies in Glass’s book “Facts and Fallacies of Software Engineering” could be a good starting point for some of the questions involving developers.

By: Sebastian Elbaum

elbaum — Wed, 07 Sep 2011 16:57:43 +0000

Interesting reading:

C. Wohlin, “Empirical Software Engineering: Teaching Methods and Conducting
Studies”, in Empirical Software Engineering – Dagstuhl Seminar Proceedings (LNCS
4336), pp. 135-142, edited by V. Basili, D. Rombach, K. Schneider, B. Kitchenham,
D. Pfahl and R. Selby, Springer Verlag, 2007.

elbaum — Wed, 07 Sep 2011 16:45:05 +0000

One strategy that is extremelly effective is “incrementalism”. The idea is not to take a big-bang approach to the design and implementation of a study but ratherto conduct a series of studies, refined over a period of time, addressing some of the threats that are found and incorporating more context and variables of interest.

Shari Pfleeger had an interested article related to this topic in IEEE Computer October 1999 entitled “albert Einstein and Empirical Software Engineering”. (doi>10.1109/2.796106).

elbaum — Thu, 04 Aug 2011 18:55:30 +0000

There are at least three key elements that make a study relevant.

First, a relevant study pursues a research question that matters to practitioners or that might affect groups of researchers or both. Very often I see very well conducted studies, with careful designs and implementations, but the question being asked is either uninteresting or inconsequential making the study just a good exercise in empirical software engineering. With some many good questions left to answer this is a really a shame.

Second, a relevant study can teach us something. A study that provides a new data point can contribute to the body of knowledge, a study that confirms a hypothesis in a new context can extend what we know, a study that challenges an existing assumption can change the way we think or work. Clearly, there is a spectrum or relevance depending on the level of contribution of each piece of work. Not all studies will challenge the status-quo but those that do so soundly are often very relevant.

Third, for a study to be relevant it must be disseminated it in the right venues, to the right audience. Two natural tendencies make this harder than it sounds. First, we often send our work to venues that we are familiar with even when the best targets of our studies might be domain-specific venues where we can reach more relevant folks. Second, we are used to a presentation style that may not be appropriate for all venues yet reusing that style is often “cheaper” (in the short term at least). These myopic tendencies can render the best studies “invisibles” to the relevant audience.

elbaum — Tue, 05 Jul 2011 17:41:26 +0000

A key guiding principle for reporting is transparency, and for me that means making the raw data available, following a standard reporting format, and describing the context of the study as much as possible.

I am pushing so that every paper we publish that contains any empirical component makes all the raw data available in the web in a format that is easily readable/processable by others. I find that, even when papers include what I would deem as complete sets of tables and figures, as a reseacher I often have questions or doubts that require to dig a bit deeper on an aspect of the work and that often requires access to the raw data. Besides, that is completely necessary for reproducing some studies.

Over the years of performing empirical studies on testing and analysis techniques I have developed and convereged towards a standard reporting structure. I believe the consistency of a standard report makes papers easier to follow since readers know what to expect and where to expect it, and if it is not there then it will raise some questions. It is kind of neat to see that that type of format has pretty much propagated throughout the testing and analysis community in the last 10 years (even in the late 90s it was ok for a T&A paper with an empical component not to state its limitation or threats).

elbaum — Tue, 05 Jul 2011 17:28:02 +0000

I think that one of the most critical and also one of the hardest threats to address in practice is the internal threat due to researcher bias. In software engineering it is common for researchers to perform experiments and studies of their own approaches (e.g., algorithms, techniques, processes.) By definition, this is biased. We all have a desire for our approach to succeed and even when we try very hard our preferences may lead to decisions that affect the outcome of a study.

Ideally we would like for the approaches we create to be evaluated by others that have no stake in the outcome. In practice, however, the time and cost of conducting studies, and the speed by which technical progress happens, makes self-evaluation the standard practice.

One mitigating practice in our community is to compare the proposed technique or approach against others, and perhaps use standard methods and artifacts to avoid selecting the “right context for my tool”. But again, that fails to address a lot of things happenning under the hood as the study is conducted and reported.

Note that systematic reviews address this issue only tangentially as they work with existing repoted data that may already be biased.

elbaum — Thu, 14 Apr 2011 19:41:16 +0000

Perhaps the most recognized pioneer in the field is Victor Basili (www.cs.umd.edu/~basili) who not only published some of the earlier work in the area in one o the first “empirical” laboratories (NASA Software Engineering Laboratory) but also founded the Empirical Software Engineering Journal (http://www.springer.com/computer/swe/journal/10664) and lead the formation of a community in empirical software engineers (ESEM, ISERN …). He started to work in this area in the early 70′s.One of the first software engineering paper with “Experimental” in the title: “A controlled experiment quantitatively comparing software development approaches”, Basili and Reider, TSE 1981, is actually a very good read.

Dr. Basili argues that understanding a discipline involves observation, model building, experimentation, encapsulation of knowledge, and evolving knowledge over time. Software engineering is just like any other discipline, requiring an empirical approach to be understood. That is why we have empirical software engineering. What is unique about it is that it applies to software products and processes.

Wang-author — Sun, 30 Jan 2011 02:11:48 +0000

It should address an interesting research question. An empirical study can be well-executed, but if the results are not of interest to anyone, then it’s not really a good empirical study.

Miley-Cyrus-Cannamp-39-t-Be-Tamed

Different from computer science but not so much from social science

Wang-author — Sun, 30 Jan 2011 02:04:18 +0000

One significant way empirical studies in software engineering differ from those in other research areas in computer science is that the studies often involve human beings as subjects in some way.

Empirical studies in software engineering are probably not that different from studies in areas such as psychology, sociology, or education, which is why many of our methods are drawn from these aresa.