But I do know is that the most relevant questions will:
1) do not have answers yet, or the answers are not consistent, or they are “folklore”
2) impact many if an answer is found
3) stated as hypotheses that can be checked.
Tim Menzies mentioned to me recently that the fallacies in Glass’s book “Facts and Fallacies of Software Engineering” could be a good starting point for some of the questions involving developers.
By: Sebastian Elbaum
]]>C. Wohlin, “Empirical Software Engineering: Teaching Methods and Conducting
Studies”, in Empirical Software Engineering – Dagstuhl Seminar Proceedings (LNCS
4336), pp. 135-142, edited by V. Basili, D. Rombach, K. Schneider, B. Kitchenham,
D. Pfahl and R. Selby, Springer Verlag, 2007.
Shari Pfleeger had an interested article related to this topic in IEEE Computer October 1999 entitled “albert Einstein and Empirical Software Engineering”. (doi>10.1109/2.796106).
]]>First, a relevant study pursues a research question that matters to practitioners or that might affect groups of researchers or both. Very often I see very well conducted studies, with careful designs and implementations, but the question being asked is either uninteresting or inconsequential making the study just a good exercise in empirical software engineering. With some many good questions left to answer this is a really a shame.
Second, a relevant study can teach us something. A study that provides a new data point can contribute to the body of knowledge, a study that confirms a hypothesis in a new context can extend what we know, a study that challenges an existing assumption can change the way we think or work. Clearly, there is a spectrum or relevance depending on the level of contribution of each piece of work. Not all studies will challenge the status-quo but those that do so soundly are often very relevant.
Third, for a study to be relevant it must be disseminated it in the right venues, to the right audience. Two natural tendencies make this harder than it sounds. First, we often send our work to venues that we are familiar with even when the best targets of our studies might be domain-specific venues where we can reach more relevant folks. Second, we are used to a presentation style that may not be appropriate for all venues yet reusing that style is often “cheaper” (in the short term at least). These myopic tendencies can render the best studies “invisibles” to the relevant audience.
]]>I am pushing so that every paper we publish that contains any empirical component makes all the raw data available in the web in a format that is easily readable/processable by others. I find that, even when papers include what I would deem as complete sets of tables and figures, as a reseacher I often have questions or doubts that require to dig a bit deeper on an aspect of the work and that often requires access to the raw data. Besides, that is completely necessary for reproducing some studies.
Over the years of performing empirical studies on testing and analysis techniques I have developed and convereged towards a standard reporting structure. I believe the consistency of a standard report makes papers easier to follow since readers know what to expect and where to expect it, and if it is not there then it will raise some questions. It is kind of neat to see that that type of format has pretty much propagated throughout the testing and analysis community in the last 10 years (even in the late 90s it was ok for a T&A paper with an empical component not to state its limitation or threats).
]]>Ideally we would like for the approaches we create to be evaluated by others that have no stake in the outcome. In practice, however, the time and cost of conducting studies, and the speed by which technical progress happens, makes self-evaluation the standard practice.
One mitigating practice in our community is to compare the proposed technique or approach against others, and perhaps use standard methods and artifacts to avoid selecting the “right context for my tool”. But again, that fails to address a lot of things happenning under the hood as the study is conducted and reported.
Note that systematic reviews address this issue only tangentially as they work with existing repoted data that may already be biased.
]]>Dr. Basili argues that understanding a discipline involves observation, model building, experimentation, encapsulation of knowledge, and evolving knowledge over time. Software engineering is just like any other discipline, requiring an empirical approach to be understood. That is why we have empirical software engineering. What is unique about it is that it applies to software products and processes.
]]>Empirical studies in software engineering are probably not that different from studies in areas such as psychology, sociology, or education, which is why many of our methods are drawn from these aresa.
]]>