How are statistics being applied in computer science to evaluate accuracy in research claims?

Question

I have noticed in my short academic life that many published papers in our area sometimes do not have much rigor regarding statistics. This is not just an assumption; I have heard professors say the same.

For example, in CS disciplines I see papers being published claiming that methodology X has been observed to be effective and this is proved by ANOVA and ANCOVA, however I see no references for other researchers evaluating that the necessary constraints have been observed. It somewhat feels like as soon as some 'complex function and name' appears, then that shows the researcher is using some highly credible method and approach that 'he must know what is he doing and it is fine if he does not describe the constraints', say, for that given distribution or approach, so that the community can evaluate it.

Sometimes, there are excuses for justifying the hypothesis with such a small sample size.

My question here is thusly posed as a student of CS disciplines as an aspirant to learn more about statistics: How do computer scientists approaches statistics?

This question might seems like I am asking what I have already explained, but that is my opinion. I might be wrong, or I might be focusing on a group of practitioners whereas other groups of CS researchers might be doing something else that follows better practices with respect to statistics rigor.

So specifically, what I want is "Our area is or is not into statistics because of the given facts (papers example, books, or another discussion article about this are fine)". @Patrick answer is closer to this.

score 12 · Accepted Answer · edited Apr 07 '12 at 12:40

As a graduate student in computer science, who has exposure to research in fields other than computer science, and whose research group works in an area of computer science where statistics can be fruitfully applied, I can offer my experience; your mileage may vary.

In general, even the most well-meaning scientific research can fail to rigorously apply statistical analysis to results, and it is my experience that this does not always preclude papers including such poorly-analyzed results from being accepted for publication. The area in which my group operates is mainly in distributed computing and high-performance computer architecture. Often, research involves experimental designs the performance of which cannot easily be understood analytically in the required detail. As such, empirical results are often used as evidence for claims.

Clearly, experiments should be design - and results analyzed - in such a way as to provide some confidence that the results are statistically significant. Most of the time, this is not done, even in some of the most important venues. When statistical analysis is applied, it is almost never rigorous in any meaningful sense; the most one typically sees (and one is glad to see it!) is that an experiment was repeated n times, for some arbitrarily-chosen n, where typically $1 < n < 5$. The selection of error bars (if any are indicated) seems to be mainly a matter of personal preference or taste.

In summary, no, it's not just you; and it's not just software engineering. In general, based on my experience, several areas of computing research seem to err on the side of not doing enough. Indeed, it might be even be detrimental to the viability of a submitted paper to dwell on statistical considerations. That's not to say that I find the situation satisfactory; far from it. But these are my impressions. For instance, you may take a look at section 5 of this paper, which was presented at Supercomputing 2011, one of the most high-profile conferences in the area of high-performance computing. Specifically, take a look at some of the discussion of results in section 5, and see whether you arrive at the same conclusions that I do about the rigor of statistical analysis of experimental results.

More generally, this shortcoming may be symptomatic of a condition in some areas of computing to publish more papers rather than fewer, to target conferences rather than journals, and to emphasize incremental progress rather than significant and fundamental improvements in understanding. You might consult this article, which provides valuable insight along these lines.

score 2 · Answer 2 · edited Apr 10 '12 at 11:21

Software engineering includes many features. Two of them are human factor and quality measure.

Let’s say I want to do productivity analysis. The data collection would be hard comparing to algorithm analysis because the data is about human productivity. Also the objective measure of quality is not easy to achieve.

10 lines of code per day for an avionics system versus 150 lines of code per day for an app on smart phones, which one has higher productivity and which one has better quality? And if both claim that they are using the same methodology? Comparing them is to compare apples and oranges.

Sometimes it is hard to achieve accurate measure of code efficiency. For example, I put in a bunch of unuseful variables and lots of lines of code for those variables, say for debugging purpose. This boosts my productivity at development stage. At the end, I take out all of them and I say I improve my code to achieve efficiency.

Later on, a researcher comes in and perform efficiency analysis. He might treat the above as noises and only concentrate on the final results. Some researchers pay attention to the noises. Then you are going to see articles with different conclusions.

Statistics is supposed to be a tool to assist the researchers in finding causes of issues. Many researchers use it to draw conclusions. This is what you have observed.

Some remarks above might lead the OP to think that I am against the usage of stats in software engineering. If so, I'd like to make myself clear.

I am not against stats. Stats analysis can tell you X might be true. But, that should not be the end of the research. The next task should be to find out if X is actually true and why. This is what I believe science is about - to find the truth.

Whether or not software engineering belongs to computer science is another issue.

score 1 · Answer 3 · answered Feb 08 '14 at 00:45

Statistics is hard, and often counter-intuitive. Besides, the urge to "do one more experiment" to see if there is an effect (and stop when it shows up) is strong, specially if the experiments are costly (time, work, not just money). Also remember that publishing a paper on how the carefully set up, long and costly experiment shows no statistically significant relationship tends to be impossible.

Specially in software engineering there are many uncontrollable variables. To account for them you'd need many replications of the experiment, and you get resources to do one or at best two.

score -3 · Answer 4 · edited May 23 '17 at 12:37

My question here is thusly posed as a student of CS disciplines as an aspirant to learn more about statistics: How do computer scientists approaches statistics?

there are several questions above & some not the same as the title question & in some ways this question has an underlying faulty premise/misconception about some lack of connection between statistics and CS. the general question is about the interface of computer science and statistics.

there is a vast, verging on intense overlap in some areas and it is an increasing trend with the new strongly emerging field of big data. at some schools (eg even elite "Ivy League" schools) the CS degree is tightly coupled with the mathematics and statistics departments and some have a joint major. there is a very strong interconnection in the CS/statistical field of Machine Learning. also the relatively new field of bioinformatics has very strong CS+statistical grounding.

there is an entire field Computational statistics focused on the interface!

Computational statistics, or statistical computing, is the interface between statistics and computer science. It is the area of computational science (or scientific computing) specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.[1]

yes, agreed, as pointed out in the question, there are many CS papers that dont use statistics, including situations (such as evaluating empirical experiments) where it might be even highly applicable & relevant, but exactly the same can be said of many other scientific fields, eg mathematics, and even more applied fields like physics.

there are many ways to use/apply statistics, some less rigorous than others, and not all contexts call for full application of the very advanced aspects of statistics. for example just running multiple experiments and plotting error bars for statistical deviation (or even merely averages!) is a basic use of statistics. more rigorous uses include hypothesis testing, but there is a general observation in the field that many scientific papers do not do rigorous hypothesis testing even where it might be applicable.

also, this question is tagged with software-engineering. this was my major, and a statistics class was required to pass to complete this degree at my school & to get an engineering-certified major (eg ABET), this is likely the case at many other universities. if one wants more applied and rigorous CS type principles such as applications of statistics, one can go the "software-engineering" route in education.

Combining computer science, statistics creates machines that can learn, Prof. John Lafferty teaches computers to extract knowledge from data
combined statistics and computer science major, U of Illinois
Dept math, CS, statistics, Purdue

Computer Science and Statistics, with roots deep in the traditions of mathematics, are exciting, rapidly expanding fields which provide the basis for many contemporary applications which affect us daily in such areas as commerce, industry, medicine, and environmental issues.
what statistics should a computer scientist know stackoverflow

How are statistics being applied in computer science to evaluate accuracy in research claims?

4 Answers4