Highest Voted 'statistics' Questions - Computer Science Stack Exchange

30

votes

12 answers

Why is overfitting bad?

I've studied this lots, and they say overfitting the actions in machine learning is bad, yet our neurons do become very strong and find the best actions/senses that we go by or avoid, plus can be de-incremented/incremented from bad/good by bad or…

machine-learning statistics

asked Jan 06 '16 at 22:39

Friendly Person 44

421
4
5

20

votes

1 answer

Applying Expectation Maximization to coin toss examples

I've been self-studying the Expectation Maximization lately, and grabbed myself some simple examples in the process: From here: There are three coins $c_0$, $c_1$ and $c_2$ with $p_0$, $p_1$ and $p_2$ the respective probability for landing on Head…

probability-theory statistics

asked Mar 20 '13 at 05:13

IcySnow

345
3
6

15

votes

4 answers

What is the relation between correlation and causation in machine learning?

It is a well-known fact that "Correlation doesn't equal causation", but machine learning seems to be almost entirely based on correlation. I'm working on a system to estimate the performance of students on questions based on their past performances.…

machine-learning statistics

asked Feb 18 '14 at 04:51

Casebash

333
2
7

13

votes

1 answer

Smoothing in Naive Bayes model

A Naive Bayes predictor makes its predictions using this formula: $$P(Y=y|X=x) = \alpha P(Y=y)\prod_i P(X_i=x_i|Y=y)$$ where $\alpha$ is a normalizing factor. This requires estimating the parameters $P(X_i=x_i|Y=y)$ from the data. If we do this with…

machine-learning probability-theory statistics

asked Aug 02 '12 at 15:47

Chris Taylor

231
1
2
4

9

votes

1 answer

What is the optimal algorithm for playing the hangman word game?

Suppose we are playing the game hangman. My opponent and I both have access to the dictionary during the game. My opponent picks a word from the dictionary with knowledge of the algorithm which I will use to guess his secret word. Once my opponent…

game-theory statistics monte-carlo

asked Jul 24 '20 at 09:06

zfj3ub94rf576hc4eegm

193
1
7

9

votes

1 answer

Conditional Probabilities as Tensors?

Is it proper to view conditional probabilities, such as the forms: P(a|c) P(a|c,d) P(a, b|c, d) ...and so forth, as being tensors? If so, does anyone know of a decent introductory text (online tutorial, workshop paper, book, etc) which develops…

machine-learning probability-theory statistics

asked May 17 '13 at 23:18

Novak

211
1
8

9

votes

4 answers

How are statistics being applied in computer science to evaluate accuracy in research claims?

I have noticed in my short academic life that many published papers in our area sometimes do not have much rigor regarding statistics. This is not just an assumption; I have heard professors say the same. For example, in CS disciplines I see papers…

software-engineering empirical-research statistics

asked Apr 07 '12 at 01:53

Oeufcoque Penteano

405
1
4
11

8

votes

2 answers

Maintain statistics over a sliding window (robust & efficient)

I am looking for an algorithms to maintain several statistics over a sliding window. The setup is as follows: There is a datastream consisting of (real value,timestamp) tuples. The values for the last x seconds are stored. In each iteration at least…

algorithms statistics

asked Apr 04 '15 at 13:47

Johannes

81
1
2

8

votes

1 answer

What does it mean for a random number generator's sequence to be only 1-dimensionally equidistributed?

Whilst reading up on Xorshift I came across the following (emphases added): The following xorshift+ generator, instead, has 128 bits of state, a maximal period of 2^128 − 1 and passes BigCrush: [snip code] This generator is one of the fastest…

randomness statistics pseudo-random-generators

asked Apr 23 '14 at 02:03

Claudiu

261
1
8

8

votes

1 answer

What Is The Complexity of Implementing a Particle Filter?

In a video discussing the merits of particle filters for localization, it was implied that there is some ambiguity about the complexity cost of particle filter implementations. Is this correct? Could someone explain this?

computational-geometry knowledge-representation reasoning statistics

asked Mar 08 '12 at 02:38

DorkRawk

303
1
7

8

votes

2 answers

VC dimension of linear separator in 3D

I am confused about the Vapnik-Chervonenkis dimension of a linear separator in 3 dimensions. In three dimensions, a linear separator would be a plane, and the classification model would be "everything on one side of a plane." It's apparently…

statistics learning-theory vc-dimension classification

asked Apr 25 '13 at 03:25

Jason Smith

81
1
2

7

votes

2 answers

How best to statistically verify random numbers?

Lets say I have 1000 bytes that are supposedly random. I want to verify to a certain certainty that they are indeed random and evenly distributed across all byte values. Aside from calculating the standard deviation and mean value, what are my…

randomness statistics entropy

asked Dec 08 '16 at 04:49

Mr. Negi

73
4

6

votes

2 answers

Applying graph "adjustment" algorithms to Elo rating system

I'm trying to address what I perceive to be a potential shortcoming in the Elo Rating System (predominantly used by the international Chess community to rate + rank players). I have a two-player game in mind (not Chess) that is played all over the…

graphs statistics

asked Nov 28 '17 at 14:44

smeeb

153
4

6

votes

1 answer

Showing that Bayes classifier is optimal

Consider domain $X$, label set $ Y=\{0,1\}$ and the zero-one loss. Given any probability distribution D over $ X\times \{0,1\} $, we've defined the Bayes classifier $ f_D $ to be- $$ f_{D}(x)= \begin{cases} 1 & \text{if…

machine-learning classification statistics

asked Mar 31 '17 at 10:25

Alex Goft

235
2
7

5

votes

1 answer

Reconstructing a data table from cross-tabulation frequencies

Say there is a data table $D$ that we cannot see, with $M$ columns. We are given exact cross-tabulation frequencies for all ${M \choose 2}$ pairs of columns, that is how often each combination of two values occurs. From the cross-tabulations, we can…

algorithms complexity-theory combinatorics statistics

asked Sep 28 '12 at 20:16

Sarkom

213
1
5

Questions tagged [statistics]