Questions tagged [statistical-inference]

The area of statistics that focuses on taking information from samples of a population, in order to derive information on the entire population.

Statistical inference makes propositions about a population using data sampled from the population. To test a hypothesis about a population, a typical workflow is to select a statistical model of the process that generates the data and then deduce propositions from the model.

Statistical propositions include—

  • a point estimate, which is a particular value that best approximates some parameter of interest,

  • an interval estimate, for example, a confidence interval (or set estimate), which is an interval constructed using a data set drawn from a population so that, under repeated sampling of such data sets, such intervals would contain the true parameter value with the probability at the stated confidence level,

  • a credible interval, which is a set of values containing, for example, 95% of posterior belief,

  • rejection of a hypothesis, or

  • clustering or classification of data points into groups.

4004 questions
53
votes
4 answers

Why the sum of residuals equals 0 when we do a sample regression by OLS?

That's my question, I have looking round online and people post a formula by they don't explain the formula. Could anyone please give me a hand with that ? cheers
32
votes
5 answers

What situation calls for dividing the standard deviation by $\sqrt n$?

While doing my homework and checking my answers with the book's answers I noticed that sometimes the standard deviation is divided by $\sqrt n$ where $n$ is the sample size. I'm a little confused. For my current problem I am trying to find the…
TheHopefulActuary
  • 4,850
  • 12
  • 57
  • 81
32
votes
4 answers

Maximum Likelihood Estimator of parameters of multinomial distribution

Suppose that 50 measuring scales made by a machine are selected at random from the production of the machine and their lengths and widths are measured. It was found that 45 had both measurements within the tolerance limits, 2 had satisfactory length…
23
votes
4 answers

Strange distribution of movie ratings

I like math but I also like movies. I have been collecting movies all my life. My collection is rather huge: almost 25.000 movies. Being also a developer I was able to create my own online catalogue and pull various statistics from the database.…
21
votes
1 answer

Best way to play 20 questions

Background You and I are going to play a game. To start off with I play a measurable function $f_1$ and you respond with a real number $y_1$ (possibly infinite). We repeat this some fixed number $N$ of times, to obtain a collection…
21
votes
2 answers

Why does the number of possible probability distributions have the cardinality of the continuum?

Wikipedia's article on parametric statistical models (https://en.wikipedia.org/wiki/Parametric_model) mentions that you could parameterize all probability distributions with a one-dimensional real parameter, since the set of all probability measures…
18
votes
2 answers

Vague Gamma prior?

I'm looking at a MCMC algorithm where the author takes a Gamma(shape = 0.001, rate = 0.001) prior distribution, which they refer to as a vague prior. For all my searching, I am struggling to see how this is vague. The density seems to spread…
Nicholas
  • 183
17
votes
2 answers

Probability vs Confidence

My notes on confidence give this question: An investigator is interested in the amount of time internet users spend watching TV a week. He assumes $\sigma = 3.5$ hours and samples $n=50$ users and takes the sample mean to estimate the population…
15
votes
3 answers

What is the most general formalism for machine learning?

Most of the literature I can find in the field of machine learning is extremely practical, listing many techniques you can use like neural networks, SVMs, random forests, and so on. There are lots of suggestions on implementations and what…
15
votes
2 answers

Distribution of Sum of Discrete Uniform Random Variables

I just had a quick question that I hope someone can answer. Does anyone know what the distribution of the sum of discrete uniform random variables is? Is it a normal distribution? Thanks!
13
votes
2 answers

Probability - Interview Question - Hidden Assumptions and Phrasing Issues

I’ve encountered the following seemingly simple probability interview question in my workplace: Two reviewers were tasked with finding errors in a book. The first had found 40 errors and the other had found 60. 20 of the found errors were found in…
12
votes
1 answer

Minimal sufficient statistics for uniform distribution on $(-\theta, \theta)$

Let $X_1,\dots,X_n$ be a sample from uniform distribution on $(-\theta,\theta)$ with parameter $\theta>0$. It is easy to show that $T(X) = (X_{(1)},X_{(n)})$ is a sufficient statistic for $\theta$ where $X_{(1)}$ and $X_{(n)}$ stands for the minimum…
12
votes
1 answer

Distribution of $\sum\limits_{i=1}^{N}X_{i}$ conditionally on $\sum\limits_{i=1}^{N}X_{i}^{2}$ for i.i.d. standard normal $X_i$s

Assume that the random variables $X_{i}$ are i.i.d $\mathcal{N}\left(0,1\right)$, then: $$S_N=\sum_{i=1}^{N}X_{i}\sim\mathcal{N}\left(0,N\right)\qquad\qquad T_N=\sum_{i=1}^{N}X_{i}^{2}\sim\chi^{2}\left(N\right)$$ What can be said about the…
11
votes
2 answers

What is the motivation for using cross-entropy to compare two probability vectors?

Define a "probability vector" to be a vector $p = (p_1,\ldots, p_K) \in \mathbb R^K$ whose components are nonnegative and which satisfies $\sum_{k=1}^K p_k = 1$. We can think of a probability vector as specifying a probability mass function (PMF)…
11
votes
4 answers

Is there a mathematical basis for the idea that this interpretation of confidence intervals is incorrect, or is it just frequentist philosophy?

Suppose the mean time it takes all workers in a particular city to get to work is estimated as $21$. A $95\%$ confident interval is calculated to be $(18.3, 23.7).$ According to this website, the following statement is incorrect: There is a $95\%$…
1
2 3
99 100