Questions tagged [anova]
12 questions
3
votes
1 answer
When should mutual information be used for feature selection over other feature selection methods like correlation, ANOVA , etc?
I have a data set with categorical and continuous/ordinal explanatory variables and continuous target variable. I tried to filter features using one-way ANOVA for categorical variables and using Spearman's correlation coefficient for…
Ankita Talwar
- 357
- 1
- 10
2
votes
1 answer
Score of ANOVA in selected features
I selected features using ANOVA (because I have Numerical data as input and Categorical data as target):
anova = SelectKBest(score_func=f_classif, k='all')
anova.fit(X_train, y_train.values.argmax(1)) # y_train.values.argmax(1) because I already…
Mimi
- 65
- 8
2
votes
1 answer
What conclusion can I get when the variable is influenced by other but there isn't any correlation?
I am doing an analytic exploratory analysis.
If the target is a continuous variable and the attributes are all categorical (discrete values), in order to know if exist any influence on the target from the each attribute I am doing the ANOVA-test…
Tlaloc-ES
- 337
- 1
- 7
2
votes
0 answers
ANOVA for mean difference b/w groups abnormal distribution, large sample size
I have $10$ groups - sample size $n>700$: resampled to $710$ for ANOVA - visually these distributions are not normal, slight bimodlity in the sets.
I ran an ANOVA, and got a $P\approx 0.089$. It coincides with what I expected from the histograms,…
2
votes
1 answer
ANOVA procedure - Regression
I am new to regression. Can someone explain to me how the regression sum of squares shows the explained variation? Essentially, why is it (y hat - y bar)? I hope i'm explaining my question accurately. i tried drawing a graph with the regression…
Michael
- 21
- 1
1
vote
0 answers
Levene test for equal variance
I would like to run one-way ANOVA test on my data. I saw that one of several assumptions for one-way ANOVA is that there needs to be homogeneity of variances. I have run the test for different data-sets. I find sometimes my p-values are larger than…
Reut
- 299
- 3
- 15
1
vote
1 answer
Question on ANOVA and Correlation/Association
I've been working on examining statistical relationships between variable:
Pearsons, Spearman's for continuous variables
Kendall's Tau, Cramer's V for ordinal/nominal variables.
I know there's many more ways. Recently I read about ANOVA and…
rocksNwaves
- 309
- 1
- 11
1
vote
2 answers
Are Chi-square and ANOVA (f_classif) to select best features?
I have a binary classification problem (target 0 o 1), I have both variables continuous and categorical as features. I understood that about Chi-square i can use only categorical features to evaluate them. What about ANOVA (f_classif)? It's the…
SimoneA
- 41
- 1
- 5
0
votes
1 answer
pass variable length argument to mstats.kruskalwallis
I am trying to run kruskawallis test on multiple columns of my data for that i wrote an function
var=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
def kruskawallis_test(column):
…
Ayush Ranjan
- 411
- 1
- 4
- 15
0
votes
1 answer
What does it mean to have 1 degree of freedom in ANOVA test?
So I used python to run multi-factorial ANOVA analysis on a data set. I first used a ols.fit() and then the anova_lm function. I realized for the variables I am analyzing their degree of freedom is 1. Does that mean only 1 value out of my data is…
kalone_mevin
- 3
- 2
0
votes
0 answers
Statistical significance on aggregate data to show that the groups are different?
I am working with performance data for three groups for each region. The denominator for the groups is the number of people who are identified as low performers. For region A, Group-1 low performer %= 40% , group-2= 30% , group-3 low performer= 30%.…
user728148
- 21
- 1
- 3
0
votes
2 answers
What model should I use to predict monthly sales by products?
I am trying to predict monthly sales by product based on a plethora of variables. There are 4 predictors. One is categorical (month) and the other three are numerical. One of the variables is just part sales.
The data I am trying to predict is…
Lauren
- 1