Highest Voted Questions - Data Science Stack Exchange

9

votes

3 answers

How to detect cardboard boxes using Neural Network

I'm trying to train a Neural Network how to detect cardboard boxes along with multiple classes of persons (people). Although it's easy to detect persons and correctly classifies them, it's incredibly hard to detect cardboard boxes. The boxes look…

neural-network deep-learning cnn image-classification

asked May 28 '19 at 11:47

Martin Brisiak

151
1
7

9

votes

3 answers

Score matrix string similarity

I have a load of documents, which have a load of key value pairs in them. The key might not be unique so there might be multiple keys of the same type with different values. I want to compare the similarity of the keys between 2 documents. More…

algorithms similarity

asked Jun 22 '14 at 21:45

David

95
5

9

votes

2 answers

Why class weight is outperforming oversampling?

I am applying both class_weight and oversampling (SMOTE) techniques on a multiclass classification problem and getting better results when using the class_weight technique. Could someone please explain what could be the cause of this difference?

multiclass-classification class-imbalance smote

asked May 26 '19 at 01:09

Sarah

621
2
5
17

9

votes

2 answers

What does pandas describe() percentiles values tell about our data?

Let say this is my dataframe x=[0.09, 0.95, 0.93, 0.93, 0.34, 0.29, 0.14, 0.23, 0.91, 0.31, 0.62, 0.29, 0.71, 0.26, 0.79, 0.3 , 0.1 , 0.73, 0.63, 0.61] x=pd.DataFrame(x) When we x.describe() this dataframe we get result as this >>>…

python pandas

asked May 25 '19 at 16:48

Eka

301
1
3
11

9

votes

2 answers

How to Use Shap Kernal Explainer with Pipeline models?

I have a pandas DataFrame X. I would like to find the prediction explanation of a a particular model. My model is given below: pipeline = Pipeline(steps= [ ('imputer', imputer_function()), ('classifier', RandomForestClassifier() …

machine-learning machine-learning-model data-science-model ipython

asked May 23 '19 at 14:57

Nayana Madhu

436
1
3
8

9

votes

1 answer

How to arrange the dataset/images for CNN+LSTM

I am working on an image classification problem using Transfer Learning with Resnet50 as base model (in Keras) (For example Class A and Class B). There is a time factor involved in this classification. For example, I need sufficient evidence to make…

keras cnn lstm image-classification transfer-learning

asked May 13 '19 at 02:44

deepguy

1,471
8
21
39

9

votes

2 answers

Does feature selections matter to Decision Tree algorithms?

Background: Currently I'm working on my thesis project, which is to build Tree-based ensemble methods for classification on a large data set. Before I started with modeling, I've spent a large amount of time on feature selection using…

machine-learning feature-selection decision-trees

asked May 08 '19 at 13:17

Ping

91
1
1
4

9

votes

1 answer

How can I do simple machine learning without hard-coding behavior?

I've always been interested in machine learning, but I can't figure out one thing about starting out with a simple "Hello World" example - how can I avoid hard-coding behavior? For example, if I wanted to "teach" a bot how to avoid randomly placed…

machine-learning

asked May 13 '14 at 23:58

Doorknob

215
2
8

9

votes

1 answer

Generate predictions that are orthogonal (uncorrelated) to a given variable

I have an X matrix, a y variable, and another variable ORTHO_VAR. I need to predict the y variable using X, however, the predictions from that model need to be orthogonal to ORTHO_VAR while being as correlated with y as possible. I would prefer…

correlation

asked Apr 13 '19 at 03:32

Chris

224
2
9

9

votes

5 answers

Tutorials on topic models and LDA

I would like to know if you people have some good tutorials (fast and straightforward) about topic models and LDA, teaching intuitively how to set some parameters, what they mean and if possible, with some real examples.

topic-model lda

asked Jan 08 '15 at 15:47

pedrobisp

191
1
1
3

9

votes

1 answer

Similarity measure based on multiple classes from a hierarchical taxonomy?

Could anyone recommend a good similarity measure for objects which have multiple classes, where each class is part of a hierarchy? For example, let's say the classes look like: 1 Produce 1.1 Eggs 1.1.1 Duck eggs 1.1.2 Chicken eggs 1.2…

similarity

asked Jan 08 '15 at 10:09

Dave Challis

395
2
10

9

votes

2 answers

In which cases shouldn't we drop the first level of categorical variables?

Beginner in machine learning, I'm looking into the one-hot encoding concept. Unlike in statistics when you always want to drop the first level to have k-1 dummies (as discussed here on SE), it seems that some models needs to keep it and have k…

machine-learning algorithms encoding dummy-variables

asked Mar 19 '19 at 19:55

Dan Chaltiel

341
2
10

9

votes

3 answers

Fuzzy name and nickname match

I have a dataset with the following structure: full_name,nickname,match Christian Douglas,Chris,1, Jhon Stevens,Charlie,0, David Jr Simpson,Junior,1 Anastasia Williams,Stacie,1 Lara Williams,Ana,0 John Williams,Willy,1 where each predictor row…

deep-learning nlp

asked Mar 19 '19 at 13:36

David Masip

6,136
2
28
62

9

votes

3 answers

predict gives the same output value for every image (Keras)

I am trying to classify images and assign them label 1 or 0. (Skin cancer or not). I am aware of the three main issues regarding having the same output in every input. I did not split the set and I'm just trying to apply the CNN on the train set,…

python neural-network keras image-classification

asked Mar 06 '19 at 11:16

Florian Laborde

115
1
1
7

9

votes

1 answer

Bert Fine Tuning with additional features

I want to use Bert for an nlp task. But I also have additional features that I would like to include. From what I have seen, with fine tuning, one only changes the labels and retrains the classification layer. Is there a way to used pre-trained…

nlp bert

asked Mar 05 '19 at 02:57

Jeff

193
1
3

Most Popular