Highest Voted Questions - Data Science Stack Exchange

9

votes

1 answer

Difference between tf-idf and tf with Random Forests

I am working on a text classification problem using Random Forest as classifiers, and a bag-of-words approach. I am using the basic implementation of Random Forests (the one present in scikit), that creates a binary condition on a single variable…

classification text-mining random-forest

asked Sep 16 '14 at 08:14

papafe

595
1
5
9

9

votes

2 answers

MLOps for beginner

I am 1 year old in ML and have been using jupyter notebook to build static models all these days, do some analysis and present my results to the bosses as it was all POC. Now, we would like to scale the solution to become automatic and be able to…

machine-learning deep-learning neural-network predictive-modeling mlops

asked Jul 03 '22 at 14:05

The Great

2,725
3
23
49

9

votes

5 answers

How does deep learning helps in detecting multiple objects in single image?

Let's say there are two cars in an image. How can it detect these cars, given that it can detect single car in an image?

deep-learning convolutional-neural-network object-recognition

asked Apr 07 '16 at 19:15

Amanuel Negash

471
4
8

9

votes

1 answer

Sum vs mean of word-embeddings for sentence similarity

So, say I have the following sentences ["The dog says woof", "a king leads the country", "an apple is red"] I can embed each word using an N dimensional vector, and represent each sentence as either the sum or mean of all the words in the sentence…

nlp word-embeddings word2vec

asked May 06 '22 at 13:23

CutePoison

520
3
10

9

votes

2 answers

How to build a textual search engine?

I am having an HTML string and want to find out if a word I supply is relevant in that string. Relevancy could be measured based on frequency in the text. An example to illustrate my problem: this is an awesome bike store bikes can be purchased…

machine-learning data-mining

asked Sep 12 '14 at 11:48

Hendrik

191
2

9

votes

3 answers

Encoding before vs after train test split?

Am new to ML and working on a dataset with lot of categorical variables with high cardinality. I observed that in lot of tutorials for encoding like here, the encoding is applied after the train and test split. Can I check why is it done so? Why…

machine-learning deep-learning neural-network classification machine-learning-model

asked Feb 01 '22 at 07:50

The Great

2,725
3
23
49

9

votes

2 answers

Training Deep Nets on an Ordinary Laptop

Would it be possible for a an amateur who is interested in getting some "hands-on" experience in desining and training deep neural networks, to use an ordinary laptop for that purpose (no GPU), or is it hopeless to get good results in reasonable…

machine-learning deep-learning

asked Feb 20 '16 at 07:24

Lior

223
1
2
6

9

votes

1 answer

Understanding Reinforcement Learning with Neural Net (Q-learning)

I am trying to understand reinforcement learning and markov decision processes (MDP) in the case where a neural net is being used as the function approximator. I'm having difficulty with the relationship between the MDP where the environment is…

machine-learning neural-network q-learning

asked Feb 18 '16 at 10:11

CatsLoveJazz

247
1
10

9

votes

1 answer

Should I take random elements for mini-batch gradient descent?

When implementing mini-batch gradient descent for neural networks, is it important to take random elements in each mini-batch? Or is it enough to shuffle the elements at the beginning of the training once? (I'm also interested in sources which…

machine-learning neural-network

asked Feb 11 '16 at 16:35

Martin Thoma

19,540
36
98
170

8

votes

1 answer

What is a "residual mapping"?

A recent paper by He et al. (Deep Residual Learning for Image Recognition, Microsoft Research, 2015) claims that they use up to 4096 layers (not neurons!). I am trying to understand the paper, but I stumble about the word "residual". Could somebody…

machine-learning neural-network

asked Jan 24 '16 at 16:49

Martin Thoma

19,540
36
98
170

8

votes

2 answers

Image clustering by similarity measurement (CW-SSIM)

I'm trying to use scikit-learn and pyssim for clustering a set of images - less than 100. The end goal is to place the images into several buckets (clusters) according to the calculated similarity measures - CW-SSIM. The task seems to be trivial,…

machine-learning r python scikit-learn k-means

asked Jan 10 '16 at 19:44

Oleg Puzanov

111
1
4

8

votes

1 answer

Why autoencoders use binary_crossentropy loss and not mean squared error?

Keras autoencoders examples: (https://blog.keras.io/building-autoencoders-in-keras.html) use binary_crossentropy (BCE) as loss function. Why they use binary_crossentropy (BCE) and not mse ? According to keras example, the input to the…

deep-learning keras autoencoder mse

asked Jun 29 '21 at 10:25

user3668129

769
4
15

8

votes

1 answer

How to choose the right threshold for binary classification?

I am currently working on the titanic dataset from Kaggle. The data set is imbalanced with almost 61.5 % negative and 38.5 positive class. I divided my training dataset into 85% train and 15% validation set. I chose a support vector classifier as…

machine-learning accuracy model-evaluations binary-classification

asked Jun 16 '21 at 09:07

Joe

105
1
1
6

8

votes

4 answers

How to give name to topics created using LDA?

I have categorized 800,000 documents into 500 categories using the Mahout topic modelling. Instead of representing the topic using the top 5/10 words for each topics, I want to infer a generic name for the group using any existing algorithm. For the…

machine-learning data-mining nlp text-mining topic-model

asked Jan 07 '16 at 04:28

adihere

81
1
1
2

8

votes

2 answers

How to teach neural network a policy for a board game using reinforcement learning?

I need to use reinforcement learning to teach a neural net a policy for a board game. I chose Q-learining as the specific alghoritm. I'd like a neural net to have the following structure: layer - rows * cols + 1 neurons - input - values of…

machine-learning neural-network reinforcement-learning q-learning

asked Jan 05 '16 at 13:28

Luke

189
1
11

Most Popular