Highest Voted Questions - Data Science Stack Exchange

9

votes

2 answers

Ways to reconstruct shuffled pixels of a video file?

Suppose that you have a video file which pixel order has been shuffled once. That is, a random order have been defined once and applied to all frames. Does it exist some known approach for retrieving the initial order of pixels? I have some ideas…

statistics convolutional-neural-network convolution image-recognition tsne

asked Sep 30 '17 at 15:57

Denis Dollfus

93
3

9

votes

2 answers

Why does decreasing the SGD learning rate cause a massive increase in accuracy?

In papers such as this I often see training curves with this kind of shape: In this case SGD was used with a factor of 0.9 and learning rate decreasing by a factor of 10 every 30 epochs. Why is there such a large decrease in error when the…

optimization

asked Sep 21 '17 at 16:02

geometrikal

533
1
5
14

9

votes

2 answers

"Deep Noether's Theorem": Building in Symmetry Constraints

If I have a learning problem that should have an inherent symmetry, is there a way to subject my learning problem to a symmetry constraint to enhance learning? For example, if I am doing image recognition, I might want 2D rotational symmetry.…

machine-learning

asked Aug 04 '17 at 16:44

user32280

9

votes

2 answers

Tensorflow regression model giving same prediction every time

import tensorflow as tf x = tf.placeholder(tf.float32, [None,4]) # input vector w1 = tf.Variable(tf.random_normal([4,2])) # weights between first and second layers b1 = tf.Variable(tf.zeros([2])) # biases added to hidden…

neural-network deep-learning regression tensorflow

asked Aug 04 '17 at 11:22

Tarun

93
1
1
5

9

votes

3 answers

Human activity recognition using smartphone data set problem

I'm new to this community and hopefully my question will well fit in here. As part of my undergraduate data analytics course I have choose to do the project on human activity recognition using smartphone data sets. As far as I'm concern this topic…

bigdata machine-learning databases clustering data-mining

asked May 27 '14 at 10:41

Jakubee

401
1
5
8

9

votes

1 answer

What are the most suitable machine learning algorithms according to type of data?

I am beginner to data science. I found that some machine learning algorithms perform better, when given particular kind of data(ie - numerical, categorical, text, graphical). I searched about this topic on the web, but no luck. I would like to know…

machine-learning algorithms data

asked Jun 23 '17 at 02:09

kaushalyap

211
1
2
5

9

votes

1 answer

How do I approach a classification problem where one of the classes is defined by 'not any of the others'

Suppose that I am interested in three classes $c_1$, $c_2$, $c_3$. But my dataset actually contains several more real classes $(c_j)_{j=4}^n$. The obvious answer is to define a new class $\hat c_4$ that refers to all classes $c_j$, $j>3$ but I…

machine-learning classification

asked Jun 17 '17 at 18:05

h3h325

253
1
6

9

votes

2 answers

What is the rationale for discretization of continuous features and when should it be done?

Continous feature discretization usually leads to lose of information due to the binning process. However most of the Top solutions for Kaggle Titanic are based on discretization(age,fare). When should continuous features be discretized ? Is there…

machine-learning statistics feature-selection algorithms feature-extraction

asked Jun 17 '17 at 04:23

drichlet

91
1
4

9

votes

2 answers

Why my training and validation loss is not changing?

I used MSE loss function, SGD optimization: xtrain = data.reshape(21168, 21, 21, 21,1) inp = Input(shape=(21, 21, 21,1)) x = Conv3D(filters=512, kernel_size=(3, 3, 3), activation='relu',padding='same')(inp) x = MaxPool3D(pool_size=(3, 3,…

machine-learning deep-learning autoencoder

asked Jun 09 '17 at 05:16

sp_713

115
1
2
4

9

votes

1 answer

Extracting individual emails from an email thread

Most of the open source datasets are well formatted i.e each email message is separated well like the enron email dataset. But out in the real world it is highly difficult to separate a top email message from a thread of emails. For example consider…

classification scikit-learn apache-spark preprocessing sentiment-analysis

asked Jun 01 '17 at 13:02

Greedy Coder

153
1
6

9

votes

4 answers

Improving accuracy of Text Classification

I am working on a text classification problem, the objective is to classify news articles to their corresponding categories, but in this case the categories are not very broad like, politics, sports, economics, etc., but are very closely related and…

machine-learning nlp feature-selection svm sentiment-analysis

asked May 28 '17 at 12:56

ac-lap

159
1
1
6

9

votes

1 answer

Can training label confidence be used to improve prediction accuracy?

I have training data that is labelled with binary values. I also have collected the confidence of each of these labels i.e. 0.8 confidence would mean that 80% of the human labellers agree on that label. Is it possible to use this confidence data to…

machine-learning classification regression scikit-learn svm

asked May 24 '17 at 16:13

Ben J. Hawkins

91
1
3

9

votes

1 answer

Using SVM as a binary classifier, is the label for a data point chosen by consensus?

I'm learning Support Vector Machines, and I'm unable to understand how a class label is chosen for a data point in a binary classifier. Is it chosen by consensus with respect to the classification in each dimension of the separating hyperplane?

svm classification binary

asked May 21 '14 at 15:12

gc5

879
2
9
17

9

votes

3 answers

Google prediction API: What training/prediction methods Google Prediction API employs?

The details of the Google Prediction API are on this page, but I am not able to find any details about the prediction algorithms running behind the API. So far I have gathered that they let you provide your preprocessing steps in PMML format.

tools

asked May 21 '14 at 11:22

Tahir Akhtar

315
2
9

9

votes

2 answers

Predicting probability from scikit-learn SVC decision_function with decision_function_shape='ovo'

I have a multiclass SVM classifier with labels 'A', 'B', 'C', 'D'. This is the code I'm running: >>>print clf.predict([predict_this]) ['A'] >>>print clf.decision_function([predict_this]) [[ 185.23220833 43.62763596 180.83305074 -93.58628288 …

machine-learning python scikit-learn svm

asked Apr 15 '17 at 10:18

Samkit Jain

213
1
2
9

Most Popular