Most Popular
1500 questions
9
votes
2 answers
Ways to reconstruct shuffled pixels of a video file?
Suppose that you have a video file which pixel order has been shuffled once. That is, a random order have been defined once and applied to all frames.
Does it exist some known approach for retrieving the initial order of pixels?
I have some ideas…
Denis Dollfus
- 93
- 3
9
votes
2 answers
Why does decreasing the SGD learning rate cause a massive increase in accuracy?
In papers such as this I often see training curves with this kind of shape:
In this case SGD was used with a factor of 0.9 and learning rate decreasing by a factor of 10 every 30 epochs.
Why is there such a large decrease in error when the…
geometrikal
- 533
- 1
- 5
- 14
9
votes
2 answers
"Deep Noether's Theorem": Building in Symmetry Constraints
If I have a learning problem that should have an inherent symmetry, is there a way to subject my learning problem to a symmetry constraint to enhance learning?
For example, if I am doing image recognition, I might want 2D rotational symmetry.…
user32280
9
votes
2 answers
Tensorflow regression model giving same prediction every time
import tensorflow as tf
x = tf.placeholder(tf.float32, [None,4]) # input vector
w1 = tf.Variable(tf.random_normal([4,2])) # weights between first and second layers
b1 = tf.Variable(tf.zeros([2])) # biases added to hidden…
Tarun
- 93
- 1
- 1
- 5
9
votes
3 answers
Human activity recognition using smartphone data set problem
I'm new to this community and hopefully my question will well fit in here.
As part of my undergraduate data analytics course I have choose to do the project on human activity recognition using smartphone data sets. As far as I'm concern this topic…
Jakubee
- 401
- 1
- 5
- 8
9
votes
1 answer
What are the most suitable machine learning algorithms according to type of data?
I am beginner to data science. I found that some machine learning algorithms perform better, when given particular kind of data(ie - numerical, categorical, text, graphical).
I searched about this topic on the web, but no luck.
I would like to know…
kaushalyap
- 211
- 1
- 2
- 5
9
votes
1 answer
How do I approach a classification problem where one of the classes is defined by 'not any of the others'
Suppose that I am interested in three classes $c_1$, $c_2$, $c_3$. But my dataset actually contains several more real classes $(c_j)_{j=4}^n$.
The obvious answer is to define a new class $\hat c_4$ that refers to all classes $c_j$, $j>3$ but I…
h3h325
- 253
- 1
- 6
9
votes
2 answers
What is the rationale for discretization of continuous features and when should it be done?
Continous feature discretization usually leads to lose of information due to the binning process. However most of the Top solutions for Kaggle Titanic are based on discretization(age,fare).
When should continuous features be discretized ? Is there…
drichlet
- 91
- 1
- 4
9
votes
2 answers
Why my training and validation loss is not changing?
I used MSE loss function, SGD optimization:
xtrain = data.reshape(21168, 21, 21, 21,1)
inp = Input(shape=(21, 21, 21,1))
x = Conv3D(filters=512, kernel_size=(3, 3, 3), activation='relu',padding='same')(inp)
x = MaxPool3D(pool_size=(3, 3,…
sp_713
- 115
- 1
- 2
- 4
9
votes
1 answer
Extracting individual emails from an email thread
Most of the open source datasets are well formatted i.e each email message is separated well like the enron email dataset. But out in the real world it is highly difficult to separate a top email message from a thread of emails.
For example consider…
Greedy Coder
- 153
- 1
- 6
9
votes
4 answers
Improving accuracy of Text Classification
I am working on a text classification problem, the objective is to classify news articles to their corresponding categories, but in this case the categories are not very broad like, politics, sports, economics, etc., but are very closely related and…
ac-lap
- 159
- 1
- 1
- 6
9
votes
1 answer
Can training label confidence be used to improve prediction accuracy?
I have training data that is labelled with binary values. I also have collected the confidence of each of these labels i.e. 0.8 confidence would mean that 80% of the human labellers agree on that label.
Is it possible to use this confidence data to…
Ben J. Hawkins
- 91
- 1
- 3
9
votes
1 answer
Using SVM as a binary classifier, is the label for a data point chosen by consensus?
I'm learning Support Vector Machines, and I'm unable to understand how a class label is chosen for a data point in a binary classifier. Is it chosen by consensus with respect to the classification in each dimension of the separating hyperplane?
gc5
- 879
- 2
- 9
- 17
9
votes
3 answers
Google prediction API: What training/prediction methods Google Prediction API employs?
The details of the Google Prediction API are on this page, but I am not able to find any details about the prediction algorithms running behind the API.
So far I have gathered that they let you provide your preprocessing steps in PMML format.
Tahir Akhtar
- 315
- 2
- 9
9
votes
2 answers
Predicting probability from scikit-learn SVC decision_function with decision_function_shape='ovo'
I have a multiclass SVM classifier with labels 'A', 'B', 'C', 'D'.
This is the code I'm running:
>>>print clf.predict([predict_this])
['A']
>>>print clf.decision_function([predict_this])
[[ 185.23220833 43.62763596 180.83305074 -93.58628288 …
Samkit Jain
- 213
- 1
- 2
- 9