Most Popular
1500 questions
9
votes
1 answer
Feature selection for Support Vector Machines
My question is three-fold
In the context of "Kernelized" support vector machines
Is variable/feature selection desirable - especially since we regularize the parameter C to prevent overfitting and the main motive behind introducing kernels to a SVM…
Nitin Srivastava
- 93
- 7
9
votes
3 answers
Is (nearly) all data separable?
Suppose I have some data set with two classes. I could draw a decision boundary around each data point belonging to one of these classes, and hence, separate the data, like so:
Where the red lines are the decision boundaries around the data points…
Data
- 467
- 3
- 12
9
votes
5 answers
Decision tree with final decision being a linear regression
Question:
I want to implement a decision tree with each leaf being a linear regression, does such a model exist (preferable in sklearn)?
Example case 1:
Mockup data is generated using the formula:
y = int(x) + x * 1.5
Which looks like:
I want to…
Nathan
- 193
- 1
- 6
9
votes
1 answer
How can you build a model that extracts data out from receipts?
I'm trying to build a model that is capable of identifying information on receipts and invoices.
I have used google cloud vision api for text extraction from the receipt but the problem is it just returns all the text from a receipt. I am looking to…
user_12
- 347
- 3
- 10
9
votes
4 answers
Validation loss much higher than training loss
I am training a CNN on some text data. The sentences are padded and embedded and fed to a CNN. The model architecture is:
model = Sequential()
model.add(Embedding(max_features, embedding_dims, input_length=maxlen))
model.add(Conv1D(128, 5,…
NoLand'sMan
- 215
- 1
- 3
- 6
9
votes
4 answers
How to deal with spelling errors NLP
I have some data where the main column is the description of one product. The main task is to extract the name of some product from this column, where it sometimes is spelled wrong and amended in other words. I have more than a thousand possible…
Roland
- 221
- 2
- 4
9
votes
1 answer
R - Interpreting neural networks plot
I know there are similar question on stats.SE, but I didn't find one that fulfills my request; please, before mark the question as a duplicate, ping me in the comment.
I run a neural network based on neuralnet to forecast SP500 index time series and…
Quantopik
- 279
- 1
- 3
- 14
9
votes
2 answers
Are there any graph embedding algorithms like this already?
I wrote an algorithm for generating node embeddings based on the graph's topology. Most of the explanation is done in the readme file and the examples.
The question is:
Am I reinventing the wheel?
Does this approach have any practical advantages…
monomonedula
- 201
- 1
- 2
9
votes
3 answers
How to combine GridSearchCV with Early Stopping?
I'm a beginner in machine learning and want to train a CNN (for image recognition) with optimized hyperparameter like dropout rate, learning rate and number of epochs.
The optimal hyperparameter I try to find via GridSearchCV from Scikit-learn.
I…
Code Now
- 403
- 1
- 6
- 11
9
votes
2 answers
How prevalent is `C/C++` in machine learning development?
I am currently a data scientist mostly doing NLP, and I do most of my work inPython. Since I didn't get a CS degree in undergrad, I've been limited to very high level languages; Java, Python, and R. I somehow even took Data Structures and Algorithms…
gust
- 237
- 1
- 7
9
votes
3 answers
How to run a pyspark application in windows 8 command prompt
I have a python script written with Spark Context and I want to run it. I tried to integrate IPython with Spark, but I could not do that. So, I tried to set the spark path [ Installation folder/bin ] as an environment variable and called…
SRS
- 1,065
- 5
- 11
- 22
9
votes
2 answers
What is the difference between gradient descent and gradient boosting? Are they interdependent on each other by any way?
What is the difference between gradient descent and gradient boosting? Are they interdependent on each other in any way ?
star
- 1,521
- 7
- 20
- 31
9
votes
2 answers
Relationship between VC dimension and degrees of freedom
I'm studying machine learning and I feel there is a strong relationship between the concept of VC dimension and the more classical (statistical) concept of degrees of freedom.
Can anyone explain such a connection?
stochazesthai
- 543
- 4
- 5
9
votes
1 answer
how to check all values in particular column has same data type or not?
I have column 'ABC' which has 5000 rows. Currently, dtype of column is object. Mostly it has string values but some values dtype is not string, I want to find all those rows and modify those rows. Column is as following:
1 abc
2 def
3 ghi
4 23
5…
Kiran
- 205
- 1
- 2
- 5
9
votes
2 answers
Any differences in regularisation in MLP between batch and individual updates?
I have just learned about regularisation as an approach to control over-fitting, and I would like to incorporate the idea into a simple implementation of backpropagation and Multilayer perceptron (MLP) that I put together.
Currently to avoid…
Neil Slater
- 29,388
- 5
- 82
- 101