Most Popular
1500 questions
9
votes
1 answer
When does decision tree perform better than the neural network?
I was experimenting with different modelling methods including KNN, Decision Trees, Neural Networks and SVN and trying to fit my data to see which works the best. To my surprise, the decision tree works the best with training accuracy of 1.0 and…
Suhail Gupta
- 611
- 8
- 15
9
votes
2 answers
Display Images (url) Inside Pandas Dataframe
I would like to display images (mostly jpg and png formats) directly from their url link inside a pandas dataframe. Imagine I already have the following dataframe:
id image_url
1 …
TwinPenguins
- 4,429
- 3
- 22
- 54
9
votes
2 answers
Is there any consensus on choosing an appropriate ML approach?
I am studying data science at the moment and we are taught a dizzying variety of basic regression/classification techniques (linear, logistic, trees, splines, ANN, SVM, MARS, and so on....), along with a variety of extra tools (bootstrapping,…
Brendan Hill
- 155
- 8
9
votes
3 answers
R random forest on Amazon ec2 Error: cannot allocate vector of size 5.4 Gb
I am training random forest models in R using randomForest() with 1000 trees and data frames with about 20 predictors and 600K rows. On my laptop everything works fine, but when I move to amazon ec2, to run the same thing, I get the error:
Error:…
SOUser
9
votes
2 answers
Dealing with feature vectors of variable length
How does one deal with a feature vector that can vary in size?
Let's say per object, I calculate 4 features. In order to solve a certain regression problem, I may have 1, 2, or more of these objects (no more than 10). Thus, the feature vector is…
Otto Nahmee
- 91
- 1
- 4
9
votes
3 answers
Interactive Graphing while logging data
I'm looking to graph and interactively explore live/continuously measured data. There are quite a few options out there, with plot.ly being the most user-friendly. Plot.ly has a fantastic and easy to use UI (easily scalable, pannable, easily…
Clayton Pipkin
- 93
- 3
9
votes
3 answers
Text classification with thousands of output classes in Keras
Task:
I have a dataset with job titles and descriptions. The task is to predict tags for job by job title and description.
There are several tags for each job posting. Therefore, the number of labels for the model will be measured in tens of…
lemon
- 205
- 2
- 6
9
votes
3 answers
How to use Cross Entropy loss in pytorch for binary prediction?
In the pytorch docs, it says for cross entropy loss:
input has to be a Tensor of size (minibatch, C)
Does this mean that for binary (0,1) prediction, the input must be converted into an (N,2) tensor where the second dimension is equal to (1-p)?
So…
AAC
- 509
- 2
- 6
- 13
9
votes
2 answers
Difference between using RMSE and nDCG to evaluate Recommender Systems
What kind of error measures do RMSE and nDCG give while evaluating a recommender system, and how do I know when to use one over the other? If you could give an example of when to use each, that would be great as well!
covfefe
- 293
- 4
- 7
9
votes
2 answers
How does a FC layer work in a typical CNN
I am new to CNNs and NNs. I am reading this blog: CNN and I am confused about this part: What confuses me is the operation that will be performed on an input vector/matrix. Will we be using a typical ANN equation: "O = W.T * input"?. And then a…
user57521
- 91
- 1
- 1
- 2
9
votes
1 answer
Using class weights in Keras with multiple binary outputs which are not simply one-hot-encoded
My labels are binary vectors of length 5, e.g., [0, 0, 1, 1, 1].
My label set is very biased, 1-to-50, where the case [0, 0, 0, 0, 0] is very common while all other combinations are not. I'd like to weight the uncommon versions using the…
André Christoffer Andersen
- 336
- 3
- 9
9
votes
2 answers
How to learn 3D orientations reliably?
I am working on neural network models for 3D skeletal character animation, where I learn joint positions and orientations. The problem comes with the orientations. There are several ways I can choose to represent a 3D rotation, but all of them have…
javidcf
- 306
- 1
- 8
9
votes
5 answers
Measuring the uncertainty of predictions
Given a multiclass classification model, with n features, how can I measure the uncertainty of the model for that particular classification?
Let's say that for some class the model accuracy is amazing, but for another it's not. I would like to find…
Latent
- 334
- 3
- 16
9
votes
4 answers
How to download a Jupyter Notebook from GitHub?
This is a fairly basic question.
I am working on a data science project inside of a Pandas tutorial. I can access my Jupyter notebooks through my Anaconda installation. The only problem is that the tutorial notebooks (exercise files) are on…
Ethan
- 1,657
- 9
- 25
- 39
9
votes
2 answers
Data Science as a Social Scientist?
as I am very interested in programming and statistics, Data Science seems like a great career path to me - I like both fields and would like to combine them. Unfortunately, I have studied political science with a non-statistical sounding Master. I…
Christian Sauer
- 657
- 4
- 7