Most Popular

1500 questions
9
votes
1 answer

When does decision tree perform better than the neural network?

I was experimenting with different modelling methods including KNN, Decision Trees, Neural Networks and SVN and trying to fit my data to see which works the best. To my surprise, the decision tree works the best with training accuracy of 1.0 and…
Suhail Gupta
  • 611
  • 8
  • 15
9
votes
2 answers

Display Images (url) Inside Pandas Dataframe

I would like to display images (mostly jpg and png formats) directly from their url link inside a pandas dataframe. Imagine I already have the following dataframe: id image_url 1 …
TwinPenguins
  • 4,429
  • 3
  • 22
  • 54
9
votes
2 answers

Is there any consensus on choosing an appropriate ML approach?

I am studying data science at the moment and we are taught a dizzying variety of basic regression/classification techniques (linear, logistic, trees, splines, ANN, SVM, MARS, and so on....), along with a variety of extra tools (bootstrapping,…
9
votes
3 answers

R random forest on Amazon ec2 Error: cannot allocate vector of size 5.4 Gb

I am training random forest models in R using randomForest() with 1000 trees and data frames with about 20 predictors and 600K rows. On my laptop everything works fine, but when I move to amazon ec2, to run the same thing, I get the error: Error:…
SOUser
9
votes
2 answers

Dealing with feature vectors of variable length

How does one deal with a feature vector that can vary in size? Let's say per object, I calculate 4 features. In order to solve a certain regression problem, I may have 1, 2, or more of these objects (no more than 10). Thus, the feature vector is…
Otto Nahmee
  • 91
  • 1
  • 4
9
votes
3 answers

Interactive Graphing while logging data

I'm looking to graph and interactively explore live/continuously measured data. There are quite a few options out there, with plot.ly being the most user-friendly. Plot.ly has a fantastic and easy to use UI (easily scalable, pannable, easily…
9
votes
3 answers

Text classification with thousands of output classes in Keras

Task: I have a dataset with job titles and descriptions. The task is to predict tags for job by job title and description. There are several tags for each job posting. Therefore, the number of labels for the model will be measured in tens of…
lemon
  • 205
  • 2
  • 6
9
votes
3 answers

How to use Cross Entropy loss in pytorch for binary prediction?

In the pytorch docs, it says for cross entropy loss: input has to be a Tensor of size (minibatch, C) Does this mean that for binary (0,1) prediction, the input must be converted into an (N,2) tensor where the second dimension is equal to (1-p)? So…
AAC
  • 509
  • 2
  • 6
  • 13
9
votes
2 answers

Difference between using RMSE and nDCG to evaluate Recommender Systems

What kind of error measures do RMSE and nDCG give while evaluating a recommender system, and how do I know when to use one over the other? If you could give an example of when to use each, that would be great as well!
9
votes
2 answers

How does a FC layer work in a typical CNN

I am new to CNNs and NNs. I am reading this blog: CNN and I am confused about this part: What confuses me is the operation that will be performed on an input vector/matrix. Will we be using a typical ANN equation: "O = W.T * input"?. And then a…
user57521
  • 91
  • 1
  • 1
  • 2
9
votes
1 answer

Using class weights in Keras with multiple binary outputs which are not simply one-hot-encoded

My labels are binary vectors of length 5, e.g., [0, 0, 1, 1, 1]. My label set is very biased, 1-to-50, where the case [0, 0, 0, 0, 0] is very common while all other combinations are not. I'd like to weight the uncommon versions using the…
9
votes
2 answers

How to learn 3D orientations reliably?

I am working on neural network models for 3D skeletal character animation, where I learn joint positions and orientations. The problem comes with the orientations. There are several ways I can choose to represent a 3D rotation, but all of them have…
javidcf
  • 306
  • 1
  • 8
9
votes
5 answers

Measuring the uncertainty of predictions

Given a multiclass classification model, with n features, how can I measure the uncertainty of the model for that particular classification? Let's say that for some class the model accuracy is amazing, but for another it's not. I would like to find…
Latent
  • 334
  • 3
  • 16
9
votes
4 answers

How to download a Jupyter Notebook from GitHub?

This is a fairly basic question. I am working on a data science project inside of a Pandas tutorial. I can access my Jupyter notebooks through my Anaconda installation. The only problem is that the tutorial notebooks (exercise files) are on…
Ethan
  • 1,657
  • 9
  • 25
  • 39
9
votes
2 answers

Data Science as a Social Scientist?

as I am very interested in programming and statistics, Data Science seems like a great career path to me - I like both fields and would like to combine them. Unfortunately, I have studied political science with a non-statistical sounding Master. I…
Christian Sauer
  • 657
  • 4
  • 7