Most Popular

1500 questions
10
votes
5 answers

Visualizing items frequently purchased together

I have a dataset in following structure inserted in a CSV file: Banana Water Rice Rice Water Bread Banana Juice Each row indicates a collection of items that were purchased together. For example, the first row denotes that the items…
João_testeSW
  • 179
  • 2
  • 3
  • 13
10
votes
1 answer

What is the most efficient data indexing technique

As we all know, there are some data indexing techniques, using by well-known indexing apps, like Lucene (for java) or Lucene.NET (for .NET), MurMurHash, B+Tree etc. For a No-Sql / Object Oriented Database (which I try to write/play a little around…
10
votes
4 answers

Why is it hard to grant efficiency while using libraries?

Any small database processing can be easily tackled by Python/Perl/... scripts, that uses libraries and/or even utilities from the language itself. However, when it comes to performance, people tend to reach out for C/C++/low-level languages. The…
Rubens
  • 4,117
  • 5
  • 25
  • 42
10
votes
3 answers

Switching Keras backend Tensorflow to GPU

I use Keras-Tensorflow combo installed with CPU option (it was said to be more robust), but now I'd like to try it with GPU-version. Is there a convenient way to switch? Or shall I re-install fully Tensorflow? Is the GPU version reliable?
Hendrik
  • 8,767
  • 17
  • 43
  • 55
10
votes
1 answer

Naive Bayes Should generate prediction given missing features (scikit learn)

Seeing that Naive Bayes uses probability to make a prediction, and treats features as being conditionally independent of each other, then it makes sense that the model can still make a prediction given that there are some features missing in the…
gbhrea
  • 307
  • 4
  • 10
10
votes
2 answers

Parameters in GridSearchCV in scikit-learn

I am trying to build a model in scikit-learn. I used RandomForestClassifier as my method for classification. In order to improve the score and efficiency of my model, I thought about using GridSearchCV. Here is the code: import pandas as pd import…
enterML
  • 3,091
  • 9
  • 28
  • 38
10
votes
4 answers

What is affine transformation in regard to neural networks?

I have been reading a paper recently on Highway Neural Networks and found the following: $y=H(x,W_H)$ $H$ is usually an affine transform followed by a non-linear activation function, but in general it may take other forms. After Googling about…
tastyminerals
  • 2,177
  • 3
  • 18
  • 20
10
votes
3 answers

Nested cross-validation and selecting the best regression model - is this the right SKLearn process?

If I understand correctly, nested-CV can help me evaluate what model and hyperparameter tuning process is best. The inner loop (GridSearchCV) finds the best hyperparameters, and the outter loop (cross_val_score) evaluates the hyperparameter tuning…
10
votes
4 answers

Does training a neural network on a combined dataset outperform sequential training on individual datasets?

I have a neural network with a fixed architecture (let's call it Architecture A). I also have two datasets, Dataset 1 and Dataset 2, both of which are independently and identically distributed (i.i.d.). I’m exploring how training strategies affect…
10
votes
1 answer

What is generative and discriminative model? How are they used in Natural Language Processing?

This question asks about generative vs. discriminative algorithm, but can someone give an example of the difference between these forms when applied to Natural Language Processing? How are generative and discriminative models used in NLP?
alvas
  • 2,510
  • 7
  • 28
  • 40
10
votes
1 answer

Backpropagation: In second-order methods, would ReLU derivative be 0? and what its effect on training?

ReLU is an activation function defined as $h = \max(0, a)$ where $a = Wx + b$. Normally, we train neural networks with first-order methods such as SGD, Adam, RMSprop, Adadelta, or Adagrad. Backpropagation in first-order methods requires first-order…
10
votes
2 answers

Stochastic gradient descent based on vector operations?

let's assume that I want to train a stochastic gradient descent regression algorithm using a dataset that has N samples. Since the size of the dataset is fixed, I will reuse the data T times. At each iteration or "epoch", I use each training sample…
Pablo Suau
  • 1,809
  • 1
  • 14
  • 20
10
votes
1 answer

Difference: Replicator Neural Network vs. Autoencoder

I'm currently studying papers about outlier detection using RNN's (Replicator Neural Networks) and wonder what is the particular difference to Autoencoders? RNN's seem to be treaded for many as the holy grail of outlier/anomaly detection, however…
Nex
  • 285
  • 2
  • 6
10
votes
5 answers

Dimension-Hopping in Machine Learning

What is the dimension hopping problem in machine learning (occurring in convolutional neural networks and image recognition)? I have googled about it but all I get is information on the Physics of material shape deformation. It will be more helpful…
Saurabh Jain
  • 213
  • 2
  • 7
10
votes
1 answer

How to summarize a long text using GPT-3

What is the best way to summarize a long text that exceeds 4096 token limit (like a podcast transcript for example)? As I understand I need to split the text into chunks to summarize, and then concatenate the results and summarize those. Is there…
Poma
  • 203
  • 2
  • 6