Questions tagged [feature-extraction]

Variables (used for prediction or explication) used in regression or regression-like models (like clustering, discrimination). Use this tag for questions about constructing such variables or selecting the best among them.

409 questions
72
votes
11 answers

What is dimensionality reduction? What is the difference between feature selection and extraction?

From wikipedia: dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction. What is the difference between feature…
alvas
  • 2,510
  • 7
  • 28
  • 40
48
votes
6 answers

Encoding features like month and hour as categorial or numeric?

Is it better to encode features like month and hour as factor or numeric in a machine learning model? On the one hand, I feel numeric encoding might be reasonable, because time is a forward progressing process (the fifth month is followed by the…
36
votes
4 answers

What is a good way to transform Cyclic Ordinal attributes?

I am having 'hour' field as my attribute, but it takes a cyclic values. How could I transform the feature to preserve the information like '23' and '0' hour are close not far. One way I could think is to do transformation: min(h, 23-h) Input: [0 1…
34
votes
6 answers

Are there any tools for feature engineering?

Specifically what I am looking for are tools with some functionality, which is specific to feature engineering. I would like to be able to easily smooth, visualize, fill gaps, etc. Something similar to MS Excel, but that has R as the underlying…
John
  • 441
  • 1
  • 5
  • 4
30
votes
3 answers

Why do we convert skewed data into a normal distribution

I was going through a solution of the Housing prices competition on Kaggle (Human Analog's Kernel on House Prices: Advance Regression Techniques) and came across this part: # Transform the skewed numeric features by taking log(feature + 1). # This…
25
votes
2 answers

Feature Transformation on Input data

I was reading about the solution to this OTTO Kaggle challenge and the first place solution seems to use several transforms for the input data X, for example Log(X+1), sqrt( X + 3/8), etc. Is there a general guideline on when to apply which kind…
22
votes
2 answers

How to choose the features for a neural network?

I know that there is no a clear answer for this question, but let's suppose that I have a huge neural network, with a lot of data and I want to add a new feature in input. The "best" way would be to test the network with the new feature and see the…
22
votes
3 answers

How to perform feature engineering on unknown features?

I am participating on a kaggle competition. The dataset has around 100 features and all are unknown (in terms of what actually they represent). Basically they are just numbers. People are performing a lot of feature engineering on these features. I…
21
votes
3 answers

Feature extraction of images in Python

In my class I have to create an application using two classifiers to decide whether an object in an image is an example of phylum porifera (seasponge) or some other object. However, I am completely lost when it comes to feature extraction techniques…
Jeremy Barnes
  • 315
  • 1
  • 3
  • 8
21
votes
5 answers

Feature selection vs Feature extraction. Which to use when?

Feature extraction and feature selection essentially reduce the dimensionality of the data, but feature extraction also makes the data more separable, if I am right. Which technique would be preferred over the other and when? I was thinking,…
20
votes
1 answer

What is difference between one hot encoding and leave one out encoding?

I am reading a presentation and it recommends not using leave one out encoding, but it is okay with one hot encoding. I thought they both were the same. Can anyone describe what the differences between them are?
18
votes
2 answers

List of feature engineering techniques

Is there any resource with a list of feature engineering techniques? A mapping of type of data, model and feature engineering technique would be a gold mine.
13
votes
3 answers

Unsupervised feature learning for NER

I have implemented NER system with the use of CRF algorithm with my handcrafted features that gave quite good results. The thing is that I used lots of different features including POS tags and lemmas. Now I want to make the same NER for different…
MaticDiba
  • 661
  • 1
  • 6
  • 10
13
votes
3 answers

How to use GAN for unsupervised feature extraction from images?

I have understood how GAN works while two networks (generative and discriminative) compete with each other. I have built a DCGAN (GAN with convolutional discriminator and de-convolutional generator) which now successfully generates handwritten…
exAres
  • 251
  • 2
  • 4
13
votes
2 answers

What features are generally used from Parse trees in classification process in NLP?

I am exploring different types of parse tree structures. The two widely known parse tree structures are a) Constituency based parse tree and b) Dependency based parse tree structures. I am able to use generate both types of parse tree structures…
1
2 3
27 28