7

I have a NumPy array of strings: 'n', 'y', wanna convert it into integer array of 0, 1, how to convert it?

imp = SimpleImputer(missing_values=np.nan, strategy='most_frequent')
X = imp.fit_transform(X)

X

array([['n', 'y', 'n', ..., 'y', 'n', 'y'],
       ['n', 'y', 'n', ..., 'y', 'n', 'y'],
       ['n', 'y', 'y', ..., 'y', 'n', 'n'],
       ...,
       ['n', 'y', 'n', ..., 'y', 'n', 'y'],
       ['n', 'n', 'n', ..., 'y', 'n', 'y'],
       ['n', 'y', 'n', ..., 'y', 'n', 'n']], dtype=object)
JL1829
  • 81
  • 1
  • 5

4 Answers4

14

It is quite trivial

(X=='y').astype(int)

Should do the trick. It simply converts your array to True or False according to your requirements and then astype will impose the required datatype. By default int will give you 1 for True and 0 for False.

Yohanes Alfredo
  • 1,143
  • 8
  • 15
7

You could use the following code:

X[X=='y'] = 1
X[X=='n'] = 0

This replaces the indexes of 'y' with 1 and of 'n' with 0. Generally the X=='y' returns a Boolean array which contains True where the 'y' and False everywhere else and so on.

Giannis Krilis
  • 501
  • 2
  • 7
2

you can use np.where to replace 'n' and 'y'. Please go through the documentation here,

https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html

Also, please try to be more elaborate in your questions such that right solutions can be reached.

Aman Raparia
  • 257
  • 2
  • 8
1

You can use np.vectorize as this answer explains:

In [1]: import numpy as np

In [2]: a = np.array([['n', 'y', 'n', 'n', 'y', 'n', 'y'],
   ...:        ['n', 'y', 'n', 'n', 'y', 'n', 'y'],
   ...:        ['n', 'y', 'y', 'y', 'y', 'n', 'n']])
   ...:        

In [3]: my_map = {'n': 0, 'y': 1}

In [4]: np.vectorize(my_map.get)(a)
Out[4]: 
array([[0, 1, 0, 0, 1, 0, 1],
       [0, 1, 0, 0, 1, 0, 1],
       [0, 1, 1, 1, 1, 0, 0]])
user1717828
  • 245
  • 1
  • 3
  • 9