1

I got the following array:

array([['A', 0.05],
       ['B', 0.09],
       ['C', 0.13]]

I want to make a new column and assign a label to the items A, B and C, based on the second column. If the item is above 0.10, it has to get the label '2'. If it is below 0.10, it has to get the label '1'. So my desired output is:

 array([['A', 0.05,'1'],
        ['B', 0.09,'1'],
        ['C', 0.13,'2']]

How can I do this?

Steven Pauly
  • 185
  • 1
  • 2
  • 13

1 Answers1

1

You can use numpy.where combined with numpy.column_stack:

import numpy as np

arr = np.array([['A', 0.05],
                ['B', 0.09],
                ['C', 0.13]])

col = np.where(arr[:, 1].astype(np.float) > 0.10, '2', '1')
arr = np.column_stack((arr, col))
print(arr)

Output

[['A' '0.05' '1']
 ['B' '0.09' '1']
 ['C' '0.13' '2']]

UPDATE

If you have more than two labels, you could do something like this:

import numpy as np

arr = np.array([['A', 0.05],
                ['B', 0.09],
                ['C', 0.13]])

def calc(x):
    if x < 0.08:
        return '1'
    elif 0.08 <= x < 0.10:
        return '2'
    elif 0.10 < x:
        return '3'


col = np.array([calc(e) for e in arr[:, 1].astype(np.float)])
arr = np.column_stack((arr, col))
print(arr)

Output

[['A' '0.05' '1']
 ['B' '0.09' '2']
 ['C' '0.13' '3']]
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
  • thanks! And what do I do when i want to make 3 classes, where I say; if A, B, or C is below 0.08, then it has to get the label '1', if it is higher than 0.10, it has to get label '3' and everything in between has to get the label '2'? – Steven Pauly Dec 19 '18 at 15:59
  • 1
    I would suggest creating a function that maps the intervals to the labels and apply it to every element in the array. – Dani Mesejo Dec 19 '18 at 16:00
  • I now did that: my function is called 'calc(x)', where x is the value of 0.05, 0.09, etc. But i'm struggling to add this into the code you provided. Any suggestions on this? – Steven Pauly Dec 19 '18 at 16:22
  • 1
    @StevenPauly Updated the answer to include a solution for more than two labels. – Dani Mesejo Dec 19 '18 at 16:28
  • thanks!! all makes sense now – Steven Pauly Dec 19 '18 at 16:36
  • all makes sense now - except when x is 0.1, maybe that last elif should be 0.10 <= x: – Mr Ed Dec 20 '18 at 05:44