I have a dataset that looks something like this:
| col_1 | col_2 | col_3 |
|---|---|---|
| 0 | nan | nan |
| nan | 1 | nan |
| nan | nan | nan |
| nan | nan | 1 |
And I need to collapse those columns into something like this:
| foo |
|---|
| 0 |
| 1 |
| nan |
| 1 |
My first attempt was doing this:
df[columns].values[~df[columns].isna()]
But since there are rows where all the values are nan, I miss those rows.
My second attempt was doing this:
def get_cols_or_nan(row):
mask = ~row.isna()
if np.any(mask):
return row[mask][0]
return float('nan')
df[columns].apply(get_cols_or_nan, axis=1)
But, I don't know why, this is significantly slower to the point that for me it's impracticable.
Is there a more efficient way that I can collapse those columns? I am guaranteed that there's only one non-nan value in each row.