0

I'm using pandas 17.1. Dealing with the SettingWithCopyWarning is widely discussed on SO, but I don't believe what looks like the most popular thread addresses my use case, which is assigning a scalar to a column.

My code:

df.loc[:, "some_col_name"] = 0

Assume that the column called "some_col_name" already exists; this is not adding the column (if such a statement even could).

It's generating a SettingWithCopyWarning, and for the life of me I can't figure out why.

It works when I set df.is_copy = False first, but I'd rather avoid the extra statement every time I do this if possible.

What am I doing wrong here?

Thanks!

Follow-up response to johnchase's answer: df was created by a groupby statement (see below), so I'm not sure where I'd add in the .copy. The remedy I mentioned works, but that I have to do that means to me that pandas makes the groupby iterations not realize they're copies. (They are, though, right?)

for some_ix, df in bigger_df.groupby(cols_I_care_about):
    df.loc[:, "some_col_name"] = 0
Community
  • 1
  • 1
HaPsantran
  • 5,581
  • 6
  • 24
  • 39

1 Answers1

1

My guess is that df is a dataframe that is created from a previously existing dataframe. See the following:

df_old = pd.DataFrame(data=np.arange(15).reshape(5, 3), columns=['a', 'b', 'c'])
df = df_old[['a', 'c']]
df.loc[:, 'c'] = 0

This results in a SettingWithCopyWarning. The next code chunk does not result in a SettingWithCopyWarning

df_old = pd.DataFrame(data=np.arange(15).reshape(5, 3), columns=['a', 'b', 'c'])
df = df_old[['a', 'c']].copy()
df.loc[:, 'c'] = 0

pandas is warning you that you may be modifying the original data frame. You can use the .copy() method to be sure that you are creating a new dataframe and not modifying the original, or df.is_copy = False as you did. I would disagree that your solution is a cop out, but would prefer using .copy() when creating the new dataframe as it is less prone to errors down the road.

johnchase
  • 13,155
  • 6
  • 38
  • 64
  • Thanks, @johnchase. The df is created by a groupby iterator / generator. (I can never remember which it is or if the second is a subset of the first.) I'll revise the code to show this. I would think pandas wouldmake each resulting df know it's a copy already. – HaPsantran Jan 26 '16 at 16:11