Is there an encoder which can automatically detect the intrinsic order of an ordinal variable and assign values accordingly?

Question

Given data with an ordinal variable, says "house quality" with values ex (excellent), gd (good), fa (fair) and bd (bad), we obviously cannot just throw data into sklearn's LabelEncoder as the resulting labels can be in wrong order, e.g. {bd: 3, gd: 2, fa: 1, ex:0}. Instead, we need to manually specify an order, right? However, if we do not have domain knowledge, how can we specify the order? Also a manual way is usually prone to error. Thus, I am curious if there is any encoder which can auto detect the correct order in an ordinal variable?

score 1 · Answer 1 · answered Jul 16 '20 at 07:09

Yup, it should be target encoding. By calculating the target mean for each category, you are ordering the categories based on the target, I don't think of a best way to order them. Of course, this is only valid in a supervised learning setting. If not, I can't think of a way of automatically ordering categories.

See this question and this blogpost to dig deeper in target encoding.

Is there an encoder which can automatically detect the intrinsic order of an ordinal variable and assign values accordingly?

1 Answers1