2

I have thought of a regression technique that I want to try on several datasets. I would like these datasets to have the following properties:

  • Be a tabular dataset (no images).
  • Have at least 20k rows, and ideally around 100k.
  • Have some categorical variables with many levels (at least a variable with 100 levels or more).
  • Ideally, the target should have long tails.

Does anyone any public dataset with these properties? I have found the stack overflow developer survey to work for me, but I'd like to have some more datasets with such structure.

David Masip
  • 6,136
  • 2
  • 28
  • 62

0 Answers0