2

I would like to build a variety of classification and regression decision trees. My use case focuses on extraction and communication of decision rules. Previously weka was used in my organisation for decision tree learning. What can weka do that Python or Sklearn can't?

I currently use pandas, numpy, scipy, and sk-learn and other libraries for the majority of my workflow.

2 Answers2

4

Weka's decision trees are from the Quinlan family, whereas sklearn uses CART.

The most notable difference is that Quinlan trees aren't restricted to binary splits: a categorical column will be split into subtrees for each level.

Another is how missing values are dealt with, but there are some differences in individual implementations, so it's not straightforward to compare the two branches.

https://stackoverflow.com/q/9979461/10495893
TDIDT Decision Trees algorithm

Otherwise, I expect the main real difference is in whether it's easier to deal with python or java. If you want to extract decision rules, you may be looking to post-process a decision tree; I know of skope-rules to do this in python, but not whether such a thing is easy in weka.

Ben Reiniger
  • 12,855
  • 3
  • 20
  • 63
0

I found amazing behavior while doing frequent pattern mining in Weka and Python (Collab). Weka can handle entire dataset of Supermarket (approx 4600 x 211), whereas in Python it gives crash process due to RAM. I am not that expert in either, may be there is something that I am doing wrong. Still, I reduced the size of data file more than half, still the same. So far my conclusion is that Weka is more focused on fewer things that it does. Python perhaps focuses less on implementation of frequent pattern mining algorithms, like Apriori.