I am working on an app to help people learn English as a second language. I have validated that sentences help in learning a language by providing extra context. I did that by conducting a small research in a classroom of 60 students.
I have mined over hundred thousand sentences from Wikipedia for various English words (Including Barrons'800 words and 1000 most common English words)
Entire data is available at https://buildmyvocab.in
In order to maintain the quality of content, I filtered out sentences which were longer than 160 characters since they might be difficult to understand.
As a next step, I want to be able to automate the process of sorting this content in the order of ease of understanding. I myself am a non-native English speaker. I want to know what features I can use to separate easy sentences from difficult ones.
Also, do you think this is possible?