Considering that a lot of language is repetitions of groups of words, I’ve been thinking that studying frequently used phrases or groupings of words would be very helpful. AntConc‘s n-gram and cluster tools seem like they would generate a useful list to study.
AntConc is an easy-to-use freeware concordance program for Windows (98/Me/2000/NT/XP), Macintosh OS X, and Linux. It was originally developed for use by students in the classroom, but also serves as a comprehensive text analysis tool kit for researchers. AntConc is written completely in the Perl 5.8 programming language using ActiveState’s excellent Komodo development environment and continues to be developed through feedback from users around the world.
AntConc contains the following tools:
- Concordance
- Concordance Plot
- File View
- Clusters
- N-Grams (part of Word Clusters)
- Collocates
- Word List
- Keyword List