Find large files in a given directory

Jul 1, 2016

A linux command called du is used to display disk usage statistics. By default it shows all the files and subdirectories (recursively) residing under the current directory. For example:

> du -h
4.0K    ./node_modules/.bin
256K    ./node_modules/nlp_compromise/client_side/basic_demo
456K    ./node_modules/nlp_compromise/client_side/cute_demo/libs
476K    ./node_modules/nlp_compromise/client_side/cute_demo
400K    ./node_modules/nlp_compromise/client_side/long_demo/libs
512K    ./node_modules/nlp_compromise/client_side/long_demo
 68K    ./node_modules/nlp_compromise/client_side/unit_test
1.6M    ./node_modules/nlp_compromise/client_side
 68K    ./node_modules/nlp_compromise/src/data/lexicon
 96K    ./node_modules/nlp_compromise/src/data
...

The -h option make the file sizes displayed in human readable format. Now grep to filter the output and see only the large files. Of course, the definition of large is subjective. I'll assume 1MB files are large enough.

> du -h | grep '^\s*[0-9\.]*M'
1.6M    ./node_modules/nlp_compromise/client_side
...