Searching Through Files

Grep#

Grep#

The obvious ways to search through file is with grep.

$ grep "some string" *

Ack#

A better way is with ack.

This example will search through all python files for the string "def foo".

$ ack --python "def foo"

Ripgrep#

Even faster than Ack

$ rg some_string

Zgrep#

If you want to grep through zip files, it is faster to use zgrep than to extract the zip files.

$ zgrep "some string" *gz

Optimizing Zgrep - Pigz#

You can optimize Zgrep by installing pigz. Pigz is a drop-in parallel replacement for gzip.

Simply open the zgrep script, replace the call to gzip with pigz, and you will see a nice improvement.

Optimizing Zgrep - xargs#

If you have a lot of gz files (and a few cores) you can launch multiple version of zgrep at once. This will bring fantastic improvement provided the gz files are roughly the same size.

In this example, xargs will spawn at most four copies of zgrep at a time and pass at most one file to each zgrep for searching. On my machine which has four hyper-threaded cores, this is optimal as each zgrep gets a real core and a hyper-threaded one (because it's using pigz).

$ find . -name "*gz" | xargs -P 4 -n 1 zgrep "some string"

Find#

fd#

fd is faster than find. See https://github.com/sharkdp/fd


Performance CategoryComputing.Shell