Use exec() to run OS commands or other programs on files returned in query results, using their fullpath.

On macOS, unlike Windows, exec() does not generally run in a shell because file management commands such as 'mv', 'cp', and 'del' are available without it. See "Tips" at the bottom of this page for details.

Run 'command' with option '-opt1' on multiple files

SELECT exec('command',  '-opt1',   fullpath)
FROM files
WHERE type = 'f'
and ...

  • The first argument to exec() should be a string specifying the command/program to run. Each argument or option for the command should be a separate argument to exec(), generally one of these will be the fullpath of the file you want to operate on.

  • Progams you invoke should be on the PATH, or in the directory where you launched Crab, or you must provide the full path for the program. Use the macOS 'open' command to run GUI applications.

  • Use the WHERE clause to specify which objects to run 'command' on.
    No WHERE clause means run it on every object scanned

  • To run only on direct children of 'SourceDirectory', i.e. not recursive

  • WHERE parentpath = '/somepath/SourceDirectory/'

    To run on all children of 'SourceDirectory,' i.e. recursive

    WHERE parentpath like '/somepath/SourceDirectory/%'

  • You should test the logic of queries that change the filesystem with an 'echo' command before use

  • Don't use exec() with commands that give run-time prompts, exec() doesn't display them.

  • exec() runs sychronously, meaning exec() waits for each command to complete before starting the next.

    What it means for a command to complete varies from one program to another, e.g. if you open an .mp3 file in iTunes, the .mp3 file will start playing immediately and the command will complete immediately. So if you play 100 mp3's in quick succession you'll only hear the last one. Similarly, if you open files in Finder, the command is complete when the Finder window has opened, not when you have finished interacting with the window. If you use exec() to open 100 directories in Finder, you'll get 100 open windows.

Run 'command' on every file in directory 'MyProject'

SELECT exec('command', fullpath)
FROM files
WHERE type = 'f'
and parentpath = '/somepath/MyProject/';

This is the generic template for running a command on every file in the 'MyProject' directory.

The example is non recursive because we're only picking up files where the parentpath is
equal to 'MyProject'.

E.g. Recursively delete all .pyc files from 'MyProject' directory

Here the 'rm'
command is used with its -f option. This option forces deletion of readonly files without prompting for confirmation.

WARNING: If you ran this query without a WHERE clause it would delete every file scanned

SELECT exec('rm', '-f', fullpath)
FROM files
WHERE parentpath like '/somepath/MyProject/%'
and extension = '.pyc';

This example is recursive because we're using a wildcard pattern to filter parentpath, which is satisfied by objects in the 'MyProject' directory tree at any depth.

E.g. Open in Finder the directory that has the most Excel spreadsheets

SELECT exec('open', parentpath), count(*)
FROM files
WHERE extension like '.xls%'
GROUP BY parentpath ORDER BY count(*) DESC LIMIT 1;

This query counts files which have an extension starting '.xls' (so it matches '.xls' and '.xlsx' files), and groups the count by parent directory. Then it sorts the parent directories by this count and returns just one row - the parent directory with the most Excel files. Finally the query runs the 'open' command on this directory, which opens it in Finder.

E.g. Play a random Jack Johnson song

SELECT exec('open', fullpath) FROM files
WHERE fullpath like '%jack%' and fullpath like '%jo%son%' and extension = '.mp3'
ORDER BY random() LIMIT 1;

This query returns .mp3 files which have 'jack' and 'johnson' or 'jonson' somewhere in the filename or the directory name. The results are sorted randomly and the first row is returned. The call to exec() invokes the macOS 'open' command to open the .mp3 file in its default application, likely iTunes.

Run 'command' 1000 times

SELECT exec('command')
FROM iterator
WHERE i <= 1000;

This is the generic template for running a command multiple times.

In this example, 'command' will run in a Bash shell because only a single argument has been given to

iterator is a virtual table with one field i which goes from 1 to 1000000000

Unzip files

macOS includes the 'unzip' utility which supports unpacking .zip files from the command line.

Unzip files to directory 'target Dir'

SELECT exec('unzip',  '-n',  fullpath,  '-d',  '/somepath/target Dir/')
FROM files
WHERE extension = '.zip'
and ...

  • Use 'unzip' with either -o or -n, so as to suppress run-time prompts: exec() doesn't display them.

    • Use the '-o' option to always overwrite when there is a name collision

    • Use the '-n' option to never overwrite when there is a name collision

  • Option '-d' specifies the output directory, which must already exist.

  • Specify which files to unzip using the WHERE clause.
    No WHERE clause means unzip every object scanned

E.g. Recursively unzip all files from 'Source Dir' to 'Target Dir'

Here the
-n option tells unzip to never overwrite files, and the -d option specifies a target directory. 'Target Dir' must exist before the query is run.

SELECT exec('unzip', '-n', fullpath, '-d', '/Users/johnsmith/Target Dir/')
FROM files
WHERE parentpath like '/Users/johnsmith/Source
and extension = '.zip';

Tip: Test exec() queries by running them with the echo command

Prior to running a query that will modify the filesystem you should prefix your
exec() function call with an echo command to check exactly which commands will be run on which files. E.g.

SELECT exec('echo', 'unzip', '-n', fullpath,'-d','/Users/johnsmith/Target Dir/')
FROM files
WHERE parentpath like '/Users/johnsmith/Source
and extension = '.zip';

Only when you're happy that the right files are being processed should you remove the 'echo':

SELECT exec('unzip', '-n', fullpath,'-d','/Users/johnsmith/Target Dir/')
FROM files
WHERE parentpath like '/Users/johnsmith/Source
and extension = '.zip';

Tip: Suppress exec() echo to screen

By default
exec() writes every command that is run to your Terminal window, together with its output. If you are running hundreds of thousands or millions of commands this will be slow.

You can resolve this by throwing away the output, use the following command before running the query:

%output /dev/null

Any error messages will still go to the Terminal window, as will subsequent CRAB> prompts

To get output back do this:

%output stdout

If you want to improve query speed and keep a log of the output, use a filename in place of /dev/null, e.g.

%output '/somepath/MyOutputLog.txt'

Tip: Check if files are still present with pathexists()

Crab doesn't track files that are moved or deleted after a scan. To check that a file or directory is still present at query run time use the
pathexists() function.

E.g. to unzip a bunch of files when some have been moved or deleted since the last scan:

SELECT exec('unzip', '-n', fullpath,'-d','/Users/johnsmith/Target Dir/')
FROM files
WHERE parentpath like '/Users/johnsmith/Source
and extension = '.zip'
and pathexists(fullpath);

Tip: Use shell features with exec()

If you need to use pipes or I/O redirection on macOS, use exec() with only one argument. In this case you will have to concatenate together the command, its options, its arguments and everything else into one string. You will also need to embed single quotes around names and paths to avoid problems with any that contain spaces, e.g.

SELECT exec("ls '" || fullpath || "' >> DirMyFiles.txt")
FROM files;

To make this a little easier to read, Crab has a function called q() which puts single quotes around its argument.

SELECT exec('ls ' || q(fullpath) || ' >> DirMyFiles.txt')
FROM files;

© 2017 Etia UK