|Which files are using most disk space?|
|Find the ten biggest files|
|Find duplicate files bigger than 100MB|
|Find which filetypes are using most space|
|Which directories are using most disk space?|
|Find biggest directories by total file size|
|Find likely-duplicate directories|
|Delete files manually|
|Delete directory trees manually|
|Delete using exec()|
|Delete multiple files using exec()|
|E.g. Recursively delete all files beneath the 'MyProject' directory >100 MB|
|Irreversibly delete multiple directories using exec()|
|E.g. Recursively delete all empty directories beneath the 'MyProject' directory|
First scan your whole drive. This might take up to half an hour, depending on size and type. The following command will do that, saving the scan data to a file called wholedisk.crdb.
$ crab -db wholedisk.crdb /
Next time you start Crab, you can query this same data without scanning again; use the same -db option, and no scan path, e.g.
$ crab -db wholedisk.crdb
Which files are using most disk space?
SELECT fullpath, bytes/1E9 as GB FROM files
ORDER by bytes DESC LIMIT 10;
SELECT extension, sum(bytes)/1e9 as GB, max(bytes)/1e9, fullpath
GROUP BY extension
ORDER BY GB DESC
Which directories are using most disk space?
This query finds candidate duplicate directories by looking for directories that contain the same total file size, and same number of files.
SELECT p1.pp, p2.pp, p1.size/1e9
(SELECT parentpath as pp, sum(bytes) ||':'|| count(*) as sig, sum(bytes) as size FROM files
GROUP BY parentpath ) AS p1
(SELECT parentpath as pp, sum(bytes) ||':'|| count(*) as sig FROM files
GROUP BY parentpath ) AS p2
ON p1.sig = p2.sig and p1.pp < p2.pp
ORDER BY p1.size DESC
You can run shell commands to delete or move objects, create directories etc without leaving Crab. Just put an exclamation mark at the start of the line and Crab will send the rest of the line to the shell.
You can use the 'rm' command together with the fullpath of a file you want to delete, but the path and filename must be put in quotes. Without the quotes you'll have problems with paths that contain spaces
To save typing, you can copy a fullpath from your query results by highlighting it with the mouse and typing Cmd+C, Cmd+V.
Be careful: rm will delete files immediately, without putting them in the Trash, there is no Undo.
CRAB> !rm "/Users/johnsmith/Assorted Files/File_which_I_am_100_percent_sure_I_dont_need"
Remember that files you delete won't be removed from query results until you scan again.
You absolutely must put quotes around the fullpath of the directory you want to delete, or a typing mistake could delete everything on your filesystem.
This example deletes the 'MyProject' directory, and every file and directory inside it.
CRAB> !rm -r "/Users/johnsmith/MyProject/"
The exclamation mark tells Crab to send the whole line to the shell, and the -r option tells the 'rm' command to delete recursively.
This will prompt you for each file that doesn't have appropriate permissions. It can be tedious if deleting a bunch of files where each one needs your approval.
There is a more dangerous form of the command that deletes files irrespective of their permissions, without asking you for confirmation. This uses the -f option in addition to the -r option
CRAB> !rm -rf "/Users/johnsmith/myproject/"
Crab's exec() function runs OS commands on files you specify. If you know what you're doing and don't want to copy files before deleting them, use the 'rm' command with its -f option. This causes 'rm' to delete files without confirmation, whatever their permissions. Test that the query logic is correct before using it for deletion.
WARNING: If a delete query has no WHERE clause it will delete every file that was scanned.
SELECT exec('rm', '-f', fullpath) FROM files
WHERE fullpath LIKE '/Users/johnsmith/myproject/%' and type = 'f' and bytes>100e6;
Use the 'rmdir' command to delete empty directories.
Tip: Suppress exec() echo to screen
By default exec() outputs every command executed to the screen. If you are running an exec() function hundreds of thousands or millions of times this will be slow, and the screen will be a mess.
To discard the output use the following command before running the query:
Any error messages will still go to the Terminal window, as will subsequent CRAB> prompts
To switch output back to the screen do this: