Details of every item scanned, file or directory
|fileid (primary key)||files table row number, a unique id for each item|
|name||item name e.g. 'Hei.ttf'|
|bytes||item size in bytes e.g. 7502752|
|depth||how far scan recursed to find the item, starts at 0|
|accessed||datetime item was accessed|
|modified||datetime item was modified|
|basename||item name without path or extension, e.g. 'Hei'|
|extension||item extension including the dot, e.g. '.ttf'|
|type||item type, 'f' for file or 'd' for directory|
|mode||further type info and permissions, e.g. 'drwxr-xr-x'|
|parentpath||absolute path of directory containing the item, e.g. '/Library/Fonts/'|
|fullpath (unique)||parentpath of the item concatenated with its name, e.g. '/Library/Fonts/Hei.ttf'|
All directory paths end with a slash, e.g. /somedirectory/
File paths do not e.g. /somefile.txt
Just to remind you, to convert from bytes: KB = bytes/1e3, MB = bytes/1e6, GB = bytes/1e9, TB = bytes/1e12
A virtual table giving searchable access to current contents of files Crab has scanned, one row for each line of data.
|linenumber||line number within file, restarts count from 1 at the first line of each file|
|data||data content of the files, one entry for each line in the file|
fileslines also includes columns from the files table: fileid, name, bytes, depth, accessed, modified, basename, extension, type, mode, parentpath, and fullpath. This lets you restrict files to be searched, without needing to join tables.
A virtual table with one column, i, that takes the values 1,2,3,..,1000000000.
|i||1, 2, 3, ...|
On the first row i has the value 1, on the second row i has the value 2, and so on: 1, 2, 3, ...
By joining to the iterator table, and using i in the select clause, you turn each value of i into a separate row in your result set. Use it to turn columns in your data, into rows in your result set. E.g.
SELECT substr('The sun was shining',i,1)
WHERE i<=length('The sun was shining');
CRAB> select substr('The sun was shining',i,1) from iterator where i<=length('The sun was shining');
substr('The sun was shining',i,1)
If you do this for every line of text in a file, you can group and count the results to make a character frequency count
SELECT substr(data,i,1), count(*)
FROM fileslines JOIN iterator ON i <= length(data)
WHERE name = '25021.txt' AND length(data)>0
GROUP by substr(data,i,1) ORDER BY count(*) DESC LIMIT 10;
CRAB> select substr(data,i,1), count(*) from fileslines join iterator on i <= length(data) where name = '25021.txt' and length(data)>0 group by substr(data,i,1) order by count(*) desc limit 10;