| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Storing this in SQLite is slow and grows the cache to a huge size. The
approach of reading these files lists may be a bit slower but saves a
lot of space and overhead and removes quite some complexity.
|
|
|
|
|
|
|
|
|
|
| |
This changes the information in the metadata dict to include the file
type in a separate field and limit the mode information to standard
permissions only.
Upon reading files lists from the repository the old format is
automatically converted. This changes local cache file to ensure all
information is re-read (the previous commit also already required this).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use fetchall() on a cursor because SQLite cannot handle partial reads
from a cursor if the database is being modified in another cursor. This
clearly uses more memory during the backup.
Set the synchronous and journal_mode SQLite pragmas to have better
performance but be less safe. Since it is a cache, the data can be
reconstructed from the repository if needed.
This also uses the connection as a context manager instead of manually
calling commit, changes some of the transactions around to have better
performance and includes a few consistency improvements.
|
|
|
|
| |
The names now better reflect the purpose and contents of the table.
|
| |
|
|
|
|
|
|
|
| |
This instructs the crawler to skip certain patterns from the backup. It
supports * for matching any part of a file name, ** to also match /,
ending the pattern with / to only match directories and starting the
pattern with / to match the full path.
|
| |
|
|
|
|
|
| |
This reads snapshot file list and filters and formats the output to be
like ls.
|
|
|
|
|
|
|
| |
Create indexes (some after crawling which is a minor improvement) in the
tables to improve queries, and use explicit transactions to improve
performance (small improvement). Also, move temporary table creation to
the functions where they are used (instead of global).
|
|
|
|
|
|
| |
The crawler now chanes to the directoties that are crawled and uses
stat() on relative paths instead of using abolsute paths for all
operations. This brings about a 10% reducting in crawling time.
|
|
|
|
|
| |
This currently ignores files with filenames that have an unknown
encoding. This is far from ideal though.
|
|
|
|
|
| |
This replaces a call to os.walk() with one to os.listdir() to avoid
calling stat() twice on each file and directory encountered.
|
| |
|
|
|