Arthur de Jong

Open Source / Free Software developer

Commit message (Collapse)AuthorAgeFilesLines
* Improve performance of metadata readingHEADmasterArthur de Jong2015-07-031-42/+22
| | | | | | | | | This inlines the clean_meta() function and reads the whole JSON file in memory to greatly reduce the number of function calls that are performed reading the files list. This is especially noticable in reading the backup files lists. This does mean that more memory is used when reading large files.
* Print some statistics at end of fsck runArthur de Jong2015-07-031-0/+7
* Log failed command lineArthur de Jong2015-07-031-2/+4
* Fix logic error in handling --no-extractlistsArthur de Jong2015-07-031-1/+1
* Do not cache full backup contentsArthur de Jong2015-06-304-123/+36
| | | | | | Storing this in SQLite is slow and grows the cache to a huge size. The approach of reading these files lists may be a bit slower but saves a lot of space and overhead and removes quite some complexity.
* Name unknown file or directoryArthur de Jong2015-06-301-6/+6
| | | | When fsck finds an unknown file, it could also be a directory.
* Refactor out check_archive() functionArthur de Jong2015-06-261-38/+42
* Explicitly specify filename encodingArthur de Jong2015-06-251-1/+2
* Change metadata informationArthur de Jong2015-06-256-76/+126
| | | | | | | | | | This changes the information in the metadata dict to include the file type in a separate field and limit the mode information to standard permissions only. Upon reading files lists from the repository the old format is automatically converted. This changes local cache file to ensure all information is re-read (the previous commit also already required this).
* Ensure file lists are written in archive orderArthur de Jong2015-06-212-2/+4
| | | | | This is apparently needed because at least GNU tar expects the extractlists to be in the same order as files in the archive itself.
* Split out backend functions from repoArthur de Jong2015-06-205-85/+127
| | | | | | This creates a FileBackend class (from parts of the FileRepository class that is now renamed to Repository) that implements a simple API that can also be provided by other backends.
* Implement --no-extractlistsArthur de Jong2015-06-012-6/+10
| | | | | | This option ensures that a full restore does not require extractlists. Note that this is less efficient than with extractlists because existing archives need to be discarded earlier.
* Fix handling of --block-sizeArthur de Jong2015-05-301-0/+2
| | | | | This handles handling of --block-size values that don't end in a suffix (fixes 4a2c63c).
* Add a --clean option to fsckArthur de Jong2015-05-172-13/+22
| | | | | The option removes files from the repository that either are currupt, unknown or have become redundant.
* Add an fsck commandArthur de Jong2015-05-174-0/+307
| | | | | | | | This command checks the repository to see if any files are missing or corrupt and whether backups can still be restored. This currently only reads meta-data files and only checks archive files presence.
* Use build_extractlist() in select_archives()Arthur de Jong2015-05-151-26/+9
* Implement a restore commandArthur de Jong2015-05-153-0/+165
| | | | This implements a command to restore files from a specific backup.
* Make filters context managersArthur de Jong2015-05-154-70/+71
| | | | | This ensures that open files and streams are properly closed when an exception occurs.
* Use GnuPGKeyEncryption for reading passphraseArthur de Jong2015-05-141-1/+1
| | | The keyencryption property was removed. This fixes d676905.
* Catch certain problems reading JSON filesArthur de Jong2015-05-074-9/+9
| | | | | This also replaces catches IOError by catching EnvironmentError which covers a wider range of errors.
* Implement a --keep option for rmArthur de Jong2015-05-072-3/+83
| | | | | | | | This allows for specifying a policy for backups to keep. All other backups will be removed. This also implements a --no-act option to only print which backups would be removed.
* Add extractlists to infofileArthur de Jong2015-05-071-1/+2
| | | | | This adds the list of archives that have an extractlist to the infofile so the backup can be checked to see if all needed files are present.
* Check archive existance when resyncingArthur de Jong2015-05-072-7/+4
| | | | | The checks that the repository contains an archive file when resyncing the metdata cache from the repository.
* Check that we can read the just written passphraseArthur de Jong2015-05-071-0/+4
| | | | | This tries to read the newly written passphrase file before installing it as a new file to avoid installing a not-decryptable passphrase file.
* Change minimum archive usageArthur de Jong2015-04-102-9/+11
| | | | | | | | This ignores archives that will only be for 40% or less by default and make the percentage count towards effective use and use relative to block size. Also renames --min-files to --archive-min-files.
* Refactor out repository and cacheArthur de Jong2015-04-103-467/+513
| | | | | This moves functionality related to the metadata cache and repository files to separate modules.
* Ignore exceptions removing filesArthur de Jong2015-04-081-2/+10
| | | | | This ignores any exceptions when trying to remove files from the repository.
* Improve command-line handlingArthur de Jong2015-04-022-7/+26
| | | | | This moves a validation to cmdline, supports natural values for --block-size and does simple validation of the BACKUP argument.
* Ingore errors in crawling directoriesArthur de Jong2015-04-021-5/+11
* Add option for minimum number of filesMichel Wilson2015-03-272-1/+13
| | | | | | When backing up directories containing large files, archives with only one or two files are created. The --min-files option forces the archives to always contain at least a set amount of files.
* Do not store contents after backupArthur de Jong2015-03-261-6/+0
| | | | | This tries to avoid filling the SQLite cache and only fills the backup contents cache when it is needed.
* Fix key encryption and uuid propertyArthur de Jong2015-03-261-3/+4
* Save a UUID in the repositoryArthur de Jong2015-03-161-8/+32
| | | | | | | | This UUID is used to distinguish repository cache files from each other which allows running multiple backups without manually specifying different cache directories. This also changes the default cache directory to ~/.cache/sloth.
* Auto-detect key encryptionArthur de Jong2015-03-161-22/+12
* Implement a command to remove backupsArthur de Jong2015-03-152-3/+101
| | | | | This allows removing backups from the repository along with archives that have become redundant.
* Implement find commandArthur de Jong2015-03-082-0/+50
| | | | WIP: Implement a find command that finds files in archives
* Store backup contents in the cacheArthur de Jong2015-03-082-31/+97
| | | | | This stores the file list for the backup in the cache and modifies the ls command to use the cache.
* Store backup metadata in the cacheArthur de Jong2015-03-081-37/+111
| | | | | This adds the information from the backup (stored in info.json) in the cache so it can be queried easier.
* Use fetchall() where neededArthur de Jong2015-03-082-112/+97
| | | | | | | | | | | | | | Use fetchall() on a cursor because SQLite cannot handle partial reads from a cursor if the database is being modified in another cursor. This clearly uses more memory during the backup. Set the synchronous and journal_mode SQLite pragmas to have better performance but be less safe. Since it is a cache, the data can be reconstructed from the repository if needed. This also uses the connection as a context manager instead of manually calling commit, changes some of the transactions around to have better performance and includes a few consistency improvements.
* Use better names for tablesArthur de Jong2015-03-022-39/+39
| | | | The names now better reflect the purpose and contents of the table.
* Rename snapshot to backupArthur de Jong2015-03-012-46/+48
| | | | Don't use the term snapshot any more and use backup instead.
* Refactor out path-handling functionsArthur de Jong2015-03-013-58/+84
* Refactor out ls moduleArthur de Jong2015-03-012-44/+75
* Implement --exclude-from optionArthur de Jong2015-03-012-1/+14
| | | | This reads exclude patterns from a file.
* Implement --exclude optionArthur de Jong2015-03-013-11/+68
| | | | | | | This instructs the crawler to skip certain patterns from the backup. It supports * for matching any part of a file name, ** to also match /, ending the pattern with / to only match directories and starting the pattern with / to match the full path.
* Support both python 2 and 3Arthur de Jong2015-03-013-45/+45
* Improve database performanceArthur de Jong2015-03-011-63/+111
| | | | | | | | | This avoids problems with long extractlists and long lists of archives where more than SQLite's maximum number of SQL variables were passed. This also uses database transactions in order to improve performance and moves some database actions around in the code to make for example temporary table creation closer to where it is actually used.
* Implement a command to list backup contentsArthur de Jong2015-03-014-1/+93
| | | | | This reads snapshot file list and filters and formats the output to be like ls.
* Implement a command to list backupsArthur de Jong2015-02-212-0/+88
| | | | | This writes an info.json file that contains information on each backup that was made and shows that information with a backups command.
* Improve database performanceArthur de Jong2015-02-212-33/+40
| | | | | | | Create indexes (some after crawling which is a minor improvement) in the tables to improve queries, and use explicit transactions to improve performance (small improvement). Also, move temporary table creation to the functions where they are used (instead of global).