From 63a59357a898927713a6032bea684330e4b75ba2 Mon Sep 17 00:00:00 2001 From: Olivier 'reivilibre Date: Sat, 26 Jun 2021 16:41:47 +0100 Subject: [PATCH] Update READMEs --- datman/README.md | 12 ++++++ yama/README.md | 105 ++++------------------------------------------- 2 files changed, 19 insertions(+), 98 deletions(-) create mode 100644 datman/README.md diff --git a/datman/README.md b/datman/README.md new file mode 100644 index 0000000..4a27a60 --- /dev/null +++ b/datman/README.md @@ -0,0 +1,12 @@ +# datman: DATa MANager + +Datman is a tool to make it easier to use Yama for backups. + +Features: + +* Chunk-based deduplication +* (optional) Compression using Zstd and a specifiable dictionary +* (optional) Encryption +* Ability to back up to remote machines over SSH + +See the documentation for more information. diff --git a/yama/README.md b/yama/README.md index 611e836..c8499d5 100644 --- a/yama/README.md +++ b/yama/README.md @@ -1,109 +1,18 @@ # 山 (yama): deduplicated heap repository -note: this readme is not yet updated to reality… +Yama is a system for storing files and directory trees in 'piles'. The data stored is deduplicated (by using content-defined chunking) and can be compressed and encrypted, too. -``` -yama - [-w|--with [user@host:]path] [--with-encrypted true|false] +NOT YET ~~Yama also permits storing to piles on remote computers, using SSH.~~ -``` +Yama is intended for use as a storage mechanism for backups. Datman is a tool to make it easier to use Yama for backups. -## Backup Profiles +The documentation is currently the best source of information about Yama, see the `docs` directory. +Yama can be used as a library for your own programs; further information about this is yet to be provided but the API documentation (Rustdocs) may be useful. -## Remotes +## Other, unpolished, notes -In `yama.toml`, you can configure remotes: - -```toml -[remote.bob] -encrypted = true -host = "bobmachine.xyz" -user = "bob" -path = "/home/bob/yama" -``` - -## Subcommands - - -### `check`: Check repository for consistency - -Verifies the full repository satisfies the following consistency constraints: - -- all chunks have the correct hash -- all pointers have a valid structure, recursively - -Usage: `yama check [--gc]` - -The amount of space occupied and occupied by unused chunks is reported. - -If `--gc` is specified, unused chunks will be removed. - -### `lsp`: List tree pointers - -Usage: `yama lsp` - -### `rmp`: Remove tree pointers - -Usage: `yama rmp pointer/path [--force]` - -If `--force` is not specified and the pointer is depended upon by another, then deletion is aborted with an error. - -### `store`: Store tree into repository - -Usage: `yama store [--dry-run] [ssh://user@host]/path/to/dir pointer/path [--exclusions path/to/exclusions.txt] [--differential pointer/parent]` - -The pointer must not exist and it will be created. If `--differential` is specified with an existing parent pointer, then the diretory listing is specified as a differential list to the parent. -The intention of this is to reduce the size of the directory list. - -#### Exclusion lists - -Exclusion lists have pretty much the same format as `.gitignore`, one glob per line of files to not include, relative to the tree root. - -### `extract`: Extract file(s) from repository - -Usage: `yama extract [--dry-run] pointer/path[:path] [ssh://user@host]/path/to/local/dir[/]` - -If no path specified, extract root /. Trailing slash means that the file will be extracted as a child of the specified directory. - -### `remote`: Run operations on a remote repository - -Usage: `yama remote ssh://user@host/path/to/repo ` - -#### remote `store`: Store local tree into remote repository - -Usage is identical to `yama store` except store path must be local. - -#### remote `extract`: Extract remote repository into local tree - -Usage is identical to `yama extract` except target path must be local. - -### `slave`: Remote-controlled yama - -Communicates over stdin/stdout to perform specified operations. Used when a yama command involves SSH. - -## Repository Storage Details - -Pointers are stored in `pointers.lmdb` and chunks are stored in `chunks.lmdb`. -It is expected that exclusion files will be kept in the same directory with the repository, if they are to be used -on a recurring basis. - -Chunks are compressed with `zstd`. It must first be trained and a training dictionary placed in `repo root/zstd.dict`. -**This dictionary file must not be lost or altered after chunks have been made using it. Doing so will void the integrity of the entire repository.** - -Chunks are hashed with BLAKE256, and chunks will have their xxHash calculated before being deduplicated away. (Collision being detected will result in abortion of the backup. It is expected to never happen but nevertheless we may not be sure.) - -## Remote Protocol Details - -* Compression is performed on the host where the data resides. -* Only required chunks are compressed and diffused across the SSH connection. -* There needs to be some mechanism to offer, decline and accept chunks, without buffers overflowing and bringing hosts down. - - -## Processor Details - - -## Other notes +### Training a Zstd Dictionary `zstd --train FILEs -o zstd.dict`