NOTE: this is mostly a wishlist and doesn't reflect the actual state of implementation!
Work in progress documentation: https://profi248.github.io/pluto (contains build & run instructions).
pluto lets you back up any amount of your data for free to other people's computers, in exchange for an equal amount of your disk space. You can save your data to random users, your friends, or choose to make multiple copies and split them however you want. In any case, your data will of course be encrypted with a key only you have. Don't procrastinate your backups, it's as simple as installing the app and picking what the data you want safeguarded! The only thing you need is some free disk space.
There's of course a lot of potential for abuse in this system, so we try to mitigate as much of it as we can. To prevent a malicious actor from pushing data to other users, while lying to the network about saving theirs, all clients are required to answer storage challenges. Storage challenge is generated by a client that has backed up some data, and sent to the client that is supposed to store it. The challenge consists of a random byte range of a random file. The challenged client than responds with a hash if the requested data, thus proving that they're actually storing all of it.
There is also an inherent risk of a client refusing to give back backed up data. The malicious actor wouldn't gain too much value from that, but it will be mitigated, at least partially, by a reputation score. Clients that misbehave too often will be excluded from the network. To ensure that legitimate computers are suitable for participating, we will check their internet speeds and verify if they are online often enough.
The problem of malicious actor intentionally pushing malicious data (malware, illegal content, etc.) to other users is a tricky one to fully solve, but obfuscating with our own key before physically saving it to the drive should save us from most trouble.
All data in backups is encrypted and authenticated using AES-256-GCM. Ed25519 is used for authentication to the backup coordinator. All the keys are generated from a CSPRNG (ChaCha20-based), seeded with entropy that's initially generated on the first installation. That initial entropy is shown to the user in the form of a 24-word passphrase (generated with a BIP39-like algorithm), which serves as a simple way to later recover the keys and restore files.
Nodes (clients) connect to the backup coordinator, a central server that matches nodes that want to back up their data. It also stores relationships between nodes (so everyone knows who has their backups), relays challenges, etc. Data itself (in the form of encrypted chunks), is transferred directly between nodes (if possible, otherwise a relay could be used). The MQTT protocol is used between coordinator and nodes.
Files that are intended for backups are split into chunks using a content defined chunking algorithm, then encrypted and sent off to storage. That way, only changes in files have to be transferred. Index of chunks is stored and encrypted along with the other data.