We use index partitioning schemes based on file system security metadata. In our work on indexing, we are investigating making search both faster and more secure. However, difficult issues arise in combining these features, such as safe data destruction and privacy preservation in the face of network analysis. Combining the two can allow for efficient storage of data under arbitrary classification. We are investigating a system that integrates the seemingly incompatible features of encryption and deduplication. Building on this approach, we are investigating scalable encryption and limiting the effects of compromised computation nodes. The prototype implementation we developed imposes only a 6–7% overhead on a metadata-heavy workload involving file opens spread across hundreds of clients. Our approach to security in Ceph allows secure access by hundreds of thousands of clients to a single file spread across tens of thousands of object-based storage devices without taxing the metadata servers or any other part of the system. There is a version of the library available for download. Performance evaluation shows that our prototype’s key distribution is highly scalable and robust. The design of Horus provides end-to-end data encryption and can reduce the need to trust system operators or cloud service providers. KHT also reduces key management and distribution overhead. Horus encrypts large datasets using keyed hash trees (KHT) to generate different keys for each region of the dataset, providing fine-grained security. We have designed and implemented Horus, a system that offers fine-grained encryption-based security for large-scale storage. A node with one part of the file cannot read any other part of the file, providing strong security while still allowing for highly parallel computation on stored data. This work developed techniques for using a single key to encrypt an entire file, while still allowing secure distribution of parts of the file to different compute nodes.
By explicitly carrying forward all data under a new key, this approach allows a storage system to securely delete information quickly, meeting security requirements for regulations such as GDPR and CCPA, as well as providing privacy guarantees for deleted data for users.
Our current approach allows a user to "forget" data by securely disposing of a single 128-bit key. The Lethe project is exploring techniques for securely deleting data from any file system by forgetting as little information as possible. These systems store data in a way that prevent an attacker from even knowing that data exists. This thrust includes the POTSHARDS project, which completed around 2010, as well as current research on security for archival storage systems using combinations of random blocks to provide a strong source of entropy, helping to guard against long-term "cracking". Our secure storage research has several thrusts:
File storage system verification#
We are also exploring protocols to verify remote storage and formal verification of secure network-attached storage. Adding security to large storage systems presents a severe challenge to scalability that we are addressing using aggregate capabilities. We investigate the use of strong authentication, encryption, and other mechanisms to safeguard data stored in network-attached storage systems and long-term archival storage systems.