Configurations: A model for distributed data storage

B Mungamuru, H Garcia-Molina, C Olston - Proceedings of the twenty …, 2007 - dl.acm.org
Proceedings of the twenty-sixth annual ACM symposium on Principles of …, 2007dl.acm.org
There are many ways to safeguard data from loss and unauthorized access, but there are
two fundamental operations that cover many of the options: Copy and what we call Split.
Making copies of data (ie, replicating across n sites) safeguards against the loss of data,
whereas splitting data safeguards against unauthorized access. With a Split, data is
“decomposed” into n shares (eg, ciphertext and n− 1 keys) and distributed across n sites, in
such a way that all n shares are needed to reconstruct the original data, and access by an …
There are many ways to safeguard data from loss and unauthorized access, but there are two fundamental operations that cover many of the options: Copy and what we call Split. Making copies of data (ie, replicating across n sites) safeguards against the loss of data, whereas splitting data safeguards against unauthorized access. With a Split, data is “decomposed” into n shares (eg, ciphertext and n− 1 keys) and distributed across n sites, in such a way that all n shares are needed to reconstruct the original data, and access by an adversary to any proper subset of the shares is not a security breach.
The Split operator can be implemented in many ways. For instance, in a 3-way Split (ie, n= 3) of a file, two shares can be randomly generated sequences of bits and the third share can be the bits of our file XOR-ed with the random sequences. From our point of view, a Split might even be a partition of the attributes of a relational database into subsets (see [1]), as long as each subset of attributes is not sensitive on its own, and only the combination of all attributes allows us to reconstruct the original database. Copy and Split operators can be composed in interesting ways to specify distributed strategies for safeguarding data. To illustrate, consider Figure 1, which shows what we call a configuration. The terminal vertices at the bottom of the tree represent data that is materialized and stored at a physical storage site, while the non-terminals represent data that is not materialized. The non-terminals are annotated with either an S for a Split operation, or a C for a Copy. In this configuration, a file containing sensitive data (represented by the root g) has been split into a materialized share a and a non-materialized share f. For our illustration, let us say that a is a publicly accessible encrypted version of file g and that f is the encryption key. We make two copies of key f: one copy is stored at site b, whereas the other copy e is split again into shares c and d.
ACM Digital Library
Showing the best result for this search. See all results