-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Composability: Key Value/Ad Retrieval Service and Pluggable Storage/Query Engines #50
Comments
Hi @thegreatfatzby , allowing the TEE to query untrusted systems on a per-request level would ship information from within the TEE to the outside. So any dependency of a TEE has to be also hosted within TEEs. It seems to me only (3) is true for that. Or maybe I'm missing something? |
I think (1) would satisfy it as well, since in that scenario I'm hypothesizing that the storage and retrieval would still stay on the KV-AR server. |
I see. If it stays within the TEE, it's essentially part of the server - since it has access to the same level of information as the core server. So the plug-in would be more in the sense of allowing external contributions to the codebase rather than allowing additional functionality without the same level of scrutiny that the core server code would experience. External contributions on the advanced features make sense though. |
@peiwenhu for (2) above, the managed service option, I wanted to dig on this more...are the issues here that the operator would potentially have observability into:
And then they could in theory write out? For writing I would think we could allow only read in the drivers made available in the TEE. Observability in switches seems like a general thing not just in this situation, so understood on that. In theory, could a database vendor choose to make an attested version of their software that could run in confidential computing environments, maybe with less functionality than normal as needed? |
Yes 1. and 2 are the main concerns. (writing or reading doesn't matter at first really since they are all effectively shipping information out of the TEE as long as there's some observability. when there's no observability, yes writing would become a problem.) In theory if a db vendor can make a version running in confidential computing environments recognized and attested by the same/equivalent standard as the TEE server itself, my personal opinion is it is possible for the TEE server to use it. (more discussions are needed to form an official position from Privacy Sandbox) |
So something I've been noodling on for a while is that we (and I include ASAPI in "we") are subtly coupling behavior and implementation/technology in some places. This is completely understandable given the phase of development we're in, level of general understanding across the industry, the new-ness (to this industry at least) of integrating DP into inputs/outputs to functions (in the most general sense), etc. Arguably it's a good/necessary thing to get off the ground...but I would imagine we'd all like to decouple complicated/hard things like storage/query engines from complicated things like "privacy preservation" layer if we could.
So I'd like to get thoughts on a particular issue I see coming soon, which is the ability store and retrieve data in more nuanced ways but still enforce privacy.
In our experience at App-Xandr-Soft, data storage and retrieval is not an easy problem to solve with a single solution. Here are some KV/AR issues we've run into for MSAN, Monetize, and Invest:
I suspect in the long run it will be very difficult to support ad-tech functionality if we don't support this type of scaling and retrieval.
At a gross level of simplification, in simple app development this seems like something we'd solve with an adapter pattern, so I'm interested in discussing pluggable storage/query backend engines for the KV/AR service. Some strawmen I'll put out for open fire:
PrivateKVARBackendInterface
and then query that from their UDFs for KVs or creatives for further bidding.(1) has the clear downside of needing to implement both storage and query adapters and store data on the same servers, which will be operationally challenging, although would seem the safest step from a privacy perspective.
(2) would have some of the same issues as discussed with general composability here, and I can currently only wave my hands at a general idea of a managed service database in which the cloud provided interface ensures certain guarantees around observability of queries and data, although I'd guess it's not strictly impossible.
(3), same issue as (2) but harder I'd think.
So, obvious issues to discuss, but I think very worth discussing. ¡Vamos!
The text was updated successfully, but these errors were encountered: