- Ball, Madeleine P;
- Thakuria, Joseph V;
- Zaranek, Alexander Wait;
- Clegg, Tom;
- Rosenbaum, Abraham M;
- Wu, Xiaodi;
- Angrist, Misha;
- Bhak, Jong;
- Bobe, Jason;
- Callow, Matthew J;
- Cano, Carlos;
- Chou, Michael F;
- Chung, Wendy K;
- Douglas, Shawn M;
- Estep, Preston W;
- Gore, Athurva;
- Hulick, Peter;
- Labarga, Alberto;
- Lee, Je-Hyuk;
- Lunshof, Jeantine E;
- Kim, Byung Chul;
- Kim, Jong-Il;
- Li, Zhe;
- Murray, Michael F;
- Nilsen, Geoffrey B;
- Peters, Brock A;
- Raman, Anugraha M;
- Rienhoff, Hugh Y;
- Robasky, Kimberly;
- Wheeler, Matthew T;
- Vandewege, Ward;
- Vorhaus, Daniel B;
- Yang, Joyce L;
- Yang, Luhan;
- Aach, John;
- Ashley, Euan A;
- Drmanac, Radoje;
- Kim, Seong-Jin;
- Li, Jin Billy;
- Peshkin, Leonid;
- Seidman, Christine E;
- Seo, Jeong-Sun;
- Zhang, Kun;
- Rehm, Heidi L;
- Church, George M
Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved "open consent" process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain-we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research.