Inspiration
I was inspired to make this project by other genome viewers like those by NIH or UCSC. However, with the mounting possible harm caused by bad actors who get ahold of this data, I wanted to create a more secure way to store this information.
What it does
DNA-SeQ is a secure platform for hosting genome information. The website is very user-friendly with a high level of security. Users must first apply for an account to be able to view any data, protecting against potential bad actors. An administrator can then review the application, and, upon confirming the new user's good intentions, can grant them access to the database. The users can then log in and view the data with the help of fireauth, firebase's authentication SDK. I also added route protection, so that users without access to data do not have any way to get to it, nor to the webpages from which admins can exercise their power.
Administrators also have broad control over both users and genomic data. They can set access specifiers on the raw data (which is useful for example if the user has a rare, phenotypic mutation that would allow for their identification), or they can review users' access and remotely deactivate their account in the case the original user has their account breached.
How we built it
or this project, I took a sequence of mutations from samples in the Interntional Genome Sample Resource (IGSR) and encoded them into a cloud firestore database. From firebase, I accessed the data on a nextjs and tailwindcss frontend. Lastly, I hosted the finished app on Vercel, which is available at dnaseq.vercel.app.
Challenges we ran into
Due to the immense (over 10 GB!) size of the VCF file we were given, I was unable to add all of the datapoints to the database. Instead, I added roughly the first few hundred to illustrate what a finished version of the app might look like.
Accomplishments that we're proud of
I really like how the table turned out! It manages to convey almost all of the VCF data while still being very readable and secure. The individual access modification on datapoints is also very nice.
What we learned
I learned how to parse VCF files -- more specifically, I learned how to read files in segments in python's pandas.
What's next for DNA-SeQ
In the future, I hope to add additional table features, such as pagination, indexing, and searching, as well as further improve security by adding access logs so that admins can see who logged in when.
Log in or sign up for Devpost to join the conversation.