Examination Project

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Project Title: Movie Library Database Design and Analysis with MongoDB

Objective:

The objective of this project is to design a MongoDB database for a movie library, implement the database,
and perform various queries and aggregations to analyze the data. This project will help you apply your
knowledge of MongoDB to a real-world scenario involving movies, actors, and users.

Datasets Provided:

● actors.json: Contains data about actors.


● movies.json: Contains data about movies.
● users.json: Contains data about users and their favorite movies.

Tasks:

1. Creating Mongodb collections:


○ Import the provided JSON data into the corresponding collections.
2. Data Ingestion:
○ Write scripts to insert additional sample data into your MongoDB collections.
3. Indexing Strategy:
○ Analyze your collections and queries to determine where indexes should be applied.
○ Create indexes to optimize query performance.
○ Document the rationale behind choosing specific fields for indexing.
4. Aggregation and Analysis:
○ Implement aggregation queries to perform the following analyses:
■ Total number of movies released per year.
■ Average rating of movies for each genre.
■ Top 5 actors by the number of movies they have acted in.
■ List of movies in which a specific actor has performed.
■ List of users who have a specific movie as their favorite.
■ List of movies favored by the most users.
5. Plotting Data:
○ Use a data visualization tool (e.g., Python with Matplotlib, Seaborn, or any BI tool) to
create visualizations for the following:
■ Number of movies released per year.
■ Average rating of movies per genre.
■ Top 5 actors by the number of movies.
■ Top 5 movies favored by users.
6. Balancing and Maintenance:
○ Monitor the performance of your MongoDB instance and adjust your indexing and
sharding strategy if necessary.
○ Document any adjustments made and the impact on system performance.
7. Reporting:
○ Create a report summarizing your project.
○ Include the following sections:
■ Introduction: Brief overview of the project and objectives.
■ Data Ingestion: Explanation of how you imported the data.
■ Indexing Strategy: Details of your indexing choices and their benefits.
■ Aggregation Queries: List and explanation of your aggregation queries.
■ Plotting Data: Description and examples of your visualizations.
■ Balancing and Maintenance: Summary of monitoring and adjustments made.
■ Conclusion: Insights gained from the project and potential improvements.

Additional Questions:

○ Discuss the concept of denormalization. Explain how you could denormalize the database
in this project to optimize query performance. Provide examples of what the
denormalized data might look like and the trade-offs involved (e.g., redundancy vs.
performance).

Submission Guidelines

● Submit your project as a single PDF file before 14 June. Send it by email: [email protected]

Good luck, and apply your knowledge creatively and effectively!

You might also like