Ekta Final
Ekta Final
Ekta Final
Internship Report
Submitted by
Ekta Kothari (202103103510343)
Bachelors of Technology
in
Computer Science and Engineering(CC)
at
Uka Tarsadia University
This is to certify that the project report entitled ”Project Title” has been carried
out by Ms. Ekta Kothari having enrollment number 202103103510343for the partial
fulfillment of Bachelor of Technology in Computer Science and Engineering
(CC) at Asha M. Tarsadia Institute of Computer Science and Technology degree to
be awarded by Uka Tarsadia University.
Date:
Place: AMTICS, Bardoli
Examiner’s Signature
ii
ACKNOWLEDGEMENT
I have made an effort in this seminar work. However, it would not have been possible
without many individuals’ kind support and help. I would like to extend my sincere
thanks to all of them.
I am highly indebted to Santosh Saha for his guidance and constant supervision and
for providing necessary information regarding the seminar work.
I would like to express my gratitude to my parents and other family members for their
kind cooperation and encouragement, which helped me complete this project. My
thanks and appreciation also go to the people who have willingly helped me out with
their abilities. Note: This page will be edited by students only.
Ekta Kothari
iii
ABSTRACT
This project combines the strengths of Movie Mentor and IntellifyDev to deliver in-
telligent, user-centric solutions powered by data-driven insights and advanced machine
learning techniques. Movie Mentor simplifies movie discovery by leveraging the TMDB
dataset and API to provide dynamic, visually enriched content such as movie details,
posters, and trailers. Using the scikit-learn library’s CountVectorizer, it identifies pat-
terns in metadata, offering personalized recommendations based on storylines, genres,
and styles. IntellifyDev, on the other hand, focuses on building intelligent applica-
tions that integrate predictive modeling and advanced analytics. With scalable machine
learning models powered by tools like scikit-learn, it delivers intuitive solutions for com-
plex user needs. Together, these platforms showcase the seamless integration of robust
backend technologies and user-focused design to provide innovative, data-backed recom-
mendations and intelligent development capabilities for a wide range of applications.
iv
TABLE OF CONTENTS
CERTIFICATE ii
ACKNOWLEDGEMENT iii
ABSTRACT iv
LIST OF FIGURES ix
LIST OF ABBREVIATIONS x
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Intellifydev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Intellifydev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Intellifydev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
v
2.2 System modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Intellifydev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Intellifydev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2 Intellifydev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 System Design 23
3.1 Use case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.1 Intellifydev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
vi
5 Conclusion and Future Work 40
5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
vii
LIST OF TABLES
viii
LIST OF FIGURES
2.1 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
ix
LIST OF ABBREVIATIONS
AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Intelligence
ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Neural Networks
BLEU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bilingual Evaluation Understudy
CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Convolutional Neural Networks
GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphics Processing Unit
LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Long Short Term Memory
MNIST . . . . . . . . . . . . . . . . . . . . . . Mixed National Institute of Standards and Technology
NIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . National Institute of Standards and Technology
NN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neural Network
NMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neural Machine Translation
RNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recurrent Neural Network
x
Chapter 1
Introduction
1.1 Overview
In today’s era of OTT platforms, audiences are often overwhelmed with choices, leaving
them asking essential questions: What to watch? Where to watch? How to watch?
While most platforms focus on offering options based on user data and viewing habits,
they often miss addressing a common desire—to find movies that capture the same
vibe, essence, or spirit as a favorite film. For instance, imagine wanting to watch a
movie similar to Avatar. Its groundbreaking visuals, immersive storytelling, and unique
themes set it apart, making it hard to find something that truly feels like it. Currently,
no platform specializes in helping users explore films that share such specific qualities
with their favorites. This inspired the creation of Movie Mentor, a platform designed
to bridge this gap and redefine the movie-selection process.
Movie Mentor simplifies the challenge of discovering movies by combining data-driven
technology with user-centric design. Using the TMDB dataset from Kaggle, we access
1
CHAPTER 1. INTRODUCTION 2
To power its core functionality, Movie Mentor employs machine learning techniques
using the scikit-learn library. Specifically, the CountVectorizer method is applied to
analyze patterns in the dataset, creating a robust model that identifies movies similar
to a given favorite. This ensures that users receive accurate and meaningful suggestions
based on storylines, genres, or stylistic elements they enjoy.
With Movie Mentor, planning your movie night becomes an engaging experience.
Whether you’re seeking films with similar themes, a comparable style, or an emotional
resonance to your favorite, Movie Mentor is here to ensure every movie night aligns
perfectly with your mood and preferences. Second project I worked on for Intellifydev
involved developing a dynamic and informative website that showcased the company’s
CHAPTER 1. INTRODUCTION 3
diverse IT services. The goal was to create an engaging platform that effectively
communicated Intellifydev’s expertise in areas like web development, mobile app de-
velopment, SEO, and WordPress development. My responsibilities included designing
a clean, user-friendly interface where visitors could easily explore the company’s offer-
ings, learn about previous projects, and view client testimonials. The website featured
detailed service descriptions, interactive case studies, and a responsive contact form
for inquiries. Additionally, I implemented a content management system (CMS) to
allow the Intellifydev team to seamlessly update the content, ensuring the site remains
current. To enhance the user experience, I optimized the website’s performance,
ensuring fast load times and smooth navigation across different devices, including
mobile and desktop. Through this project, I contributed to building a platform that
not only represented Intellifydev’s brand but also highlighted its technical capabilities,
providing visitors with an informative and engaging experience.
The world of online streaming is brimming with countless choices, leaving users
overwhelmed when trying to select movies that match their preferences. While
existing OTT platforms provide suggestions based on user profiles and viewing history,
they often fail to cater to the specific need for finding movies that resemble the essence,
themes, or style of a particular favorite film. This gap in personalized discovery raises
questions like: How do you find movies with the same vibe as your favorites?
Creating a solution to address this gap involves a set of complex challenges. First,
working with datasets presents significant hurdles. Raw data is often vast and
unstructured, requiring substantial effort to clean, process, and transform it into
meaningful information. The transition from raw data to actionable insights is not
straightforward and demands careful analysis and preparation. Another significant
CHAPTER 1. INTRODUCTION 4
challenge is integrating this processed data with a functional and intuitive user
interface. Developing a seamless connection between the frontend and backend
systems is particularly tricky when using machine learning models saved in formats
like Pickle. Ensuring compatibility between these components while maintaining
responsiveness and usability adds to the complexity.
Moreover, extracting valuable insights from the TMDB dataset required advanced ma-
chine learning techniques. Building models that accurately identify similarities between
movies necessitates the implementation of methods like CountVectorizer. Fine-tuning
these models to align with user expectations is another layer of difficulty.
These challenges collectively highlight the intricate nature of designing a system that
bridges the gap in movie discovery while meeting technical and user-focused require-
ments.
1.2.2 Intellifydev
The primary challenge in developing the Intellifydev website was to create an efficient,
user-friendly platform that effectively communicates the company’s expertise in a range
of IT services
• One of the challenges was to ensure that the website was responsive and visually
appealing across multiple devices. Tailwind CSS was chosen to ensure quick and
scalable styling that aligns with the modern design needs of an IT company. It
also had to include a contact form that functions seamlessly with the backend to
send user inquiries or messages directly to the company’s email, making it simple
for clients to get in touch.
• The site required the integration of a backend using PHP to handle form sub-
missions and dynamically update content as necessary. Ensuring the form was
secure, especially against common vulnerabilities such as spam or SQL injection,
was a significant challenge. Moreover, the backend needed to handle data securely
while also providing a smooth user experience.
CHAPTER 1. INTRODUCTION 5
• Another problem was to ensure that the site could scale efficiently as the company
grew, adding new services or features in the future without requiring a complete
redesign or overhaul. Additionally, optimizing the site for speed and performance
was a priority to guarantee fast load times, reducing bounce rates and improving
overall user experience.
• Lastly, the website needed to function as a professional tool for attracting po-
tential clients, offering clear navigation, showcasing the company’s services, and
enabling easy access to contact information. This required a careful balance
between aesthetic design, functional backend features, and ease of use for clients.
1.3 Scope
1. User-Centric Features:
• Allow users to search for movies based on their preferences or a reference movie.
• Provide detailed information about movies, including genre, cast, synopsis, and
visuals.
• Enhance user experience through an intuitive and responsive interface that sim-
plifies navigation.
2. Data Integration:
• Utilize the TMDB dataset from Kaggle and the TMDB API to fetch comprehen-
sive movie details, including metadata and imagery.
• Process and transform raw data into a structured format suitable for machine
learning analysis.
CHAPTER 1. INTRODUCTION 6
• Continuously refine the model for accuracy and relevance based on user feedback
and performance.
4. Technical Implementation:
• Develop a robust backend system to handle data processing and model integra-
tion.
• Explore the potential for integrating other media types, such as TV shows or
documentaries.
1.3.2 Intellifydev
• Service Display and Information Architecture: The website serves as an online
portfolio and information hub for Intellifydev, detailing the company’s expertise
CHAPTER 1. INTRODUCTION 7
• SEO Optimization: The website is optimized for search engines to improve its
visibility and ranking. This includes integrating SEO best practices like keyword-
rich content, metadata, and alt tags to enhance discoverability and drive organic
traffic.
• Security and Performance: The scope also includes implementing security mea-
sures to protect form submissions and other user data, ensuring that the website
is resilient to common web vulnerabilities like spam and SQL injections. Perfor-
mance optimization was also a critical component to ensure fast load times and
smooth navigation.
• Scalability: The site’s structure allows for future scalability, enabling the addition
of new pages, services, or features without disrupting the overall user experience
or functionality.
Chapter 2
The development of this project followed a structured and iterative approach, focusing
on creating a seamless system for movie discovery. Below is an overview of the steps
taken during the project’s development:
The first step involved identifying a suitable dataset containing comprehensive movie
details. The TMDB dataset from Kaggle was chosen for its richness in attributes
such as genres, cast, synopsis, and visuals. However, the raw data required significant
preprocessing. Using pandas, the CSV files were read and reshaped to extract only the
relevant columns, such as movie titles, overviews, and genres. Further, the raw data
was cleaned, structured, and transformed into a usable format.
After shaping the data, the next step was to convert textual information into a machine-
readable format. This was achieved using the CountVectorizer method from the scikit-
learn library. The vectorization process transformed the movie overviews and genres
into numerical vectors, enabling the system to compute similarities effectively. This
step was crucial for building a model capable of identifying the top five movies that
most closely match a given movie.
8
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 9
3. Model Serialization
Once the vectorized data and similarity computation were complete, the entire setup,
including the processed movie database and the vectorization model, was serialized
into a PKL (Pickle) file. This format ensured the reusability of the model and data for
frontend integration without requiring real-time recomputation.
4. Frontend Integration
The frontend was developed using Streamlit, chosen for its simplicity and effectiveness
in creating interactive web applications. The PKL file was loaded into the frontend,
where it facilitated efficient data retrieval and user interaction. The interface was
designed to be intuitive, allowing users to input a movie name and instantly receive a
list of similar movies.
4. Frontend Integration
Throughout the development, rigorous testing was conducted to ensure data accuracy,
model reliability, and frontend responsiveness. Multiple iterations were performed
to refine the preprocessing pipeline, improve model performance, and enhance user
experience.
This structured approach not only ensured the successful completion of the project but
also laid the foundation for scalability and future enhancements, such as incorporating
additional datasets or expanding features for TV shows and documentaries. The result
is a cohesive system that efficiently bridges the gap in movie discovery for users.
2.1.2 Intellifydev
The development approach for the Intellifydev website followed a structured, systematic
methodology to ensure efficient delivery and high-quality outcomes. The process was
divided into several phases:
Stakeholder Collaboration:
1. Frontend Development: The website was built using HTML, CSS, and JavaScript
to implement the design and ensure a smooth user experience. Tailwind CSS was
used to style the website with a responsive, mobile-first approach, allowing the
site to adjust to various screen sizes and devices. JavaScript was incorporated
for interactive elements, such as form validation and dynamic content loading,
enhancing user engagement.
2. Backend Development: The backend of the website was developed using PHP,
handling functionalities like form submissions through the Contact Us page. The
backend was responsible for securely processing and sending messages to the
company’s email. Additionally, PHP was utilized to ensure content management
capabilities, allowing the website to be easily updated without requiring direct
code changes.
3. Performance Testing: Load times and site efficiency were optimized using
tools like Google PageSpeed Insights, ensuring fast performance.
5. User Acceptance Testing (UAT): The website was reviewed by the client to
confirm it met all expectations before launch.
Purpose: To gather, clean, and preprocess the dataset for further use.
Components:
• Extract key features such as movie titles, genres, overviews, and cast information.
• Transform the raw data into a structured format suitable for machine learning.
Purpose: To convert textual data into a machine-readable format and compute simi-
larities.
Components:
• Store the top 5 most similar movies for each entry in a retrievable format.
Purpose: To save processed data and vectorized models for reuse in the frontend.
Components:
• Serialize the processed movie database and vectorization model into a PKL
(Pickle) file.
• Ensure efficient and secure storage of serialized data for easy retrieval.
• Develop the frontend using Streamlit to create a lightweight and interactive web
application.
• Display the top 5 most similar movies, along with details such as genres,
overviews, and posters, retrieved from the backend.
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 14
Purpose: To manage data retrieval and model interaction between the backend and
frontend.
Components:
• Handle user queries and fetch the corresponding results using precomputed sim-
ilarity metrics.
Purpose: To deploy the system and ensure it can handle future enhancements.
Components:
• Deploy the Streamlit app to a web server or cloud platform for public access.
These modules collectively ensure that the system operates efficiently, delivering accu-
rate and meaningful results while providing a smooth and interactive user experience.
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 15
2.2.2 Intellifydev
• Home Page Module: Displays an overview of the company and quick access
to services, portfolio, and contact information.
• About Us Module: Provides details about the company’s mission and values
to connect with users and establish trust.
The system must allow users to input a movie name through a simple search interface
on the frontend.
• The system should accept movie names in text format and provide feedback if
the movie is not found in the database.
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 16
• Upon receiving a movie name, the system must calculate and return the top 5
most similar movies based on a pre-trained machine learning model.
• The system must compute similarity efficiently and return results within a rea-
sonable time frame.
For each of the top 5 similar movies, the system must display detailed information,
including:
• Movie title
• Genre(s)
• Overview or synopsis
The movie details should be retrieved from the backend (via the serialized data) and
presented in a visually appealing format in the frontend.
• The system must be able to retrieve and process data from the serialized Pickle
file, which stores the preprocessed movie data and the trained machine learning
model.
• The backend should handle the retrieval of movie data based on user queries and
ensure that results are accurate.
]
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 17
• The system must support real-time responses for user queries, ensuring minimal
delay in returning the top similar movies.
• Any query made by the user should trigger a quick and responsive backend call,
and the results should be displayed promptly on the frontend.
• The system must support serializing the preprocessed dataset and machine learn-
ing models into Pickle files.
• The data and model must be securely stored and retrievable for future use, re-
ducing the need for real-time computation.
• The system must provide a clean and interactive frontend, developed using
Streamlit, to display the movie search functionality and results.
• The frontend must support easy navigation, including a search bar, result display
area, and buttons for initiating searches.
8. Error Handling
• The system must handle errors gracefully, providing informative messages in case
of failures, such as when no similar movies are found or if an invalid movie name
is entered.
• The frontend should display error messages in a user-friendly manner and allow
users to try new queries.
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 18
9. System Performance
• The system should be optimized for speed and performance, ensuring that the
movie similarity calculations and data retrieval do not cause delays.
These functional requirements ensure that Movie Mentor operates efficiently, providing
users with relevant and timely movie suggestions based on their preferences, while
offering a smooth and responsive user experience.
2.3.2 Intellifydev
• Homepage: Displays services and navigation to other sections.
• Contact Us Page: Includes a functional form with validation and backend pro-
cessing via PHP.
• Responsive Design: Adapts to all screen sizes and ensures cross-browser compat-
ibility.
1. Performance
• The system must provide quick responses to user queries, with minimal latency
in retrieving and displaying results.
• The similarity calculation and movie data retrieval should not take more than a
few seconds to ensure a smooth user experience.
2. Scalability
3. Availability
• The system must be available 24/7, ensuring continuous availability of the movie
search and recommendation functionality.
4. Security
• The system must ensure the security of user data and prevent unauthorized
access.
• The system should comply with relevant data privacy and security regulations.
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 20
5. Usability
• The frontend user interface should be simple, intuitive, and easy to navigate,
with clear instructions for users.
• The system should provide helpful feedback for invalid inputs, such as when no
similar movies are found.
• The system should be designed for a wide range of users, from casual moviegoers
to tech-savvy individuals, with no steep learning curve.
6. Maintainability
• The system must be easy to maintain and update, with a modular code structure
that allows for quick fixes, updates, and enhancements.
• The system should follow best practices for code organization and version control
to ensure long-term maintainability.
7. Compatibility
• The frontend application should be compatible with all major browsers (Chrome,
Firefox, Safari, etc.) and work seamlessly on both desktop and mobile devices.
8. Responsiveness
• The user interface should be responsive and adaptable to different screen sizes,
providing a consistent experience across desktop, tablet, and mobile devices.
• The layout should adjust dynamically based on the user’s device to ensure the
movie search and results display are fully accessible on any platform.
CHAPTER 2. SYSTEM PLANNING ARCHITECTURE PLANNING 21
9. Reliability
• The system should be reliable, with minimal bugs or crashes. It should provide
accurate results based on the data and return appropriate feedback when errors
occur.
• Error handling mechanisms should be in place to ensure that the system recovers
gracefully from unexpected failures.
These nonfunctional requirements ensure that Movie Mentor not only meets functional
expectations but also delivers a high-quality user experience, is easy to maintain and
scale, and operates securely and efficiently.
2.4.2 Intellifydev
• Performance: The website must load within 3 seconds on standard internet con-
nections. Optimize resources, such as images and scripts, for fast load times.
• Scalability: The design and backend should support future expansion, including
adding new pages or features without significant rework.
• Security: Protect user data submitted through the Contact Us form using input
sanitization and secure transmission protocols. Prevent vulnerabilities such as
SQL injection and XSS attacks.
• Maintainability: Use clean, modular code for easier updates and troubleshooting.
Document the website’s architecture for smoother handover and future mainte-
nance.
System Design
23
CHAPTER 3. SYSTEM DESIGN 24
4.1.1 Platforms
Programming Languages:
• HTML, CSS, JavaScript: Commonly used in both projects for building interactive
and visually appealing user interfaces.
• PHP: Employed in the Intellifydev website for server-side scripting and managing
backend operations.
• Flask: Used in Movie Mentor to integrate backend logic with the web frontend
seamlessly.
• Tailwind CSS: Integrated into the Intellifydev website for a responsive, modern
design.
• JavaScript Libraries: Used across both projects to enhance interactivity and user
experiences.
33
CHAPTER 4. IMPLEMENTATION AND TESTING 34
• TMDB API: Facilitated dynamic content fetching such as movie posters, details,
and images in Movie Mentor.
• MySQL: Used in the Intellifydev website for managing and storing dynamic con-
tent like client inquiries and service data.
• CMS Integration: Implemented in the Intellifydev website for easy content up-
dates by the company’s team.
• Both projects are optimized for seamless usage across various devices, including
desktops, laptops, and mobile platforms with modern web browsers.
Core Functionalities:
4.2.1 Intellifydev
User Registration
Movie Search
Movie Details
• Open a movie and check for title, genre, synopsis, poster, and images.
Recommendations
Watchlist
User Feedback
Performance
• Logout
• Verify user logs out successfully and is redirected to the login screen.
5.1 Conclusion
In conclusion, the Movie Mentor platform offers a novel solution to the problem of
discovering movies that share a similar essence, theme, or vibe to user favorites. By
leveraging data from large movie databases like TMDB and employing advanced ma-
chine learning techniques, the platform allows users to make more personalized movie
selections, enhancing their streaming experience. While the project overcame signifi-
cant challenges, such as handling vast and unstructured datasets, integrating machine
learning models with an intuitive user interface, and fine-tuning the similarity algo-
rithms, it has successfully provided a step forward in personalized movie discovery.
The system’s ability to offer recommendations based on shared characteristics, rather
than merely on genre or viewing history, sets it apart from traditional OTT platforms.
40
CHAPTER 5. CONCLUSION AND FUTURE WORK 41
42