Report Movie Recommendation
Report Movie Recommendation
Report Movie Recommendation
Recommendation systems are predicting systems that radically recommend items to users or
users to the items, and sometimes users to users too. Tech giants like YouTube, Amazon
Prime, Netflix use similar methods to recommend video content according to their desired
interest. As the internet contains huge loads of data, finding your content is very difficult and
can be very time consuming, thus the recommendation plays an important role in minimizing
our effort. These systems are getting more popular nowadays in various areas such as in
books, videos, music, movies, and other social network sites where the recommendation is
used to filter out the information. It is a tool that is using the user’s information to improve
the suggestion result and give out the most preferred choice. User/Customer satisfaction is
key for building the tool. It is beneficial for both customers and companies, as the more
satisfied the customer is, the more likely he/she would want to use the system for their ease,
which would ultimately make revenues for the companies. Although there are a lot of
algorithms, collaborative filtering is the most popular one used by the companies as it
involves user’s interactions more. Collaborative filtering can predict better than content-
based filtering because it analyses the user’s browsing history and compares with other users
and then suggests results. Whereas, the content-based filtering takes the user’s information as
an input to find similar movies and recommends them in descending order.
Our project aims to implement a recommendation system that responds to the user to get the
recommendations for a movie. The ultimate purpose of Movie Recommendation System is to
make the user's experience better by recommending them movies. This system uses
Collaborative Filtering approach. Here recommendations get filtered based on the
collaboration between similar user’s preferences. Collaborative Filtering (CF) predicts user
preferences in item selection based on the known user ratings of items.
Recommender systems are information filtering systems that help deal with the problem of
information overload by filtering and segregating information and creating fragments out of
large amounts of dynamically generated information according to user’s preferences,
interests, or observed behavior about a particular item or items. A Recommender system has
the ability to predict whether a particular user would prefer an item or not based on the user’s
profile and its historical information. Recommendation systems have also proved to improve
the decision making processes and quality. In large e-commerce settings, recommender
systems enhance the revenues for marketing, for the fact that they are effective means of
selling more products. In scientific libraries, recommender systems support and allow users to
move beyond the generic catalogue searches. Therefore, the need to use efficient and accurate
recommendation techniques within a system that provides relevant and dependable
recommendations for users cannot be neglected.
System Analysis is the process of gathering and interpreting facts, diagnosing the problems
and using the information to recommend improvements. System study is a general term that
refers to an orderly, structured process for identifying and solving problems. The first phase
of software development is system study. The importance of system study phase is the
establishment of the requirements for the system to be acquired, developed and installed.
Analysing the project to understand the complexity forms the vital part of the system study.
Problematic areas are identified and information is collected. Fact finding or gathering is
essential to any analysis of requirements. It is also highly essential that the analyst familiarize
himself with the objectives, activities and functions of organizations in which the system is to
be implemented. In system study, a detailed study of these operations performed by a system
and their relationships within and outside the system is done. A key question considered here
is, “What must be done to solve the problem?” One aspect of system study is defining the
boundaries of the application and determining whether or not the candidate application
should be considered.
The proposed recommendation system used the collaborative filtering technique which is far
more accurate and more efficient to use. Collaborative filtering provides many advantages
over content-based filtering. A few of them are as follows 1. Not required to understand item
content, the content of the items does not necessarily tell the whole story, such as movie
type/genre, and so on. 2. No item cold-start problem: Even when no information on an item is
available, we still can predict the item rating without waiting for a user to purchase it. 3.
Captures the change in user interests over time: Focusing solely on content does not provide
any flexibility on the user's perspective and their preferences. 4. Captures inherent subtle
characteristics: This is very true for latent factor models.
A feasibility analysis evaluates the candidate systems and determines the best system that
needs performance requirements. The purpose of feasibility study is to investigate the present
system, evaluate the possible application of computer-based methods, select a tentative
system, evaluate the cost and effectiveness of the proposed system, evaluate the impact of
proposed system on existing system and as certain the need for new system. Feasibility is
carried out to see if the system is technically, economical and operationally feasible.
All projects are feasible when given unlimited resources and infinite time. It is both
necessary and prudent to evaluate the feasibility of the project at the earliest possible time..
An estimate is made of whether the identified user may be satisfied using current hardware
and software technologies. The study will decide if the proposed system will be cost effective
from the business point of view and if it can be developed in the existing budgetary
constraints.
The objective of a feasibility study is to test the technical, social and economic feasibility of
developing a computer system. This is done by investigating the existing system and
generating ideas about a new system. The computer system must be evaluated from a
technical viewpoint first, and if technically feasible, their impact on the organization and the
staff must be accessed. If a compatible, social and technical system can be devised, then it
must be tested for economic feasibility.
5. Determine and evaluate performance and cost effectiveness of each candidate system.
Operational feasibility is connected with human organizational and political aspects. The
issues considered are the job changes that will be brought about, the organizational structures
that will be distributed and the new skills that will be required. Methods of processing and
presentation are all according to the needs of clients since they can meet all user requirements
here. The proposed system will not cause any problem, any circumstances and will work
according to the specifications mentioned. Hence the proposed system is operationally
feasible. People are inherently resistant to change and computer has been known to facilitate
changes. The system operation is the longest phase in the development life cycle of a system.
So operational feasibility should be given much importance. This system has a user friendly
interface. Thus, it is easy to handle.
Technical feasibility is the most important of all types of feasibility analysis. Technical
feasibility deals with hardware as well as software requirements. An idea from the outline
design to system requirements in terms of input/output files and procedures is drawn and
types of hardware and software and the methods required for running the system are
analyzed. Technical study is a study of hardware and software requirement. All the technical
issue related to the proposed system is dealed during feasibility stage of preliminary
investigation produced the following results: While considering the problems of existing
systems it is sufficient to implement the new system. The proposed system can be
implemented to solve issues in the existing system. It includes the evaluation of how it meets
the proposed system. The assessment of technical feasibility must be based on the outline of
the system requirements in terms of inputs, outputs, files, programs and procedures. This can
be quantified in terms of volumes of data, trends, frequency of updating, etc.
Economic analysis is the most frequently used method for evaluating the effectiveness of
software, more commonly known as the cost/benefit analysis. The procedure is to determine
the benefits and savings that are expected from a candidate and compare them with costs. If
the benefits outweigh cost, the decision is made to design and implement the system;
otherwise further alternatives have to be made. Here it is seen that no new hardware or
software is needed for the development of the system.
Behavioural feasibility determines how much effort will go in to educating, selling and
training the user on the candidate system. People are inherently resistant to change and
computers have been known to facilitate change. Since the system is user friendly, user
training is a very easy matter.
Legal feasibility is the determination of any infringement, violation, or liability that could
result from the development of the system. Legal feasibility environment passes abroad range
of concerns that include contract and liability. The proposed project is also a legally feasible
one.
These Artificial Intelligence systems are designed to solve one single problem and would be
able to execute a single task really well. By definition, they have narrow capabilities, like
recommending a product for an e-commerce user or predicting the weather. This is the only
kind of Artificial Intelligence that exists today. They’re able to come close to human
functioning in very specific contexts, and even surpass them in many instances, but only
excelling in very controlled environments with a limited set of parameters.
AGI is still a theoretical concept. It’s defined as AI which has a human-level of cognitive
function, across a wide variety of domains such as language processing, image processing,
computational functioning and reasoning and so on. An AGI system would need to comprise
of thousands of Artificial Narrow Intelligence systems working in tandem, communicating
with each other to mimic human reasoning.
ASI is seen as the logical progression from AGI. An Artificial Super Intelligence (ASI)
system would be able to surpass all human capabilities. This would include decision making,
taking rational decisions, and even includes things like making better art and building
emotional relationships.
In supervised learning, the machine is taught by example. The operator provides the machine
learning algorithm with a known dataset that includes desired inputs and outputs, and the
algorithm must find a method to determine how to arrive at those inputs and outputs. While
the operator knows the correct answers to the problem, the algorithm identifies patterns in
data, learns from observations and makes predictions. The algorithm makes predictions and is
corrected by the operator – and this process continues until the algorithm achieves a high
level of accuracy/performance. Under the umbrella of supervised learning fall: Classification,
Regression and Forecasting.
2. Regression: In regression tasks, the machine learning program must estimate – and
understand – the relationships among variables. Regression analysis focuses on one
dependent variable and a series of other changing variables – making it particularly useful
for prediction and forecasting.
3. Forecasting: Forecasting is the process of making predictions about the future based
on the past and present data, and is commonly used to analyse trends.
Unsupervised learning is the type of machine learning algorithm where there is no any
defined or labelled class and it itself draws the inferences from datasets. Unsupervised
learning studies how systems can infer a function to describe a hidden structure from
unlabelled data. Under the umbrella of unsupervised learning, fall:
This is because deep learning models are capable of learning to focus on the right features by
themselves, requiring little guidance from the programmer. Basically, deep learning mimics
the way our brain functions i.e. it learns from experience. As you know, our brain is made up
of billions of neurons that allows us to do amazing things. Actually, our brain has
subconsciously trained itself to do such things over the years. Now, the question comes, how
deep learning mimics the functionality of a brain? Well, deep learning uses the concept of
artificial neurons that functions in a similar manner as the biological neurons present in our
brain. Therefore, we can say that Deep Learning is a subfield of machine learning concerned
with algorithms inspired by the structure and function of the brain called artificial neural
networks.
The main objective of the system is to provide the best user experience. Therefore, companies
strive to connect the users with the most relevant things according to their past behavior and
get them hooked to their content. The recommender system suggests which text should be
read next, which movie should be watched, and which product should be bought, creating a
stickiness factor to any product or service. Its unique algorithms are designed to predict a
users’ interest and suggest different products to the users in many different ways and retain
that interest till the end. Needless to say that we see the implementation of this system in our
daily lives. Many online sellers implement recommender systems to generate sales through
machine learning (ML). Many retail companies generate a high volume of sales by adopting
and implementing this system on their websites. The pioneering organizations using
recommenders like Netflix and Amazon have introduced their algorithms of recommendation
systems to hook their customers. Before diving into the in-depth mechanics, it is necessary to
know that this system removes useless and redundant information. It intelligently filters out
all information before showing it to the front users. To understand the recommender system
better, it is a must to know that there are three approaches to it being:
1. Content-based filtering
2. Collaborative filtering
3. Hybrid model
2.8.1. Content-based filtering
Many of the product’s features are required to implement content-based filtering instead of
user feedback or interaction. It is a machine learning technique that is used to decide the
outcomes based on product similarities. Content-based filtering algorithms are designed to
recommend products based on the accumulated knowledge of users. This technique is all
about comparing user interest with product features, so it is essential to provide a significant
feature of products in the system. It should be the first priority before designing a system to
select the favorite features of each buyer. These two strategies can be applied in a possible
combination. Firstly, a list of features is provided to the user to select the most interesting
features. Secondly, the algorithms keep the record of all the products chosen by the user in
the past and make up the customer’s behavioral data. The buyer’s profile rotates around the
buyer’s choices, tastes, and preferences and shapes the buyer’s rating. It includes how many
times a single buyer clicks on interested products or how many times liked those products in
wishlists.
Content-based filtering consists of a resemblance between the items. The proximity and
similarity of the product are measured based on the similar content of the item. When we talk
about the content, it includes genre, the item category, and so on. Let’s take the example of
recommender systems in movies. Suppose you have four movies in which the user starts off
liking only two movies at first. Still, the 3rd movie is similar to the 1st movie in terms of the
genre, so the system will automatically suggest the 3 rd movie. It is something that is
automatically generated by a content-based recommender system based on the similarity of
content.
Just imagine the power of content-based recommender systems, and the possibilities are
endless. For example, when we have a drama film that the user has not seen or liked before,
this genre will be excluded from their profile altogether. Therefore, a user only gets their
recommendation of the genre that is already existing in their profile. The system would never
suggest any movie out of their genres to present the best user experience.
Let’s get back to the movie example. Imagine that you have only six movie data sets. Let’s for the
sake of clarity that the user has seen all these six movies. Then, the genre of all movies is assigned,
i.e., Super Hero, adventure, comedy, and sci-fi, with each movie assigned one or a combination of
genres. Now moving further, the user has seen and rated three movies and given a rating of 2 out of
10 to the 1st movie, 10 to the 2nd movie, and 8 out of 10 to the 3 rd movie. After these ratings, the
recommender system needs to make calculations based on a user profile. Furthermore, the system will
recommend the best-suited movie according to the calculations. The Content-based filtering system
does not require any buyer information since the suggestion is only specific to the buyer and makes a
scale easier for many buyers. User’s interest is captured by this system and suggests items that few
buyers use.
Collaborative filtering needs a set of items that are based on the user’s historical choices. This
system does not require a good amount of product features to work. An embedding or feature
vector describes each item and User, and it sinks both the items and the users in a similar
embedding location. It creates enclosures for items and users on its own.
Other purchaser’s reactions are taken into consideration while suggesting a specific product
to the primary user. It keeps track of the behavior of all users before recommending which
item is mostly liked by users. It also relates similar users by similarity in preference and
behavior towards a similar product when proposing a product to the primary customer. Two
sources are used to record the interaction of a product user. First, through implicit feedback,
User likes and dislikes are recorded and noticed by their actions like clicks, listening to music
tracks, searches, purchase records, page views, etc. On the other hand, explicit feedback is
when a customer specifies dislikes or likes by rating or reacting against any specific product
on a scale of 1 to 5 stars. This is direct feedback from the users to show like and dislike about
the product. It includes both positive and negative feedback.
Collaborative Filtering is the most famous application suggestion engine and is based on
calculated guesses; the people who liked the product will enjoy the same product in the
future. This type of algorithm is also known as a product-based collaborative shift. In this
Filtering, users are filtered and associated with each User in place of items. In this system,
only users’ behavior is considered. Only their content and profile information is not enough.
The User giving a positive rating to products will be associated with other User’s behavior
giving a similar rating. The main idea behind this approach is suggesting new items based on
the closeness in the behavior of similar customers. If you plan to watch a new movie, you
will generally ask your friends and seek their recommendations. This is based on the premise
that users trust their friends as they are confident that their friends know their taste in movies.
Therefore, we usually follow and watch whatever is recommended by a good friend who has
a similar taste.
Thus collaborative filtering focuses on relationships between the item and users; items’
similarity is determined by their rating given by customers who rated both the items.
The data consist of aggregating similar calculations. This method is called the hybrid
approach, in which both methods are used to produce the results. When this system is
compared with other approaches, this system has higher suggestions accuracy. The main
reason is the absence of information about the filtering’s domain dependencies and the
people’s interest in a content-based system.
Memory-based CF is one method that calculates the similarity between users or items using
the user’s previous data based on ranking. The main objective of this method is to describe
the degree of resemblance between users or objects and discover homogenous ratings to
suggest the obscured items. Memory-based CF consist of the following two methods:
In this method, the same user who has similar rankings for homogenous items is known.
Then point out the user’s order for the item to which the user is never linked. For this, we
need to follow the given steps:
In item-based CF, we find the same items that the target user has already viewed.
A numerical measure using a similarity matrix is the most common technique. It involves Dot
product, Cosine similarity, Pearson similarity, and Euclidean distance.
Model-based collaborative filtering is not required to remember the based matrix. Instead, the
machine models are used to forecast and calculate how a customer gives a rating to each
product. These system algorithms to predict unrated products by customer ratings. These
algorithms are further divided into different subsets, i.e., Matrix factorization-based
algorithms, and clustering algorithms.
The matrix factorization technique is different from analyzing and exploring the rate of rating
matrix in an algebra context and has two main goals. First, the initial ambition is to reduce
the rating matrix dimension. This approach’s second ambition is to identify perspective
features under the rating matrix, which will provide several recommendations.
Since the similarity measure plays a significant role in improving accuracy in prediction
algorithms, it can be effectively used to balance the ratings significance . There are a couple
of popular similarity algorithms that have been used in the CF recommendation algorithms .
Cosine Vector Similarity
Cosine vector similarity is one of the popular metrics in statistics. Since it notionally
considers only the angle of two vectors without the magnitude, it is a very useful
measurement with data missing preference information as long as it can count the number of
times that term appears in the data.
In the following formula, the cosine vector similarity looks into the angle between two
vectors (the target Item i and the other Item j) of ratings in ndimensional item space.
Rk,j is the rating of the other Item j by user k. n is the total number of all rating users to Item i
and Item j.
When the angle between two vectors is near 0 degree (they are in the same direction), Cosine
similarity value, sim(i,j), is 1, meaning very similar. When the angle between two vectors is
near 90 degree, sim(i,j) is 0, meaning irrelevant. When the angle between two vectors is near
180 degree (they are in the opposite direction), sim(i,j) is -1, meaning very dissimilar. In case
of information retrieval using CF, sim(i,j) ranges from 0 to 1. This is because the angle
between two term frequency vectors cannot be greater than 90 degrees .
Pearson correlation coefficient is one of the popularly used methods in CF to measure how
larger a number in one series is, relative to the corresponding number. As following formula
shows, it is used to measure the linear correlation between two vectors (Item i and Item j).
It measures the tendency of two series of numbers, paired up one-to-one, to move together.
When two vectors have a high tendency, the correlation, sim(i,j) is close to 1. When two
vectors have a low tendency, sim(i,j) is close to 0. When two vectors have opposite tendency,
sim(i,j) is close to -1. The item-based similarity is computed with the corated items where
users rated both (Item i and Item j).
Rk,i is the rating of the target item i given by User k. Rk,j is the rating of the other item j given
by User i. Ai is the average rating of the target Item i for all the co-rated users, and A j is the
average rating of the other item j for all the co-rated users. n is the total number of ratings
users gave to item i and item j
3. SYSTEM DESIGN
CHAPTER 3
SYSTEM DESIGN
3.1 INTRODUCTION
System design is a solution that “how to” approach the creation of a new system. It provides
the understanding and procedural details necessary for implementing the proposed system.
Fig.Block diagram
1. Raw Data
2. Pre-processing
3. Structured Data
4. EDA
5. User-Movie Matrix
Pre-processing
Data preprocessing is a process of preparing the raw data and making it suitable for a
machine learning model. We used the following Pre-processing components in our Project:
Raw Data- Raw data also called source data, atomic data or primary data is data that has not
been processed for use.
Structured data- It is data that adheres to a predefined data model and is therefore
straightforward to analyse.
Exploration Data Analysis(EDA)-Exploratory Data Analysis refers to the critical process of
performing initial investigations on data so as to discover patterns, to spot anomalies, to test
hypotheses and to check assumptions with the help of summary statistics and graphical
representations.
User Movie Matrix
Recommendations
3.2.1 Application Building
After the model is built, we will be integrating it to a web application so that normal users
can also use it. In this section, we will be building a web application using Django
Framework that is integrated to the model we built. A UI is provided for the users where
he/she has to uploads an image/video . The uploaded image is given to the saved model and
prediction is showcased on the UI. This section has the following tasks
Navigate to the localhost where you can view your web page. Then it will run on
localhost:8000. Navigate to the localhost (http://127.0.0.1:8000/) where you can view your
web page.
Input design is the process of converting user-oriented input into a computer based format.
The goal of the designing input is to make data entry as easy and free from error. In Android,
input to the system is entered through activity. An activity is "any surface on which
information is to be entered, the nature of which is determined by what is already on that
surface." If the data going into the system is incorrect, then processing and output will
magnify these errors. So designer should ensure that form is accessible and understandable by
the user. End-users are people who communicate to the system frequently through the user
interface, the design of the input screen should be according to their recommendations. The
data is validated wherever it requires in the project. This ensures only correct data is entered
to the system. Html is the interface used in input design. All the input data are validated in the
order and if any data violates any condition the use is warned by a message and asks to re-
enter data. If the data satisfies all the conditions then it is transferred to the appropriate tables
in the database. This project uses text boxes and drop down to accept user input. If user enters
wrong format then it shows a message to the user. User is never lift in confusion as to what is
happening. Instead appropriate error messages and acknowledgments are displayed to the
user.
3.4 OUTPUT DESIGN
Computer output is the most important one to the user. A major form of the output is the
display of the information gathered by the system and the servicing the user requests to the
system. Output generally refers to the results or information that is generated by the system.
It can be in the form of operational documents and reports. Since some of the users of the
system may not operate the system, but merely use the output from the system to aid them in
decision making, much importance is given to the output design. Output generation hence
serves two main purposes, providing proper communication of information to the users and
providing data in a form suited for permanent storage to be used later on. The output design
phase consists of two stages, output definition and output specification. Output definition
takes into account the types of outputs, its contents, formats, its frequency and its volume.
The output specification describes each type of output in detail. The objective of the output
design to convey the information of all the past activities, current status and emphasize
important a quality output is one, which meets the requirements of the end user and presents
the information.
4 SYSTEM ENVIRONMENT
CHAPTER-4
SYSTEM ENVIRONMENT
4.1 SOFTWARE ENVIRONMENT
Purpose To understand the nature of the program to be building the software engineers must
understand the information domain for the software. Here the document specifies the
software requirements of automating the functions. The document gives different software
and hardware requirements of the system. This will help the users to understand there own
needs. It will be the validation of all project.
Scope
This document is the only one that describes the requirements of the system to be developed.
OVERVIEW OF WINDOWS 10
Often, programmers fall in love with Python because of the increased productivity it
provides. Since there is no compilation step, the edit-test-debug cycle is incredibly fast.
Debugging Python programs is easy: a bug or bad input will never cause a segmentation
fault. Instead, when the interpreter discovers an error, it raises an exception. When the
program doesn't catch the exception, the interpreter prints a stack trace. A source level
debugger allows inspection of local and global variables, evaluation of arbitrary expressions,
setting breakpoints, stepping through the code a line at a time, and so on. The debugger is
written in Python itself, testifying to Python's introspective power. On the other hand, often
the quickest way to debug a program is to add a few print statements to the source: the fast
edit-test-debug cycle makes this simple approach very effective.
HTML5
HTML5 is a markup language used for structuring and presenting content on the World Wide
Web. It is the fifth and last[3] major HTML version that is a World Wide Web Consortium
(W3C) recommendation. The current specification is known as the HTML Living Standard. It
is maintained by the Web Hypertext Application Technology Working Group (WHATWG), a
consortium of the major browser vendors (Apple, Google, Mozilla, and Microsoft). HTML5
was first released in a public-facing form on 22 January 2008,[2] with a major update and
"W3C Recommendation" status in October 2014.[4][5] Its goals were to improve the
language with support for the latest multimedia and other new features; to keep the language
both easily readable by humans and consistently understood by computers and devices such
as web browsers, parsers, etc., without XHTML's rigidity; and to remain backwardcompatible
with older software. HTML5 is intended to subsume not only HTML 4 but also XHTML 1
and DOM Level 2 HTML.[6] HTML5 includes detailed processing models to encourage
more interoperable implementations; it extends, improves, and rationalizes the markup
available for documents and introduces markup and application programming interfaces
(APIs) for complex web applications.[7] For the same reasons, HTML5 is also a candidate
for cross-platform mobile applications because it includes features designed with low-
powered devices in mind.
Django
Django is a high-level Python web framework that encourages rapid development and
clean, pragmatic design. Built by experienced developers, it takes care of much of the hassle
of web development, so you can focus on writing your app without needing to reinvent the
wheel. It’s free and open source. Django was designed to help developers take applications
from concept to completion as quickly as possible. Django takes security seriously and helps
developers avoid many common security mistakes. Some of the busiest sites on the web
leverage Django’s ability to quickly and flexibly scale.
Sublime Text
The Jupyter Notebook is an open source web application that you can use to create and share
documents that contain live code, equations, visualizations, and text. Jupyter Notebook is
maintained by the people at Project Jupyter. Jupyter Notebooks are a spin-off project from
the IPython project, which used to have an IPython Notebook project itself. The name,
Jupyter, comes from the core supported programming languages that it supports: Julia,
Python, and R. Jupyter ships with the IPython kernel, which allows you to write your
programs in Python, but there are currently over 100 other kernels that you can also use.
5.1 CODING
CODING STANDARDS
Coding standards are important because they lead to greater consistency within code of all
developers. Consistency leads to code that is easier to understand, which in turn results in
turn result in a code, which is easier to develop and maintain. Code that difficult to
understand and maintain runs the risks of being scrapped rewritten.
Unit Testing
Unit testing is a concept that would be familiar to people coming from software development.
It is a very useful technique that can help you prevent obvious errors and bugs in your code.
It involves testing individual units of the source code, such as functions, methods, and class
to ascertain that they meet the requirements and have expected behaviour. Unit tests are
usually small and don’t take much time to execute. The tests have a wide range of inputs
often including boundary and edge cases. The outputs of these inputs are usually calculated
by the developer manually to test the output of the unit being tested. For example for an adder
function, we would have test cases something like the following.
You test cases with positive inputs, inputs with zero, negative inputs, positive and negative
inputs. If the output of our function/method being tested would be equal to the outputs
defined in the unit test for all the input cases, your unit would pass the test otherwise it would
fail. You would know exactly which test case failed. Which can be further investigated to
find out the problem. This is an awesome sanity check to have in your code. Especially if
multiple developers are working on a large project. Imagine someone wrote a piece of code
based on certain assumptions and data sizes and a new developer changes something in the
codebase which no longer meets those assumptions. Then the code is bound to fail. Unit tests
allow avoiding such situations.
Improved confidence in the unit itself since if it passes the unit tests we are sure that there is
nothing obviously wrong with the logic and the unit is performing as intended.
Debugging becomes easier since you would know which unit failed as well as the particular
test cases which failed.
Integration Testing
Data can be lost across an interface, one module can have an adverse effect on the other
subfunctions, when combined they may not perform the desired functions. Integrated testing
is the systematic testing to uncover the errors within the interface. This testing is done with
simple data and the developed system has run successfully with this simple data. The need for
integrated system is to the overall system performance.
The Modules of this project are connected and tested. After splitting the programs into units,
the units were tested together to see the defects between each module and function. It is
testing to one or more modules or functions together with the intent of interface defects
between the modules or functions. Testing completed as part of unit or functional testing,
integration testing can involve putting together of groups of modules and functions with the
goal of completing and verifying meets the system requirements.
System Testing
System testing focuses on testing the system as a whole. System Testing is a crucial step in
Quality Management Process. In the Software Development Life Cycle, System Testing is
the first level where the System is tested as a whole. The System is tested to verify whether it
meets the functional and technical requirements.
The system was tested by a small client community to see if the program met the
requirements the analysis stage. It was found to be satisfactory. In this phase, the system is
fully tested by the client community against the requirements in the analysis and design
stages, corrections are made as required, and the production system is built. User acceptance
of the system is key factor for success of the system.
The software application may use different users on different way & it impossible to
developer or tester to predict what all possible scenarios or test data end user will use & how
customer actually use the software application. So most of software venders are use the term
like Alpha testing and Beta Testing which help to uncover the errors that may occurs in the
actual test environment. In this testing method the software application release over limited
end users rather than testing professionals to get feedback from them.
Alpha Testing
Sometimes the Alpha Testing is carried out by client or an outsider with the attendance of
developer and tester. The version of the release on which Alpha testing is perform is called
“Alpha Release”.
Beta Testing
Most if times we have the sense of hearing term “Beta release/version”, so it is linked to Beta
Testing. Basically the beta testing is to be carried out without any help of developers at the
end user‟s site by the end users &, so it is performed under uncontrolled environment. Beta
testing is also known as Field testing. This is used to get feedback from the market. This
testing is conducted by limited users & all issues found during this testing are reported on
continuous basis which helps to improve the system. Developers are taking actions on all
issues reported in beta testing after bug triage & then the software application is ready for the
final release. The version release after beta testing is called “Beta Release“.
SOFTWARE TESTING
Software testing is critical element of software quality assurance and represents ultimate
review of specification design and coding system with testing is actually a series of different
task whose primary objective is to fully exercise computer-based systems through
successfully, it will uncover error in software. Testing is a process of executing a program
with intension of finding an error, Good test case is one that has a high probability of finding
undiscovered error.
VALIDATION TESTING
In validation testing, all the relevant fields are checked to whether they contain data and also
checks whether they hold the right data format guarantee that all the independent path with in
a module have been exercised at least once
TEST CASES
A specific set of steps and data along with expected results for a particular test objective. A
test case should only test one limited subset of a future or functionality. Test case documents
for each functionality/testing area of our project is written, reviewed and maintained
separately. Test cases that check error conditions are written separately from the functional
test cases and should have steps to verify the error messages.
Implementation is the process of personnel check out, install the required equipment and
application and train user accordingly. Depending on the size of the organization and its
requirements the implementation can be divided into three:
Stage Implementation
Here system is implemented in stages. The whole system is not implemented at once. Once
the user starts working with system and is familiar with it, then a stage is introduced and
implemented. Also the system is usually updated, regularly until a final system is sealed.
Direct Implementation
The proposed new system is implemented directly and the user starts working on the new
System. The shortcoming, if any, faced are then rectified later.
Parallel Implementation
The old and the new system are not used simultaneously. This helps in comparison of the
results from the two systems. Once the user is satisfied and his intended objectives are
achieved by the new system, he stops using the old one.
In my project I have used direct implementation method. The client is given with fully
developed system. System developed to provide recommendations of movies in order to
support the decision-making process in a movie website.
6. SYSTEM MAINTENANCE
CHAPTER-6
SYSTEM MAINTENANCE
6.1 MAINTENANCE
Software Maintenance is the process of modifying a software product after it has been
delivered to the client. The main purpose of software maintenance is to modify and
update software application after delivery to correct faults and to improve
performance.
7.1 INTRODUCTION
System-level security refers to the architecture, policy and processes that ensure data
and system security on individual computer systems. It facilitates the security of
standalone and/or network computer systems/servers from events and processes that
can exploit or violate its security or stature.
Recommender system has developed for many years, which ever entered a low point. In the
past few years, the development of machine learning, large-scale network and high
performance computing is promoting new development in this field. We will consider the
following aspects in future work. After getting enough user data, Hybrid filtering
recommendation will be introduced. Introduce more precise and proper features of movie. In
the future we should extract features such as subtitle from movie which can provide a more
accurate description for movie.
In the future we will collect more user data and add user dislike movie list. We will input
dislike movie list into the recommender system as well and generate scores that will be added
to previous result. By this way we can improve the result of recommender system. Make the
recommender system as an internal service. In the future, the recommender system is no
longer a external website that will be just for testing. We will make it as an internal APIs for
developers to invoke.
9. CONCLUSION
CHAPTER-9
CONCLUSION
In this paper, to avoid the use of content-based filtering, the CF filtering approach is used for
obtaining better results. Collaborative Filtering recommendation system is proposed using
Pearson correlation/cosine similarity by employing Movielens dataset. The existing system
are compared and found that the proposed system is more reliable and accurate. It is also
found that when the proposed methodology is applied to different larger datasets, both
accuracy, and efficiency increase which proves that our system is both accurate and as well as
efficient. The main aim was to improve the regular recommendation algorithm and to provide
better results. The research work was successful as it has been able to fulfill our aim of the
project. In the future, more features can be included to datasets to make recommendations
more reliable and innovative.