MN15 B09 PDF

A Mini Project Report
On
SENTIMENT ANALYSIS AND OPINION MINING: A SURVEY
Submitted in partial fulfillment of the requirement for the award of the degree of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING
By
N.USHARANI (15UP1A0577)
T.BHAVANI (15UP1A0594)
B.ROOPASRI (15UP1A0559)
Under the guidance of

Mr.T. SRAJAN KUMAR
Assistant Professor
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

VIGNAN’S INSTITUTE OF MANAGEMENT AND TECHNOLOGY FOR
WOMEN
(Affiliated to Jawaharlal Nehru Technological University Hyderabad)
Kondapur Village, Ghatkesar Mandal, Telangana, India
www.vmtw.ac.in
2018-2019
WOMEN
DECLARATION
We hereby declare that project entitled “SENTIMENT ANALYSIS AND

OPINION MINING A SURVEY” is bonafide work duly completed by us. It does
not contain any part of the project or thesis submitted by any other candidate to this
or another institute of the university.
All such materials that have been obtained from other sources have been
duly acknowledged.
WOMEN
CERTIFICATE
This is to certify that the these is work titled “SENTIMENT
ANALYSIS AND OPINION MINING A SURVEY” submitted by
N.USHARANI (15UP1A0577), T.BHAVANI (15UP1A0594),
B.ROOPASRI (15UP1A0559) in partial fulfillment of the requirements for
the award of the degree of Bachelor of Technology in Computer Science
and Engineering to the Vignan’s Institute of Management and Technology
for Women is a record of bonafide work carried out by them under my
guidance and supervision.
The results embodied in this project report have not been submitted to
any university for the award of any degree and the results are achieved
satisfactorily.
Mrs. A. GAUTHAMILATHA Mr. T.SRAJAN KUMAR

Head of the Department Assistant Professor
Vignan’s Institute of Management Vignan’s Institute of Management
And Technology for Women and Technology for Women
(External Examiner)
ACKNOWLEDGEMENT
We would like to express sincere gratitude to Dr. P. Sudhakara Rao,

Principal, Vignan’s Institute of Management and Technology for Women for
his timely suggestions which helped us to complete the project in time.
We would also like to thank our madam Mrs. A. Gauthamilatha, Head of

the Department, Computer Science and Engineering, for providing us with
constant encouragement and resources which helped us to complete the project in
time.
We would like to thank our project guide, Mr. T.Srajan Kumar, Assistant
professor, Computer Science and Engineering, for her timely cooperation and
valuable suggestions throughout the project. We are indebted to her for the
opportunity given to work under her guidance.
Our sincere thanks to all the teaching and non-teaching staff of Department
of Computer Science and Engineering for their support throughout our project
work.
CONTENTS
ABSTRACT i
LIST OF FIGURES ii
LIST OF TABLES iii
1. INTRODUCTION 1-4
1.1. How Sentimental Analysis Works 2
1.2. Approaches For Feature- Based Opinion Mining 4
2. LITERATURE SURVEY 5-8
3. SYSTEM ANALYSIS 7
3.1. Existing System 7
3.2. Proposed System 7
3.3. System Requirements 8
3.3.1. Hardware Requirements 8
3.3.2. Hardware Requirements 8
4. IMPLEMENTATION 9-12
4.1. Modules 9
4.2. Modules Description 9
4.2.1. Login Module 9
4.2.2. Sentiment Analysis 9
4.2.3. Customer feedback 10
4.2.4. Analyzing of neutral sentiments 10
4.3. Input Design 10
4.4. Output Design 11
5. SYSTEM ARCHITECTURE 13-14
5.1. Opinion Retrieval 13
5.2. Opinion Classification 13
5.3. Opinion Summarization 13

6. SYSTEM STUDY 15
6.1. Technical Feasibility 15
6.2. Economic Feasibility 15
6.3. Operational Feasibility 15
7. SOFTWARE ENVIRONMENT 16-32
7.1. Features of Java 16
7.1.1. Distributed 16
7.1.2. Robust 16
7.1.3. Secure 17
7.1.4. Architecture Neutral 17
7.1.5. Portable 17
7.1.6. Interpreted 18
7.1.7. High Performance 18
7.1.8. Multithreaded 18
7.2. The Java Framework 19
7.2.1. Java Packages 19
7.3. HTML Document 19
7.4. Cascading Style Sheets (CSS) 25
7.5. Servlet 27
7.6. Servlet Life Cycle 29
8. SYSTEM DESIGN 33-38
8.1. Data Flow Diagram 34
8.2. UML Diagrams 35
8.3. Use Case Diagram 35
8.4. Class Diagram 37
8.5. Sequence Diagram 38

9. SAMPLE CODE 39-48
9.1. Sample code for User Registration 39
9.2. Sample code for User Login 43
9.3. Sample code for Admin Login 46
10. SYSTEM TESTING 49-51
11. RESULT SCREENSHOOTS 52-61
11.1. Home Page 52
11.2. Product 53
11.3. Login Page 54
11.4. Register Page 55
11.5. Products 56
11.6. Product Rating 57
11.7. Admin Login 58
11.8. Admin Page 59
11.9. Product Registration 60
11.10. Products of Admin 61
12. CONCLUSION 62
13. FUTURE ENHANCEMENT 63
BIBLIOGRAPHY 64
ABSTRACT
Sentiment analysis, also called opinion mining, is the field of study that analyses
people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards
entities such as products, services, organizations, individuals, issues, events, topics, and their
attributes. Now a days’-commerce is well developed resource. People want to purchase items
from e-commerce sites such as Amazon, Flipkart, Snap deal, etc. They are purchasing the
goods through online based on the review ratings. Peoples are depending on the review
ratings for not only purchasing the goods but also for service providers such as hotels, shops,
restaurants, etc. To overcome from these problems I am providing a new technique called
content based opinion mining technique to provide perfect review rating of the service to the
user in my project. I am categorizing the opinion of the user based on the content which is
provided by the user.
LIST OF FIGURES
Fig. 5.1 Architecture of Opinion Mining 14

Fig. 7.1 Servlets Architecture 28
Fig 9 Architecture Diagram 33
Fig 9.1 Data Flow Diagram 34
Fig 9.2.1 Use Case Diagram 36
Fig 9.2.2 Class Diagram 37
Fig 9.2.3 Sequence Diagram 38
Fig.11.1 Home Page 52
Fig.11.2 Product 53
Fig.11.3 Login Page 54
Fig.11.4 User Register 55
Fig.11.5 All Products 56
Fig.11.6 User Product Rating 57
Fig.11.7 Admin Login 58
Fig.11.8 Admin Page 59
Fig.11.9 Admin Product Registration 60
Fig.11.10 Admin Registered Products 61
LIST OF TABLES
Table 7.1 Creating and Using Sessions 31

Table 7.2 JavaBeans Properties 32
Sentiment Analysis and Opinion Mining: A Survey MN15-B9
1. INTRODUCTION
Sentiment analysis refers to determining the semantic orientation of an individual
opinion or the opinions of group of people over a particular context. Today’s World Wide
Web is not just a medium for communications, but has got a great influence on the society.
Internet has led a path to know others opinions and became a resource to perform our
activities such as online business, information acquisition, community operations etc. thus
these opinions have subjecting to show their impact on our decision making.
Many companies or organizations are depending on social media to gather

information of public opinions on their products/services. And so Technology has also been
advancing with the growing popularity of social media. As an example, today we have many
data marts available for extracting opinions from social media like face book insights, you
tube insights, twitter fire hose etc. these digital opinion data are being analyzed by the
organizations for their specific purpose using commercial social listening tools like Radian6,
Viralheat, SMR, Sysmos, which provides companies report, relevant text required to
company, sentiment of customer opinions, number of visitors, process of online workflow of
business.
Thus the good number of companies large and small are having opinion mining and
sentiment analysis as part of their mission. And the profound applications of opinion mining
and sentiment analysis in areas like review related websites, sub component technology
(detection of flames, mentoring add services etc), business and government intelligence etc
has urged the research to gain its important rapidly.
The main objective of the sentiment analysis and opinion mining is to classify the
polarity of mined text opinions at document level or sentence level or feature or aspect based
level. That is to find out whether the polarity of opinionated text is positive or negative or
neutral.
The possibility of automatically conducted consumers’ opinions analysis and the rules
of designing and building such systems for Polish language are crucial questions for
specialists in web marketing. In the paper theoretical and practical aspects of sentiment
analysis (also known as opinion mining) are discussed according to (Jansen, 2010) 58% of
Americans have looked for products’ or services’ opinions online and 24% of them have
Dept of CSE 1 VMTW

posted comments or opinions online about goods or services they have bought. Opinions
published online form a very significant source of information not only for customers but also
for companies. Huge distribution and colossal amount of online opinions and comments
cause that their manual analysis is not possible. Therefore various kinds of opinion analysis
automatization are performed. Computer-aided opinion analysis can be considered as a part
of computer-aided text analysis. But it is necessary to take into account the most important
feature of opinions: they are subjective. On account of subjectivity opinions differ from
objective information statements.
1.1 How Sentimental Analysis Works:

Preparing Review Database:
Reviews are extracted from different websites and then we can store those reviews
into storing into review database. Each website has its own structure. Web crawlers can be
used to download reviews. Web crawlers, also known as spiders or robots, are programs that
automatically download Web pages. A crawler can visit many sites to collect information that
can be analyzed and mined in a central. After this preprocessing is done where unwanted text
(other than product reviews) is removed and then reviews are stored into database.
Part-of-Speech Tagging (POS Tagging):
The aim feature based opinion mining is to find out product features and opinion
words (opinion words means words which express opinion). And then find polarity of
opinion word. In general, opinion words are adjectives and product features are nouns.
Consider following example
“This is good phone”
In above sentence, phone (product feature) is noun and good (opinion word) is
adjective. In part-of-speech (POS tagging), each word in review is tagged with its part- of-
speech (such as noun, adjective, adverb, verb etc.). After POS tagging now it is possible to
retrieve nouns as product features and adjectives as opinion words. There are different freely
available POS taggers like Stanford POS Tagger. Fig1 shows how above sentence will be
tagged using Stanford POS Tagger.
Dept of CSE 2 VMTW

Feature Extraction
In feature extraction, product features are extracted from each sentence. Product
features are generally nouns, so each noun is extracted from sentence. In review, features
may be mentioned explicitly or implicitly by the reviewer. Features which are mentioned ina
sentence directly are called as explicit features and features which are not mentioned directly
are called implicit features. For example,
“Battery Life of a phone is less”
In this sentence reviewer has mentioned battery life directly so it is explicit feature. It
is easy to extract such features. Now consider following sentence,
“This phone needs to charge many times in a day”
In this sentence reviewer is talking about battery of phone but it is not mentioned
directly in the sentence. So here battery is implicit feature. It is difficult to understand and
extract such features from sentence.
Opinion Word Extraction:
In opinion word extraction, opinion words are identified. If sentence contains one or
more product features and one or more opinion words, then the sentence is called an opinion
sentence. As stated above opinion words are generally adjectives.
Opinion Word Polarity Identification:
In opinion word polarity identification, semantic orientation of each opinion word is

identified. Semantic orientation means identifying whether opinion word is expressing
positive opinion, negative opinion or neutral opinion.
Opinion sentence polarity identification predicts the orientation of an opinion
sentence. Consider following sentence-
“This is not good phone”
Dept of CSE 3 VMTW

Above sentence contains opinion word ‘good’ which expresses positive opinion. But
sentence expresses negative opinion because of negation word ‘not’. Therefore after finding
opinion word polarity identification it is necessary to find polarity of opinion sentence. For
opinion sentence polarity identification a list of negation words such as ‘no’, ’not’, ’but’ etc.
can be prepared and negation rules can be formed. For example, if sentence contains odd
number of negation words then its polarity will be opposite of polarity opinion word in that
sentence. Otherwise sentence will have same polarity as that of polarity of opinion word in it.
Summary Generation:
Summary generation is generated after opinion sentence orientation identification.

This summary is based on features of product. With the help of information discovered in
previous steps summary can be generated. Summary can be generated in the form of tables or
graphs. A table or graph will give summary of all the reviews related to a product.
1.2 Approaches For Feature- Based Opinion Mining :

Statistical Approach:
It proposed a supervised information extraction system which extracts features and

associated opinions. They used frequency Bayesian classification technique to calculate
probability distribution. MI (Pointwise Mutual Information) algorithm uses mutual
information as measure of the strength of semantic association between two words. PMI-IR
uses Pointwise Mutual Information (PMI) and Information Retrieval (IR)to measure the
similarity of pairs of words or phrases. This mantic orientation of a given phrase is calculated
by comparing its similarity to a positive reference word with its similarity to a negative
reference word.
Intelligent Feature Selection Approach:
The IFS approach uses a feature relation network. FRN utilizes two important
syntactic n-gram relations-(1) subsumption and (2) parallel. These two relations occur
between two n-gram features categories. IFS can be also combined with larger feature sets for
enhanced opinion-classification performance.
Dept of CSE 4 VMTW

2. LITERATURE SURVEY
Balamurali (2011) presents an innovative idea to introduce sense based sentiment

analysis. This implies shifting from lexeme feature space to semantic space i.e. from simple
words to their synsets. The works in Sentiment Analysis, for so long, concentrated on lexeme
feature space or identifying relations between words using parsing. The need for integrating
sense to Sentiment Analysis was the need of the hour due to the following scenarios, as
identified by the authors:
 A word may have some sentiment-bearing and some non-sentiment-bearing senses.

 There may be different senses of a word that bear sentiment of opposite polarity
 The same sense can be manifested by different words (appearing in the same synset)
Using sense as features helps to exploit the idea of sense/concepts and the hierarchical
structure of the WordNet. The following feature representations were used by the authors and
their performance were compared to that of lexeme based features:
 A group of word senses that have been manually annotated (M)

 A group of word senses that have been annotated by an automatic WSD (I)
 A group of manually annotated word senses and words (both separately as features)
(Sense + Words(M))
 A group of automatically annotated word senses and words (both separately as
features) (Sense + Words(I))
Sense + Words (M) and Sense + Words (I) were used to overcome non-coverage of
WordNet for some noun sunsets. The authors used sunset-replacement strategies to deal with
non-coverage, in case a sunset in test document is not found in the training documents. In that
case the target unknown sunset is replaced with its closest counterpart among the Word Net
sunsets by using some metric.
Support Vector Machines were used for classification of the feature vectors and
IWSD was used for automatic WSD. Extensive experiments were done to compare the
performance of the 4 feature representations with lexeme representation. Best performance,
in terms of accuracy, was obtained by using sense based SA with manual annotation (with an
Dept of CSE 5 VMTW

accuracy of 90.2 percent and an increase of 5.3 percent over the baseline accuracy) followed
by Sense(M), Sense + Words(I), Sense(I) and lexeme feature representation. LESK was
found to perform the best among the 3 metrics used in replacement strategies. One of the
reasons for improvements was attributed to feature abstraction and dimensionality reduction
leading to noise reduction. The work achieved its target of bringing a new dimension to
Sentiment Analysis by introducing sense based Sentiment Analysis.
Dept of CSE 6 VMTW

3. SYSTEM ANALYSIS
3.1. EXISTING SYSTEM:
Since broadly identified existing sentiment analysis system are based on machine
learning methods. These machine learning methods often derive sentiment classification in
terms of binary labels (positive or negative). The common requirement of these methods
would be labeled data to train the classifier. Although these learning based methods have
ability to train the new data using labeled data and create trained models for different
domains and certain context, the availability of labeled data is a big drawback and the
implementation of method on new data becomes priory low. And moreover sentiment
analysis determined can be outright irrational with respect to an individual perception of
opinion at an instant of time (that is his/her opinion can be ironic or hyperbolic to a particular
context). And hence there is a drawback of not diluting the outliers (irrational opinions) from
the aggregate of opinion summarization in machine learning methods.
Disadvantages of Existing System:
 Most existing is optimized to detect attacks with high accuracy. However, they still
have various disadvantages that have been outlined in a number of publications and a
lot of work has been done to analyze in order to direct future research.
 Besides others, one drawback is the large amount of alerts produced.
3.2. PROPOSED SYSTEM:
The proposed system overcomes the drawback of cost of availability of labeled data
for different contexts and the lack of applicability of machine learning methods on new data
by employing content based approach.
Advantages of Proposed System:
 More detection accuracy

 Less false alarm
Accurate characterization for traffic behaviors and detection of known and unknown
attacks respectively.
Dept of CSE 7 VMTW

3.3. SYSTEM REQUIREMENTS:
3.3.1 Hardware Requirements:

 PROCESSOR : INTEL Core i3
 RAM : 3 GB DD RAM
 HARD DISK : 40 GB
3.3.2. Software Requirements:
The software requirements document is the specification of the system. It should

include both a definition and a specification of requirements. It is a set of what the system
should do rather than how it should do it. The software requirements provide a basis for
creating the software requirements specification. It is useful in estimating cost, planning team
activities, performing tasks and tracking the teams.
 OPERATING SYSTEM : Windows XP / 2007/10
 WEB TECHNOLOGIES : HTML, CSS, JAVA SCRIPT
 BACK END : JAVA (Servlets, JDBC),
 DATABASE : MySQL
 FRONT END :HTML,JSP
Dept of CSE 8 VMTW

4. IMPLEMENTATION
4.1 MODULES:
 Login module
 Sentiment Analysis
 Customer feedback
 Analyzing of neutral sentiments
4.2 MODULES DESCRIPTION:
4.2.1 Login Module:
In this Module, the client initially gets registered by entering the required details and
creates a Login ID and password for getting authentication to the application .The Registered
details are stored in the Centralized MYSQL Database. The Client gets logged in to the
application through the Login Module. Then the Controller checks for Client Credentials and
provides Authentication to the Client. In the Similar manner, the Administrator/job manager
gets Registered with the application and provided authentication to view and process the
Client Requests.
4.2.2 Sentiment Analysis:
Once after the login/registration process of a user is completed, the user can check on
any issues on different products through keyword search by sending request to the
Centralized MYSQL Database via controller. Opinion mining and sentiment analysis the
controller after receiving the request from the client, it responds to the client with either top
ten Recent/Popular Comments posted by users on a particular issue.
Finally, the user is delivered with the individual sentiment of the comments and
summary generation of the overall sentiment of the comments. Thus, helping a user/client to
know about opinions of other user for decision-making purposes.
Dept of CSE 9 VMTW

4.2.3 Customer feedback:
Once after the login/registration process of a user is completed, the user can check for
the sentiment on any product reviews/feedback posted by the users in the application. These
users are able to view the number of people being interested towards the product and vice-
versa through the summary generation report delivered to the user.
4.2.3 Analyzing of neutral sentiments:
The admin has privilege to check for the possibility of status of neutral sentiments, if
they fall in either of positive category classification or negative category classification, since
neutral sentences may consist of words which might not include in the lexical resources. The
neutral sentence grabbed from Centralized MYSQL database is enormous and the syntactical
approach employed by users in quoting their opinions is not good enough.
4.3 Input Design:
The input design is the link between the information system and the user. It comprises
the developing specification and procedures for data preparation and those steps are
necessary to put transaction data in to a usable form for processing can be achieved by
inspecting the computer to read data from a written or printed document or it can occur by
having people keying the data directly into the system. The design of input focuses on
controlling the amount of input required, controlling the errors, avoiding delay, avoiding
extra steps and keeping the process simple. The input is designed in such a way so that it
provides security and ease of use with retaining the privacy. Input Design considered the
following things:
 What data should be given as input?

 How the data should be arranged or coded?
 The dialog to guide the operating personnel in providing input.
 Methods for preparing input validations and steps to follow when error occur.
Dept of CSE 10 VMTW

Objectives:
 Input Design is the process of converting a user-oriented description of the input into
a computer-based system. This design is important to avoid errors in the data input
process and show the correct direction to the management for getting correct
information from the computerized system.
 It is achieved by creating user-friendly screens for the data entry to handle large
volume of data. The goal of designing input is to make data entry easier and to be free
from errors. The data entry screen is designed in such a way that all the data
manipulates can be performed. It also provides record viewing facilities.
 When the data is entered it will check for its validity. Data can be entered with the
help of screens. Appropriate messages are provided as when needed so that the user
will not be in maize of instant. Thus the objective of input design is to create an input
layout that is easy to follow.
4.4 Output Design:
A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users and to
other system through outputs. In output design it is determined how the information is to be
displaced for immediate need and also the hard copy output. It is the most important and
direct source information to the user. Efficient and intelligent output design improves the
system’s relationship to help user decision-making.
 Designing computer output should proceed in an organized, well thought out manner;
the right output must be developed while ensuring that each output element is
designed so that people will find the system can use easily and effectively. When
analysis design computer output, they should Identify the specific output that is
needed to meet the requirements.
 Select methods for presenting information.
 Create document, report, or other formats that contain information produced by the
system.
Dept of CSE 11 VMTW

The output form of an information system should accomplish one or more of the
following objectives.
 Convey information about past activities, current status or projections of the

 Future.
 Signal important events, opportunities, problems, or warnings.
 Trigger an action.
 Confirm an action.
Dept of CSE 12 VMTW

5. SYSTEM ARCHITECTURE
Opinion Mining also called sentiment analysis is a process of finding user’s opinion
towards a topic or a product. Opinion mining concludes whether user’s view is positive,
negative, or neutral about product, topic, event etc. Opinion mining and summarization
process involve three main steps, first is Opinion Retrieval, Opinion Classification and
Opinion Summarization. Review Text is retrieved from review websites. Opinion text in
blog, reviews, comments etc. contains subjective information about topic. Reviews classified
as positive or negative review. Opinion summary is generated based on features opinion
sentences by considering frequent features about atopic.
5.1 OPINION RETRIEVAL:

It is the process of collecting review text from review websites. Different review
websites contain reviews for products, movies, hotels and news. Information retrieval
techniques such as web crawler can be applied to collect the review text data from many
sources and store them in database. This step involves retrieval of reviews, microblogs, and
comments of user.
5.2 OPINION CLASSIFICATION:

Primary step in sentiment analysis is classification of review text. Given a review
document D = {d1…..d1} and predefined categories set C = {positive, negative}, sentiment
classification is to classify each di in D, with label expressed in C. The approach involves
classifying review text into two forms namely positive and negative. Machine learning and
lexicon based approach is more popular
5.3 OPINION SUMMARIZATION:

Summarization of opinion is a major part in opinion mining process. Summary of
reviews provided should be based on features or subtopics that are mentioned in reviews.
Many works have been done on summarization of product reviews. The opinion
summarization process mainly involves the following two approaches. Feature based
summarization a type summarization involves finding of frequent terms (features) that are
appearing in many reviews. The summary is presented by selecting sentences that contain
particular feature information. Features present in review text can be identified using Latent
Dept of CSE 13 VMTW

Semantic Analysis (LSA) method. Term frequencies count of term occurrences in a

document. If a term has higher frequency it means that term is more import for summary
presentation. In many product reviews certain product features appear frequently and
associated with user opinions about it. Figure 3 have the architecture of Opinion Mining
which says how the input is being classified on various steps to summarize the reviews.
ARCHITECTURE OF OPINION MINING:
Fig. 5.1 Architecture of Opinion Mining
Dept of CSE 14 VMTW

6. SYSTEM STUDY
Feasibility study is conducted once the problem is clearly understood. Feasibility

study is a high level capsule version of the entire system analysis and design process. The
objective is to determine quickly at a minimum expense how to solve a problem. The
purpose of feasibility is not to solve the problem but to determine if the problem is worth
solving.
The system has been tested for feasibility in the following points.
1. Technical Feasibility
2. Economic Feasibility
3. Operational Feasibility.
6.1. TECHNICAL FEASIBILITY:

The project entitles "Courier Service System” is technically feasibility because of the
below mentioned feature. The project was developed in Java which Graphical User Interface.
It provides the high level of reliability, availability and compatibility. All these make
Java an appropriate language for this project. Thus the existing software Java is a powerful
language.
6.2. ECONOMIC FEASIBILITY:
The computerized system will help in automate the selection leading the profits and
details of the organization. With this software, the machine and manpower utilization are
expected to go up by 80-90% approximately. The costs incurred of not creating the system
are set to be great, because precious time can be wanted by manually.
6.3. OPERATIONAL FEASIBILITY:
In this project, the management will know the details of each project where he may be
presented and the data will be maintained as decentralized and if any inquires for that
particular contract can be known as per their requirements and necessaries.
Dept of CSE 15 VMTW

7. SOFTWARE ENVIRONMENT
7.1 FEATURES OF JAVA:
7.1.1. Distributed:
Java has an extensive library of routines for coping with TCP/IP protocols like HTTP
and FTP Java applications can open and access across the Net via URLs with the same ease
as when accessing local file system.
We have found the networking capabilities of Java to be both strong and easy to use.
Anyone who has tries to do Internet programming using another language will revel. How
simple Java makes onerous tasks will like opening a socket connection.
7.1.2. Robust:
Java is intended for writing programs that must be readable in a Variety ways. Java
puts a lot of emphasis on early checking for possible problems, later dynamic checking, and
eliminating situations that are error prone. The single biggest difference between Java has a
pointer model that eliminates the possibility of overwriting memory and corrupting data.
The Java compiler detects many problems that in other languages would only show up
at runtime. As for the second point, anyone who has spent hours chasing a memory leak cost
by a printer bug will be very happy with this feature of Java.
Java gives you the best of both worlds. You need not pointers for everyday constructs
like string and arrays. You have the power of pointers if you need it, for example, for like
lists. And you have always-complete safety, since you can never access a bad pointer or make
memory allocation errors.
Dept of CSE 16 VMTW

7.1.3. Secure:
Java is intended to be used in networked/distributed environment toward that end; a

lot of emphasis has been placed on security. Java enables the contraction of virus-free,
temper-free systems.
Here is a sample of what Java’s security features are supposed to keep a Java
programming from doing:
1. Overrunning the runtime stack.
2. Corrupting memory outside its own process space.
3. Reading or writing local files when invoked through a security-conscious class loader
like Web browser.
7.1.4 Architecture Neutral:
The compiler generates an architecture neutral object file format- the compiled code is
executable on many processors, given the presence of Java runtime system...The Java
compiler does this by generating byte code instructions which have nothing to do with a
particular computer architecture. Rather they were designed to be both easy to any machine
and easily translated into native machine code on the fly.
Twenty years ago, the UCSD Pascal system did the same thing in a commercial
product and, even before that, Nicholas Worth’s original implementation of Pascal used the
same approach. By using bytecodes, performance takes major hit. The designers of Java did
an excellent job developing a byte code instruction set those workers well on today’s most
common computer architectures. And the codes have been designed to translate easily into
actual machine instructions.
7.1.5 Portable:
Unlike C and C++, they are no "implementation dependent" aspects of the

specifications. The sizes of the primitive’s data types are specified, as is the behavior of
arithmetic on them.
Dept of CSE 17 VMTW

For example, an int in Java is always a 32-bit integer. In C/C++, int can mean a 16-bit
integer, a 32-bit integer, or any size the compiler vendor likes. The only restriction is that it
must have at least as many bytes int and cannot have more bytes than a long int.
The libraries that are a part of the system define portable interfaces. For example,
there is an abstract window class and implementations of it UNIX, Windows, and the
Macintosh.
7.1.6 Interpreted:
The Java interpreters can execute Java byte codes directly on any machine to which
the interpreter has been ported. Since linking is a more incremental and lightweight process,
the development process can be much more rapid and explanatory.
One problem is that the JDK is fairly slow at compiling your source code to the
bytecodes that will, ultimately, be interpreted in the current version.
7.1.7. High Performance:
While the performance of interpreted bytecodes is usually more than adequate, there
are situations higher performance is required. The bytecodes can be translated on fly into
machine code for the particular CPU the application is running on.
Native code compilers for Java are not yet generally available. Instead there are just-
in-time (jit) compilers. These work by compiling the byte codes into native code once,
caching the results, and then calling them again, if needed. This speeds up code once,
catching the results, and calling them again, if needed. This speed up the loop tremendously
since once has to do the interpretation only once. Although still slightly slower than a true
native code compiler, just-in-time compilers can give you a 10-or even 20-fold speedup for
some programs and will almost always be significantly faster than the Java Interpreter.
7.1.8 Multithreaded:
In a number of ways, Java is more dynamic language than C or C++. It was designed
to adapt to an evolving environment. Libraries can freely add new methods and instance
variables without any effect on their clients.... In Java, finding out run time type information
is straightforward.
Dept of CSE 18 VMTW

This is an important feature in those situations where code needs to be added to a

running program. A prime example is code that is downloaded from the Internet to run in
browser.
7.2. THE JAVA FRAMEWORK:
The framework of java involves the following
7.2.1 Java Packages:

Packages are used in Java in order to prevent naming conflicts, to control access, to
make searching/locating and usage of classes, interfaces, enumerations and annotations
easier, etc.
A Package can be defined as a grouping of related types(classes, interfaces,
enumerations and annotations)Providing access protection and name space management.
Some of the existing packages in Java are:
 java.lang- bundles the fundamental classes

 java.io - classes for input, output functions are bundled in this package
Programmers can define their own packages to bundle group of
classes/interfaces, etc. It is a good practice to group related classes implemented by
you so that a programmer can easily determine that the classes, interfaces,
Enumerations, annotations are related. Since the package creates a new namespace
there won't be any name conflicts with names in other packages. Using packages, it is
easier to provide access control and it is also easier to locate the related classed.
7.3. HTML Document:

Html document consists of html tags and some text.
HTML Tags
1. Html tags are used to markup html element.

2. Html tags are surrounded by <and> Characters these characters are called as angler
brackets.
3. Html tags always come in pairs. That is tags contain a starting tag and ending tags.
Dept of CSE 19 VMTW

4. The text in b/w the starting tag and ending tag is called as element content.
5. Html tags are free defined tags.
6. Html tags are not case sensitive. That the upper and lower document is html
document.
7. The html tag <html> in a document represents. That the document is html document.
8. The entire html document must be written starting html tag <html> and ending html
</html>.
9. Html document is divided into two sections.
(I) Head section:

This section is used to provide general information about the html doc. This section is
representing by <head>.
Ex: Title, Meta etc.
(II) Body section
This section is used to display text are images on the Browser. This section is
representing by <body>.
Html document structure:
<html>
<head>
<title></title>
</head>
<body>
Welcome to any web page
</body>
</html>
Html comments:
Comments are used to make the code name readable or they are used to explain the
code.
HTML comments begins with 
EX :
<hr />
<center>
<br><br><br><br><br><br><br><br><br>
Dept of CSE 47 VMTW

<pq> welcome to </pq><pp> ${uname} (admin)</pp><br/><br/><br/><br/>

<ul>
<li><a href="ProductRegistration.jsp">Product Registration</a></li><br/>
<li><a href="DeleteProduct.jsp">Delete the product</a></li><br/>
</ul>
</div>
</center>
<hr />
<div id="footer">
<p>(c) For College. All rights reserved. </p></div></body></html>
Dept of CSE 48 VMTW

10. SYSTEM TESTING
The testing phase is an important part of software development. It is the process of

finding errors and missing operations and also a complete verification to determine whether
the objectives are met and the user requirements are satisfied.
Software testing is carried out in three steps:
The first includes unit testing, where in each module is tested to provide its
correctness, validity and also determine any missing operations and to verify whether the
objectives have been met. Errors are noted down and corrected immediately. Unit testing is
the important and major part of the project. So errors are rectified easily in particular module
and program clarity is increased. In this project entire system is divided into several modules
and is developed individually. So unit testing is conducted to individual modules.
The second step includes Integration testing. It need not be the case, the software
whose modules when run individually and showing perfect results, will also show perfect
results when run as a whole. The individual modules are clipped under this major module and
tested again and verified the results. This is due to poor interfacing, which may results in data
being lost across an interface. A module can have inadvertent, adverse effect on any other or
on the global data structures, causing serious problems.
The final step involves validation and testing which determines which the software
functions as the user expected. Here also some modifications were. In the completion of the
project it is satisfied fully by the end user
Maintenance and Enhancement:
AS the number of computer based systems, grieve libraries of computer software

began to expand. In house developed projects produced tones of thousand soft program
source statements. Software products purchased from the outside added hundreds of
thousands of new statements. A dark cloud appeared on the horizon. All of these programs,
all of those source statements-had to be corrected when false were detected, modified as user
requirements changed, or adapted to new hardware that was purchased. These activities were
collectively called software Maintenance.
Dept of CSE 49 VMTW

The maintenance phase focuses on change that is associated with error correction,
adaptations required as the software's environment evolves, and changes due to enhancements
brought about by changing customer requirements. Four types of changes are encountered
during the maintenance phase.
 Correction
 Adaptation
 Enhancement
 Prevention
 Correction
Even with the best quality assurance activities is lightly that the customer will
uncover defects in the software. Corrective maintenance changes the software to correct
defects.
Maintenance is a set of software Engineering activities that occur after software has
been delivered to the customer and put into operation. Software configuration management is
a set of tracking and control activities that began when a software project begins and
terminates only when the software is taken out of the operation.
We may define maintenance by describing four activities that are undertaken after a
program is released for use:
 Corrective Maintenance
 Adaptive Maintenance
 Perfective Maintenance or Enhancement
 Preventive Maintenance or reengineering
Only about 20 percent of all maintenance work are spent "fixing mistakes". The
remaining 80 percent are spent adapting existing systems to changes in their external
environment, making enhancements requested by users, and reengineering an application for
use.
Adaptation:
Over time, the original environment (E>G., CPU, operating system, business rules,
Dept of CSE 50 VMTW

external product characteristics) for which the software was developed is likely to change.
Adaptive maintenance results in modification to the software to accommodate change to its
external environment.
Enhancement:
As software is used, the customer/user will recognize additional functions that will
provide benefit. Perceptive maintenance extends the software beyond its original function
requirements.
Prevention:
Computer software deteriorates due to change, and because of this, preventive

maintenance, often called software re-engineering, and must be conducted to enable the
software to serve the needs of its end users. In essence, preventive maintenance makes
changes to computer programs so that they can be more easily corrected, adapted, and
enhanced.
Dept of CSE 51 VMTW

11. RESULT SCREENSHOOTS
11.1 Home Page:
Fig.11.1 Home Page
Dept of CSE 52 VMTW

11.2 Product:
Fig.11.2 Product
Dept of CSE 53 VMTW

11.3 Login Page:
Fig.11.3 Login Page
Dept of CSE 54 VMTW

11.4 Register Page:
Fig.11.4 User Register
Dept of CSE 55 VMTW

11.5 Products:
Fig.11.5 All Products
Dept of CSE 56 VMTW

11.6 Product Rating:
Fig.11.6 User Product Rating
Dept of CSE 57 VMTW

11.7 Admin Login:
Fig.11.7 Admin Login
Dept of CSE 58 VMTW

11.8 Admin Page
Fig.11.8 Admin Page
Dept of CSE 59 VMTW

11.9 Product Registration:
Fig.11.9 Admin Product Registration
Dept of CSE 60 VMTW

11.10 Products of Admin:
Fig.11.10 Admin Registered Products
Dept of CSE 61 VMTW

12. CONCLUSION
It is difficult to overestimate the importance of computer-aided opinion analysis solutions.

Individual consumers search for online opinions using standard search engines manually.
Time requirements of manual searching causes that this solution is not attractive for
companies which are interested in massive and automatic opinion searching and opinion
processing.
Large number of approaches to automatic text analysis causes that the choice of right
alternative may be difficult. Literature research and authors’ experience show that in opinion
mining field the following factors have an influence on methods and tools which are used for
opinion mining:
 expected effectiveness – it seems that machine learning and sentence-based solutions

have higher effectiveness than solutions based on frequency matrix or regular
expressions,
 Time for designing and implementation process – traditional models based on bag-of-
words representation are favored,
 Domain characteristic – for narrow fields of interests it is possible to design simple
mining methods; for broad domains more complex and more advanced solutions
must be used,
 cost of system designing and implementation – more advanced solutions require
greater expenditures.
Dept of CSE 62 VMTW

13. FUTURE ENHANCEMENT
An opinion word can express different meaning when used in different domains and
might raise disambiguous complex problems, which lead to misclassification by the
classifier. And also translation of any native languages like Chinese, Arabic, and other
European languages into machine languages is a complex process for linguistic approaches.
Solving this problem is a good challenge for opinion mining and sentiment analysis and
hence future work is progressing in this area of new classification algorithms and linguistic
approaches
Dept of CSE 63 VMTW

BIBLIOGRAPHY
[1] Abbasi, Ahmed, Hsinchun Chen, and Arab Salem. Sentiment analysis in multiple
languages: Feature selection for opinion classification in web forums. ACM Transactions
on Information Systems (TOIS), 2008.26(3).
[2] Abdul-Mageed, Muhammad, Mona T. Diab, and Mohammed Korayem. Subjectivity and
sentiment analysis of modern standard Arabic.in Proceedings of the 49th Annual Meeting
of the Association for Computational Linguistics:shortpapers. 2011.
[3] Akkaya, Cem, JanyceWiebe, and RadaMihalcea. Subjectivity word sense
disambiguation.in Proceedings of the 2009 Conference on Empirical Methods in Natural
Language Processing (EMNLP-2009). 2009.
[4] Alm, Ebba Cecilia Ovesdotter. Affect in text and speech, 2008: ProQuest.
[5] Andreevskaia, Alina and Sabine Bergler. Mining WordNet for fuzzy sentiment:
Sentiment tag extraction from WordNet glosses. in Proceedings of Conference of the
European Chapter of the Association for Computational Linguistics (EACL-06). 2006.
[6] Andreevskaia, Alina and Sabine Bergler. When specialists and generalists work together:
Overcoming domain dependence in sentiment tagging.in Proceedings of the Annual
Meeting of the Association for Computational Linguistics (ACL-2008). 2008.
[7] Andrzejewski, David and Xiaojin Zhu. Latent Dirichlet Allocation with topic-in-set
knowledge. in Proceedings of NAACL HLT. 2009.
[8] Andrzejewski, David, Xiaojin Zhu, and Mark Craven. Incorporating domain knowledge
into topic modeling via Dirichlet forest priors.in Proceedings of ICML. 2009.
[9] Archak, Nikolay, AnindyaGhose, and Panagiotis G. Ipeirotis. Show me the money!:
deriving the pricing power of product features by mining consumer reviews.
Dept of CSE 64 VMTW

MN15 B09 PDF

Uploaded by

Copyright:

Available Formats

MN15 B09 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MN15 B09 PDF

Uploaded by

Copyright:

Available Formats

A Mini Project Report

Under the guidance of

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

We hereby declare that project entitled “SENTIMENT ANALYSIS AND

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Mrs. A. GAUTHAMILATHA Mr. T.SRAJAN KUMAR

We would like to express sincere gratitude to Dr. P. Sudhakara Rao,

We would also like to thank our madam Mrs. A. Gauthamilatha, Head of

1.1. How Sentimental Analysis Works 2

1.2. Approaches For Feature- Based Opinion Mining 4

2. LITERATURE SURVEY 5-8

3.1. Existing System 7

3.2. Proposed System 7

3.3. System Requirements 8

3.3.1. Hardware Requirements 8

3.3.2. Hardware Requirements 8

4.2. Modules Description 9

4.2.1. Login Module 9

4.2.2. Sentiment Analysis 9

4.2.3. Customer feedback 10

4.2.4. Analyzing of neutral sentiments 10

4.3. Input Design 10

4.4. Output Design 11

5. SYSTEM ARCHITECTURE 13-14

5.1. Opinion Retrieval 13

5.2. Opinion Classification 13

5.3. Opinion Summarization 13

6.1. Technical Feasibility 15

6.2. Economic Feasibility 15

6.3. Operational Feasibility 15

7. SOFTWARE ENVIRONMENT 16-32

7.1. Features of Java 16

7.1.4. Architecture Neutral 17

7.1.7. High Performance 18

7.2. The Java Framework 19

7.2.1. Java Packages 19

7.3. HTML Document 19

7.4. Cascading Style Sheets (CSS) 25

7.6. Servlet Life Cycle 29

8. SYSTEM DESIGN 33-38

8.1. Data Flow Diagram 34

8.2. UML Diagrams 35

8.3. Use Case Diagram 35

8.4. Class Diagram 37

8.5. Sequence Diagram 38

9.1. Sample code for User Registration 39

9.2. Sample code for User Login 43

9.3. Sample code for Admin Login 46

10. SYSTEM TESTING 49-51

11. RESULT SCREENSHOOTS 52-61

11.1. Home Page 52

11.3. Login Page 54

11.4. Register Page 55

11.6. Product Rating 57

11.7. Admin Login 58

11.8. Admin Page 59

11.9. Product Registration 60