Technovate Poster - Template (AutoRecovered)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 1

Semantic Vector Space Model for Text Classification using

Progressive Learning Network Algorithm


Shikhar Pathak, Harsh Aggarwal, Lakshit Sharma, Mr. Roshan Lal
Department of Computer Science and Engineering, Amity School of Engineering and Technology, Noida, Uttar Pradesh, India

Abstract
The exponential increase in digital content necessitates Experimental Results
advanced methods to efficiently classify and organize Beyond semantic attributes, the three other statistical
extensive documents. While a myriad of techniques have measures derived from the dataset diverge significantly from
been proposed for text classification, a significant portion of the weights apt for straightforward classification,
them are tailored predominantly for English texts. Our underscoring the necessity of this particular feature set. We
employed Normalization for aligning attributes.
current research introduces an innovative classification
method using neural networks, specifically designed to
overcome existing classification challenges. In our model, we
integrate both sentence-level and lexical-level data, ensuring
that deeper linguistic nuances are captured. Additionally, we
have refined the process of dynamic routing by using context
derived from the sentence states. What sets our model apart
is its ability to sidestep the intricate process of optimizing
individual node characteristics and directly ascertain the
width of the area receiving the data. As an additional
improvement, we've incorporated language models into
graphs for extracting a more comprehensive level of
semantic information that does not rely on visual elements.
Keywords: Neural network, Text classification, Digital text,
Graph node embedding.
Proposed Technique
Fig.2: Positive and Negative sentiment distribution
The Advanced Learning Network Algorithm is a deep
learning technique designed for ongoing learning processes, The Rayleigh fading is a reasonable model when there are
The E-Shop Text Dataset serves as a comprehensive
encompassing phases of learning, advancement, and
repository of e-commerce-related textual data, meticulously
termination. Within a set of tasks, one is chosen for analysis, organized into taxonomic categories that span a diverse range
typically employing the learning method. The model's of product offerings. These categories include Electronics,
capacity is incrementally expanded using the augmentation Home Books, and Clothing & Accessories, providing a
technique, incorporating new parameters derived from nuanced representation of the e-commerce landscape. It's
previously unstudied tasks. This expansion allows for worth noting that the dataset covers nearly one percent of all
learning from the current task's data without any negative existing e-commerce platforms, offering a substantial and
representative sample of the broader online retail ecosystem.
impact from previously forgotten information.

Fig. 4: Results of Fading Channels

Fig.1. Flow Diagram of Proposed Algorithm


Fig. 3: Pie chart illustrating various emotional tones
Text Preprocessing conveyed in headlines
During the preprocessing phase, filtering is essential. This
involves removing less relevant portions of the text, Structured in a .csv (Comma-Separated Values) format, the Fig 5: Confusion matrix
including punctuation marks like.!; and other symbols. dataset is composed of two distinct columns. The first
Eliminating these elements enhances the overall precision of column is dedicated to classifying the type of product, Conclusions
employing a taxonomy that helps categorize items into In our thorough investigation, we delved into the
classification. To facilitate this process, the Natural complexities, obstacles, and potential remedies in text
specific genres, ensuring a systematic and organized
Language Toolkit (NLTK) is employed. Once the text is arrangement of information. The second column contains the categorization. Despite the apparent simplicity in certain
cleaned, a word count is conducted. Feature selection refers associated content, which primarily comprises product names languages, we emphasized the crucial role of refining
to the choice of variables or identifying the right attributes to and their detailed descriptions. preprocessing stages to significantly boost categorization
build an effective model, aiming to achieve higher effectiveness. A significant aspect of our work involved
accuracy in outcome. introducing a unique word embedding technique tailored to
identify semantic connections between terms at a sub-word
Feature Precision Recall F1 score Support level. This inventive embedding approach demonstrated
Word Embedding 0.957210 0.860020 0.906135 None exceptional proficiency in handling typos and spelling
TF-IDF 0.984989 0.880757 0.931204 None irregularities arising from segmentation errors.
Table 1: Feature extraction in text

You might also like