Unit 5-1
Unit 5-1
Social network mining, also known as social network analysis (SNA), is the
process of analyzing and extracting meaningful insights from social networks.
These insights can be used for various applications across different domains.
Here's a detailed exploration of the introduction and applications of social network
mining:
Recommendation Systems:
UNIT 5 1
Personalized Recommendations: Tailoring recommendations to
individual users by considering their social network connections,
preferences, and behaviors.
UNIT 5 2
Online Communities and Forums:
Data Privacy and Ethics: Social network mining raises concerns about
data privacy, user consent, and ethical implications of analyzing personal
information.
Data Quality and Bias: Social network data may suffer from quality issues,
including noise, bias, and missing data, which can affect the accuracy and
reliability of analysis results.
UNIT 5 3
daily basis; they just
relate to our daily activities, which are quite predictable. This is what Social
Media Mining
is! Where big companies collect data indirectly and use it in a way to
improve our quality of life. But is this good or bad? Can it be harmful
to the customers or helpful in some way?
There are numerous questions that might come to your mind, as many people are
not aware of the concept of Social Media mining.mind,
This article is for you. In this article, we’ll be discussing Social
Media Mining and the key benefits it offers. So, without further ado,
let’s get started!
UNIT 5 4
the users, as this plays a major role in today’s world. Social Media
mining has become a must-have technique in every business. Here are some
of the benefits you can derive from using Social Media Mining :
It is a process that starts with identifying the target audience and ends with
digging into what they are passionate about. Businesses may analyze the
keywords, search results, comments, and mentions to
identify the current trend, and a deeper study of behavior change can
also help in predicting future trends. This data is very useful for
businesses to make informed decisions when the stakes are high.
2. Sentiment Analysis
Sentiment Analysis is the process of identifying positive or negative sentiments
portrayed in information posted on social media platforms. Businesses use Social
Media Mining to identify the same sentiments associated with their brand and
product lines.
When combined with social media monitoring, sentiment analysis can help you
analyze your brand image and bring negative aspects
UNIT 5 5
of the business to your attention. With this information, you can
address the negative sentiments and prioritize them so that they can be
addressed properly to improve the customer experience.
3. Keyword Identification
In a world where more than 90% of businesses function online, the importance of
using the right words
cannot be emphasized enough. The business has to stand out to compete
in a world where your sales team cannot charm customers with their looks
and cheesy talks.
Keywords can give your business an edge over itsimprove the competitors.
Keywords are those words that reveal the behavior of users and highlight the
frequently used and popular terms related
to their products. Social Media Data Mining can be highly effective in
finding these keywords. The process is as basic as
scanning the list of the most frequent words or phrases used by customers to
search for or define your product.
Using these keywords to define your product in digital media and implementing
SEO can yieldits pretty good results. Your product will rank higher, and by
implementing frequent and popular terms, you can make your product listings
better.
UNIT 5 6
seeking advicegain and opinions from millions of users.
By using so much data, you are essentially tweaking your product in
such a way that the probability of its success is very high. By
analyzing the userbase information, you can
target the social media platform with the highest number of users.
5. Competitor Analysis
You are not wrong to assume that your competitors are
already using Data Mining techniques to monitor the market and to
compete with them; it becomes essential to Improving yourself by
analyzing others’ mistakes is often less painful than learning from your
own.
6. Event Identification
Also known as Social Heat Mapping, this technique uses excellent. It is a part of
Social Media mining that helps researchers and agencies to be prepared for
unexpected outbursts.
An excellent example of implementing heat mapping on social media was seen
during the Farmer Protests. During the protest, huge crowds were approaching
the venue of the Republic Day celebration.
UNIT 5 7
department identify big issues as they use heat mapping or any other
technique to access social media sources. They detect the events and
figure out information faster than traditional sensor approaches. Many
users publish the information using their cell phones, so event
identification is real-time and up to date. Organizations can respond
faster as people share information during disasters or social events.
9. Recognize Behavior
Social media mining analyzes our real behavior even
when we are not present and helps to learn about humans. Organizations
use some techniques to understand customers. The government provides
facilities for companies to identify the right members and scientists to
explain the events. Therefore, social media mining helps to understand
how events link together that we may not noticed earlier.
UNIT 5 8
(e.g., user ID) and may contain additional attributes such as user
demographics, interests, or behavior.
UNIT 5 9
4. Community Detection:
UNIT 5 10
medium of interaction. Here are some common types of social networks:
3. Interest-Based Networks:
UNIT 5 11
Photo Sharing Networks: These networks focus on sharing and
discovering photos, images, and visual content. Users can upload, edit,
and share photos with their network of friends or followers. Examples
include Instagram and Flickr.
These are just a few examples of the diverse types of social networks that exist,
each serving different purposes and catering to various user needs and
preferences. The social networking landscape continues to evolve, with new
platforms emerging and existing platforms evolving to meet changing trends and
demands.
UNIT 5 12
What are recommender systems?
First things first, let’s define recommender systems.
Recommender systems are sophisticated algorithms designed to provide product-
relevant suggestions to users.
Recommender systems play a paramount
role in enhancing user experiences on various online platforms,
including e-commerce websites, streaming services, and social media.
Essentially, recommender systems aim to analyze user data and behavior to make
tailored recommendations.
Data processing: Once collected, they process the data to extract meaningful
patterns
and insights. This involves techniques like data cleaning,
transformation, and feature engineering.
Item profiling: Similarly, items or content available on the platform are also
profiled based on their characteristics. Think of attributes like genres,
keywords, or product features.
UNIT 5 13
profiles. For example, collaborative filtering identifies users with
similar preferences and recommends items liked by others with similar
profiles. Content-based filtering recommends items based on the
attributes of items users have previously interacted with.
Ranking and presentation: Finally, the recommended items are ranked based
on their relevance to
the user. The top-ranked items are then presented to the user through
interfaces like recommendation lists, personalized emails, or pop-up
suggestions.
Top-10 movies,
New products.
However, non-personalized
recommendation systems have their limitations, including the inability
to provide highly tailored recommendations. They may be a good option
for a first step in the process of personalization, but you shouldn’t
stop there.
UNIT 5 14
Once you gather enough data about the user in question, personalized offers and
recommendations are the logical next step.
This is especially important if you
don’t want to reject your potential buyer by failing to recognize what
they like and what to recommend next. Or even worse, you recommend a
product they have already bought.
This can all be handled well with a suitable personalized recommender system.
Doesn’t recommend the same items to users for the second time, and
UNIT 5 15
techniques for providing tailored recommendations.
These include:
Content-based filtering,
Hybrid recommenders.
Content-based filtering
Content-based recommender systems use
items or user metadata to create specific recommendations. To do this,
we look at the user’s purchase history.
For example, if a user has already
read a book from one author or a product from a certain brand, you
assume that they have a preference for that author or that brand. Also,
there is a probability that they will buy a similar product in the
future.
UNIT 5 16
Aristoi book, then her recommended book will be Angel Station, also a
sci-fi book written by Walter Jon Williams.
The “Filter bubble”: Content filtering can recommend only content similar to
the user’s past preferences. If a user reads a book about a political ideology
and
UNIT 5 17
books related to that ideology are recommended to them, they will be in
the “bubble of their previous interests”.
In the first case scenario, 20% of items attract the attention of 70-80%
of users and 70-80% of items attract the attention of 20% of users. The
recommender’s goal is to introduce other products that are not available to
users at first glance.
Collaborative filtering
Collaborative filtering is a popular
technique used to provide personalized recommendations to users based on
the behavior and preferences of similar users.
The fundamental idea behind
collaborative filtering is that users who have interacted with items in
similar ways or have had similar preferences in the past are likely to
have similar preferences in the future, too.
Collaborative filtering relies on the collective wisdom of the user community to
generate recommendations.
There are two main types of collaborative filtering: memory-based and model-
based.
Memory-based recommenders
UNIT 5 18
Memory-based recommenders rely on the direct similarity between users or items
to make recommendations.
Usually, these systems use raw,
historical user interaction data, such as user-item ratings or purchase
histories, to identify similarities between users or items and generate
personalized recommendations.
The biggest disadvantage of
memory-based recommenders is that they require a lot of data to be
stored and comparing every item/user with every item/user is extremely
computationally demanding.
Memory-based recommenders can be categorized into two main types user-
based and item-based collaborative filtering.
User-based
UNIT 5 19
Jenny buys that book, that same book will be recommended to Tom, since
he also likes sci-fi books.
Item-based
UNIT 5 20
e.g. whether the user liked or rated an item or whether the item was
liked or rated by a certain user.
For example, the idea is to recommend Robert the new sci-fi book. Let’s look at
the steps in this process:
Cosine similarity
1. Look up similar users: In the user-user matrix, we observe users that are most
similar to Robert.
3. Candidate scoring: Depending on the other users’ ratings, books are ranked
from the ones
they liked the most, to the ones they liked the least. The results are
normalized on a scale from 0 to 1.
The item-item similarity calculation is done in an identical way and has all the
same steps as user-user similarity.
UNIT 5 21
The similarity between items is more stable than the similarity between the users.
Why?
Well, a math book will always be a
math book, but a user can easily change his mind – something they liked
last week might not be interesting next week.
Model-based recommenders
Model-based recommenders make use of machine learning models to generate
recommendations.
These systems learn patterns,
correlations, and relationships from historical user-item interaction
data to make predictions about a user’s preferences for items they
haven’t interacted with yet.
There are different types of model-based recommenders, such as matrix
factorization, Singular Value Decomposition (SVD), or neural networks.
However, matrix factorization remains the most popular one, so let’s explore it a bit
further.
Matrix factorization
Matrix factorization is a mathematical technique used to decompose a large matrix
into the product
of multiple smaller matrices.
In the context of recommender systems, matrix factorization is commonly
employed to uncover latent patterns or features in user-item interaction data,
allowing for
UNIT 5 22
personalized recommendations. Latent information can be reported by
analyzing user behavior.
If there is feedback from the user,
for example – they have watched a particular movie or read a particular
book and have given a rating, that can be represented in the form of a
matrix. In this case,
User latent factor matrix (U), which contains information about users and their
relationships with latent factors.
Item latent factor matrix (V), which contains information about items and their
relationships with latent factors.
UNIT 5 23
Matrix factorization
The matrix factorization process includes the following steps:
The ratings matrix is obtained by multiplying the user and the transposed item
matrix,
UNIT 5 24
No need for item attributes: Collaborative filtering works solely based on
user-item interactions,
making it applicable to a wide range of recommendation scenarios where
item features may be sparse or unavailable. This is especially useful in
content-rich platforms.
User cold start occurs when a new user joins the system without any prior
interaction history. Collaborative filtering relies on historical
interactions to make recommendations, so it can’t provide personalized
suggestions to new users who start with no data.
Item cold start happens when a new item is added, and there’s no user
interaction data for it. Collaborative filtering has difficulty
recommending new items since it lacks information about how users have
engaged with these items in the past.
UNIT 5 25
already popular items receive even more attention, while niche or
less-known items are overlooked.
Hybrid recommenders
Hybrid recommendation systems combine
multiple recommendation techniques or approaches to provide more
accurate, diverse, and effective personalized recommendations.
They are particularly valuable in
real-world recommendation scenarios because they can provide more
robust, accurate, and adaptable recommendations.
The choice of which hybrid approach
to use depends on the specific requirements and constraints of the
recommendation system and the nature of the available data.
Enhanced robustness and flexibility: Hybrid models are often more robust in
handling various recommendation
scenarios. They can adapt to different data characteristics, user
behaviors, and recommendation challenges. This flexibility is valuable
in real-world recommendation systems.
UNIT 5 26
techniques. For example, they can overcome the “cold-start” problem for
new users and items by incorporating content-based recommendations,
providing serendipitous suggestions, and reducing popularity bias.
Data and computational demands: Hybrid models often require more data
and computational resources
because they use multiple recommendation algorithms. This can be
challenging, especially in large-scale systems with vast user-item
interactions and a diverse catalog of items.
UNIT 5 27
They can help you measure how well a
recommendation algorithm or model is performing and provide insights
into its strengths and weaknesses.
There are several categories of evaluation metrics, depending on the specific
aspect of recommendations being assessed.
Some common evaluation metrics include:
Ranking metrics evaluate how well a recommender system ranks items for a
user, especially in
top-N recommendation scenarios. Think of hit rate, average reciprocal
hit rate (ARHR), cumulative hit rate, or rating hit rate.
UNIT 5 28
Which metric will be used depends on the business problem being solved.
If we think that we have made the
best possible recommender and the metric is great, but in practice it is
bad, then our recommender is not good. For example, Netflix’s
recommender was never used in practice because it didn’t meet customer
needs.
The most important thing is that the
user gains trust in the recommender system. If we recommend to them the
top 10 products, but only 2 or 3 are relevant to them, they will
consider this a bad recommendation.
For this reason, the idea is not to always recommend the top 10 items but to
recommend items above a certain threshold.
In such cases, recommender systems can initially recommend either the top 10
best-selling products or the top 10 products on promotion as a starting point.
Alternatively,
conducting user interviews can help gather information about the user’s
preferences.
Another aspect of the cold start
problem pertains to introducing new products to users. This can be
achieved by leveraging content-based attributes and periodically adding
new products to user recommendations while actively promoting them.
Furthermore, churn
poses another challenge, as users’ preferences and behaviors evolve
over time. To address this, recommender systems should incorporate a
degree of randomization to refresh the top N list of recommended items
periodically.
UNIT 5 29
It is also crucial to ensure that
recommender systems are designed with sensitivity in mind, avoiding
content that may offend or discriminate against users.
UNIT 5 30