Social Network Analysis (SNA) - 1
Social Network Analysis (SNA) - 1
Social Network Analysis (SNA) - 1
Chapter-1
CONTENTS
Social Network perspectives
Fundamental concepts of network analysis
Motivation
Erdos Number Project
Centrality measures
2
Social Network perspectives
4
Social Network perspectives cont.…
5
Social Network perspectives
1. Formal Descriptions
network analysis provides a vocabulary and set
of formal definitions for expressing theoretical
concepts and properties.
7
Fundamental Concepts in Network
Analysis
There are several key concepts at the heart
of network analysis that are fundamental to
the discussion of social networks.
These concepts are: actor, relational tie,
dyad, triad, subgroup, group, relation, and
network.
8
Fundamental Concepts in Network
Analysis
Actor:
discrete individual, corporate, or collective social
units.
Examples of actors are people in a group,
departments within a corporation, public service
agencies in a city, or nation-states in the world
system.
onemode networks: most social network
applications focus on collections actors that are all of
the same type (for example, people in a work group)
9
Fundamental Concepts in Network
Analysis
Relational Tie:
Actors are linked to one another by social ties.
Some of the more common examples of ties
employed in network analysis are:
Evaluation of one person by another (for
example expressed friendship, liking, or
respect)
Transfers of material resources (for example
business transactions, lending or borrowing
things)
10
Fundamental Concepts in Network
Analysis
Association or affiliation (for example jointly
attending a social event, or belonging to the same
social club)
Behavioral interaction (talking together, sending
messages)
Movement between places or statuses (migration,
social or physical mobility)
Physical connection (a road, river, or bridge
connecting two points)
Formal relations (for example authority)
Biological relationship (kinship or descent)
11
Fundamental Concepts in Network
Analysis
Dyad:
At the most basic level, a linkage or relationship
establishes a tie between two actors.
A dyad consists of a pair of actors and the
(possible) tie(s) between them
Dyadic analyses focus on the properties of
pairwise relationships, such as whether ties are
reciprocated or not, or whether specific types of
multiple relationships tend to occur together.
Example: A conversation between two friends
12
Fundamental Concepts in Network
Analysis
Triad :
Relationships among larger subsets of actors may
also be studied.
Many important social network methods and
models focus on the triad;
a subset of three actors and the (possible) tie(s)
among them.
Example: Balance theory(a theory of attitude
change, urge to maintain one's values and
beliefs over time)
13
BALANCE THEORY
A
-VE
+VE
C B
-VE
14
Fundamental Concepts in Network
Analysis
Subgroup :
we can define a subgroup of actors as any
subset of actors, and all ties among them.
Locating and studying subgroups using
specific criteria has been an important
concern in social network analysis.
15
Social Networking Basics: Cohesive Sub-
group
Cohesive Sub-group
A
well-connected group,
C
B
Strong,direct,intense,fre
quent, positive, ties
D
E Clique
Cluster
E.g. A,B ,D and E
F
G
H I
16
clustering is the task of grouping a set of objects in
such a way that objects in the same group (called
a cluster) are more similar (in some sense or another)
to each other than to those in other groups (clusters).
17
18
Fundamental Concepts in Network
Analysis
Group :
a group is the collection of all actors on
which ties are to be measured
A group, then, consists of a finite set of
actors who for conceptual, theoretical, or
empirical reasons are treated as a finite set of
individuals on which network measurements
are made.
19
Fundamental Concepts in Network
Analysis
Relation :
The collection of ties of a specific kind
among members of a group is called a
relation
For example, the set of friendships among
pairs of children in a classroom, or the set of
formal diplomatic ties maintained by pairs of
nations in the world, are ties that define
relations.
20
Fundamental Concepts in Network
Analysis
Social Network :
A social network consists of a finite set or
sets of actors and the relation or relations
defined on them.
21
Summary
These terms provide a core working vocabulary
for discussing social networks and social network
data.
social network analysis not only requires a
specialized vocabulary, but also deals with
conceptual entities that are quite difficult to pursue
using a more traditional statistical and data analytic
framework
22
Motivation
Empirical Motivations
Theoretical Motivations
Mathematical Motivations
23
Motivation-Empirical Motivations
Inventions of term “sociogram”
A sociogram is a visual depiction of the relationships
between a specific group
To discover the underlying relationship between two
persons.
Criterion (what you want to measure): specific type of
social interaction
1. + ve criterion(to choose something that you either
enjoy or would love to participate in with others)
2. - ve criterion(to choose something that you would not
enjoy), resistant in interpersonal relationship
24
Example: positive criterion
1. Which three
classmate would you
most like to go on a
vacation with?
25
Example: Negative criterion
1. Which three
classmate would you
least enjoy going on
a vacation with?
2. Which three classmates do you like to be
around the least?
26
Motivation:Theoretical Motivations
development of network methods
EXAMPLE:
social group, isolate, popularity, liaison, prestige,
balance, transitivity, clique, subgroup, social
cohesion, social position, social role, reciprocity,
mutuality, exchange, influence, dominance,
conformity
27
Motivation: Mathematical Motivations
social network analysis, researchers found use for
mathematical models
The three major mathematical foundations of
network methods are graph theory, statistical and
probability theory, and algebraic models
28
Erdos Number Project
describes the "collaborative distance" between
mathematician Paul Erdős and another person, as
measured by authorship of mathematical papers
31
Centrality measures
centrality identify the most important vertices
within a graph
EXAMPLE:
identifying the most influential person(s) in a
social network
key infrastructure nodes in the Internet or urban
networks
super-spreaders of disease
32
Centrality
• Finding out which is the most central node is
important:
– It could help disseminating information in the
network faster
– It could help stopping epidemics
– It could help protecting the network from
breaking
33
Centrality measures
Degree centrality
Closeness centrality
Betweeness centrality
Eigenvector centrality
PageRank centrality
34
Centrality: visually
• Centrality can have various
meanings:
Y
X X
X Y
X
Y Y
35
Centrality measures : Degree centrality
Historically first and conceptually simplest
Definition:
Degree centrality refers to the number of ties
a node has to other nodes.
Actors who have more ties may have
multiple alternative ways and resources to
reach goals—and thus be relatively
advantaged.
36
Centrality measures : Degree centrality
Un-Directed Graph:
37
Centrality measures : Degree centrality
Directed Graph:
In the case of a directed network (where ties
have direction), we usually define two
separate measures of degree centrality,
namely indegree and outdegree.
38
Centrality measures : Degree centrality
1. in-degree centrality:
39
Centrality measures : Degree centrality
2. out-degree centrality:
Actors who have high out-degree centrality
may be relatively able to exchange with
others, or disperse information quickly to
many others.
So actors with high out-degree centrality are
often characterized as influential.
40
DANA
ANN
A
FRANK
CARA
BEN
EVAN
ANNA:2 RANKING
BEN:1 CARA,DANA,EVAN
CARA:3 ANNA,FRANK
DANA:3 BEN
EVAN:3
FRANK:2
41
EXAMPLE: Freeman's approach
42
EXAMPLE: Freeman's approach
43
EXAMPLE
Consider the network ,Which nodes (actors) are
more “central” than others?
2, 5, and 7 appear relatively “central”.
So, node 7 has an in-degree centrality absolute value
of 9 (there are 9 other nodes connected to node 7).
The normalized value is 100 (all possible other
nodes are connected to node 7).
The out-degree centrality has an absolute value of 3
(node 7 is connected out to nodes 2, 4, and 5), and a
normalized value of 33.33 (3 nodes is 33.33% of the
possible 9 nodes to which node 7 could extend out.)
44
45
46
2 1 3
1
2 4
3
NETWORK A NETWORK B
47
Degree centrality
Network A
Node 1 - centrality score 3
Node 2 - centrality score 1
Node 3 - centrality score 1
Node 4 - centrality score 1
The maximum score is 3.
The degree centrality score of Network A is 1.
48
Degree centrality
Network B
Node 1 - centrality score 3
Node 2 - centrality score 3
Node 3 - centrality score 3
Node 4 - centrality score 3
The maximum score is 3.
The degree centrality score of Network B is 0.
Thus, Network A is more centralized than Network B
for degree centrality.
49
Centrality measures : Closeness centrality
Degree doesn't factor in distance.
Refers to Number of links on Path between nodes
Path: set of links connecting two people(not unique)
Shortest Path: Path between two nodes with shortest
distance(not unique)
Diameter: longest of the shortest paths considering all
nodes.
Closeness centrality for node:
Find the shortest path lengths to other
Take the average of these
Closeness of centrality=
50
(B,A,C, D,F)
ANN
DANA (B,A,C,E,D,
A F)
FRANK
(B,A CARA
)
BEN (B,A,C,D,E,
EVAN F)
(B,A,C, E,F)
Diameter: 5
51
CLOSENESS CENTRALITY D
A
C: F
C
C-B=(C,A,B)=2
C-D=(C,D)=1
C-E=(C,E)=1 B
E
C-F=(C,D,F)=2
C-A=(C,A)=1
AVERAGE C: =
52
CLOSENESS CENTRALITY D
A
D: F
C
D-B=(D,C,A,B)=3
D-C=(D,C)=1
D-E=(D,E)=1 B
E
D-F=(D,F)=1
D-A=(D,C,A)=2
AVERAGE D: =
53
0.55 0.625
6
0.45
0.71 5
4
0.38
5 0.625
54
CLOSENESS CENTRALITY
55
Centrality measures : Closeness centrality
Closeness is a measure of the degree to
which an individual is near all other
individuals in a network.
It is the inverse of the sum of the shortest
distances between each node and every
other node in the network.
Closeness is the reciprocal of farness
56
Centrality measures : Betweenness
Betweenness is a centrality measure of a vertex within a
graph.
Betweenness centrality quantifies the number of times a node
acts as a bridge along the shortest path between two other
nodes.
It was introduced as a measure for quantifying the control of
a human on the communication between other humans in a
social network by Linton Freeman
A node with high betweenness centrality has a large
influence on the transfer of items through the network, under
the assumption that item transfer follows the shortest paths.
57
Centrality measures : Betweenness
The betweenness of of a vertex , in a graph
G=(V,E) with vertices V is computed is computed
as follows:
1. For each pair of vertices (s,t), compute the
shortest paths between them.
2. For each pair of vertices (s,t) ,determine the
fraction of shortest paths that pass through the
vertex in question.(here vertex )
3. Sum this fraction over all pairs of vertices (s,t)
58
Betweenness of vertex
()=
59
Centrality measures : Betweenness
Cara:
For each pair, consider two questions?
1. How many shortest paths are there between the pair of people?
2. How many of these shortest paths contain Cara?
A to B=0/1,A to D=1/1=1,A to E= 1/1=1,
A to F= 2/2=1
B to D = 1/1, B to E = 1/1, B to F =2/2=1
D to E =0/1=0, D to F=0/1=0,
E to F =0/1 =0
Betweeness ( C ) = 0+1+1+1+1+1+1+0+0+0=6
60
Centrality measures : Betweenness
Dana:
A to B=A to C=,A to E= 0/1=0
A to F= 1/2=0.5
B to C= B to E = 0/1=0,
B to F=1/2=0.5,
C to E=0/1=0, C to F =1/2 =0.5
E to F = 0/1=0
Betweeness ( D ) =1.5
61
1.5
4
0
6
0
1.5
62
Centrality measures : Eigenvector
Centrality
A natural extension of degree centrality is eigenvector centrality.
In-degree centrality awards one centrality point for every link a
node receives.
But not all vertices are equivalent:
some are more relevant than others, and, reasonably, endorsements
from important nodes count more
Eigenvector centrality differs from in-degree centrality: a node
receiving many links does not necessarily have a high eigenvector
centrality (it might be that all linkers have low or null eigenvector
centrality).
Moreover, a node with high eigenvector centrality is not necessarily
highly linked (the node might have few but important linkers).
63
64
Eigenvector Centrality
66
Using the adjacency matrix to find eigenvector
centrality
For
a given graph,with number of vertices
() be the adjacency matrix, i.e. (if there is a linked
between and ) otherwise
The relative centrality score of vertex can be defined
as:
67
Eigenvector Centrality, cont.
Let T
69
Now let's look at what happens when we multiply the
vector x by the matrix A. The result, of course, is another
5x1 vector.
71
This has, in effect, "spread out" the degree centrality. That this is
moving in the direction of a reasonable metric for centrality can be
seen better if we rearrange the graph a little bit:
72
Suppose we multiplied the resulting vector by A again.
we'd be allowing this centrality value to once again
"spread" across the edges of the graph.
the spread is in both directions (vertices both give to and
get from their neighbors)
might eventually reach an equilibrium when the amount
coming into a given vertex would be in balance with the
amount going out to its neighbors.
the numbers would keep getting bigger, but we could
reach a point where the share of the total at each node
would remain stable
73
At that point we might imagine that all of the
"centrality-ness" of the graph had equilibrated and the
value of each node completely captured the centrality
of all of its neighbors, all the way out to the edges of
the graph.
74
Eigenvector Centrality: Example
max = 2.68
Centrality measure: Page Rank
PageRank is a variant of EigenCentrality.
A potential problem with Katz centrality is the
following: if a node with high centrality links many
others then all those others get high centrality. In many
cases, however, it means less if a node is only one
among many to be linked.
Like Eigen Centrality, PageRank can help uncover
influential or important nodes whose reach extends
beyond just their direct connections
76
Centrality measure: Page Rank
PageRank is an adjustment of Katz centrality that takes
into consideration this issue. There are three distinct
factors that determine the PageRank of a node:
(i) the number of links it receives,
(ii) the link propensity of the linkers, and
(iii) the centrality of the linkers
PageRank computes a ranking of the
nodes in the graph G based on the
structure of the incoming links.
77
Centrality measure: Page Rank
PageRank is an algorithm used by Google Search to
rank websites in their search engine results.
“PageRank works by counting the number and
quality of links to a page to determine a rough
estimate of how important the website is. The
underlying assumption is that more important
websites are likely to receive more links from
other websites”
78
79
Centrality measure: Page Rank
The main difference to EigenCentrality:
is that PageRank takes direction and weight into
account
Understanding citations (e.g. patent citations,
academic citations)
Visualizing network activity / propagation of
malware
Modeling the impact of SEO and link building
activity (although PageRank is now just one of
many ranking algorithms used by Google)
80
Centrality measure: Page Rank
PageRank is a link analysis algorithm and it assigns a
numerical weighting to each element of a hyperlinked
set of documents, such as the World Wide Web, with
the purpose of "measuring" its relative importance
within the set.
81