Actors network

This network has all the actors that have played movies from 1989 to 2016 as nodes and these nodes (actors) are linked with other actors when they have at least one movie in common. In our dataset, we have exactly 8950 actors, resulting in 8950 nodes, 81042 edges and an average degree per actor of 18 links. This network looks like this:

network_actors

We can see that there is a high cluster with most of actors that forms the Giant Connected Component (GCC). On the other hand, there are some actors that are not linked with this GCC. Contrary to the movies network, these external actors are usually connected with some other actors forming small clusters, since the movies are seldom starring by only one actor.

The GCC looks like this:

network_gcc_actors

In general there are so many nodes and edges that is difficult to interpret something from this picture. However, there are a some nodes that seems to form small clusters since they are connected among them more than with the rest of the network. We can see what the Python-Louvain clustering algorithm generates in the following picture:

actors_network_louvain_communities

The different colors represent the 54 different communities found by this algorithm. The resulting modularity is of about 0.4 so there is a small clustering coefficient in this network. In this network, it is more significant than in the movies network.

Degree distributions

Now, we analyse the nodes degree of the network. The top 10 actors by degree are represented in the following figure:

top_actors_degree

This classification is led by “Samuel L. Jackson”, which has played movies with 266 other actors. The podium is completed with Bruce Willis with a degree of 241 and Robert de Niro and John Goodman sharing the 3rd place with a degree of 233.

But how the degree distribution of this network looks like? The following picture gives us the answer both in linear and log axes:

actors_degree_distribution

We can see that the degree distribution of this network follows a Poisson distribution in a way that there are few actors with high degree and the most common degree is 6, i.e., actors that only played movies with 6 other actors. From this degree, the degree distribution follows a power-law distribution. In this way, we will check if this network obey the friendship paradox.

Friendship paradox

We will explore if the actors network obey the friendship paradox. This paradox states that almost everyone have fewer friends than their friends have, on average. In this case, the ‘friends’ are the actors who are worked with each other.

By running a simulation with 10.000 tests, we can confirm that the friendship paradox for the actors is true 81.5 % of times. That is, the neighbours of a specific actor have a higher average degree than him with a 81.5 % of probability.

Degree assortativity

Another questions that comes to our minds is if the high-degree actors tend to link with other high-degree actors and the low-degree actors with other low-degree actors, i.e., if there is any degree assortativity in the network.

In the following picture, we can see the correlation between the degree of a node and their neighbor’s average degree:

actors_degree_assortativity

It seems clear that there is a positive correlation between the degree of a node and its neighbor’s average degree, i.e. there is a positive degree assortativity in the network.

We can prove this point of view by calculating the assortative coefficient with the help of the NetworkX Python library as done with the actors network. In this case, the result is 0.18 so there is also a positive degree assortativity that is also illustrated in the linear regression of the previous plot.

Betweenness centrality

One of the most interesting characteristics of the nodes of a network is its betweenness centrality. According to the Wikipedia definition, “the betweenness centrality is an indicator of a node’s centrality in a network. It is equal to the number of shortest paths from all vertices to all others that pass through that node. A node with high betweenness centrality has a large influence on the transfer of items through the network, under the assumption that item transfer follows the shortest paths.” In our betweenness centrality computation, it is not equal to the number of shortest paths that pass through that node but proportional to it.

This is the list of our top 10 actors by betweenness centrality.

top_betweenness_centrality_actors

A large degree can help to have a higher betweenness centrality and in this way “Samuel L. Jackson” appears on the 3rd place. However, as it can be seen, it is not the only factor since the list is led by Jackie Chan. One possible explanation is that there are a set of Asian actors that if they perform the shortest path with American actors they can only go or they usually go through Jackie Chan, which has played with actors from both continents, becoming the most central node in the network.