Diffusion

In networks, we can model the spread of information, disease, or other phenomena as a diffusion process. The diffusion process usually starts with an initial node or a set of initial nodes. The goal is then to model how the information spreads through the network. You can imagine why this would be important for modeling the spread of a disease or an advertising campaign on social media where the goal is to reach as many people as possible.

Innovation Diffusion

Already in 1962, Everett Rogers published a book called "Diffusion of Innovations" where he describes the spread of a new idea or technology through a population. He split the adoption of a new idea into five stages:

Knowledge/Awareness: The individual is exposed to the innovation and gains knowledge of the innovation.
Persuasion: The individual is interested in the innovation and actively seeks information about the innovation.
Decision: The individual makes a decision to adopt or reject the innovation.
Implementation: The individual implements the innovation and uses it as a trial.
Confirmation: The individual finalizes his/her decision to continue using the innovation.

When analyzing the spread of a new innovation, Rogers found that the adoption of a new innovation follows a normal distribution.

Innovators 2.5%: Innovators are the first individuals to adopt an innovation. Innovators are most often young and willing to take risks and have a high social status.
Early Adopters 13.5%: This is the second-fastest category of individuals who adopt an innovation. These individuals have the highest degree of opinion leadership among the other adopter categories. Early adopters take more time to adopt an innovation than innovators due to more careful deliberation.
Early Majority 34%: Individuals in this category adopt an innovation after a varying degree of time. Most often, the early majority waits to adopt an innovation until they see that the innovation has proven useful for others and are in contact with the early adopters.
Late Majority 34%: Individuals in this category will adopt an innovation after the average member of the society. These individuals approach an innovation with a high degree of skepticism.
Laggards 16%: Individuals in this category are the last to adopt an innovation. Most often bound by traditions.

The normal distribution of the adoption of a new innovation.

Example

We can easily give some examples for the above distribution for when the first iPhone was released:

Innovators (2.5%): These were the tech enthusiasts who camped outside Apple stores. They were excited and were willing to embrace the new technology despite its high price and limited features compared to today's standards.
Early Adopters (13.5%): The early adopters included individuals who closely followed tech trends and were quick to purchase the iPhone once they saw the positive reviews and early adopter experiences. They recognized the iPhone's potential to change the way people communicate and access information.
Early Majority (34%): As the iPhone gained popularity and started to prove its utility, the early majority joined in. These individuals might have been initially hesitant but were swayed by the success stories of the early adopters.
Late Majority (34%): The late majority were more cautious and waited until the iPhone became a mainstream product. They wanted to ensure that any initial bugs or issues were resolved and that the price had become more affordable. Their decision to adopt the iPhone was influenced by its widespread acceptance and integration into daily life.
Laggards (16%): Laggards were the last to adopt the iPhone, often sticking with their traditional cell phones or resisting smartphones altogether. They were skeptical of the technology's benefits and preferred to maintain their existing routines and devices.

ICM - Independent Cascade Model

The Independent Cascade Model (ICM) is a probabilistic diffusion model that is based on the idea that the spread of information travels through neighbors in a network and therefore has a cascading effect. The model is based on the following assumptions:

A node can only effect its neighbors.
A node can only be in one of two states: active or inactive. For example, a node can be infected or not infected.
A node only has one chance to activate its neighbors.
A node can only go from inactive to active.

The initial setup of the model is as follows:

Each edge has an attribute $p \in [0,1]$ , which is the probability that the node will take over the state of its neighbor. How this probability is calculated depends on the application. For example, in the case of a disease, the probability could be based on a persons age and immune system. In the case of an advertising campaign, the probability could be based on the number of friends that have already seen the ad. You could also just use random probabilities.
A set of nodes $S$ is selected as the initial set of active nodes. All other nodes are inactive.

The model then proceeds in discrete time steps. In each time step, the following happens:

For each node $v \in S$ , the node tries to activate each of its neighbors $u$ . The activation is successful with probability $p_{vu}$ so if we generate a random value $r \in [0,1]$ and it is smaller or equal to $p$ . If the activation is successful, $u$ is added to the set $S_{new}$ .
If $S_{new}$ is empty then the process terminates. Otherwise, $S$ is updated to $S_{new}$ and the process repeats from step 1.

Spread Maximization

When working with the ICM model, we are often interested in finding the set of nodes $S$ that maximizes the spread, for example in an advertising campaign. This is a NP-Hard problem to solve, but we can use a greedy algorithm to find a good but not necessarily optimal solution. (How is this an NP-Hard problem?)

We can denote the spread after the ICM model as $f(S)$ where $S$ is the set of initial nodes. The output of the function is the number of nodes that are active after the ICM model has finished. Using this we can then implement a greedy algorithm that wants to maximize the spread, i.e. find the set of nodes $S$ that maximizes $f(S)$ .

However, we first need to change a few things about the ICM model to make it easier to work with because the model is non-deterministic. Instead of using a random probability $p$ for each edge and then using a random number generator to determine if the edge is activated. We can instead use a fixed $p$ and fixed $r$ for each edge. Another possible approach could be to define an "activation function" that takes the two nodes as input and defines if the edge is activated or not.

For example, we could define the activation function as follows:

a(u,v) = |u - v| \leq 2

Most often, when wanting to maximize the spread, for example of an advertising campaign, we are also on a budget. This means that we can only select a limited number of nodes $k$ as the initial set of active nodes, i.e. $|S| \leq k$ .

The greedy algorithm then works as follows:

Initialize $S = \emptyset$ .

For each vertex $v \in V \land v \notin S$ compute $f(S \cup \{v\})$ .

Select the vertex $v$ where $f(S \cup \{v\})$ is the highest and add it to $S$ . If there are multiple vertices with the same value, select one of them randomly.

If $|S| = k$ then terminate, otherwise repeat from step 2.

Linear Threshold Model

The threshold model is a diffusion model that is based on the idea that a node can only be activated if a certain proportion of its neighbors are already activated. The model is based on the same assumptions as the ICM model:

A node can only effect its neighbors.
A node can only be in one of two states: active or inactive. For example, a node can be infected or not infected.
A node only has one chance to activate its neighbors.
A node can only go from inactive to active.

In the model we define a threshold $t_v$ for each node $v$ . The threshold is a value between $0$ and $1$ and defines the proportion of neighbors that need to be active for the node to be activated. For example, if $t_v = 0.5$ then at least half of the neighbors of $v$ need to be active for $v$ to be activated.

For the algorithm we then define an initial set of active nodes $S$ and then in each time step we do the following:

For each node $v \in V \land v \notin S$ we compute the proportion of active neighbors $p_v$ .

If $p_v \geq t_v$ then we add $v$ to the set $S_{new}$ .

If $S_{new}$ is empty then the process terminates. Otherwise, $S$ is merged with $S_{new}$ , i.e. $S = S \cup S_{new}$ , and the process repeats back to the initial step 1.

An example of the linear threshold model, where the threshold is $0.5$ for all nodes.

Voter Model

The voter model is a simple probabilistic diffusion model. To start the model, each node is assigned a random state which is either $0$ or $1$ . In each time step, a node is selected at random and then one of its neighbors is also selected at random. The node then adopts the state of the selected neighbor. The process repeats until all nodes have the same state.

Node Embeddings General Definition