The Ubiquity of Graph Theory

Ruby Abrams
5 min readJun 19, 2022

--

A brief demonstration of how Graph Theory can be applied.

Introduction: I discovered this theory back in 2016 and got immediately hooked on the consequences of thinking in this way. In this article, a graph is a set of nodes coupled with a set of edges, where edges are defined as pairs of nodes.

A picture is worth a thousand words. The number of the node in the graph on the left corresponds to each row and column of the matrix on the right. The i^th node is connected to the j^th node with weight w_ij. It’s called the Adjacency (go figure). source

The matrix representation is a topological invariant that captures all necessary information about the graph. Studying these matrices reveals hidden characteristics about the graph. Properties we care about include: number of connected components, connectivity, is it planar, etc. Let me show you a few applications of how I’ve used this theory.

Spread of infectious disease. Back in 2015, just after the Ebola epidemic in North Africa, I worked at Brown University on modeling the spread of infectious disease. We took a unique approach to this. We used SIR and SIS epidemic models to simulate Meningitis, the common cold, and even Ebola. The assumed difference between these diseases is the reproductive number, R0, the average number of people who would get infected if an infected person is dropped in a pool of susceptible people. If R0 > 1, the disease grows exponentially. If R0 < 1, the disease deteriorates exponentially. We spread these simulated infections on graphs representing communities extracted from diary-based data. We sought to answer the question: Through what common daily interaction (home, social, work, school, travel, shops, etc) is one most likely to contract an infectious disease? Spoiler alert: In Warwick, England, it’s through work interactions, where people spend most of their time.

Diary-based data: the participants of the study in Warwick tracked: who they came into contact with, in what environment the interaction took place, and when the interaction occurred. We analyzed statistics of interactions to generate graphs that had the same statistical characteristics. Then we super imposed all graphs to create an approximation of the connections in the community and ran our simulation of SIR and SIS models on this graph. You can find the published paper here.

This represents the structure of mid to large sized business corporations (each color graph is a different business). This example network was generated using the Barabasi-Albert algorithm. The philosophy of the algorithm is “the rich get richer”: whenever adding a node to the graph, it is more likely to be attached to a more popular node.

Stock Market Analysis. In 2016–17, I worked on treating stocks as nodes in a graph. Here edges between the nodes represented correlation between stocks. We used the Spearman Rank Correlation Coefficient which gives a number between -1 and 1 as a measure of correlation between stocks. If the number is close to 0, there is little to no correlation. Using this idea, we created graphs reflecting the correlation between every pair of stocks and applied techniques to identify clusters of stocks that are most correlated. We sought to find the stock(s) that are most indicative of stock market movement.

Each node represents a stock in S&P500. Edges represent correlations between stocks. Network analysis was applied to find stocks that are most indicative of market movement.

Using this technique, we characterized how the Market recovered after the 2008 GFC. Immediately after the crash, there was a huge drop in correlation between finance sector and all other sectors — finance sector was affected immediately — and high correlation between financial sector stocks. Later, all other sectors dropped in value as the crash propagated to other sectors, resulting in an increased correlation of other sectors to that of finance companies. This shows how the finance sector can be an indicator of market direction. You can find this paper here.

Scheduling / flow problems. During the summer of 2019, I was lucky to spend a summer in the amazing Berlin where my team was tasked with creating ticket inspector schedules for the Deutsche Bahn trains that maximize the number of tickets inspected (that’s a mouthful…). We did this by formulating a combinatorial optimization problem on a network flow graph. I will focus on the network flow bit.

We created a graph whose nodes were events and edges were train trips between events. An event consists of a train arrival or departure at a given time and place. A train trip is an edge with an associated number of people on board, a trip duration, and connects events. Since time is embedded into the nodes of the graph, the graph has a very important property: a direction of flow of time. No future nodes can point to past nodes. You can find the full article here.

Example information embedded in trips. A train leaves station A at time t1 and arrives at station B at time t2. The edge contains information too: x is a binary decision variable determining if an inspector is on the train or not, V is the number of passengers on edge e, and t is the duration of the trip.
Example inspector schedule generated on with the network flow. Source and Sink nodes are dummy nodes. Our software maximized number of passenger tickets inspected subject to constraints (time flows forward, inspector must work at most 8 hours a day, etc). Please excuse the unrealistic destination times.

Conclusion: Graphs capture relationships and connectivity between objects of the same type. Prolific Hungarian mathematician Paul Erdös made contributions to graph theory, namely, the Erdös-Renyi graph. A few other applications of graph theory and large scale computing in the information age follow. Consider processing/sifting through all the tweets on Twitter. Fundamentally, our economy is based off the exchange of value between individuals and businesses. In neuroscience, transmission of information (through neurotransmitters into dendrites and action potentials along axons) through neurons results in consciousness. And the protein folding problem!

In this article, I demonstrated three examples of applications of graphs that represent people, stocks, and train arrival/departure events. Many different problems can be formulated on graphs. The examples I provide here are ones I’ve used in my own work. My goal is to unlock my reader’s imagination and to let them see the world through these new glasses.

I would like to thank my collaborators who have generated some of these graphs: José Celaya-Alcalá, Hai Nguyen, and Nathan May.

--

--

Ruby Abrams
Ruby Abrams

Written by Ruby Abrams

Applied Mathematician working on Digital Health Technologies for rare diseases.

No responses yet