The Ubiquity of Graph Theory
A brief demonstration of how Graph Theory can be applied.
Introduction: I discovered this theory back in 2016 and got immediately hooked on the consequences of thinking in this way. In this article, a graph is a set of nodes coupled with a set of edges, where edges are defined as pairs of nodes.
The matrix representation is a topological invariant that captures all necessary information about the graph. Studying these matrices reveals hidden characteristics about the graph. Properties we care about include: number of connected components, connectivity, is it planar, etc. Let me show you a few applications of how I’ve used this theory.
Spread of infectious disease. Back in 2015, just after the Ebola epidemic in North Africa, I worked at Brown University on modeling the spread of infectious disease. We took a unique approach to this. We used SIR and SIS epidemic models to simulate Meningitis, the common cold, and even Ebola. The assumed difference between these diseases is the reproductive number, R0, the average number of people who would get infected if an infected person is dropped in a pool of susceptible people. If R0 > 1, the disease grows exponentially. If R0 < 1, the disease deteriorates exponentially. We spread these simulated infections on graphs representing communities extracted from diary-based data. We sought to answer the question: Through what common daily interaction (home, social, work, school, travel, shops, etc) is one most likely to contract an infectious disease? Spoiler alert: In Warwick, England, it’s through work interactions, where people spend most of their time.
Diary-based data: the participants of the study in Warwick tracked: who they came into contact with, in what environment the interaction took place, and when the interaction occurred. We analyzed statistics of interactions to generate graphs that had the same statistical characteristics. Then we super imposed all graphs to create an approximation of the connections in the community and ran our simulation of SIR and SIS models on this graph. You can find the published paper here.
Stock Market Analysis. In 2016–17, I worked on treating stocks as nodes in a graph. Here edges between the nodes represented correlation between stocks. We used the Spearman Rank Correlation Coefficient which gives a number between -1 and 1 as a measure of correlation between stocks. If the number is close to 0, there is little to no correlation. Using this idea, we created graphs reflecting the correlation between every pair of stocks and applied techniques to identify clusters of stocks that are most correlated. We sought to find the stock(s) that are most indicative of stock market movement.
Using this technique, we characterized how the Market recovered after the 2008 GFC. Immediately after the crash, there was a huge drop in correlation between finance sector and all other sectors — finance sector was affected immediately — and high correlation between financial sector stocks. Later, all other sectors dropped in value as the crash propagated to other sectors, resulting in an increased correlation of other sectors to that of finance companies. This shows how the finance sector can be an indicator of market direction. You can find this paper here.
Scheduling / flow problems. During the summer of 2019, I was lucky to spend a summer in the amazing Berlin where my team was tasked with creating ticket inspector schedules for the Deutsche Bahn trains that maximize the number of tickets inspected (that’s a mouthful…). We did this by formulating a combinatorial optimization problem on a network flow graph. I will focus on the network flow bit.
We created a graph whose nodes were events and edges were train trips between events. An event consists of a train arrival or departure at a given time and place. A train trip is an edge with an associated number of people on board, a trip duration, and connects events. Since time is embedded into the nodes of the graph, the graph has a very important property: a direction of flow of time. No future nodes can point to past nodes. You can find the full article here.
Conclusion: Graphs capture relationships and connectivity between objects of the same type. Prolific Hungarian mathematician Paul Erdös made contributions to graph theory, namely, the Erdös-Renyi graph. A few other applications of graph theory and large scale computing in the information age follow. Consider processing/sifting through all the tweets on Twitter. Fundamentally, our economy is based off the exchange of value between individuals and businesses. In neuroscience, transmission of information (through neurotransmitters into dendrites and action potentials along axons) through neurons results in consciousness. And the protein folding problem!
In this article, I demonstrated three examples of applications of graphs that represent people, stocks, and train arrival/departure events. Many different problems can be formulated on graphs. The examples I provide here are ones I’ve used in my own work. My goal is to unlock my reader’s imagination and to let them see the world through these new glasses.
I would like to thank my collaborators who have generated some of these graphs: José Celaya-Alcalá, Hai Nguyen, and Nathan May.