Date of Award
Doctor of Philosophy (PhD)
The behavior of modern systems lives in a complex landscape that is unique to its particular application. In this work we describe and analyze the behavior of two modern computational systems: a Linux server and the National Market System (NMS). Though this work is diverse in both the type and scale of system under study, it is unified through the design and implementation of computationally tractable quantitative metrics aimed at defining the state of behavior of these systems. Understanding the behavior of these systems allows us to ensure their desired operation. In the case of a server we need to quickly be alerted when the system is compromised. Similarly, we need to know when a systematic or structural change in the NMS has unintended side-effects.
We first explore methods for host-based intrusion detection.Host-based Intrusion Detection Systems (HIDS) automatically detect events that indicate compromise of the host by adversarial applications. We propose and implement a full pipeline for HIDS development on an arbitrary host system. Our methodology first learns the sequence structure in system calls on an uncompromised host by predicting future calls. We then use predictions from this model to detect anomalies at the application level. Our pipeline is evaluated on an existing event sequence corpora, and PLAID. The PLAID Lab Artificial Intrusion Dataset is a new corpus for HIDS development we created to be more representative of modern systems. In addition, we characterize differences in attack and baseline behavior using allotaxonographs.
Next we turn our attention to the NMS for which we propose measures to quantify inefficiencies resulting from the geographic fragmentation of the equity marketplace. Using the most comprehensive, commercially-available dataset of trading activity in U.S. equity markets, we catalog and analyze quote dislocations between the SIP National Best Bid and Offer (NBBO) and a synthetic BBO constructed from direct feeds. We observe a total of over 3.1 billion dislocation segments in the Russell 3000 during trading in 2016, roughly 525 per second of trading. These dislocations do not behave as expected, often persisting meaningfully longer and with higher magnitude than what physical constraints suggest. These dislocations exhibit a characteristic structure that features more dislocations near the open and close. Around 23% of observed trades executed during dislocations leading to estimated opportunity costs on the order of $2 billion USD. A subset of the constituents of the S&P 500 index experience the greatest amount of opportunity cost and appear to drive inefficiencies in other stocks.
Number of Pages
Ring, John Henry, "Establishing behavioral baselines for computational systems: two case studies" (2021). Graduate College Dissertations and Theses. 1411.