Financial Market Trade Data Analysis

Introduction

Professor Dave Lesmond (Freeman School of Business) and research assistant Chuan Wang work on identifying and developing profitable trading strategies. This research involves analyzing market activity data to search for patterns and trends. Software tools such as Stata, Python, and R are used.

Researchers conduct analyses in a familiar Python environment while taking advantage of big data processing technologies under the hood to accelerate the research cycle.

Tulane HPC Contribution

Professor Lesmond’s research involves processing tens of terabytes of high-freqency trade data comprising hundreds of billions of trade records. Initial analysis of these data on Cypress found a typical computation would take around half a day of runtime. Professor Lesmond and research assistant Chuan Wang approached us about using Cypress more efficiently to analyze these data. Tulane HPC staff member Hoang Tran analyzed existing code and workflow, and recoded computationally intensive portions to optimally use Cypress resources and technologies. This reduced the time needed for a 10 hour process to 5 minutes. Technologies used included Lustre for fast, parallel access to market data, Apache Spark for big data distributed processing, and Apache Arrow for integration with pandas and NumPy for data analysis in Python. This quick turnaround time has greatly improved the ease of data exploration and it allows, for the first time, a means to more properly test the viability of trading strategies using high-frequency trade data. Professor Lesmond expects to expand this project as additional data is obtained.

Participants

  • Dave Lesmond (Principal Investigator, Professor, Freeman School of Business)
  • Chuan Wang (Research Assistant, Freeman School of Business)
  • Hoang Tran (Tulane HPC)

More information on Professor Lesmond’s research

Contact Tulane HPC for Assistance