Levenshtein-distance Stock Typo Analysis
Overview
Link: GitHub Repository
Designed a data analysis pipeline to study the prevalence of buying pressure due to typos made by retail traders.
Key Features
- Automated Pipeline: Automated pulling of latest trade data and analysis of likely ticker pairs.
- Smart Filtering: Used smart filtering of names and keyboard distance (Levenshtein distance) to identify alpha based on genuine typos as opposed to spurious correlations.
- Event Detection: Implemented analysis to detect typo trading on high-volume and high-likelihood days, such as IPOs.
