Terminology and Definitions
Cross-Asset Alpha Engine Glossary
Alpha Generation Terms
- Alpha
- Excess return generated by a trading strategy relative to a benchmark, representing the value added by active management.
- Cross-Asset Alpha
- Alpha generated by exploiting relationships and inefficiencies across multiple asset classes (equities, bonds, commodities, currencies).
- Regime-Aware Alpha
- Alpha generation that adapts to different market conditions or regimes, using different models or parameters for different market environments.
- Information Ratio
- Risk-adjusted measure of alpha generation, calculated as excess return divided by tracking error (standard deviation of excess returns).
Market Regime Terms
- Market Regime
- Distinct periods in financial markets characterized by different risk-return dynamics, volatility patterns, and asset correlations.
- Hidden Markov Model (HMM)
- Statistical model assuming that market observations are generated by an underlying, unobservable regime state that follows a Markov process.
- Regime Detection
- Process of identifying and classifying different market regimes using statistical or machine learning methods.
- Regime Transition
- The process of moving from one market regime to another, often triggered by economic events or structural changes.
- State Space Model
- Mathematical framework where the system state (regime) is not directly observable but can be inferred from observable variables.
Feature Engineering Terms
- Technical Features
- Quantitative indicators derived from price and volume data, including momentum, volatility, and mean reversion signals.
- Daily Microstructure-Inspired Features
- Features inspired by microstructure concepts but computed from daily OHLCV bars only. Includes VWAP deviations, volume anomalies, and daily price patterns. Note: True intraday trading patterns and bid-ask dynamics require intraday/tick data, which is not used in the current experiment.
- Cross-Asset Features
- Indicators that capture relationships between different asset classes, such as correlations, volatility spillovers, and risk sentiment.
- Feature Importance
- Measure of how much each feature contributes to model predictions, typically calculated using methods like Gini importance or permutation importance.
- Z-Score Normalization
- Statistical technique to standardize features by subtracting the mean and dividing by standard deviation, ensuring features have zero mean and unit variance.
Statistical and ML Terms
- Random Forest
- Ensemble machine learning method that combines multiple decision trees to make predictions, providing robustness and feature importance rankings.
- Ensemble Method
- Machine learning technique that combines predictions from multiple models to improve overall performance and reduce overfitting.
- Walk-Forward Validation
- Time series validation technique where models are trained on historical data and tested on subsequent out-of-sample periods.
- Cross-Validation
- Model validation technique that divides data into multiple folds to assess model performance and prevent overfitting.
- Overfitting
- Phenomenon where a model performs well on training data but poorly on new, unseen data due to excessive complexity.
Risk Management Terms
- Value at Risk (VaR)
- Statistical measure estimating the maximum potential loss of a portfolio over a specific time horizon at a given confidence level.
- Expected Shortfall (ES)
- Risk measure that estimates the expected loss beyond the VaR threshold, also known as Conditional Value at Risk (CVaR).
- Maximum Drawdown
- Largest peak-to-trough decline in portfolio value, representing the worst-case loss scenario during the analysis period.
- Sharpe Ratio
- Risk-adjusted return measure calculated as excess return divided by standard deviation of returns.
- Market Neutrality
- Portfolio construction approach that maintains approximately zero net market exposure (beta ≈ 0) to isolate alpha from market movements.
Portfolio Construction Terms
- Position Sizing
- Process of determining the appropriate allocation to each asset in a portfolio based on expected returns, risk, and constraints.
- Risk Parity
- Portfolio construction approach where each asset contributes equally to total portfolio risk, typically achieved by inverse volatility weighting.
- Kelly Criterion
- Mathematical formula for optimal position sizing that maximizes long-term growth rate based on win probability and payoff ratios.
- Gross Exposure
- Sum of absolute values of all portfolio positions, representing total capital deployed regardless of direction.
- Net Exposure
- Sum of all portfolio positions considering direction (long minus short), representing overall market exposure or beta.
Market Data Terms
- OHLCV
- Standard market data format containing Open, High, Low, Close prices and Volume for each time period.
- VWAP (Volume Weighted Average Price)
- Average price weighted by volume, representing the average execution price for the trading period.
- Bid-Ask Spread
- Difference between the highest price a buyer is willing to pay (bid) and the lowest price a seller is willing to accept (ask).
- Market Impact
- Price movement caused by executing a trade, typically modeled as a function of trade size and market liquidity.
- Slippage
- Difference between expected execution price and actual execution price, caused by market movement and liquidity constraints.
Asset Class Definitions
- SPY
- SPDR S&P 500 ETF Trust, tracking the S&P 500 index and representing large-cap US equity exposure.
- QQQ
- Invesco QQQ Trust, tracking the NASDAQ-100 index and representing technology-heavy large-cap growth stocks.
- IWM
- iShares Russell 2000 ETF, tracking small-cap US equity exposure.
- VIX
- CBOE Volatility Index, measuring implied volatility of S&P 500 options and serving as a "fear gauge" for market sentiment.
- TLT
- iShares 20+ Year Treasury Bond ETF, representing long-term US government bond exposure and interest rate sensitivity.
- GLD
- SPDR Gold Trust, providing exposure to gold prices and serving as an inflation hedge and safe-haven asset.
- DXY
- US Dollar Index, measuring the value of the US dollar against a basket of major foreign currencies.
- USO
- United States Oil Fund, tracking crude oil prices and representing commodity exposure.
Performance Metrics
- Annualized Return
- Return scaled to represent performance over a full year, calculated as (1 + period_return)^(252/periods) - 1 for daily data.
- Volatility
- Standard deviation of returns, typically annualized by multiplying daily volatility by √252.
- Win Rate
- Percentage of trading periods with positive returns.
- Calmar Ratio
- Risk-adjusted return measure calculated as annualized return divided by maximum drawdown.
- Sortino Ratio
- Modified Sharpe ratio that only considers downside volatility in the denominator, focusing on harmful volatility.
Execution Terms
- TWAP (Time Weighted Average Price)
- Execution strategy that spreads trades evenly over time to minimize market impact.
- Implementation Shortfall
- Difference between the decision price and the final execution price, including market impact and timing costs.
- Participation Rate
- Percentage of total market volume that a trading algorithm is allowed to consume during execution.
- Transaction Costs
- Total cost of executing trades, including commissions, bid-ask spreads, market impact, and opportunity costs.
Statistical Terms
- Autocorrelation
- Correlation of a time series with a delayed copy of itself, measuring the persistence of trends or mean reversion.
- Stationarity
- Statistical property where the mean, variance, and autocorrelation structure remain constant over time.
- Heteroskedasticity
- Condition where the variance of errors is not constant across observations, common in financial time series.
- Multicollinearity
- High correlation between predictor variables that can cause instability in model coefficients.
- P-Value
- Probability of observing results at least as extreme as those observed, assuming the null hypothesis is true.
Backtesting Terms
- In-Sample Period
- Historical data used to train and optimize models, also known as the training set.
- Out-of-Sample Period
- Historical data reserved for testing model performance, simulating real-world deployment conditions.
- Look-Ahead Bias
- Error in backtesting where future information is inadvertently used to make historical decisions.
- Survivorship Bias
- Bias that occurs when analysis only includes assets that survived the entire period, ignoring delisted or failed assets.
- Data Snooping
- Bias that results from testing multiple strategies on the same dataset and selecting the best-performing one.
This comprehensive glossary provides definitions for all key terms used throughout the Cross-Asset Alpha Engine documentation and analysis.