
Factor-Based Investing Model
We develop a Factor-Based Investing Model to construct and evaluate portfolios based on fundamental and quantitative factors. Factor-based investing aims to systematically capture specific sources of returns, or "factors," that have historically provided excess returns over the market.

Methodology:
We will use a multifactor approach, combining multiple factors such as value, momentum, quality, and size, to construct portfolios that outperform the market. This methodology leverages quantitative analysis and factor modeling techniques to identify and exploit market anomalies.
Project Duration
Industry
My Role
Deliverables
05.02.24 - 11.02.24
One Week
Finance & Stock Market
Quantitative Trader
Python Programmer
Data Analyst
Executable Scripts to collect financial data and analyse to make trade with proper risk management

Methodology
​
-
Data Collection: Obtain historical price data for a pair of stocks from the NSE using the yfinance library.
​
-
Pair Selection: Identify a suitable pair of stocks exhibiting cointegration, a prerequisite for pairs trading. This can be done using statistical tests such as the Augmented Dickey-Fuller (ADF) test.
​
-
Spread Calculation: Calculate the spread between the prices of the two stocks in the pair. This can be done by taking the difference between their prices or using more sophisticated methods like the Kalman Filter.
​
-
Spread Z-score Calculation: Compute the z-score of the spread to standardize it and identify deviations from its mean.
​
-
Trading Signal Generation: Generate trading signals based on the z-score exceeding predefined thresholds, indicating potential entry or exit points.
​
-
Trade Execution: Execute trades when the trading signals are triggered, taking into account considerations such as transaction costs and position sizing.
​
-
Risk Management: Implement risk management measures to limit losses and manage portfolio exposure.
​
​
​
Code Implementation
​
Step 1: Data Collection (Script One)
​
Python code:
​
import yfinance as yf
# Define stock symbols and date range
symbol1 = 'RELIANCE.NS'
symbol2 = 'TCS.NS'
start_date = '2023-01-01'
end_date = '2024-01-01'
# Fetch historical price data from Yahoo Finance
data1 = yf.download(symbol1, start=start_date, end=end_date)
data2 = yf.download(symbol2, start=start_date, end=end_date)
Explanation:
-
This section imports the yfinance library, which is a Python package for fetching historical market data from Yahoo Finance.
-
You specify the symbols of two stocks (RELIANCE.NS for Reliance Industries and TCS.NS for Tata Consultancy Services) and the date range for which you want historical data.
-
The yf.download() function is used to fetch historical price data for each stock within the specified date range and store it in data1 and data2.
​
​
​
Step 2: Pair Selection & Spread Calculation (Script Two)
​
Python code:
​
# Check for cointegration using ADF test
from statsmodels.tsa.stattools import adfuller
def test_cointegration(series1, series2):
result = adfuller(series1 - series2)
p_value = result[1]
return p_value < 0.05
# Assuming 'Close' prices are used
stock1_close = data1['Close']
stock2_close = data2['Close']
# Check for cointegration
is_cointegrated = test_cointegration(stock1_close, stock2_close)
if is_cointegrated:
spread = stock1_close - stock2_close
# Calculate spread mean and standard deviation for z-score calculation
spread_mean = spread.mean()
spread_std = spread.std()
else:
print("Selected pair is not cointegrated. Choose another pair.")
Explanation:
-
In this step, we define a function test_cointegration() to check for cointegration between the two stock price series using the Augmented Dickey-Fuller (ADF) test.
-
We extract the 'Close' price data for both stocks from the fetched historical data.
-
The test_cointegration() function is then used to check if the pair of stocks is cointegrated (i.e., if there is a long-term relationship between their prices).
-
If the pair is cointegrated, we calculate the spread between the prices of the two stocks (spread) and compute the mean and standard deviation of the spread for later use in z-score calculation.
