Skip to content

Retrieve data for various multi-factor asset pricing models.

License

Notifications You must be signed in to change notification settings

x512/getfactormodels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

212 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

getfactormodels

Python PyPI - Version PyPI - Status GitHub License

A command-line tool to retrieve data for multi-factor asset pricing models.

Models

  • The 3-factor, 5-factor, and 6-factor models of Fama & French [1] [3] [4]
  • Mark Carhart's 4-factor model [2]
  • Pastor and Stambaugh's liquidity factors [5]
  • Mispricing factors of Stambaugh and Yuan[6]
  • The $q$-factor model of Hou, Mo, Xue and Zhang[7]
  • The augmented $q^5$-factor model of Hou, Mo, Xue and Zhang[8]
  • Intermediary Capital Ratio (ICR) of He, Kelly & Manela[9]
  • The DHS behavioural factors of Daniel, Hirshleifer & Sun[10]
  • The HML $^{DEVIL}$ factor of Asness & Frazzini[11]
  • Betting Against beta, A. Frazzini, L. Pedersen (2014) [12]
  • Quality Minus Junk, Asness, Frazzini & Pedersen (2017)[13]
  • The 6-factor model of Barillas and Shanken[14]

Thanks to: Kenneth French, Robert Stambaugh, Lin Sun, Zhiguo He, AQR Capital Management (AQR.com) and Hou, Xue and Zhang (global-q.org), for their research and for the datasets they provide.

Installation

Important

getfactormodels is pre-alpha (until version 0.1.0), don't rely on it for anything.

PyPI - Status

But a huge thanks to anyone who has tried it!

Requires:

  • Python >=3.10

The easiest way to install getfactormodels is with pip:

pip install getfactormodels

Quick start

CLI

# Fama-French 5-Factor model, monthly
getfactormodels --model ff5 --frequency m

# q-factor model’s weekly ‘R_IA’ since start (using -x/--extract)
getfactormodels -m q -f w --start 2000 -x “R_IA”

# Australia, Quality Minus Junk (daily) saved to file:
getfactormodels -m qmj -f d --region aus --output aus_bab.ipc

# Get multiple models in one table, e.g., ff3 + liquidity factors:
getfactormodel -m ff3 liq

# Attach a Fama-French portfolio (12 industry portfolios)
getfactormodels -m ff6 -f d --industry 30

# Alternately, --portfolio/-p accepts industry as the sort:
getfactormodels -m ff5 -f m --portfolio industry 12

# 10 portfolios formed on Momentum
getfactormodels -m ff6 -f m --portfolio 10 --by mom 

# univariate sorts support 'decile, 'quintile', 'tertile' or 10 5 3
getfactormodels -m ff3 -f m -p quintile -b size

# Add a FF bivariate sort to model/s
getfactormodels -m ff3 -f d -p 2x3 -b mom size

# Accepts an int instead of a sort ('25' for for '5x5', etc.)
getfactormodels -m ff3 liq -f m -p 25 -b mom,size

Example

getfactormodels -m qmj -f d --output qmj.ipc
View output
Data saved to: qmj.ipc

date            Mkt-RF           QMJ           SMB           HML           UMD    RF_AQR
1957-07-01    0.001784     -0.001566     -0.002166      0.001984      0.000651    0.0001
1957-07-02    0.008514     -0.000484     -0.005030     -0.004436      0.002705    0.0001
1957-07-03    0.007938      0.000869     -0.001245     -0.003676      0.002294    0.0001
1957-07-05    0.007755      0.001975     -0.000769     -0.002781     -0.001137    0.0001
  [...]
2025-10-28    0.001157      0.004138     -0.003897     -0.006681      0.011682    0.0002
2025-10-29   -0.002034     -0.003446     -0.007738     -0.002619      0.015875    0.0002
2025-10-30   -0.010841      0.009058     -0.000406      0.005180     -0.006211    0.0002
2025-10-31    0.003867     -0.006546      0.000620      0.001108      0.000869    0.0002

[17574 rows x 7 columns, 905.3 kb]

Another:

getfactormodels -m q -f q -o qfactors_qtrly.md

Data saved to: qfactors_qtrly.md

               Mkt-RF         R_ME         R_IA         R_EG        R_ROE       RF
date
1967-03-31   0.134805     0.114866    -0.053626    -0.015750     0.084400   0.0114
1967-06-30   0.018500     0.087544    -0.026375    -0.018427     0.021278   0.0092
1967-09-30   0.068962     0.055625     0.040412    -0.009986    -0.006436   0.0096
1967-12-31   0.002605     0.052229    -0.052616     0.017556     0.048551   0.0107
  [...]
2024-03-31   0.088848    -0.048872    -0.012363     0.001792     0.034865   0.0132
2024-06-30   0.022145    -0.051355    -0.025798     0.086070     0.084276   0.0135
2024-09-30   0.046601     0.020398     0.017324    -0.058132     0.025571   0.0138
2024-12-31   0.021465    -0.025093    -0.062185     0.024003    -0.038607   0.0116

[232 rows x 7 columns, 12.0 kb]

Python

getfactormodels.model()

import getfactormodels as gfm

m = gfm.model(
    model = 'dhs',
    frequency='m',
    start_date='2000-01-01',
    end_date='2024-12-31',
    output_file='data.csv',
    cache_ttl=86400,
)

Model classes

from getfactormodels import FamaFrenchFactors

# Initialize model instance
m = FamaFrenchFactors(model='3', frequency='m', 
			region='developed', start_date='2020-01-01')
m.end_date = '2020'

# Download the data 
m = m.load()

# Access/download the Arrow Table:
table = m.data

# As a dataframe:
df = m.to_polars() # Helper method, see also `.to_pandas()`
  • Some other examples:
from getfactormodels import Qfactors, BABFactors, QMJFactors

# Q Factors have a "classic" boolean, when true, returns the classic 4 factor model.
q = QFactors(classic=True, frequency='w').load()

# AQR Models for different countries:
nor_qmj_table = QMJFactors(frequency='m', region='nor').load()

# Extract the Japan Betting Against Beta daily 'BAB' factor:
bab_jpn_df = BABFactors(frequency='d', region='JPN', 
                        start='2000-02-20', end '2010').load().extract("BAB").to_polars()

A list of model classes available:

  • FamaFrenchFactors
  • CarhartFactors
  • QFactors
  • ICRFactors
  • DHSFactors
  • LiquidityFactors
  • MispricingFactors
  • HMLDevilFactors
  • BarillasShankenFactors
  • BABFactors
  • QMJFactors

Data Interoperability

getfactormodels uses PyArrow internally and supports the Dataframe Interchange Protocol. This allows for zero-copy data sharing with most modern Python data tools.

Create a model instance:

from getfactormodels import QFactors
m = getfactormodels.QFactors(frequency='m')
  • DuckDB can query a table without conversion
import duckdb
duckdb.sql("SELECT date, ROE, IA FROM m.data LIMIT 7").show() 
  • Polars has first-class support for Arrow:
import polars as pl
df = pl.from_arrow(m.data)
  • Pandas/NumPy
# Pandas DataFrame
df = m.to_pandas()

# or NumPy Array (via Pandas)
array = m.to_pandas().to_numpy()

The Interchange Protocol

  • If you use libraries like Ibis, Modin, or Vaex, you can use the interchange protocol directly:
df = vaex.from_arrow_table(m.data)
print(df.mean(vdf.ROE))

(back to top)

Data Availability

This table shows each model's start date, available frequencies, and the latest datapoint if not current. The id column contains the shortest identifier for each model. These should all work in python and the CLI.

id Factor Model Start D W M Q Y End
3 Fama-French 3 1926-07-01 -
4 Carhart 4 1926-11-03 -
5 Fama-French 5 1963-07-01 -
6 Fama-French 6 1963-07-01 -
icr ICR 1970-01-31
Daily: 1999-05-03
2025-06-27
dhs DHS 1972-07-03 2023-12-29
mis Mispricing 1963-01-02 2016-12-30
liq Liquidity 1962-08-31 2024-12-31
q
q4
$q^5$-factors
$q$-factors
1967-01-03 $\checkmark$ 2024-12-31
bs Barillas-Shanken 6 1967-01-03 2024-12-31
hmld HML $^{DEVIL}$ 1926-07-01 2025-10-31
qmj Quality Minus Junk 1957-07-01 2025-10-31
bab Betting Against beta 1930-12-01 2025-10-31
  • Fama-French: data up until until end of prior month.
  • Fama-French: most international/emerging factors (accessed with the region param) begin between 1985-1990.
  • AQR models: non-US data begins around 1990 (accessed with the country param).

References

Publications:

  1. E. F. Fama and K. R. French, ‘Common risk factors in the returns on stocks and bonds’, Journal of Financial Economics, vol. 33, no. 1, pp. 3–56, 1993. PDF
  2. M. Carhart, ‘On Persistence in Mutual Fund Performance’, Journal of Finance, vol. 52, no. 1, pp. 57–82, 1997. PDF
  3. E. F. Fama and K. R. French, ‘A five-factor asset pricing model’, Journal of Financial Economics, vol. 116, no. 1, pp. 1–22, 2015. PDF
  4. E. F. Fama and K. R. French, ‘Choosing factors’, Journal of Financial Economics, vol. 128, no. 2, pp. 234–252, 2018. PDF
  5. L. Pastor and R. Stambaugh, ‘Liquidity Risk and Expected Stock Returns’, Journal of Political Economy, vol. 111, no. 3, pp. 642–685, 2003. PDF
  6. R. F. Stambaugh and Y. Yuan, ‘Mispricing Factors’, The Review of Financial Studies, vol. 30, no. 4, pp. 1270–1315, 12 2016. PDF
  7. K. Hou, H. Mo, C. Xue, and L. Zhang, ‘Which Factors?’, National Bureau of Economic Research, Inc, 2014. PDF
  8. K. Hou, H. Mo, C. Xue, and L. Zhang, ‘An Augmented q-Factor Model with Expected Growth*’, Review of Finance, vol. 25, no. 1, pp. 1–41, 02 2020. PDF
  9. Z. He, B. Kelly, and A. Manela, ‘Intermediary asset pricing: New evidence from many asset classes’, Journal of Financial Economics, vol. 126, no. 1, pp. 1–35, 2017. PDF
  10. K. Daniel, D. Hirshleifer, and L. Sun, ‘Short- and Long-Horizon Behavioral Factors’, Review of Financial Studies, vol. 33, no. 4, pp. 1673–1736, 2020. PDF
  11. C. Asness and A. Frazzini, ‘The Devil in HML’s Details’, The Journal of Portfolio Management, vol. 39, pp. 49–68, 2013. PDF
  12. A. Frazzini and L. H. Pedersen, “Betting Against Beta,” Journal of Financial Economics, vol. 111, no. 1, pp. 1–25, Jan. 2014. EconPapersPDF (working paper)
  13. C. S. Asness, A. Frazzini, and L. H. Pedersen, “Quality Minus Junk,” Review of Accounting Studies, vol. 24, no. 1, pp. 34–112, Nov. 2019. EconPapers PDF
  14. F. Barillas and J. Shanken, ‘Comparing Asset Pricing Models’, Journal of Finance, vol. 73, no. 2, pp. 715–754, 2018. PDF

Data sources:

  • K. French, "Data Library," Tuck School of Business at Dartmouth. Link
  • R. Stambaugh, "Liquidity" and "Mispricing" factor datasets, Wharton School, University of Pennsylvania. Link
  • Z. He, "Intermediary Capital Ratio and Risk Factor" dataset, zhiguohe.net. Link
  • K. Hou, G. Xue, R. Zhang, "The Hou-Xue-Zhang q-factors data library," at global-q.org. Link
  • AQR Capital Management's Data Sets.
  • Lin Sun, DHS Behavioural factors Link

(back to top)

License

GitHub License

Known issues

  • AQR Models (HML Devil, Betting Against Beta, Quality Minus Junk) download slowly, particulary daily datasets. Need to implement a progress bar.
Todo
  • Documentation
  • Example notebook
  • Error handling
  • README
  • metadata on models (copyright, construction, factors)
  • Refactor of FF models

About

Retrieve data for various multi-factor asset pricing models.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages