Package to build universes for factor pricing model. For further details, please refer to the documentation
Install this via pip (or your favourite package manager):
pip install factor-pricing-model-universe
The library contains the pipelines to build the universe. You can run the pipelines interactively in Jupyter Notebook.
from fpm_universe import pipeline
Alternatively, for scheduled runs, you can create a configuration and run the command line entry point to create the universe.
The configuration is in yaml format and contains a few inputs
Name | Description |
---|---|
output_filename |
Output filename |
intermediate_directory |
Intermediate directory to export the pipeline outputs |
start_datetime |
Start datetime of the universe |
last_datetime |
Last datetime of the universe |
frequency |
Frequency of the universe. For further details, please see the "Offset aliases" in pandas documentation |
pipeline |
List of pipelines to filter the universe |
data |
Defines the data used by pipeline, or referred by yaml tag !data |
Each pipeline returns a pandas dataframe indicating if the instrument is included into the universe on the specified date / time. For example, the pipeline returns the following dataframe
+------------+--------+-------+
| date | AAPL | GOOGL |
+------------+--------+-------+
| 2022-11-17 | True | False |
+------------+--------+-------+
| 2022-11-18 | True | True |
+------------+--------+-------+
and it indicates AAPL is included in the universe on both 2022-11-17 and 2022-11-18 while GOOGL only on 2022-11-18.
By default, the pipeline functions are imported from module fpm_universe.pipeline
.
Each data defines the method to retrieve from the source, or the operator on the source data. The return type of each data is unconstrained. It can be a json-like dict, a list, a pandas series, or even a pandas dataframe.
In the configuration, Each data can be referred by yaml tag !data
, and it is loaded
in lazy only when it is referred by another data object or a pipeline.
The entry point factor-pricing-model-universe
is to generate the universe regarding
the given configuration to the destination, with dynamically passing the parameters
to format the configuration.
The arguments of the entry point are
Argument | Description |
---|---|
-c, --config TEXT |
Required. Configuration file path. |
-p, --parameter TEXT |
Parameters to be formatted in the configuration. |
For example, given the configuration as follows,
output_filename: "{output_directory}/{date}.parquet"
intermediate_directory: "{output_directory}/{date}"
start_datetime: "2015-01-01"
last_datetime: "{date}"
frequency: "B"
pipeline:
- name: range_validity
function: range_validity
parameters:
values: !data initial_validity
data:
symbols:
function: jq_compile
parameters:
json_filename: "{data_directory}/index/sp500/default/{date}.json"
pattern: "[.[] | .tickers[]] | sort | unique | .[]"
initial_validity:
function: jq_compile
parameters:
json_filename: "{data_directory}/listings/{date}.json"
pattern: ".[] | {{ symbol: .symbol, valid_start_datetime: .ipoDate, valid_last_datetime: .delistingDate }}"
includes:
symbol: !data symbols
and run the following command
factor-pricing-model-universe \
--config <path> \
--parameter output_directory=$HOME/output \
--parameter data_directory=$HOME/data \
--parameter date=2022-10-20
the universe dataframe is output to $HOME/output/2022-10-20.parquet
(formatted with the parameter output_directory
and date
).