PROPOSED IMPROVEMENTS to the FINMATH.XLS MS-EXCEL SPREADSHEET PROGRAM
v.0.0, v0.1 (based on discussions T.Petrie, AV.Toy Spr05)
v0.2 (based on discussions T.Petrie, AV.Toy Spr05 Feb 16)
change summary
reorder the numerical position of the Proposed Improvements to
reflect the work sequence,
move current item 5) "Help pop-up..."
to immediately after current item 2)
move current item 7) "Password ..."
to immediately after current item 1)
partition current item 4) "Create a new set of internal ..."
into 2 work items
+one item reflecting the new set of internal variables work
+one item creating or consolidating onto one sheet
the setting of user inputs that characterize the experiment,
such as values, parameters, check boxes to a single
and move this item to before current item 3a).
v0.3 (based on discussions T.Petrie, Derya, AV.Toy Spr05 Mar 02)
Add to item 3) another (4th) "button" option form in the Reference sheet
which would clear the file contents in the folder with the prior saved
time series. Simplifying the user interface design for editing the
selected series.
***
INTRODUCTION:
Based on recent experiments,
the FinMath.xls spreadsheet program appears to be algorithmically correct
and a faithfull representation of the CAPM trading strategy described
in R.Roll (Journal of Financial Economics 1977v4).
However, it is difficult to generate results that can be compared across experiments due
to several operational attributes and assumptions of the program,
encumbering its utility for the analysis of related experiments.
This proposal for improvements to FinMath.xls first characterizes the problems followed
by a proposed set of changes and their associated motivation or problems they address.
***
PROBLEM CHARACTERIZATION:
The difficulties arise from 4 primary aspects of the design.
1) The various sheets are interdependent.
They interact to both "feed" entered values and parameter settings
(e.g. trading period length, start and end dates, reference time series, etc.)
and perform calculations (e.g. inverse covariance matrix of a training period,
assest weights, etc.) for each other.
However, some values and external parameters are changed after
a cycle through the spreadsheets.
Unless the desired value and parameter settings are diligently re-verified
and re-established, the results from multiple experiments cannot be compared
to derive useful conclusions.
Inaddition, results of a sheet calculation depend on internal variables
set from calculations in another sheet, so to generate consistent results
requires an ordered flow through the sheets.
Moreover, some parameters can be set from different sheets
(e.g. start date, trading period length, etc.).
However, changing the parameter in a subsequent sheet may break a
necessary association with the "feed" data from a prior sheet,
unless one returns to the prior sheet and reinvokes the calculations.
This situation can occur when a Reset is invoked in a subsequent sheet,
unintentionally, changing a date value set in a prior sheet.
Examples:
For a trivial example, changing the trading period length in the Dynamic
Optimal Portfolio sheet would affect the sampling, calculations, and
internal variable settings in both the Asset Browser and CoVariance sheets.
Consider the following more interesting
Covariance Analysis and Dynamic Optimal Portfolio sheets interaction
due to the value of the training interval.
When the covariance matrix is determined directly in the Covariance Analysis
sheet by Reset+Calculate (twice if necessary to clearout the stray data
of a prior run), the values of the matrix, the profolio member weights, etc.
reflect using ALL the samples between the Start and End dates,
not the Training Interval.
The Dynamic Optimal Portfolio sheet reuses the Covariance Analysis sheet to
perform part of its calculations and in the process overwrites many of the
entries in the Covariance Analysis sheet.
When the Covariance Analysis sheet calculations are invoked
from the Dynamic Optimal Portfolio sheet,
the values of the matrix, the profolio member weights, etc.
reflect using a "rolling" window of the samples within the Training Interval
set in the Dynamic Optimal Portfolio sheet,
except for the last window which may use 1 less sample.
After performing a Reset+Calculate from the Dynamic Portfolio sheet,
the Covariance Analysis sheet will show the results of using the last
rolling window's samples as determined by the Dynamic Optimal Portfolio sheet.
2) The design assumed a linear usage flow,
first (Reference sheet) to last (Dynamic Optimal Portfolio sheet),
that the datasets are not reused and the resulting spreadsheet changes
are not saved to disk across experiments,
and new datasets are imported at the beginning of each scenario run.
Examples:
The time series imported into the Reference sheet (and reflected in the
separate individual time series spreadsheets) are not saved upon saving
of the main spreadsheet book.
When the save spreadsheet is re-invoked, the last results of the prior
calculations populate the sheets, however the Reference sheet does not
contain any time series.
A more complex interaction across runs of the spreadsheets involves
how the values of the time series are populated across the sheets in
a particular order.
In the Covariance Analysis sheet,
the problem of retained prior (stray garbage) values in rolls B8->, , , B12->
occurs whenever fewer time series are imported (in the Reference sheet)
for the new experiment than the number used in the prior set of analyses.
Those stray values are applied in the subsequent calculations
of the variance and efficiency set constants.
The initial Reset + Calculate from the Covariance Analysis sheet
appears to populate the CA_vectors' components based on an index count,
set by the number of time series imported in the Reference sheet.
It erroneously ignores any prior values located further to the right.
The results of any calculations are invalid.
*However, the effects are benign, if one follows this recommended procedures.*
The values are correctly updated when at least 2 (sometimes 3, but I could
not consistently recreate the need for 3) Reset + Calculate are performed.
The second Reset + Calculate (possibly third) changes (and I believe
erroneously) the index count (however, fortuitously), resulting in the
subsequent "clean up".
A Reset + Calculate from the Dynamic Optimal Portfolio spreadsheet
(after a Reset + Calculate from the Covariance sheet) will also "clean up"
the stray values.
So for manual operation that follows the recommended procedures,
the effects of the problem are benign.
Note: Going to the Dynamic Optimal Portfolio sheet and performing a
Reset+Calculate without first going through a Reset+Calculate (again twice
if necessary) directly from within the Covariance Analysis sheet
is NOT recommended.
The results will be inconsistent because several internal variables will be
defaulted or left at the prior saved values of the spreadsheet and these
may be different than what are the intended values.
These idiosyncrasies also apply to the Statistical Analysis sheet
interactions with the Covariance Analysis sheet.
3) The Statistical Analysis sheet operates as an control overlaid,
selectively re-ordering subsets of the time series, values, and parameter
settings and invoking the various other sheets to perform calculations
multiple times.
It assumes that a) certain parameter settings are persistent or
re-established after each set of calculation, b) the order of the time
series will not affect the calculations, and c) the number of time series
involved from a prior set of calculations has no affect on the calculations
of a subsequent set. All these assumptions are invalid.
Example:
Sometimes a much more volatile stock (in this example, INTC)
forces a parameter boundary constraint to be "hit",
such as the maximum weight of an asset as calculated for particular
training period. When such constraints become "active", the sequence
order in which the time series are imported will affect the results.
Moreover, the degree of effect may depend on other parameter settings.
The period is 02/27/01 through 01/28/05.
The parameter settings are:
Trading interval 5 days, Training interval 20 periods (100 days),
Risk free return 2.5%, Fixed desired return 0.5%, and Optimize for Return.
From the Asset Browser sheet, the period volatility are
OIH 4.78%, AIG 4.17%, VAR 4.05%, INTC 6.69%.
Time series order OIH, AIG, VAR, INTC
(in fact for any case where INTC is not first)
BM weight =< 200% portfolio volatility 3.48% with end value 1.069
100 1.84
50 0.70
Time series order INTC, OIH, AIG, VAR
(in fact for any case where INTC is first)
BM weight =< 200% portfolio volatility 3.20% with end value 2.065
100 1.66
50 0.78
So the maximum weight constraint seems to have the desired effect.
However, the parameter setting of the Risk free return has a greater
influence on the results as the boundary constraint on the asset weights
becomes more stringent.
With a risk free return at 5% and the BM adjust limited to 50%,
the order (leading INTC vs not) does not impact the end value or volatility
because the 5% dominates.
If the risk free return is dropped to 2.5% and the BM adjust limited to 50%,
the order (leading INTC vs not) has a slight effect,
case: leading INTC end value 1.267 portfolio volatility 0.78%
non ld INTC end value 1.237 portfolio volatility 0.70%
4) While the CAPM strategy is represented algorthmically correct,
the program assumes that the user is familiar with the applicable theory
and numerical analysis limitations and that the select set of time series
imported are self-consistent.
Examples:
An Excel VBA Error 13 problem due to a type match problem of DOP_bal
can occur that is indeed correct algorithmic operation.
Poor user interface, but valid.
The inverted covariance matrix (R2xR2) of the Covariance Analysis sheet,
defines the "efficient" hyperbola bounding the (return, sigma) set of the
market component.
In one experiment, just mentally estimating the volume of the inverted
covariance matrix by multiplying using only the 2 significant digits per
element displayed, results in a size of ~1.5.
After expanding the columns, reformatting displayed digits to the
right of the decimal point to 6 digits and recalculating,
the size collapses to 0.002 or less. The volume of the matrix collapsed.
This error teaches me not to ignore a numerical analysis rule of thumb
when it was obvious that the range between the smallest and largest elements
was 5 orders of magnitude.
The collapse is due to the use of single stock.
Since the Covariance Analysis sheet follows R.Roll's formulation,
with a single stock, all linear combinations of the stock are correlated,
so the eigenvectors collapsed.
The covariance matrix must be positive semi-definite for the quadratic
optimization problem to work.
From the formulation, it is also necessary to watch out for cases where
the expected returns of each of the stocks are very close in absolute value
to each other within any given training period, again correlated.
It is possible to construct a multiple stock portfolio to have separated
linear combinations that are very highly correlated (e.g. linear combinations
of a subset of the time series is correlated to the non member (of the subset)
time series) resulting in a collapse.
Such a portfolio is unlikely to occur in normal practice.
However, since the estimate of the covariance matrix is determined by the subset
of returns in a rolling window sized by the Training Interval length parameter.
As the Training Interval length is reduced, a collapse could occur because the
returns are correlated in that shorter interval which spans a Training Interval
length. In practice with 3 time series, even when the time series are from the
same industry sector, such a Training Interval related collapse usually does not
occur in Training Interval lengths of more than 30 days (6 x 5 trading day).
Another example occurs when there are missing trading day samples in one of
the time series due to corporate conditions, SEC stop limits, market maker
declarations of halting of trading, etc.) relative to the other series.
The program collapses (shifts) the samples in that particular series
to close the gap. However, the resulting time series is nolonger correctly
cross-correlated against the other time series.
*However, experiments appear to show that for gaps of 1-2 days when the
training intervals are large (~50days) and the overall portfolio period
extends across more than 2 training interval in length,
there is no practical effect on the portfolio results.*
***
PROPOSED IMPROVEMENT:
1) Remove or disable the Statistical Analysis sheet.
Motivation:
The reason is more practical, than technical.
For consistent results, the operation of the other sheets would have to be
made considerably more robust against changes to in the number of time series,
to the order of the time series, and to non-linear invocation sequences
of the sheets. Insufficient resources (staff X time) exist.
The lost of this sheet would only affect a small class of experiments,
which if necessary could be performed manually.
2) Password protect (read, edit) the module VBA macros, functions, and
subroutines.
Motivation:
Produce more consistent results due to reducing the opportunity
for accidental changes to the macros, functions, and subroutines.
The trade-off is that when errors occur, it will be more difficult to
replicate and resolve.
[I am not sure about the property rights standing for this work,
if you wish I can talk to Vince at Rutgers OTLAP.]
3) Save datasets imported across cycles in the use of the spreadsheets
and across cycles in FinMath.xls exit and reinvocations.
Motivation:
This improvement would relieve the burden of remembering and re-importing
the particular set of time series desired, before an related experiment
could be performed. Since each imported series populates a separate
"sub" sheet external to the book, a possible implementation could consist
of saving those sheets into a folder. Upon re-invocation, the contents
of the folder would be checked and if not-null, those sheets containing the
prior imported series would be used to repopulate the Reference and Asset
Browser sheets. A 3rd separate "button" option form in the Reference sheet
to select the option may be desirable.
Inaddition, a 4th separate "button" option form in the Reference sheet may
be also desirable. Upon selecting this "button" option, the file contents
in the folder with the prior imported and saved series would be cleared.
Creating this "button" simplifies the design of the user interface by
removing the need for developing an editing procedure for deleting the
associated file contents whenever a selected series reference is cleared
or cut in the Reference sheet.
4) Consolidate onto a single sheet the user input settings for all the values,
parameters, check boxes, etc. that characterize the experiment.
Reducing the flexibility of setting the various parameters from multiple
sheets to a single designated sheet, especially the start date, end date
and the trading interval length.
Motivation:
Presently, it is not always clear that the displayed value of a parameter
represents the "set" value or the result of modifications during calculations.
This improvement would increase the consistency necessary when trying to
associate the results across sheets when the value of the parameters
have been changed in one sheet.
5) "Help pop-up" boxes for each of the sheets.
Motivation:
By identifying the purpose of the sheet and which parameters, if any,
may be set within a given sheet and whether
or not the new setting will require re-calculations in other other sheets,
more consistent results would be obtained for the casual user and new student.
6a)Eliminate the need for the explicit double Reset + Calculate
used to clear the residual / spurious data from prior saved executions
in the Covariance sheet.
Motivation:
This improvement would reduce the occurrence of inconsistent results
due to an oversight in manually cleaning up the Covariance sheet via
a double Reset + Calculate.
With the removal or disabling of the Statistical Experiments sheet,
it is probably possible to apply the "clear" method on the activate sheet
when entering the Covariance sheet the first time.
An alternative is to implicitly invoke a Reset + Calculate upon first
activation of the sheet.
However, it is important not to invoke multiple Reset + Calculate's
after the first instance of the first training period calculation from
the Dynamic Optimal Portfolio sheet. Otherwise the values for the rolling
window will not be calculated correctly due to the resetting of the start
and end dates.
6b)Create a separate Covariance2 sheet as a server for the Dynamic Optimal
Portfolio sheet calculations.
Motivation:
This improvement breaks the need for a linear flow whenever new values for
Dynamic Optimal Portfolio are set. This sheet would be cleanly repopulated
when values from the time series and the Dynamic Optimal Portfolio sheets
upon invocation from the Dynamic Optimal Portfolio sheet.
7) Create a new set of internal variables (initialized to the associated
external parameters settings) to represent the parameters when they
are modified during calculations, rather than using the external parameter
variable itself.
Motivation:
Presently, it is not always clear that the displayed value of a parameter
represents the "set" value or the result of modifications during calculations.
This improvement would increase the consistency necessary when trying to
associate the results across sheets when the value of the parameters
have been changed in one sheet.
For example, the Covariance sheet and the Covariance2 sheet, if implemented
would use a different set of internal variables to designate the span of the
rolling window of the training period, rather than changing the start date
and end date set by the user.)
8) Fill the values for a missing trading date gap in a time series
by replicating the trading data of the prior date.
(already implemented in version mod_3)
Motivation:
Compared to the current shift up collapsing of the time series samples,
the results should be more accurate.
However, the change knownly biases the mean of the affect series slightly
toward the replicated data value and also changes the cross correlation,
though less so than the current implementation.
Experiments appear to show that for gaps of 1-2 days when the
training intervals are large (~50days) and the overall portfolio period
extends across more than 2 training intervals in length,
the effect on the asset weights is <0.1% when a single trading gap is
adjusted. There are no practical effect on the portfolio results.
If a trading gap of say 5 days occurs relative to the other time series
in the set, then the stationary assumption is most likely violated
and the CAPM trading strategy is nolonger applicable as defined.
9) Plot the determinant of the CoVariance matrix on the Optimal Portfolio Chart.
Motivation:
This information would help identify potential intervals where the CAPM trading
strategy may have failed to generate the optimal results.