Harry Markowitz & the Discretionary Wealth
Hypothesis
By
Jarrod Wilcox
Wilcox Investment,
Inc.
950 Centre Street
Newton, MA
02459
Tel. 617-332-4666
jwilcox@wilcoxinvest.com
DRAFT
Draft Copyright 2000
Jarrod W. Wilcox
January 7, 2003
Harry Markowitz & the Discretionary Wealth
Hypothesis
ABSTRACT
In his 1959 book, Harry Markowitz showed how return mean and
variance combined to determine expected long-term growth rate of capital. But
the maximization of that growth rate seemed to fit the risk preferences of only
a narrow range of aggressive investors with no concern for shortfalls. This
paper generalizes that goal to both conservative and aggressive investors by mapping
the distribution of returns on total wealth to that of returns on discretionary
wealth. It also broadens the definition of risk to include return skew and
kurtosis where required, fully encompassing the concept of downside risk. The
resulting change in frame of reference extends Markowitz’s criterion to many
practical investment decisions involving maximizing long run wealth while
controlling the probability of shortfalls along the way.
Harry Markowitz & the Discretionary
Wealth Hypothesis
INTRODUCTION
We often wish to
better understand the long-run impact of our short-term investment policies.
One possible tool is Monte Carlo simulation, the random generation of many
alternatives to discover the probability distribution of multi-period
outcomes. Not many investors find this easy to implement. There is an
unfilled need for practical guidance.
For single periods,
we have the mean-variance criterion developed in the 1950’s by
Harry Markowitz.
He proposed that in each investment period investors should strive for portfolio
returns having the greatest difference between their mean and the product of 1)
the return’s variance, or expected squared difference from the mean, and 2) a
risk aversion coefficient specific to each investor. This is a very useful
yardstick, but it is inadequate for constructing policies that will lead to
maximum long-term results with acceptable safety against shortfalls along the
way.
In his 1959 book, Markowitz showed
how return mean and variance combine to affect expected long-term growth rate
of capital. But the maximization of expected portfolio growth rate seemed to
fit the risk preferences of only a narrow range of aggressive investors. The purpose
of the present paper is to show how to better use Markowitz’s ideas for
achieving longer-run objectives. To do so, his criterion will be extended
with optimal growth and shortfall avoidance concepts. This task has been
attempted before with limited success, most notably by Hakansson [1971]; here
we take a different approach, the discretionary wealth hypothesis, to overcome
the remaining obstacles. By the end of the paper, we will have clarified not
only the long-run impact of short-run policies, but also the perceived need to
distinguish between variance and “downside variance,” referred to by Markowitz
in his later writings as the semivariance.
To keep the scope of the paper within
manageable limits, the possible implications for market pricing if investors as
a whole act so as to achieve better long-term results will not be considered.
The paper will also refrain from adding to the investment literature on implementation
problems arising from erroneous risk estimates.
I. GROWTH MODEL
CONTEXT
An appealing source
for conceptual cross-fertilization with Harry Markowitz’s mean-variance
criterion is the optimal growth theory introduced by John Kelly [1956]. In his
framework, the investor maximizes the expected rate of long-run return in an
investment process with independent returns by maximizing the expected
logarithm of each single-period return multiple. Kelly was writing from
the vantage point of information theory, and made no attempt to fit his concept
to the utility theory for risks that was then beginning to take hold in modern
finance. Hakansson [1971] modified Kelly’s model in an attempt to bring it
into the discourse of economists and to generalize it to the preferences of
conservative investors. Hakansson also made the important observation that
such a strategy would tend to maximize one’s median wealth along the way.
However, after
generating considerable controversy, the growth-optimal model was not generally
adopted within finance, for two reasons. On the practical side, it was found
that resulting portfolio optimization of weights of stocks in diversified
portfolios gave results hard to distinguish from those of Markowitz
mean-variance optimization. On the theoretical side, no adequate response was
made to the objection by Merton and Samuelson [1974]. They argued that maximizing
expected log return, or even Hakansson’s proposed linear combination of the
mean and variance of log return, could not account for the preferences
of conservative investors in a way consistent with utility theory. In the
aftermath of that academic exchange, nearly all investors with a quantitative
bent focused on an efficient tradeoff between single-period mean and variance.
They did so despite the handicap that there seemed no objective way to
determine a best risk aversion coefficient to construct their tradeoff. They
also were willing to put aside everyday experience that suggested risk aversion
to downside risk in the form of negatively-skewed and fat-tailed return
distributions.
Recent work in this
arena most accessible to readers who are not mathematicians includes Wilcox [2000]
and Kritzman and Rich [2002], who note the crucial role of interim shortfalls
in determining risk aversion. Recent academic work on the implications for
optimal growth on pricing include Bekaert et al [1998], and Harvey and Siddique
[2000]. They have found strong evidence that the third and fourth moment of
returns, risk features not captured by variance, can be priced. A
representative study of multi-period investment policy under specialized assumptions
is that by Barberis [2000].
II. GROWING
DISCRETIONARY WEALTH
Let us first clarify
why logarithmic returns are important in understanding long-run investment
results, and why the median result is of great practical importance. We
will then discuss bridging steps between Markowitz mean-variance optimization
and optimal growth models. With these fundamentals in place, the discretionary
wealth hypothesis completes the connection by showing how to manage the risk of
interim shortfalls.
The Centrality
of Logarithmic Returns
Though many
investors are familiar with returns based on natural logarithms only as the
continuous interest rate, any fractional return can be converted to such a log
return. For example, a 50% return gives a return multiple of 1.5 and a log
return of 0.405. (In this case, 0.405 is the power to which the natural
constant e, or 2.718, must be raised to give 1.5 as a result.) Log
returns are important because they convert the representation of a compounding
process from multiplication to addition. The result of compounding fractional
returns r is the product of their return multiples 1+r. Alternatively,
the same answer can be gotten by converting each return multiple into its
natural logarithm, adding these together, and taking the antilog of the result
(raising e to that power). This perspective explains why the wealth
outcomes of long-run compounding usually display a particular positively-skewed
statistical distribution (the log-normal).
One does not need
to reject active investing to note that though stock prices may be determined
by predictable factors, the market appears relatively efficient in
incorporating changes in these factors to changes in prices.
Consequently, successive investment returns of traded securities appear to us,
to a first approximation, as practically independent random events. We know
from the Central Limit Theorem of statistics that the distribution of a sum of
independent random numbers tends toward a bell-shaped normal distribution as more
numbers are included. This is true for identically-distributed returns with
finite variance and it is usually true in real-world practice where successive
returns are drawn from somewhat different probability distributions.
Although it is
possible to create a compounding process where the Central Limit Theorem does
not hold, as from a dynamic hedging program, extreme departures are readily
identified and isolated. For most practical purposes, the sum of the log
returns created by long-run compounding of investment returns approaches the
neighborhood of a normal, bell-shaped distribution as the number of periods is
increased. Taking the antilog of such distributions, one discovers that the
ratio of ending wealth to starting wealth will be positively skewed, approximating
the log-normal distribution.
Because long-run
wealth outcomes are nearly log-normally distributed, average wealth is strongly
influenced by low-probability sequences in which unusually high single-period
returns are compounded. Consequently, mean wealth will be greater, often much
greater, than the median, or 50th percentile, wealth from a long-run
compounding process. Median wealth is closer to what most outcomes will be.
Happily, we can estimate median wealth in advance, because of the following
relationship.
Since the normal distribution
is symmetric, after a sufficient number of periods, the mean and median of the distribution
of logarithms of wealth must converge. Consequently, the mean
log return each period, by determining mean log return for the long-term,
determines the median logarithm of long-term wealth and thus the median long-term
wealth. When we maximize mean log return, we maximize median long run
wealth, a desirable outcome in itself, and we tend, other things equal, to
reduce the probability of an interim low-performance shortfall.
We can clarify
these ideas with a concrete example. Begin with $1, which is to be compounded
for 10 periods. Each period there is either a gain of 20%, giving a log return
of 0.182, or a loss of 10%, with a log return of -0.105. The probability of a
gain is 60%. After one period, the mean return is 8%. What will be the result
after compounding?
The final mean
wealth is 1.08 raised to the 10th power, or $2.16. The median
requires a different approach. The mean single-period log return is 0.6*.182 +
0.4 * (-.105) = 0.0672. Multiplying that by 10, we arrive at the estimate for final
median log wealth of 0.672. Taking the antilog of that result, we estimate
median final wealth at $1.96.
Let us use computer
simulation to check our estimate. We randomly-generate 100,000 sequences of 10
periods of compounded returns, a Monte Carlo simulation. Exhibit 1,
essentially a cumulative probability chart turned on its side, plots compounded
wealth on a vertical logarithmic scale and its percentile rank (100th
being the highest) on the horizontal scale. The S-shaped heavy line (actually
the overlapping of 100,000 small circles) shows the distribution of outcomes.
Its intersection with the centered vertical marker shows the median outcome.
The three horizontal lines show, starting at bottom, the original wealth of $1,
the median wealth of $1.96, and the mean wealth of $2.16. These amounts closely
confirm those predicted earlier.

Why Variance
Does Not Fully Capture Risk
There is a deep
relationship between the statistical moments that describe a
distribution of returns and the successive terms of a Taylor series
whose sum is mean log return. By making expected log return more transparent, this
relationship shows how to improve long-run median outcomes by showing how they
depend on the scale and shape of the single-period return distribution and on
the leverage we apply to it. It will also help us understand why some investors
may be especially unsatisfied with return variance as the entire measure of risk.
Statistical
Moments: The mean, E, is sometimes called the first moment
of a statistical distribution. The expected value of the difference between a
random outcome and its mean is zero. The expected value of that difference
squared is called the second central moment, or for short, the variance V.
The expected value of that difference raised to the third power is called the third
central moment; raised to the fourth power, it is termed the fourth central
moment, and so on. These successive central moments describe the
distribution’s dispersion, its lopsidedness and its tendency to have both a
central spike and abnormally long tails (more extreme events).
Taylor Series:
Many common mathematical functions of a number can be expressed as the sum of
an infinite series of terms of increasing powers of that number, beginning with
a constant. The natural logarithm of 1+r can be expressed as the natural
log of the mean return multiple, ln(1+E), plus a series of terms
involving increasing powers of the difference between the return r and
its mean, E.
The expected value
of the log return is the sum of the expected values of the terms in this Taylor
series, making it a function of the central moments of return. We can go
further toward linking the formula to commonly-used statistical parameters as
follows. The third central moment can be decomposed to a shape parameter,
skewness S, multiplied by variance V raised to the 3/2 power.
The fourth central moment can be decomposed to a shape parameter, kurtosis K,
multiplied by the variance V squared. We then obtain the rather
fearsome-looking formula of Equation 1. It links expected log return to statistical
parameters that can be easily calculated in a spreadsheet such as Excel or in
any statistical software package. The incremental information sought by
many investors in avoiding “downside risk” or semivariance is captured by the
third and fourth terms.
Expected ln(1+r)
ln(1+E) -
+
-
+ … (1)
Where:
ln – the natural log function
r
– fractional return, the conventional return measure
E
– mean r
V
– variance of r
S
– skewness of r
K – kurtosis of r, for
a normal distribution K=3.
For diversified
portfolios not incorporating derivatives or leverage, only the leftmost two
terms of the formula in Equation 1 are required to produce a good estimate of
mean log return. This two-parameter version is the form derived by Markowitz.
When the mean return is small, the Markowitz verson can be approximated as mean
return less half the variance, or E – V/2. In the example illustrated
in Exhibit 1, it provides an estimate of mean log return of 0.0690 as compared
to the true 0.0672.
Equation 1 provides
us with important insights even before we introduce the discretionary wealth
hypothesis. Each succeeding term contributes additional information about
events that are more extreme and have smaller probability. Note also that if time
periods for measuring return are kept short, the variance V is reduced.
That implies that the third, fourth and higher central moments, which include higher
powers of V, being reduced in greater proportion, will contribute less
and less to the determination of median long-term results. This taming of
unruly return distributions offers a sound theoretical basis for the
oft-criticized practice of close attention to short-term results.
An investor whose
risk aversion coefficient in the Markowitz mean-variance framework happens to
yield a coefficient near 1/2 will pursue a policy approximating maximum median
wealth over the long-run. However, this relationship is more aggressive than
the preferences most of us seem to exhibit. This disparity raises the question
of what the rest of us are doing, a question to be answered later by the
discretionary wealth hypothesis.
Truncated Growth
Models
To achieve its
remarkable simplification, Kelly’s growth-optimal model assumes an infinite
number of time periods. It says nothing about time preference or finite
lifetimes. The same is true for Markowitz’s single-period mean-variance
criterion. Sometimes the single-period Markowitz mean-variance criterion pays too
little attention to possible disasters with small single-period probabilities.
In contrast, Kelly’s growth model, because it is based on an infinite number of
periods, can pay too much attention to extremely low probability
disasters. That is, our investment interest is usually limited to the impact
of events likely during one or two lifetimes.
Consider an analogy.
Suppose the annual probability of an automobile driver fatality per year in the
US is about one-one hundredth of a percent. If a driver were to drive for
thousands of years, the median driving outcome would be grim. But since the
cumulative probability of a driving fatality over a realistic lifetime is only about
0.5%, and since driving helps us with many other goals, most of us rationally
decide to drive.
We automatically
reduce the influence of tiny-probability extreme events whenever we approximate
mean log return with a truncated Taylor series. If we start with the linear
mean-variance criterion innovated by Markowitz, we take most dispersion into
account. When we go further by using the first four terms of the Taylor series
for mean log return, we take unusual events seriously. When we go even further
and work directly with expected log returns, we avoid any risk with disastrous
consequences, no matter how small its probability during the lifetime of our
contemplated investment policy.
The
Discretionary Wealth Hypothesis
Now we proceed to
adapt Kelly’s model, whether as originally published, or in truncated Taylor
series form, to the needs of the great majority of investors too conservative
to maximize growth in total wealth by assuming a Markowitz risk aversion
coefficient of only 1/2. We will assume that risk aversion is caused by the
need to avoid shortfalls, not only at some far-off ending period, but all along
the way. The discretionary wealth hypothesis asserts that investors will be better
off if they strive to maximize their median discretionary wealth over
the long run. Discretionary wealth is the amount one could afford to
lose without suffering whatever one defines as a shortfall disaster.
By specifying the
shortfall boundary as the zero-point of discretionary wealth, we place it
infinitely far away in logarithmic terms from median discretionary wealth, out
of reach for a log-normal distribution. Consequently, if we truly maximize expected
log return of discretionary wealth, we will have the best possible growth in
median wealth without an interim shortfall (after a warmup-period for the
Central Limit Theorem to take hold). If, on the other hand, we maximize our
truncated Taylor series estimate, with four terms, we convert an exact formula
to a more heuristic guide. That is, we allow a residual probability of eventual
shortfall; but we may by this means achieve something closer to maximum median
discretionary wealth during our lifetimes.
The addition of the
discretionary wealth hypothesis answers the two main objections to growth-optimality
as a basis for investing, the practical and the theoretical. First, it is very
often approximated by Markowitz’s mean-variance criterion. This will generally
be true for diversified portfolios without large-scale use of derivatives or
high leverage. In practice, there is no additional mathematical complexity
except in cases where it is needed. Second, it responds to Merton and
Samuelson’s theoretical critique, though in a surprising way.
Classical financial
utility theory represents conservative investors as having utility functions
that are more strongly curved. What we do here is the alternative, varying the
apparent distance between two outcomes by changing the scaling of return from
that on risky assets to that on discretionary wealth. For the utility
theorist, we have said that all investors will be better off if they act as
though their utility was given by the log of their discretionary wealth. Every
investor is advised to have the same-shaped utility function, of the form log(w-c),
where w-c is discretionary wealth. Note that total wealth w
is usually more variable over time than is the shortfall point c.
How
Discretionary Wealth Affects Risk
The size of the
risky portfolio asset will not in general be the same as the size of
discretionary wealth. Their ratio, which may be greater or less than one,
will be termed implicit leverage. An investor fully invested in
stocks but with a discretionary wealth fraction D of only 20% has an implicit
leverage of 5 times.
For a given
implicit leverage, one could in principle re-scale each return on risky assets
to an equivalent return on discretionary assets, and from there directly estimate
mean log return on discretionary assets. However, it is much more instructive
for design purposes to start with the Taylor series representation of expected
log return on risky assets as a base, as in Equation 1. Then one can observe the
separate effects on expected log return, and thus preferences, of implicit
leverage applied to the mean and to each central moment of the risky asset
return. Implicit leverage and the manipulation of these statistical moments of
return are the policy design parameters that confront investors.
The discretionary
wealth hypothesis translates the relevant expected log return estimate to that
shown in Equation 2. Note that rescaling Equation 1 for implicit leverage L
is just the multiplication of both return mean and standard deviation (the
square root of variance) by that leverage.
Expected ln(1+Lr)
ln(1+LE)
-
+ … (2)
Where:
ln – the natural log function
r
– fractional return, the conventional return measure
L –
implicit leverage
E
– mean r
V
– variance of r
S
– skewness of r
K – kurtosis of r, for
a normal distribution K=3.
Equation 2 answers
the question of when higher moments of return matter. The answer is not just
when facing skewed or fat-tailed distributions of the return of underlying
assets. It is the separate conjunction of each moment of underlying asset return
with the increasing powers of the implicit leverage created by the shortfall
constraints for a particular investor. Investors with higher implicit leverage
should be more sensitive to variance, still more sensitive to negative
skewness, and still more sensitive to the existence of fat-tails in the return
distribution. This is a very fundamental result. It implies
that investors with high implicit leverage in an investment environment that
may have large third and fourth central moments of return should not be
satisfied with return variance as the entire measure of risk.
III. BRIEF EXAMPLES
Appropriate Risk Aversion:
The Markowitz criterion can produce an efficient frontier of
tradeoffs between mean and variance of portfolio return. It says nothing about
which point on the frontier should be selected. Using the simplest version of
Equation 2, one finds that the ideal implicit leverage is approximately the
ratio of expected real return to variance, or E/V, and thus the
proportion of wealth allocated to risky assets should be about D(E/V),
where D is the fraction of assets considered discretionary. Setting
that allocation equal to the one prescribed by Markowitz, we discover that the
best point on the efficient frontier will be obtained if we set the Markowitz
risk aversion coefficient equal, not to ½, but to 1/(2D). Specifying D is usually
easier, and is never harder, than the alternative of specifying the Markowitz
risk aversion coefficient directly.
Comparing Active Managers:
Information ratios are nearly ubiquitous in professional discussion of the
performance of active investment managers. Very few investors know the
restrictive assumptions necessary to make maximizing this ratio consistent with
long-run growth in median wealth. It is not difficult to construct cases where
the information ratio gives a ranking to investment policies or managers that
is quite misleading for that goal.
That the information ratio takes no account of higher return
moments clearly makes it inapplicable for evaluating a portfolio insurance
program. But it can be misleading even in cases involving only return mean and
variance. Here is a case in point.
Assume market returns in excess of cash have an annual mean
of 6% and a standard deviation of 20%, arising from a log-normal distribution
with appropriately translated mean and variance. Two active managers of a
client’s entire all-equity portfolio are compared – one with an extra return of
6% and a tracking error of 8%, the other with an extra return of 5% and a
tracking error of 1%. Log return and variance are appropriately translated and
assumed independent of the market log return. The first manager’s commonly
measured information ratio of 0.75 is far less than the second’s 5.0. Yet if
we calculate mean log return for the total portfolio, it is clear that it is the
first manager who offers superior long run results while avoiding shortfall for
any investor whose discretionary wealth fraction is greater than about 0.35.
Avoiding Dynamic Hedging Pitfalls:
Anecdotally, investors employing dynamic hedging have often found that they become
stuck near a protective floor. This phenomenon might be attributed to unfortunate
jumps in pricing, but there is a more general explanation.
Black and Perold’s [1992] CPPI
procedure allows the production of option-like positions through trading
rules. This procedure dynamically allocates total assets to a risky asset in
proportion to a multiple of the “cushion,” the difference between current
wealth and a desired protective floor. This produces an effect similar to
owning a put option, and if employed without borrowing constraints, also a call
option. If the return distribution statistics are constant, and the
contribution of higher moments of return to results is minimal, it is similar
to the procedure proposed in this paper. The crucial distinction is that CPPI
multiples are driven to high ratios by the need to create strong option effects
rather than the moderate levels appropriate for best long-run median wealth.
Using Equation 2, it becomes
obvious that CPPI multiples of 5 or more necessary to produce a saleable option
effect cause too much implicit leverage for the cushion, resulting in a
negative mean log return and eventual entrapment near the protected floor.
This phenomenon is only partially offset by typical constraints against outside
borrowing. Such constraints lower implicit leverage after a sequence of good
returns, allowing most investors to escape to safer regions unless early
results are negative. However, a substantial fraction of investors will be
left behind.
CONCLUSION
The purpose of this paper has been to make available to every
interested portfolio manager a method to better manage the long-term results of
investment policies.
We began to build on the Markowitz single-period mean-variance
criterion by examining Kelly’s exponential growth model. His criterion of
maximizing single-period expected log returns achieves maximum long-run median
wealth if no shortfall intervenes to interfere with the process. We used
Markowitz’s logic to translate mean log return to a Taylor series representation in terms of
the statistical moments of return. By extending the series to four terms,
rather than the two at which Markowitz stopped, we showed how the concept of
risk extends naturally beyond variance to include negative skewness and excess
kurtosis, or fat-tailed return distributions, thus encompassing downside risk.
To account for the needs of conservative investors who must
avoid shortfalls along the way to the long run, we re-mapped returns on total
wealth to amplified returns on discretionary wealth, the wealth available
before shortfall. We represent conservatism not by greater curvature of an
abstract utility function, but by greater distance between investment outcomes
based on measuring returns against fractions of wealth considered
discretionary. For example, a 10% loss for an investor who can lose no more
than 20% of total assets without shortfall is represented as a 50% loss.
The addition of the discretionary wealth hypothesis generalizes
Markowitz’s framework to investment policy governing both long-run
outcomes and risk features not captured by return variance. What is new is its
focus on the interaction between shortfall boundaries and leverage in
determining suggested separate investor preferences for return mean, variance,
skew, and kurtosis. What makes it practically very useful is its relative
simplicity.
ENDNOTE
The author is grateful to Gary
Gastineau, Campbell Harvey, Blake LeBaron, Ben Shoval, Dan Rie, Mark Kritzman,
Michael Wilcox and Richard Holmes for their comments and encouragement.
REFERENCES
Barberis, Nicholas. “Investing for
the Long Run when Returns Are Predictable.” Journal of Finance, 55 (2000),
pp. 225-264.
Bekaert, Geert, Claude Erb,
Campbell Harvey and Tadas Viskanta. “Distributional Characteristics of Emerging
Market Returns and Asset Allocation.” Journal of Portfolio Management,
24 (Winter 1997), pp. 102-116.
Black, Fischer, and André Perold. “Theory
of Constant Proportion Portfolio Insurance.” Journal of Economic Dynamics
and Control, 16 (1992), pp. 403-426.
Hakansson, Nils H. “Multi-Period
Mean-Variance Analysis: Toward A General Theory of Portfolio Choice.” Journal
of Finance, 26 (1971), pp. 857-884.
Harvey, Campbell R., and Akhtar
Siddique. “Conditional Skewness in Asset Pricing Tests”, Journal of Finance,
55 (2000), pp. 1263-1295.
Kelly, J. L., Jr. “A New Interpretation
of Information Rate.” Bell Systems Technical Journal, 35 (1956), pp. 917-926.
Kritzman, Mark., and Don Rich. “The
Mismeasurement of Risk.” Financial Analysts Journal, (May-June 2002)
pp.91-99.
Markowitz, Harry .M. Portfolio
Selection: Efficient Diversification of Investments. New Haven, Conn.:Yale University
Press,1959.
Merton, Robert C., and Paul A.
Samuelson. “Fallacy of the Log-Normal Approximation To Optimal Portfolio
Decision-Making Over Many Periods.” Journal of Financial Economics, 95
(1974), pp. 67-94.
Wilcox, Jarrod. “Better Risk Management.”
Journal of Portfolio Management, 26 no. 4 (Summer 2000), pp. 53-64.