An analysis of the screening method described in Joel Greenblatt's Little Book That Beats The Market

Joel Greenblatt's pop-hit, The Little Book That Beats The Market (2006, John Wiley and Sons, ISBN 0-471-73306-7), describes a simple screening method to identify stocks that offer good value and low risk. Many web sites, newspaper articles and blogs have commented on this book by now (February 2006). Those commentaries uniformly fail to provide any new insight into Greenblatt's ideas. I found one fairly negative, rather academic-flavored site that was quite critical, but after reading carefully, I concluded that the author of that article didn't pay close attention to what Greenblatt actually says in his book.

The Little Book describes some detailed, retrospective studies of simulated, mechanical investing in stocks selected by his method. The simulations used data from a Comstock "point in time" database containing the information that was actually available on a large universe of stocks at each date in the study. Each screening and simulated trade was based on exactly the information that was available historically at that moment in time. This allowed Greenblatt to avoid some forms of selection bias that would have invalidated his research.

I don't have access to the Comstock historical data, but current rankings served up by Greenblatt's free web site can be pretty well replicated using current data from Value Line. Doing so requires reading of the book's appendix carefully, and taking into account some other comments scattered through the book. Here's what I learned from that exercise.

I have no connection with Joel Greenblatt; I simply read his book and worked out what I think it means. Greenblatt has neither reviewed nor commented on this article.

I do subscribe to the Value Line data service, and I commend it to interested readers. It's a convenient, high-quality service with superior customer support.

Comments, critiques, observations and suggestions welcome!

In a nutshell, what Greenblatt says

Restated briefly, here are some key points from The Little Book.

How I approximated Greenblatt's numbers

I approximated Greenblatt's numbers using Value Line data:

  1. myEBIT = IncomeBeforeTaxes + Depreciation.  (IncomeBeforeTaxes has depreciation subtracted out; I added it back.)
  2. ReturnOnCapitalDenominator = TotalCurrentAssets - Cash + NetPlant. (NetPlant is the depreciated value of long-term assets.)
  3. ROC = myEBIT / ReturnOnCapitalDenominator.
  4. EarningsYieldDenominator = MarketCap + PreferredEquity + LongTermDebt.
  5. EY = myEBIT / EarningsYieldDenominator.

In addition, I eliminated stocks with Value Line industry codes "financl", "brokers", "thrift", "water", "reit", and all the codes designating banks, utilities, and insurance companies. I also eliminated stocks with ROC > 300 or EY > 50; extreme values suggest some condition in the company's history or accounting that might make its numbers not properly comparable with the rest of the population.

A company can have "good" ROC and EY by these definitions, yet still have negative shareholder equity, a poor net margin conventionally defined, or "negative" earnings by the book. Therefore I added a picky scan criterion which flags stocks that have negative shareholder equity, no earning per share, or a net profit margin less than 2%.

Expanding slightly on the features, I can specify a minimum market cap for the stocks selected, or a range of allowable market caps (e.g. companies with market caps between 25 - 1500 million dollars). I can also select logical combinations of industry codes, although doing so obviously does not find the cheapest stocks in the entire Value Line universe. This helps me notice relatively cheap stocks within an industry or market segment.

Finally, I don't compute a stock's final rank from the sum of its EY rank and ROC rank. Instead, I compute each stock's "distance" from the origin of an (EY rank,ROC rank) scatter plot according to the Pythagorean theorem (see second illustration below). The ideal "best" stock would have EY rank and ROC rank of #1, closest to the origin. The resulting sorted order is not significantly different from Greenblatt's trick.

While my definitions don't exactly replicate the numbers from, they do sort stocks into pretty much the same order. The result is certainly close enough for most purposes. Greenblatt himself makes the point that the exact definition isn't critical.

The (EY,ROC) scatter plot

It's informative to plot a scatter diagram with EY on the X-axis and ROC on the Y-axis (the raw values as opposed to ranked position in the list). The Value Line universe is big, even after eliminating non-industrial companies (financial etc. as noted above), so I only plot the top 300. Among those, the ticker symbol is placed on the plot for the best 65, and dots for the rest. Green points pass my "picky" test, red points fail it.

Here is a scatter plot of companies with >= 1500 million market cap:

The top companies from this screen exhibit a sharp ROC/EY boundary. Companies whose EY or ROC is too small just don't make the cut; a firm can't operate with a really good ROC but a disastrously bad EY, for example. Good companies in the above plot, such as AEOS, UST, INTC, and DWA, fall near the 45-degree line, and farther out from the (0,0) origin.

MVL (top of the plot) is a company that was bankrupt. Such companies usually mark some of their capital assets down and shed long-term debt, so they might have abnormally large ROC values. You might think, well, that's OK - the company is operating with advantages after its bankruptcy. But consider that for "normal" companies, there is an implicit presumption that as the business grows, the firm can reinvest some profits to enjoy even more of that great ROC. Unfortunately, as a formerly bankrupt company that wrote off a lot of assets or debt begins to reinvest, its ROC can be expected to fall back toward a value that is more typical for its industry. Its ROC advantage won't last indefinitely. Screening by itself does not reveal such issues.

MT (Mittal Steel) is a good example of a somewhat different ROC issue. MT also scores well on a Greenblatt-type scan of large companies because its ROC is very high. Why? The company bought a lot of former communist-block steel factories at steal prices. Now, Mittal can produce steel very inexpensively, but those purchases were a one-time opportunity. Some day when demand slows, those same factories will saturate the market and drive steel prices into a very deep hole. Mittal might close some of them, or it might use its capacity to put less advantaged competitors out of business. One way or another, eventually there will be an ugly scene. Right now, with Asia using all the structural steel it can get, MT seems attractive. Will China keep importing, or will it develop its own capacity? Seeds of a future "capacity catastrophe" might be hidden in MT's present, unnaturally superior ROC.

The (EY rank,ROC rank) scatter plot

Let's have a look at a plot whose coordinates are (EY rank,ROC rank), instead of the raw EY and ROC values. Here are the best members of the Value Line universe with market caps at least 25 million.

This picture is clearer because ranked points by definition can't fall on top of each other. Now the top-ranked companies, such as XJT and VTS, are near the origin. Again, red items fail the "picky" test for one reason or another. Some companies that were nearer the origin in December and January, such as EGY and AEOS, have migrated outward as the share prices went up.

Still a need for analysis

This screen produces some gems, and some stocks with real issues. Veritas (VTS) looks promising. It's essentially a software company that provides subterranean mapping of oil fields worldwide. It has a growing database of the most promising fields. The company managed itself through lean years without debt. It has great EY and ROC. Now, with the advent of what seems to be a new, persistently higher level of oil prices, Veritas is taking on a line of revolving debt and hiring more geophysical scientists. Even though energy-related stocks have run up a long way, VTS might be a really good purchase.

On the other hand, consider XJT. ExpressJet provides commuter services for a larger airline, under contract. I believe there was a recent report that big brother has decided to diversify its outsourcing. The commercial airline game is brutal; XJT may be cheap for good reason - it sells its services to one or a few customers who have every reason to squeeze hard.

These are just some of the issues that can arise if one blindly follows a numerical value screen. Still other problems may arise if a company's books can't be trusted, due to honest or dishonest mistakes.

Why does Greenblatt's method produce such good results?

Despite these limitations, Greenblatt's screen produces an excellent list of candidates to consider. His premise that these companies are inherently less risky than the "average" stock in the universe is probably true, because their raw numbers demonstrate that they have the requisites to succeed, if they are managed properly. And they're cheap buys in a tangible, concrete sense that is hard to dispute.

Greenblatt believes we can earn really remarkable rates of return by buying these companies, yet the widely accepted risk/return principle of investing holds that greater risk is required to get higher rates of return. Is there some hidden risk here? Or is that canonical belief simply wrong?

His explanation is the simplest one: in the long run stock prices do reflect risk in an average way, but they are very inaccurate in the short term. The risk/return principle is intimately related to another axiom of financial theory, which holds that at any moment, the probability of a small price up-move is about equal to the probability of a small down-move. This is equivalent to claiming that stock price returns have a log-normal probability distribution.

That is roughly true for the market as a whole, but it is probably not true for the sub-population of stocks selected by Greenblatt's screen. These stocks are already relatively cheap, and since their EY and ROC indicate that the essential requirements for a successful business are satisfied, the probability of a down-move is not about the same as the probability of an up-move for these stocks. The rate-of-return distribution for this population of stocks is probably quite asymmetric.

Which raises another question ...

We just saw that without additional analysis, depending on screening alone may lead you into mistakes. Yet Greenblatt's mechanical simulations, which demonstrated extraordinary returns, obviously didn't incorporate such analysis - although he might have applied additional, "picky" criteria that are not fully explained in The Little Book.

So ask yourself this. Suppose you accept the risk/return principle. By applying additional analysis that was not done in Greenblatt's simulation, you are selecting even less risky stocks from the screened population. Do you believe that applying such analysis will actually reduce your rate of return?

- Roger Ison, 27 Feb. 2006