Some may know that I recently got a second job as an adjunct professor in the University of Washington system. While the first course I will be teaching is on Operations and Project Management, which is part of the supply chain discipline of the School of Business, I have been encouraged to work on an emphasis of data analytics and visualization. Last month, I was invited to give a guest lecture to a class on visualization. Developing those slides gave me inspiration to consider developing a curriculum around analytics and visualization. To organize my thoughts, I decided to begin here on WordPress.
It seems to me a class should be organized around the steps of developing data to answer questions. The following list could be used for academic or industrial purposes. I don’t think this list is original, but it seems logical and fits with my own approach at work.
- Have a question without an answer
- Find a data source that may lead to answer
- Look for incompleteness, inconsistencies, and possible errors
- Perform initial analysis and construct initial visuals
- Write a story describing the analysis and visuals and include a conclusion
- Has the initial question been answered? Are there new questions?
I have encountered instances where only steps 2, 4, and 5 are performed. That has resulted in dissatisfaction from the requestor and the analyst. It seems almost all of the dissatisfaction arises from missing step one. If you don’t know what you would like to solve, you don’t know when you have finished. It is the finish that is determined in step six. Sometimes an answer to a question brings up new questions since the conclusion may not be what was expected or the answer doesn’t perfectly answer the initial question.
I have also seen instances where step three is skipped. This can be fatal to any analysis since it can lead to incorrect conclusions or worse, no conclusion when there should have been one. There generally isn’t a way to easily verify the accuracy of a data set. Incompleteness can be easy to check. My background has been in time series analysis which may be the easiest of all types of data sets to verify completeness. My best advice is to spend time with the data and use statistics along with graphs to see if the data looks reasonable. Odd situations can be observed with a basic approach that can lead to questions regarding events that influenced the data.
One of my goals will be to also examine the graphs I have constructed and maintained on the data site. I have some opinions on the construction of charts and how to do it well. With regard to software, I have been using Microsoft Excel for twenty years and still consider it to be the best general purpose software for data analysis. For this blog though, I have turned towards Tableau Public for visualization. I used a recent entry to discuss how that software has made it simpler for me to keep the visuals of this blog updated.
Okay, that is a simple introduction. No time table on what I put in this category exists, but this will keep my imagination active for awhile.
I have updated the graph of the 10-year yield to include all of February’s data. The average yield during February was 150 basis points below January’s average. I was surprised by that result since I really haven’t heard much commentary regarding the yield.
I have a table on the graph which shows the average over the past six months. February had the penultimate average of the six months with only October 2013 being lower. The yield is still within a fairly narrow range since June 24, 2013 of 2.55 and 3.03.
I added a new table today that shows the average over the past six years. The trend from 2009 to 2012 was straight down, but the trend from 2012 to 2014 has been straight up.
I have thought about constructing another view of this graph which shows wave analysis. It would be very complicated, but it is an interesting way to look at data. Here’s how to think of it:
- the current period began June 24, 2013 and has lasted 173 days with a range of 2.55 and 3.03
- a transition period occurred between May 2, 2013 and June 23, 2013 that lasted 36 days
- the previous period began June 1, 2012 and lasted 229 days with a range of 1.47 and 2.06
You get the idea. Defined range periods have an indeterminate length. Transition periods are sharp movements between defined range periods and are typically short in duration. The current wave pattern seems to have begun during the fourth quarter of 2008. I will continue to look over this graph to see if there is more interesting patterns.
Note: After thinking about it some, I decided to give it a go. There hasn’t been that many changes in the data since 2008. I updated the graph to include some color depicting what period the data is in. Blue data indicate periods of stability. Red data indicate transition periods. Green data indicate wave-based transition periods. Wave-based transitions are typically slower and involve opposite directions movements of minor-period length. There are currently three green waves in the graph and you can see the saw-tooth pattern of the line.
There is an art to watching economic indicators. They don’t move in the same direction at the same time. Take the following consecutive stories from this morning’s Wall Street Breakfast on Seeking Alpha.
Eurozone consumer prices tumble. Eurozone CPI dropped a record 1.1% on month in January after rising 0.3% in December, with the fall much sharper than the 0.4% decline that was expected. The index was dragged down by a tumble in the cost of non-energy industrial goods. On year, inflation was +0.8%, as in December. The sharp monthly fall in CPI comes amid concerns about deflation in the eurozone, although the ECB has so far been sanguine.
German corporate optimism increases again. The German Ifo institute’s business climate index has increased to its highest level in 2 1/2 years, rising to 111.3 in February from 110.6 in January and topping consensus that was also 110.6. The current-situation reading rose and exceeded forecasts, although the expectations print slipped. “The German economy is holding its own in a changeable global climate,” says Ifo.
Reconciling these two items isn’t easy. Prices should be dropping because demand is bad. But if demand is bad, corporate optimism shouldn’t be increasing. They key is reading the second item carefully. The business climate and current situation readings are better than expected, but the expectations were lower than forecast. They can match up if prices continue to fall which would be the correct attitude towards future business expectations.
I saw a remarkable graph two weeks ago and managed to find the data sets and confirm the results.
The Case-Shiller series of indices track the change in price of housing throughout the United States. It is a remarkable set of series that includes the largest 10 cities, largest 20 cities, individual cities, regions, and an average of the entire country. For this analysis, I grabbed the entire country series.
I wanted to compare it to the median family income. At first glance, I thought the original graph was showing median wages, but that was not correct and wouldn’t generate a correct comparison. While individuals may buy houses, it is a family that lives in one. A family can also be defined as an individual. The series on median family income is produced by the Census Bureau which gives me confidence that I am using the term family properly for this data comparison.
I am certain everyone is aware of the bubble that occurred in house prices between 2000 and 2008 and the leading cause of the recession was the collapse in house prices from the destruction of the sub-prime securities market. What everyone may not be aware of is how closely tied the price of housing was to median family income prior to 2000.
I gathered data from 1990 through November 2013 (the most recent Case-Shiller reported date). Between January 1990 and December 1999, the correlation between the two series was 0.7727. That is reasonably high. Contrast that with the period of January 2000 through November 2013 where the correlation was 0.4127.
The point of the original graph was the recent increase in the Case-Shiller index is far outpacing the median income again. The chart is below. There are no conclusions to be drawn, but it might be something to watch. I’ll keep the chart on this blog since the median family income number is updated only once per year and the Case-Shiller has a significant lag.
I have read some angst over the past two weeks regarding the “taper”. This refers to the reduction in the amount of assets purchased by the Federal Reserve each month. While the Fed is slowing the amount of monthly purchase, this should not be considered a tightening activity.
Let’s start with Federal Reserve policy. Typically the Fed increases or decreases an interest rate know as the Federal Funds rate to affect the economy and to counter the business cycle. In 2008, the Fed Funds rate had been decreased to the point where it was nearly zero and the economy was still struggling in a recession. The next option was to engage in a pattern of security purchases known as Quantitative Easing (QE). The first three QE programs were limited in size. The QE program we are currently in does not have a specified size or end date. I have summarized all of this in a new chart on the analytical road blogspot (opens in a new window). The Fed Funds rate is the line in red. Since this chart has a bit of history, it is easy to see where the rate was increased and decreased. The most recent decrease began in 2007 and ended at the zero bound in 2008. The green line is the size of the balance sheet of the Federal Reserve. As the QE programs have taken place, the size of the balance sheet has risen dramatically.
The point of this is the very far right of the balance sheet line. There is a slight turn in the upward movement. That is the effect of the “taper”. Notice, the line is still moving up which indicates that easing is still taking place. Once this line moves down or the red line begins moving up, then and only then can it be said that there is a tightening program taking place by the Fed.
When it comes to analytics, this analysis is very straightforward. I wish I would have kept the links to the news reports mentioning the Fed was beginning a tightening program since it would have been useful to include them here. Easing less is not the same as tightening.
One of the valuation measures of the equity market is the price / earnings ratio. In 2008, the companies of the S&P 500, in aggregate, suffered a significant loss one quarter. It badly distorted the P/E ratio for a year since the earnings part of the ratio is the summation of twelve months.
To avoid this, Robert Shiller proposed in his book Irrational Exuberance a new measure called the Cyclically Adjusted Price Earnings Ratio (CAPE Ratio). The idea was to take the price and divide by the average annual earnings over the prior ten years. The calculation is fairly easy since the prior twelve months earnings can be computed each month. Then the average of those earnings over ten years is calculated.
The current PE is 25.53. The historical mean is 16.51 and the historical median is 15.90. The conclusion is the price is high relative to the earnings.
I bring this up not just because I have been seeing this measure frequently, but because I think there is a problem with the calculation.
By my calculation, I have a current CAPE of 27.39 (1845.89 / 67.39). The official measure of 25.53 comes from 1838.63 / 72.03. I can’t explain the difference, but I don’t know the source of the earnings for the official measure. I do know my source (us.spindices.com) and I can see the detail rather than the rolling last twelve months. The two CAPE ratios aren’t too different and will not affect the conclusion and thus, I plow forward.
As I mentioned in the last post, the earnings estimates over the next two years is expected to grow 45%. This will move the denominator of the CAPE from 67.39 to 80.51, an increase of only 20%. If we assume the numerator remains constant, the CAPE ratio decreases from 27.39 to 22.93. The initial conclusion should be that even a remarkably large increase in earnings (45%) over the next two years still will not move the CAPE ratio back between the mean and median.
In thinking about the implications of using a ten-year moving average in earnings and comparing it to a spot point of the price presents a mathematical difficulty. The only way to quickly change the ratio is to change the price. The denominator can only change the ratio over the long-term. The conclusion that a 45% increase in earnings over two years doesn’t significantly impact the ratio shouldn’t be a surprise because it gets muted by the effect of using an average.
Specifically, 2014 replaces 2004 (which means 120.60 replaces 58.55) and 2015 replaces 2005 (which means 147.50 replaces 69.93). What remains is 2006 through 2013 and totals 565.78. The increases forecasted in 2014 and 2015 only add 139.62 net to the total. Spreading that out across a decade really mutes the impact.
My second conclusion is that I am no longer certain what this measure tells me about the market. It puts too much emphasis on the price relative to the earnings. It also doesn’t seem to provide any additional information beyond the traditional PE. It will be useful in case of another time when earnings are affected by a significant change away from the trend, but I will simply hope we can avoid another 2008.
The first estimates of earnings during 2015 for the S&P 500 were released on Friday. Howard Silverblatt updates an amazing workbook every week, but what is really amazing is the optimism of Wall Street analysts. I intend to track the revision of 2015 estimates over the course of the next two years. We’ll see how well I do at achieving that goal.
Let’s start with a baseline. With 83.9% of the companies in the S&P 500 reporting, the total estimate for As Reported Earnings* for 2013 is currently $101.43. As we have reached the halfway point of the first quarter of 2014, there is beginning to be some changes in 2014 estimates. Still, the total estimate for 2014 As Reported Earnings is $120.50. That is a healthy gain of 19% over 2013 earnings.
With the first estimate of 2015 As Reported Earnings, Howard is reporting $147.50, a smart little 22% increase over 2014 estimates. By itself, that seems optimistic. Combine that with nominal wage gains from workers and that seems over-optimistic. If I have time tomorrow, I will write up how that affects the P/E ratio of the S&P 500. It is interesting.
* For those who don’t know, there are two types of earnings — Operating Earnings and As Reported Earnings. As Reported Earnings are GAAP based and what companies report on their taxes. They seem quite real. Operating Earnings are quite fanciful since they exclude one-time events (which seem to recur) and corporate financial activity which may not result in actual cash.
From Wall Street Breakfast on Seeking Alpha this morning:
As expected, German CPI fell 0.6% on month in January following an increase of 0.4% in December. The drop in prices comes amid increasing concern that the eurozone faces the threat of deflation, although European Central Bank chief Mario Draghi has so far been sanguine about the prospect. Still, Barclays says the risks “are significant,” far greater than markets or policy makers have acknowledged.
I have struggled with creating a method to keep a few charts near the top of the blog. I would prefer to refer to them rather than constantly post them. It has seemed a bit of a chore to update the data in Excel, screen clip the chart, save to a file, upload the file, and create a post. What if i just want to update the chart?
I think I have found an answer using Tableau Public and Blogger. I will keep my commentary here, but I will post the charts periodically on my Blogger site. Now to update a chart on the blog site, I update the data in Excel and refresh the data in Tableau Public. Done! That seems very convenient and quite easy to accomplish.
For now I have two charts on the Blogger site — employment participation rate and the U.S. 10-year treasury yield. Maybe I will spend some time creating a full dashboard of the CPI, which is from a recent post. For now, the charts will always be visible on Blogger.
Two weeks ago saw Argentina’s currency drop significantly as the reality of the decrease in foreign currency reserves showed no signs of changing and no signs of a reversal from the government and central bank.
This morning brought the following item:
S&P cuts Puerto Rico to junk status.
S&P has reduced Puerto Rico’s rating to BB+ and maintained a negative outlook for the debt-laden commonwealth, citing a reduced capacity to access liquidity to fund its operating deficit. Even though Puerto Rico is planning to issue debt, it “will remain constrained in the medium term,” S&P says. Moody’s and Fitch are threatening to drop Puerto Rico to junk status as well.
From Wall Street Breakfast on Seeking Alpha
Puerto Rico’s debt is nine times the amount of debt that Detroit had when it declared bankruptcy. The possibility of bankruptcy from this nation may be coming.