When no historical sales data is available it is best to use statistical forecasting methods

Various statistical forecasting methods exist designed for use with slow-moving products, new product introductions, stable mature products and products with erratic demand. Determining which statistical forecasting method works best for a product often boils down to trial and error. Because of the confusion surrounding the method(s) to use, some companies bring in forecasting experts to help analyze data and determine where to start the forecasting process.

Índice Show

Seasonal Models
Simple models
New Product Models
Slow-Moving Models

Basics

When a company uses statistical sales forecasting techniques, it uses its historical sales or demand data to try to predict future sales. Because of the complex mathematical formulas used to create the forecast, most companies rely on advanced software to accomplish this task. Each type of demand requires a different statistical method to best predict the future forecast.

Seasonal Models

A number of seasonal forecasting methods exist. Seasonal forecasting methods, such as Box Jenkins, Census X-11, Decomposition and Holt Winters exponential smoothing models, all utilize the seasonal component of a products demand profile as a major input to determine the future forecast. Seasonality represents a trend that repeats during specific periods. For example, dining room tables exhibit high seasonal demand in the months leading up to Thanksgiving and Christmas.

Simple models

Businesses that don’t have advanced forecasting software often rely on simple forecasting models managed in a spreadsheet. Some of these methods include Holt’s double exponential smoothing; adaptive exponential smoothing, weighted moving average and the very common moving average method. Although an easy to use model, the moving average method fails to alert a business to future trends in a product’s data. The moving average only shows trends already formed. Each time a new period gets added to the moving average formula, the last period gets removed—thus the whole time series “moves” forward one period.

New Product Models

Forecasting new products remains one of the toughest forecasting tasks available. New product forecasting requires input from human and computer generated sources. New product forecasting methods, such as Gompertz curve and Probit curve, seek to manage the high ramp up period associated with a new product introduction. These methods also work for maturing products approaching the end of their life cycle.

Slow-Moving Models

Products that exhibit slow-moving demand or have sporadic demand require a specific type of statistical forecast model. Croston’s Intermittent model works for products with erratic demand. Products with erratic demand do not exhibit a seasonal component; instead a graph drawn of the products demand attributes shows peaks and flat periods at intermittent points along the time series. The goal of Croston’s model is to provide a safety stock value instead of a forecast value. The safety stock value allows for just enough inventories to cover needs.

The appropriate forecasting methods depend largely on what data are available.

If there are no data available, or if the data available are not relevant to the forecasts, then qualitative forecasting methods must be used. These methods are not purely guesswork—there are well-developed structured approaches to obtaining good forecasts without using historical data. These methods are discussed in Chapter 4.

Quantitative forecasting can be applied when two conditions are satisfied:

numerical information about the past is available;
it is reasonable to assume that some aspects of the past patterns will continue into the future.

There is a wide range of quantitative forecasting methods, often developed within specific disciplines for specific purposes. Each method has its own properties, accuracies, and costs that must be considered when choosing a specific method.

Most quantitative prediction problems use either time series data (collected at regular intervals over time) or cross-sectional data (collected at a single point in time). In this book we are concerned with forecasting future data, and we concentrate on the time series domain.

Examples of time series data include:

Daily IBM stock prices
Monthly rainfall
Quarterly sales results for Amazon
Annual Google profits

Anything that is observed sequentially over time is a time series. In this book, we will only consider time series that are observed at regular intervals of time (e.g., hourly, daily, weekly, monthly, quarterly, annually). Irregularly spaced time series can also occur, but are beyond the scope of this book.

When forecasting time series data, the aim is to estimate how the sequence of observations will continue into the future. Figure 1.1 shows the quarterly Australian beer production from 1992 to the second quarter of 2010.

Figure 1.1: Australian quarterly beer production: 1992Q1–2010Q2, with two years of forecasts.

The blue lines show forecasts for the next two years. Notice how the forecasts have captured the seasonal pattern seen in the historical data and replicated it for the next two years. The dark shaded region shows 80% prediction intervals. That is, each future value is expected to lie in the dark shaded region with a probability of 80%. The light shaded region shows 95% prediction intervals. These prediction intervals are a useful way of displaying the uncertainty in forecasts. In this case the forecasts are expected to be accurate, and hence the prediction intervals are quite narrow.

The simplest time series forecasting methods use only information on the variable to be forecast, and make no attempt to discover the factors that affect its behaviour. Therefore they will extrapolate trend and seasonal patterns, but they ignore all other information such as marketing initiatives, competitor activity, changes in economic conditions, and so on.

Time series models used for forecasting include decomposition models, exponential smoothing models and ARIMA models. These models are discussed in Chapters 6, 7 and 8, respectively.

Predictor variables are often useful in time series forecasting. For example, suppose we wish to forecast the hourly electricity demand (ED) of a hot region during the summer period. A model with predictor variables might be of the form \[\begin{align*} \text{ED} = & f(\text{current temperature, strength of economy, population,}\\ & \qquad\text{time of day, day of week, error}). \end{align*}\] The relationship is not exact — there will always be changes in electricity demand that cannot be accounted for by the predictor variables. The “error” term on the right allows for random variation and the effects of relevant variables that are not included in the model. We call this an explanatory model because it helps explain what causes the variation in electricity demand.

Because the electricity demand data form a time series, we could also use a time series model for forecasting. In this case, a suitable time series forecasting equation is of the form \[ \text{ED}_{t+1} = f(\text{ED}_{t}, \text{ED}_{t-1}, \text{ED}_{t-2}, \text{ED}_{t-3},\dots, \text{error}), \] where \(t\) is the present hour, \(t+1\) is the next hour, \(t-1\) is the previous hour, \(t-2\) is two hours ago, and so on. Here, prediction of the future is based on past values of a variable, but not on external variables which may affect the system. Again, the “error” term on the right allows for random variation and the effects of relevant variables that are not included in the model.

There is also a third type of model which combines the features of the above two models. For example, it might be given by \[ \text{ED}_{t+1} = f(\text{ED}_{t}, \text{current temperature, time of day, day of week, error}). \] These types of “mixed models” have been given various names in different disciplines. They are known as dynamic regression models, panel data models, longitudinal models, transfer function models, and linear system models (assuming that \(f\) is linear). These models are discussed in Chapter 9.

An explanatory model is useful because it incorporates information about other variables, rather than only historical values of the variable to be forecast. However, there are several reasons a forecaster might select a time series model rather than an explanatory or mixed model. First, the system may not be understood, and even if it was understood it may be extremely difficult to measure the relationships that are assumed to govern its behaviour. Second, it is necessary to know or forecast the future values of the various predictors in order to be able to forecast the variable of interest, and this may be too difficult. Third, the main concern may be only to predict what will happen, not to know why it happens. Finally, the time series model may give more accurate forecasts than an explanatory or mixed model.

The model to be used in forecasting depends on the resources and data available, the accuracy of the competing models, and the way in which the forecasting model is to be used.