Calender Effect in Forecasting

Hi All,

As we know last weekend was long holiday for US. Our sales was also affected.  So for forecasting can anybody suggest how to include calendar effects in ARIMA model using SPSS or any other way.  I created one event variable and put ‘1’ for event and zero for others and took this as the independent variable in the ARIMA Model.

My model is giving best fit for past months (April, May, June) but this month july it is not going good. I took two years data day wise and I have to forecast for july daywise.  Please help…..

8 thoughts on “Calender Effect in Forecasting

  1. It was great support for me. I am working on the suggested model and discussion is going on. Once we will reach the final result, I will post the procedure to solve this problem.

    Thanks & Regards
    Anuradha

  2. Tom,

    This is awesome. I am sure you have anonymized the results here without mentioning the name of the company or the product. I appreciate this response.

    Anuradha,

    Hope you found the above useful. Please post your comments back on the blog or on the Linked-In group if this has helped you progress in your work.

    I recommend you join the DemandPlanning.Net Linked-In group as well.

    regards,
    Mark

  3. The data is very rich and filled with patterns. Below are the results from the Autobox modelling process. There is almost 4 years of daily data that were analyzed. The data is from the U.S.

    As for trying to understand the model, let me try and describe it for you. When you see “B**-3” in the model for the Christmas variable it means that Autobox has identified a lead effect that occurs 3 days before Christmas where volume is found to be lower by 3,849. There was also a lead identified for 2 and 1 days before Christmas and the day of Christmas are significantly lower than usual.

    You will see that when a holiday falls on a Friday, Autobox detected an impact where volume is higher the Monday after by 2,372.

    9 monthly dummy variables made it in the model reflecting strong seasonality(ie MONTH_EFF01). June and July had the highest levels which makes sense for this type of data which will remain confidential. The first data point is 1/1/2008.

    All 6 day of the week dummy variables (ie FIXED_EFF_N10107 is the first day of the week) were found to be important. 1/1/2008 is a Tuesday so Mondays are the highest day of the week (note the intercept is Monday’s level). The FIXED_EFF_N10107 represents a Tuesday and it is the 2nd highest day of the week as its decrease is the least of all the daily impacts.

    Autobox found 5 changes in level. We will described one of them for you. (ie +[X28(T)[(+ 5422.1 )] :LEVEL SHIFT 157/ 1 12/28/2010 I~L01093__010108) A level shift is a dummy variable where it is a 0 for all time periods before period 157/1 and then a 1 for all time periods starting at 157/1 and on.

    Autobox found 6 changes in level. We will described one of them for you. The day of the week pattern changed for Mondays starting on May 10, 2010 with an increase in volume of 3,291 (123/7 means week 123 day 7) (ie +[X27(T)[(+ 3291.2 )] :SEASONAL PULSE 123/ 7 5/10/2010 I~S00861__010108

    Autobox found about 30 one time interventions. We will described one of them for you. (ie +[X50(T)[(- 5699.5 )] :PULSE 162/ 6 2/ 6/2011 I~P01133__010108) These need to be adjusted to where they should have been so that the model can be identified. the goal is to separate signal from noise and this is noise. The level shift, seasonal pulse are more about adapting to changes in the data.

    AUTOMATIC FORECASTING SYSTEMS
    HATBORO PA 19040
    215-675-0652
    VERSION: 07/18/2011 10:55

    Y(T) = 25214.
    +[X1(T)][(- 3849.4 B**-3- 6717.9 B**-2- 13109. B**-1
    – 12588. )] M_CHRISTMAS
    +[X2(T)][(- 4431.9 B** 2)] M_GOODFRIDAY
    +[X3(T)][(- 3393.7 )] M_HALLOWEEN
    +[X4(T)][(- 9863.1 – 6806.3 B** 1)] M_JULY4TH
    +[X5(T)][(- 6864.9 B**-1- 6640.6 + 2845.0 B** 1)] M_LABORDAY
    +[X6(T)][(- 4162.5 B**-2- 9945.5 B**-1- 11929.
    + 2144.2 B** 1)] M_MEMORIALDAY
    +[X7(T)][(+ 2133.2 B**-3- 8762.5 B**-1- 9985.6 )] M_NEWYEARS
    +[X8(T)][(- 2332.3 )] M_STPATRICKS
    +[X9(T)][(- 3782.8 )] M_STVALENTINES
    +[X10(T)[(- 6843.6 B**-1- 14000. )] M_THANKSGIVING
    +[X11(T)[(+ 2372.7 )] MONDAY_AFTER
    +[X12(T)[(+ 3618.9 )] MONTH_EFF01
    +[X13(T)[(+ 2503.8 )] MONTH_EFF02
    +[X14(T)[(+ 2697.8 )] MONTH_EFF03
    +[X15(T)[(+ 1893.6 )] MONTH_EFF04
    +[X16(T)[(+ 3259.4 )] MONTH_EFF05
    +[X17(T)[(+ 4023.1 )] MONTH_EFF06
    +[X18(T)[(+ 5413.1 )] MONTH_EFF07
    +[X19(T)[(+ 3590.7 )] MONTH_EFF08
    +[X20(T)[(+ 1482.1 )] MONTH_EFF09
    +[X21(T)[(- 3874.7 )] FIXED_EFF_N10107
    +[X22(T)[(- 6180.7 )] FIXED_EFF_N10207
    +[X23(T)[(- 9065.5 )] FIXED_EFF_N10307
    +[X24(T)[(- 12644. )] FIXED_EFF_N10407
    +[X25(T)[(- 13628. )] FIXED_EFF_N10507
    +[X26(T)[(- 9287.8 )] FIXED_EFF_N10607
    +[X27(T)[(+ 3291.2 )] :SEASONAL PULSE 123/ 7 5/10/2010 I~S00861__010108
    +[X28(T)[(+ 5422.1 )] :LEVEL SHIFT 157/ 1 12/28/2010 I~L01093__010108
    +[X29(T)[(+ 12330. )] :PULSE 158/ 2 1/ 5/2011 I~P01101__010108
    +[X30(T)[(+ 4338.2 )] :LEVEL SHIFT 135/ 7 8/ 2/2010 I~L00945__010108
    +[X31(T)[(+ 2683.5 )] :SEASONAL PULSE 125/ 1 5/18/2010 I~S00869__010108
    +[X32(T)[(- 1383.9 )] :LEVEL SHIFT 115/ 1 3/ 9/2010 I~L00799__010108
    +[X33(T)[(+ 3450.8 )] :LEVEL SHIFT 72/ 7 5/18/2009 I~L00504__010108
    +[X34(T)[(+ 1634.4 )] :SEASONAL PULSE 127/ 2 6/ 2/2010 I~S00884__010108
    +[X35(T)[(+ 8455.9 )] :PULSE 158/ 1 1/ 4/2011 I~P01100__010108
    +[X36(T)[(- 11028. )] :PULSE 180/ 6 6/12/2011 I~P01259__010108
    +[X37(T)[(+ 3995.9 )] :PULSE 48/ 7 12/ 1/2008 I~P00336__010108
    +[X38(T)[(- 5103.7 )] :PULSE 123/ 6 5/ 9/2010 I~P00860__010108
    +[X39(T)[(- 1820.1 )] :SEASONAL PULSE 122/ 5 5/ 1/2010 I~S00852__010108
    +[X40(T)[(+ 5750.0 )] :PULSE 80/ 7 7/13/2009 I~P00560__010108
    +[X41(T)[(+ 2358.8 )] :LEVEL SHIFT 8/ 4 2/22/2008 I~L00053__010108
    +[X42(T)[(+ 4139.6 )] :PULSE 166/ 6 3/ 6/2011 I~P01161__010108
    +[X43(T)[(+ 5024.0 )] :PULSE 81/ 1 7/14/2009 I~P00561__010108
    +[X44(T)[(- 8005.2 )] :PULSE 13/ 4 3/28/2008 I~P00088__010108
    +[X45(T)[(- 4230.7 )] :PULSE 56/ 7 1/26/2009 I~P00392__010108
    +[X46(T)[(- 4668.5 )] :PULSE 129/ 6 6/20/2010 I~P00902__010108
    +[X47(T)[(+ 3724.0 )] :LEVEL SHIFT 40/ 2 10/ 1/2008 I~L00275__010108
    +[X48(T)[(- 4348.2 )] :PULSE 175/ 6 5/ 8/2011 I~P01224__010108
    +[X49(T)[(- 3875.2 )] :PULSE 161/ 4 1/28/2011 I~P01124__010108
    +[X50(T)[(- 5699.5 )] :PULSE 162/ 6 2/ 6/2011 I~P01133__010108
    +[X51(T)[(+ 4803.0 )] :PULSE 166/ 5 3/ 5/2011 I~P01160__010108
    +[X52(T)[(+ 4595.8 )] :PULSE 179/ 3 6/ 2/2011 I~P01249__010108
    +[X53(T)[(+ 5542.4 )] :PULSE 75/ 7 6/ 8/2009 I~P00525__010108
    +[X54(T)[(- 3874.8 )] :PULSE 57/ 6 2/ 1/2009 I~P00398__010108
    +[X55(T)[(+ 6330.4 )] :PULSE 137/ 7 8/16/2010 I~P00959__010108
    +[X56(T)[(+ 6900.2 )] :PULSE 157/ 7 1/ 3/2011 I~P01099I~P 0000
    +[X57(T)[(+ 4798.7 )] :PULSE 177/ 7 5/23/2011 I~P01239I~P 0000
    +[X58(T)[(- 6603.5 )] :PULSE 163/ 7 2/14/2011 I~P01141I~P 0000
    +[X59(T)[(- 6131.3 )] :PULSE 110/ 6 2/ 7/2010 I~P00769I~P 0000
    +[X60(T)[(- 7449.0 )] :PULSE 181/ 6 6/19/2011 I~P01266I~P 0000
    +[X61(T)[(+ 5430.0 )] :PULSE 80/ 1 7/ 7/2009 I~P00554I~P 0000
    +[X62(T)[(- 5853.1 )] :PULSE 161/ 7 1/31/2011 I~P01127I~P 0000
    +[X63(T)[(+ 6124.0 )] :PULSE 77/ 7 6/22/2009 I~P00539I~P 0000
    + [(1- .538B** 1)]**-1 [A(T)]

  4. Thank you all, specially Tom for his valuable suggestions. The article you send is very mathematical and I want to forecast using a software. Also I tried many articles on this issue but the available articles talk in formula or equation terms not in software terms. So really not much useful however your suggestion has worked but result is not coming as expected from SPSS. So looking forward for Autobox results…

  5. Mark,

    It is our opinion to create a separate holiday variable for each holiday so yes memorial and labor would have separate dummy variables.

    I wasn’t clear that the monday after and friday before a weekend are an “effect” measured across all holidays and not just for July 4th. In this sense it is a ‘portmanteau’ test.

    Yes, you should be working with 3 years of data when dealing with this type of analysis.

  6. Tom,

    Great comment. thanks for posting. I have a feeling Anuradha is calling her transfer function as ARIMA model because that is what SPSS calls it.

    How about using the same intervention for both Memorial Day and Labor Day? In your view, will this produce better results?

    Also, if you want to model some of the effects you are talking about including the holiday on Monday or weekends versus mid week, don’t you need multiple years worth of data to differentiate this? How far do you need? If you have daily data say for three or four years, will this do?

    In three or four years worth of data, 4th of July could have occurred twice on the weekend days and twice in the mid-week? Will the effect be almost insignificant here?

    regards,
    Mark

  7. Anuradha,

    We recommend using daily data to model calendar effects (ie holidays, promotions). We also recommend that you DON’T use a dummy variable to account for all your events as they tend to behave very differently so you want them to be able to clearly represent each event. Using one variable for all holidays is a really improper way to do handle this modeling and software like SPSS will let you this at your peril.

    As for modeling calendar effects, you should be modeling this using a transfer function model and not just ARIMA. You should be including “fixed effects” such as day of the week dummies, week of the year dummies and individual promotion and holidays as separate variables. In addition, you need to look out for fixed days of the month and when Holidays land on fridays, weekends ormMondays as they have a different impact then they do on Tuesdays, Wednesdays and Thursdays. This is called a “bridge day” and you can see more on this here http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBgQFjAA&url=http%3A%2F%2Fwww.nvc.vt.edu%2Flmili%2Fdocs%2FChakhchoukh_Mili_Robust%2520Short-Term%2520Load%2520Forecasting%2520using%2520projection%2520Statistics.pdf&ei=rusdTrfyJOe50AHskIWuBw&usg=AFQjCNE-jYozo_rC1WVQx2xv781w3rJ93Q

    We have implemented all of the above automatically in our software Autobox. We would be happy to take a look at your data and send you the model/forecast so you can benchmark vs. your model.

    Tom Reilly
    VP Sales
    215-675-0652

Leave a Reply

Your email address will not be published. Required fields are marked *

*