top of page

Unlocking the Power of Big Data in Sales Forecasting: Insights from Abhishek Borah & Oliver Rutz

  • Writer: V. Burbulea
    V. Burbulea
  • 4 days ago
  • 5 min read

Written by Veronica Burbulea, Ph.D. candidate at the University of Groningen (The Netherlands)


Passion for music and research brought Abhishek Borah (INSEAD) and Oliver Rutz (Foster School of Business, University of Washington) together. They met during their PhDs in Los Angeles, later worked together at the Foster School of Business at the University of Washington in Seattle, and continued collaborating after Abhishek moved to INSEAD in France. I had the opportunity to discuss their project with them: Enhanced sales forecasting model using textual search data: Fusing dynamics with big data, which was published in the December 2024 issue of IJRM. Their paper illustrates how a new data source can be applied to an existing marketing challenge and improve predictive accuracy.




 

How did everything start?


Many studies have examined Google Trends, a rich source of information, but mainly at an aggregate level. In brainstorming over a beer, Abhishek and Oliver realized the potential of online competitive search data to enhance sales forecasting. Traditionally, these models relied on a firm’s own sales data and a limited set of competitors. By incorporating a more complete set of online competitive search data, they saw the potential to take forecasting to another level.  



Tackling the large p, small n problem


An early challenge was incorporating data on the full set of competitors in an analysis without aggregating it. Their competitive search data would result in hundreds of covariates for each car model analyzed. Using monthly data, that would require 97 years of sales data to satisfy a statistical issue known as the "large p, small n" problem. To address this, Oliver and Abhishek innovated a method that retains all the “important” covariates in the model while shrinking the less important ones towards zero. In simple terms, it works as a “stretchable fishing net that retains all the big fish”. With this adjustment, the forecasting model fits and predicts sales better than models not leveraging the full competitive, consumer search data.


Consumer behavior has drastically changed


Consumer search behavior has evolved significantly over the past two decades because of the explosion in choices and information sources. Forecasting models that consider only four or five competitors fail to capture the complete picture. In the past, car buyers relied on dealers, word of mouth, or car magazines, often confined to a single category, like sedans. Today, online searching makes it easy to explore alternatives across automobile categories. This shift is now a crucial factor often overlooked in sales forecasting. The actual competitors a product faces are greater than one might assume.


"In the old days, searching for alternatives was costly. Nowadays, with the internet, the process of how people search for products has drastically changed. Search costs are much smaller.”

-  Oliver Rutz


Is forecasting still relevant?


Despite the technical nature of the paper, Abhishek and Oliver’s biggest challenge was convincing reviewers that sales forecasting remains relevant in the automotive industry. Conversations with car company managers confirmed that forecasting remains a critical business practice.


"We received fewer arguments about our model choice. Surprisingly, the most controversial aspect of the review process was something we assumed would be non-controversial: the relevance of sales forecasting.”

-  Abhishek Borah


 

Forecasting & AI


The forecasting model developed by Oliver and Abhishek, using Google Trends to determine competitors, outperforms all other models that rely on sources like Wards, CarGurus, Cluster, and ChatGPT. While they believe that GenAI and large language models (LLMs) will eventually improve to provide a full set of competitive data, that time has not yet arrived. Asking ChatGPT to provide competitive data typically results in generalized relationships between car models rather than an accurate competitive landscape. Similarly, while AI excels at generating R code or Python code for model development, it struggles with consistency in data analysis, making it unreliable for now.


This unreliability in AI-driven data analysis highlights its limitations. While AI holds great potential, it still faces challenges in providing accurate, consistent, and actionable insights. As Oliver pointed out:


"Will there be a future where GenAI does everything by themselves? Probably yes. Will we all be unemployed in this future? Probably yes. Is that a good future? Probably no.”

-  Oliver Rutz

 

Is bigger always better? The reality of big data


Big data is not inherently valuable; its usefulness depends on what makes it "big." If millions of consumers behave similarly, a small sample often suffices. However, if the data’s richness lies in its diversity—such as hundreds of behavioral measures—then scale adds meaningful insights. This insight is particularly relevant to Oliver and Abhishek's forecasting model. While it incorporates a large amount of competitive search data, the diversity and granularity of this data—capturing a wide range of competitor information— truly enhances its predictive power. Researchers should focus on what differentiates large datasets from smaller ones. In the case of Oliver and Abhishek’s work, it is not just the quantity of data, but its diversity, that drives improved forecasting. Proper modeling requires understanding what aspect of data scale truly enhances forecasting.


Read the paper

Interested in reading all the details about using textual search data to enhance sales forecasting models? Read the full paper here.


Want to cite the paper?

Borah, A., & Rutz, O. (2024). Enhanced sales forecasting model using textual search data: Fusing dynamics with big data. International Journal of Research in Marketing, 41(4), 632-647.


Meet the authors

Abhishek Botah

Associate Professor of Marketing at INSEAD, France


If you would not be a marketing researcher, what would you be?

“Without any doubts, I would be a musician.”


What is the best advice you have ever received, and how has it influenced your career or life?

“Writing is a much more important part of our work than we often realize. My advisor is an excellent writer, and he advised me to keep things simple. I used to read and write very complex novels before, but now I focus to keep it simple and make it simpler, no matter how complex the ideas may be.”



Oliver Rutz

Marion B. Ingersoll Professor of Marketing, Foster School of Business, University of Washington, USA


If you would not be a marketing researcher, what would you be?

“I would either be a professional skier or a professional golfer, but I am not good enough at both things to make any money."


What is the best advice you have ever received, and how has it influenced your career or life?

“Academic writing is quite strange. Don Morrison at UCLA once told me that a good paper does three things: 'Tell them what you will tell them, tell them, tell them what you told them.' That structure, with its built-in repetition, took me a while to fully appreciate, but it’s really the foundation of a well-structured paper.


My advisor also had a useful framework, a two-by-two matrix, that categorized papers based on substantive and methodological contributions. You can get published with one or the other but having both is ideal. Over time, I started evaluating research ideas through this lens: Is it substantively new? Does it introduce a novel method? If a project didn't score well on either, I learned to walk away early.”


This article was written by

Veronica Burbulea

Ph.D. candidate at the University of Groningen (the Netherlands)


 
 
 

Comments


Subscribe to Our Newsletter

Thanks for submitting!

©2023 by IJRM. Proudly created with Wix.com

bottom of page