Welcome back to our exploration of statistical methods in engineering and business research. Earlier we considered the qualitative aspects of what is referred to as descriptive statistical analysis in enabling one to make meaning out of past occurrences. Now we will discuss transition strategies from descriptive analysis to predictive modeling.
From Descriptive to Predictive Analytics
Another subclass of the analytics suite of tools is descriptive analytics, which enables a summary of historical data to provide the patterns and relationship discovered. Statistical analysis goes a step further by consolidating past data and using it to justify future occurrence. This shift enables organizations to make better decisions based on past events.
Key Differences Between Descriptive and Predictive Analytics
- Purpose
- Descriptive Analytics: Uses stored data to describe activities before the time of data collection.
- Predictive Analytics: Makes predictions of events likely to happen in the future by analyzing occurrence of similar events in the past.
- Process Involved
- Descriptive Analytics: Concerns the process of pulling together large amounts of data and analyzing them in order to extract relevant information.
- Predictive Analytics: Employs statistical and forecast approaches with directions toward prediction.
- Definition
- Descriptive Analytics: Unveils significant and valuable data through the analysis of large datasets.
- Predictive Analytics: Includes predictions of the future, very useful in determining the future actions and goals of an organization.
- Data Volume
- Descriptive Analytics: Handles large volumes of accumulated data to generate desired information.
- Predictive Analytics: Analyzes patterns from a large amount of past activity data and applies it to forecast future results based on sophisticated algorithms.
- Examples
- Descriptive Analytics: Sales figures, revenue for the company, the evaluation of the organization.
- Predictive Analytics: Supply and demand reports, consumer opinions on products and services, credit rating reports, sales and market forecasts.
-
- Accuracy
- Descriptive Analytics: Are precise and relies on history to present information in the most accurate way.
- Predictive Analytics: Provides only forecasts of what could take place but does not guarantee the actual outcomes.
- Approach
- Descriptive Analytics: Enables a reactive strategy providing information and comprehension of past events.
- Accuracy
- Predictive Analytics: Assists in taking preventative action and helps prepare for potential developments.
Figure 1: Predictive Vs Descriptive Analysis
(Courtesy: https://www.educba.com/predictive-analytics-vs-descriptive-analytics/)
Key Predictive Modeling Techniques
-
- Regression Analysis: Is used to determine coefficients defining the nature of association between variables. For example, linear regression is a type of AI aiming to forecast a dependent variable given one or more independent variables.
- Time Series Analysis: Is perfect for understanding data gathered at regular time intervals; techniques such as ARIMA model and time-series forecasting is used for financial forecasting and demand planning.
- Classification Algorithms: Are data classification mechanisms categorizing data into predetermined classes. Logistic regression, decision trees, and support vector machines are popular algorithms used to classify data sets and predict results.
- Clustering: Is related to classification but it is the opposite because it groups similar data points. As such, clustering techniques such as the k-means and hierarchical help in identifying patterns and data segmentation for market analysis and customer profiling.
- Neural Networks and Deep Learning: Are state-of-the-art techniques mimicing the structures and functionalities of the brain and can perform well on large datasets and high order patterns. Some of the areas where kernel methods are applied include image recognition and natural language processing.
Building Predictive Models
-
- Data Collection and Preparation: Collect data from the source, preprocess the data, impute missing values, and finally engineer features to make it more meaningful to the model.
- Exploratory Data Analysis (EDA): Make inferences about the nature of the underlying patterns and relationships to guide the choice of models to use as well as the way they should be tuned.
- Model Selection and Training: Select the right models, fine-tune them, and evaluate the models on a part of the dataset.
- Model Evaluation and Validation: In order to minimize errors, you might use mean absolute error (MAE) and root mean square error (RMSE). Use cross-validation methods for valid output as the interim solution.
- Model Deployment and Monitoring: Use the models for real-time predictions and revisit the models whenever there is a need to know their accuracy or effectiveness.
Conclusion
Predictive modeling builds on descriptive analysis by enabling organizations to forecast and make data-driven decisions. These include regression analysis, time series analysis, and machine learning algorithms to unlock new approaches. For our next post, we will continue to explore different types of predictive modeling by discussing various methods in detail as well as presenting case studies from the engineering and business domains, continuing to unravel the true power of quantitative analysis together.