Predictive Analytics

Life in a company would be much easier if you could predict the future. As a manager in a company, you often suspect that you could actually foresee upcoming events based on existing data, but you just don't know how.

Examples:

Marketing campaigns via an advertising letter to the broad mass of potential customers often achieve only minimal response behaviour (response). If you only knew and wrote to perhaps 10% of those customers who respond positively not just with a probability of 0.1%, but with a 2% certainty, you would have saved 90% of the costs, increased the success rate from a total of 0.29% to 2% and lost only 9 out of 29 churning customers.
Next best offer in the web shop: «Customers who bought this product were also interested in the following products: ...»
Drug development: A drug works, but only in certain patients. It is assumed that genetic factors play a role. The question is, which factors?
Customers are migrating to the competition. Does their behaviour foreshadow this so that you can counteract it if you recognise it in good time?
Fraud detection: Insurance companies are defrauded with fictitious claims. Identifying these cases is very time-consuming. Can this be narrowed down to a small number to be analysed using pattern recognition in order to limit the effort required for the final decision?
Predictive Maintenance: A company has many identical or similar devices, machines or vehicles in use that can provide measurement data. These devices need to be serviced, ideally before they fail. Can the measurement data be used to predict that a failure can be avoided if maintenance is carried out soon (predictive maintenance)?

Artificial intelligence here means pattern recognition, also known as “data mining”. In everyday life, this is known as character, language or image recognition and translation programs into other languages. However, pattern recognition can also be applied to data for forecasting purposes, as is the case in many companies. However, this is still very little used, even though it is much simpler and requires less “training” than image recognition, for example. When data is evaluated with a view to making predictions, this is referred to as “predictive analytics”. - In contrast to traditional evaluations, which use historical data to explain what has happened in a company or why it happened. Forecasts are often more helpful than mere explanations of the past.

There are already many tools for predictive analytics in the IT world today, both open source and commercially licensed. The aim is always to automatically create a forecast model with the help of such a tool. The approach is to use data for which it is already known what the result will be or has been. This data set is divided into a data set for the creation or training of a model and a set that is used to verify the forecast output of the trained model. The data is split, for example, in a ratio of 4:1 or 9:1 or similar. Once the model has been verified, it can be applied to new data for which the result (the forecast) is not yet known. As long as the circumstances do not change, the quality of the forecast will remain the same.

It is important to be aware that a model can never produce absolutely valid forecasts, but only predictions that apply with a certain probability. The trained models are able to show this probability.

Data preparation: Before creating a forecast model, the data must be found, understood, merged and cleansed. Sometimes data must also be normalized (convert all foreign currency amounts into CHF) or values translated (0 = No, 1=Yes). During cleansing, attention is paid to inconsistent, incomplete or incorrect values or “outliers”. For repeated tests or forecasts, these steps are automated, if possible with the same tool as for model creation or then with an ETL tool (ETL = extract, transform, load data).

Most tools offer different models. Depending on the purpose of the forecast and the type of data, one or more models may be used. If several models come into question, you can also work with several at first and then choose the one that delivers the best results.

The available models can be divided into different types:

How they work or decide
the nature of the underlying data and forecast values
what is actually predicted

Details of the 3 type divisions:

There are 3 types of models:

Those that work with statistical methods such as linear regression.
Then others who find the decisive characteristics in the data and create decision trees from them (cars are more likely to be bought by men aged 25 to 50, but occupation does not play a role).
Neural networks that show probabilities for the forecast quality but cannot provide an explanatory component. The disadvantage of such models is therefore that it is not possible to understand which data characteristics the model used to arrive at a particular forecast.

With regard to the underlying data and the forecast values, the models can be divided into those that can handle continuous decimal values or only discrete values such as 1,2,3 etc. or “red”, “green”, “blue”. Some models tolerate missing values in the initial data, others do not.

The forecast value is often a Boolean value, i.e.«yes» or «no», but can also be a value on a continuous scale.

The models can also be grouped according to the «what»:

Yes/No forecasts
Associations (if A bought, then also B)
Cluster formation (groups with similar characteristic values)

Contact us

Name *

Email *

Message

I agree with the privacy policy

send