The basic custom alert definition is based on static rules. This works fine if the threshold for the exception condition is known and the data is generally consistent.
If the data is variable and changes to the pattern occur, the static rules may lead to too many or too few alerts. With machine learning, planners can set up alerts without needing to know the exact thresholds.
For example, if you define alerts for your product's sales quantity based on ABC indicator, the rules would be different for Product A (a fast seller), which could be set at 5 percent decline in sales, than for product C, (a slow seller), which could be set for 20 percent decline in sales. This would result in several different alerts to provide the same result, which is to indicate that there has been a drop in your sales quantity. With machine learning, the grouping of your ABC products can be done automatically. Similarly, it allows you to find the outliers in the groupings.
Parameters can be set to get a "healthy" amount of alerts.
Machine Learning adjusts with changing data patterns.
Users can either use ML alert definitions on their own or put them on top of existing alerts with static rules.
You can apply one of the two clustering-based algorithms – k-means or density-spaced spatial clustering of applications with noise (DBSCAN). The number of alerts can be reduced or increased. These types of rules are mainly used for detecting outliers, when you do not know in advance your threshold values, and when the data shifts over time. Additionally, if you change data, the alert definition is automatically adjusted.
The DBSCAN and k-means clustering methods are used to find and group points on a chart that are close together. They also help in identifying the outliers (isolated points in low-density regions outside the groupings).
Recommendation
When working with very large numbers of records, it is recommended to schedule the jobs to improve performance of the alerts monitor, because the operation will time out after ten minutes. When the job to retrieve outliers is scheduled, the results are stored in a buffer, which makes the results faster and easier to retrieve.
Note
Machine learning requires very intensive processing, so it is normal for your operation to take longer than when processing standard alerts.
DBSCAN
DBSCAN requires a minimum of 25 distinct records to be able to complete the clustering accurately. Where there are fewer records, the results are insignificant. For example, you have defined an alert based on PRODID, LOCID, and Day aggregation level. You need at least 25 distinct records to use DBSCAN with the PRODID as an aggregated attribute. In other words, for each product you should have a combination of records of at least five days and five locations.
The DBSCAN algorithm checks attribute groupings and performs clustering on these attributes. In addition to performing outlier determination on one key figure, it uses multiple attributes during the calculation process that make it more accurate than the k-means algorithm.
K-means
The second machine learning algorithm which can be used for defining custom alerts is the k-means algorithm. In contrast to DBSCAN it only uses key figure values and does not consider attributes. The algorithm is also used in the ABC/XYZ segmentation to find clusters based on the segmentation measure’s demand value for ABC and volatility for XYZ.
Comparison of DBSCAN and K-means
The following simple example shows the difference in identification and detection of outliers, which then are marked as an alert.
DBSCAN: The algorithm first checks the quantity key figure and then checks the ABC indicator attribute. As a result, the algorithm identifies the only value of 100 for ABC indicator A as an outlier.
K-means: Since k-means uses only numerical values in the calculations, the ABC indicator attribute is ignored by the algorithm and only the quantity key figure is considered. As a result, an outlier can’t be identified since there are multiple records with the same values.