CEP Outline

The Complex Event Processing Blog

UNDER CONSTRUCTION

Contents

Introduction


Complex event processing (CEP)is an emerging network technology that creates situational knowledge from distributed message-based systems, databases and applications in real time or near real time. CEP can provide an organization with the capability to define, manage and predict events, exceptional conditions and opportunities in complex, heterogeneous networks. It is envisioned that advancements in CEP will help advance the state-of-the-art in end-to-end visibility for operational situational awareness in many scenarios. These scenarios range from network management to business optimization, enhanced situational knowledge, increased business agility, and faster, more accurate fraud detection and regulatory reporting capabilities.


Emerging Business Applications for Complex Event Processing Applications Algorithms

In this section we examine the application of various mathematical techinques for multisensor data fusion to emerging business event correlation and CEP requirements.


Business Event Applications


Business Event Causality



Risk Mitigation


Opportunistic Trading


Sensor Networks


Complex Transactions


Fraud Detection


Fraud is a serious, costly, and growing social epidemic for both business and government. The Washington Post recently printed a news article that pointed out that credit card fraud costs the industry about a billion dollars a year, or about 7 cents out of every $100 spent on plastic. But that is down significantly from its peak about a decade ago, in large part because of event-based processing that can recognize unusual spending patterns. Market and credit risk systems are repositories with rich sources of data, but operational risk is often difficult to quantify. Cases of employee fraud which can take down a powerful global bank, such as Nick Leeson at Barings Bank, who caused $1.3 billion USD of losses from illegal trading, could have been prevented with event-driven fraud detection architectures.

Getting accurate figures for the scale of fraud in the telecommunications industry can be quite difficult. According to New Science Magazine, more than 15,000 mobile phones are stolen each month in Britain alone and Swedish cellphone maker Ericsson has said that the fraudulent use of stolen mobile phone means a loss of between two and five per cent of revenue for the network operators.

(more to come...)


Regulatory & Policy Compliance Monitoring


Intrusion Detection


Network Management


In terms of event processing and correlation, network management experts often define types of correlation in specific terms so they can compare and level vendors products, as well as work towards common languages in developing new correlation. With this in mind, all network management messages start out as raw information. The system designer does not know what is an event until the architecture performs some level of categorization and/or normalization. This software refinement process enables network management architects to build software processes that translate transitionary raw information into triggers / events.


According a leading network management expert at Cisco, the categories he uses are:


  • Event - Basic categorization and normalization. Event deduplication.
  • Alarm - Translation of one or more events into something more problem specific
  • Device - This trigger is associated with a device or a sub-component of a device.
  • Service - This event / alarm is associated with this specific service. (ie Oracle DB Service)
  • System - This event / alarm is a function of a given business system. (ie the companies' accounting system)
  • Performance - Identifies and associates the event to the performance stature of a given managed object.
  • Predictive - This event or event sequence is part of a pattern that signifies a future failure given a probability of certainty.
  • Effectual - This given event horizon signifies an effect on other managed objects (MOs) given a probability of certainty.


There are also a couple of time domains to be considered.

  • Fault Management - Near Real Time
  • Post Fault Analysis /Historical

Predictive correlation may be accomplished by recognizing a specific event horizon, then building a pattern list sample of a predetermined time frame prior to the significant event. When this is accomplished over the course of multiple samples, events that occur continuously or in majority are depicted using a specified ratio. The higher the ratio of occurrence, the higher the probability of distribution. Ratios become much more interesting when network managers can characterize the relationships between the MO in the Event Horizon (EH) to the Sampled Event (SE). In effect, architects can repeat various ratios based upon relationships like: same node, nodes separation from the comparing events node, or event model types comparison.


Spam Filtering


Mathematical Techniques


Classical Inference


Complex event processing architects assist business domain experts with recognizing and formulating statistical problems in day-to-day business decision-making. In this section we discuss traditional classical inference procedures and the basic statistical concepts that are often essential to the interpretation of business data. Classical inference topics include: discrete and continuous random variables, sampling, confidence intervals, hypothesis testing, and linear regression


Bayesian Networks

Bayesians networks are a statistical approach presented by a mathematician, Thomas Bayes in 1763, used for calculating probabilities among several events that are causally related but for which the relationships cannot easily be derived. Bayes provided a mathematical tool that combined a prior knowledge with current data to produce a posterior probability distribution. Bayesian networks are directed graphs that organize complex events, or any body of knowledge, by mapping out cause-and-effect relationships among key events and encoding the events with values that represent the probability to which one event is likely to affect other events. This approach allows network architects to combine new data with their existing knowledge or domain expertise.


Decision-making using Bayesian networks has many application [1]. One of the most visible example is Microsoft's Office Assistant. Bayesian networks are used by the Office Assistant to to examine recent user events in a near-real time attempt to understand what the user is trying to do, constantly being updating as new user events occur. Many intrusion and fraud detection experts acknowledge that Bayesian networks are a key technology to faciliate complex event processing. Bayesian networks are at the heart of many decision support methodologies created to solve the complex problems. The Bayesian network approach has it roots in symbolic and statistical artificial intelligence and in data mining. Bayesian Networks are well established as a leading methodology to provide a distributed knowledge representation that deals particularly well with uncertainty.


For example, Bayesian networks have been used to rapidly detect distributed network attacks allowing for a generalization of network-based intrusion detection. Learning agents, critical for complex event processing, may deploy Bayesian network approaches to help determine Internet fraud threats. One of the main advantage of having probabilistic capabilities as a service in fraud and intrusion detection architectures is the ability to adjust our detection sensitivity. This capability allows fraud detection architects to trade off between accuracy and sensitivity by by adjusting probability thresholds. The automatic detection of network-based events by dynamic learning provides a way to train complex event processing algorithms to distinguishing betwen normal and fraudlent activities.

Bayesian analysis is used to help network security expert visualize the contribution of causal information in the detection model and how this information relates to the target assets. It is also possible to use Baysian network supervised learning. For example, the Markov Blanket Learning algorithm allows focusing a search for fraudulent activities only toward the events that best characterize the target of fraud or other criminal activity.

Researchers have shown that the Markov Blanket Learning algorithm can improve the precision of traditional fraud detection algoritms [2]. Some studies have shown that Bayesian network perform better in fraud detection than both Decision trees and Neural Networks. The supervised Markov Blanket Learning algorithm constitutes also a very powerful tool for selecting the interesting variables (cf. also the Microarrays analysis). In that study for example, it has allowed focusing the analysis on just 21 variables that were really relevant for the characterization of the fraud, among the 224 variables that were available.



Dempster-Shafer Method


Generalized EPT


Heuristic Methods


Application and Case Studies

RETE Algorithm

References


  • Joseph T. Wells, Corporate Fraud Handbook: Prevention and Detection, ISBN 0-471-49121-7, John Wiley & Sons, 2004.

See also