The rule engine story

"It's quite easy to bend the rule than actually evaluating it"  - Anonymous

Today for a change, I'm going to talk about rule engines. The program that scans "the matrix" and sends the alerts. I know every bit and piece of it and how it works.. But do I know  "the matrix"?

Working with rule engine, its all about expecting the unexpected... you really don't know what's in your way and you have to act on it...especially if you want to sell it to multiple P&Ls. Do you want an engineering marvel or something that actually works for you now?

Lets keep the stupid questions aside. What are the basic ingredients of a rule? What is a basic rule? A leaf condition; <attribute> <operator> <Constant>;  something like <me> <=> <outspoken>.You can create your own kitchen sink by adding AND or OR conditions with parenthesis. So what is the standard in rule syntax, forget about Alrdrin. A simple xml will do.

So now that you have the rule defined, how we are going to process it? I'm talking about the simplest rule processor. Have a RuleRunner class that has a process() method. Lets have a Condition interface (The reincarnation of the XML you saved) with an evaluate method which takes a HashMap and  returns a Boolean. Let the Leaf, OR and the AND condition inherit from it. Now you have

In the leaf condition have the logic  to compare data in HashMap to the attributes.

In AndCondition evaluate() write, condition.getLHS().evaluate() && condition.getRHS().evaluate() and in OrCondition evaluate(), condition.getLHS().evaluate() || condition.getRHS().evaluate(). Having done this much, In the  RuleRunner get the condition from rule and call the evaluate() method. Have fun in delegating the work to polymorphism.

Now you have a way to evaluate a condition on a given set of data. So how to get the relevant data; enter the dragon!!!! The most challenging part of the rule engine or associated services is to get the right set of data. There are mainly two approaches here. Get the data for the whole condition or get the data for a leaf. The key here is how much data are we talking about. Again lets take a small amount of diversified data. By this I mean the quantity of data is less but difficult to join 'em all it in one query. You will find it easy to get the data for the leaf level and build the chain up rather than getting the right set of data for the whole rule, writing n number of dirty sql joints dynamically.

What should trigger the rule? Now you have to think about the events and how to initiate the processing ; Synchronously or asynchronously. It depends on your need. Having a choice, go for asynchronous mode. Go for a JMS queue or a 'data base queue + pulling thread  model' depending on your technology stack.

How you pick up the rules to evaluate for a given event? You cant have a rule to event mapping most of the cases. Other wise consider yourself lucky. What I would advice here is, if at least one condition in the rule looks relevant for the event, pick it up.

Now its looks like life is easy, don't jump to conclusions rule evaluation is not just about returning a Boolean value, we need to know what data contributed for the alert. Um.. that means the ease of getting the data in the oops way is at the cost of filtering the data that is applicable to the rule, or 'the kite level data of interest'. So what is the best approach? it depends ... really on the kind of data you have and your team's preference of messing up with SQL or oops. Any way things are not easy. Do your home work and find the right solution works for you.

Huston we have a problem.....
Having said this much lets talk about the next issue that awaits you. The irrelevant alerts. The alerts on irrelevant data are a problem everywhere, in rocket science too. How many times you have heard emergency landing of flights  or panic moments in ground control station due to false alarms? You might have thought what a stupid system they have. At least I have thought about it that way.  But soon you will find out you are walking a very thin line; it’s all about finding a compromise between losing a valid alert and getting an unwanted alert. Do your best to fine tune it and hope your tester wont find it or  believe there is always a scope for improvement in everything and keep changing it.

We have talked about a small portion of rules here, the one with straight forward logical operators. What about arithmetic expressions and trackable rules? I'll talk about it later in my blogs.

"Testing a rule engine is tough; its tougher to fix the issues you find."  - anonymous

They Made me a java guy
What makes you Happy
The rule engine story
Why all my titles ends in Y
2007 Q1 Update
Images and website ©2006 Joseph John. All rights reserved.