Measuring multivariable controller performance: how to tell if your multivariable controller is doing a good job

Jul 16, 2015

Multivariable controllers have been in use in manufacturing and production systems for many years. Multivariable controllers typically cost from $60,000 to more than $500,000, and they can deliver savings that are many times their cost. Yet the full benefits of these controllers are often not realized. Even worse, manufacturing sites may be completely unaware that the performance of the controllers is subpar. This article describes effective ways to measure and improve the performance of these powerful advanced controls.

Introduction to multivariable controls

Multivariable control technology has two benefits: reduced variability and operation closer to constraints. Reduced variability in the production process translates to energy and raw material savings, improved quality, and fewer production losses related to process trips. It is also the precursor for driving the production process closer to constraints to achieve greater overall production efficiency and profitability.

Multivariable controller performance issues come from four general areas:
• Multivariable controller implementation
• Regulatory layer controls
• Operator actions
• Process changes and disturbances
To ensure the performance of the full multivariable control system, we must have metrics to detect and resolve each of the performance issues.

Measuring performance

Assessing the performance of a multivariable controller requires producing metrics that reveal problems, impairment, and loss of benefits.

Metrics for the controller itself

One early metric applied to multi-variable controllers is “time in service” (on or off). This metric has proven to be of limited use, because functionality can be significantly impaired while the controller is still technically on. Measuring performance this way is akin to assessing an individual’s performance based on how long the office light is on. Do not discard this metric completely, however; a low time in service is usually not good and should instigate an investigation to get to the root cause. But, if there are no supporting metrics to help set the direction of that investigation, it could be time consuming, involving operator interviews and trend analysis. Good supporting metrics help analysts get to the root cause sooner. Good supporting metrics are more important in the opposite scenario, where the multivariable controller has high time in service, but other impairments exist.

Supporting metrics can be high-level metrics or detail metrics. A high-level metric is one that flags a performance issue. It can be a combination of several detail metrics. The previously discussed time-in-service metric can be made more useful by rolling the state of all the controller variables into the metric. If a low time-in-service condition occurs, it is easy to identify the controller variables that are responsible. A detail metric gives information on a specific behavior or condition. They are generally applied to individual controller variables; an example is “time at limit.” An important requirement of a good metric is that it registers deviation from what is normal or optimal. Considering the time-at-limit metric, it would be good to know if being at a limit was good or bad, and this can vary from variable to variable. Good metrics alert us to a performance issue without making us scan historical trends and apply personal and possibly inconsistent interpretations to the information. When a performance issue is identified, a good metric provides information that leads us to the source of the problem.

Showroom figures.jpg

Regulatory control metrics

Regulatory layer control issues consist of controller tuning changes and measurement and valve problems. A multivariable controller relies on measurements collected by the regulatory layer. If those values are erroneous or upset (e.g., oscillation caused by valve sticking), they can cause less than optimal behavior from the multivariable controller. A multivariable controller is tuned to manage a production process. Regulatory layer controller tuning is embedded in the production process model the multivariable controller uses.

Another source of multivariable controller impairment comes from operator activity. This consists of actions taken to turn variables on or off or to change limits on variables. Some limit changes restrict the controller from achieving a more optimal operating condition; others can set up infeasible conditions, severely impairing the controller. Finally, there are process changes. These can be seasonal or product quality/grade changes. They could be raw-material related: a different type of catalyst or a different grade or purity of additive. Changes that affect a process over time include heat exchanger fouling and catalyst deactivation. There are also physical changes like switching to packed internals in an originally trayed column or bypassing equipment and vessels.

Multivariable controller model inaccuracy can also lead to cycling. As discussed before, the process itself can change over time. A detail metric like controller prediction error can help get to the root cause for this type of change. This is an indication it is time to perform maintenance on the multivariable controller. Maintaining an average prediction error detail metric for each controlled variable permits the identification of the subset of variables most affected by the process changes. A trend of the average prediction error for a controlled variable provides some guidance on the direction a model update should take. For example, if the prediction error (actual –predicted) is consistently negative, this means one or more of the models affecting this controlled variable should have a decrease in gain. It may also be possible to address the over prediction with adjustments to model time constants. This may be less desirable, however, because it may necessitate additional adjustments to the overall controller time horizons.

The absolute value of the prediction error can help with two pieces of information. If the prediction error is alternating between over and under predictions, the average prediction error could end up near zero—leading us to assume there are no concerns. The average absolute value of the prediction error highlights the magnitude of the prediction error and provides an alert if the average prediction ends up close to zero. Taking the standard deviation of the prediction error is a measure of the dispersion of the error condition. It is not necessarily a bad situation when a controlled variable has a prediction error. A steady consistent prediction error causes little harm to controller performance. The opposite is true for a prediction error that is bouncing around. This condition could make it difficult for the controller to keep the variables within limits and very likely will reduce optimizing time. The prediction error standard deviation identifies this situation.

When operating in its best condition, a multivariable controller can return many times its original investment. Perhaps when the multivariable controller was first implemented there was an audit of the benefits achieved. The company made an effort to justify the original investment in the technology. As the previous paragraphs have shown, there are many ways a multivariable controller can suffer performance losses. These conditions will reduce the return on the original investment. To advocate effectively for funds to maintain the controller, point out the losses in benefits when the controller is not functioning optimally. We have discussed several technical measures to identify performance problems, but an economic indicator may well be the best tool to help justify when a company should conduct maintenance. One common way to establish the value added by the multivariable controller is to compare production profitability from a period when the multivariable controller was not in use to the current profitability while the controller is in use. Should that difference in profitability start to get smaller, investigate the reasons for that decline. When the reason for the difference is identified, use the difference in profitability to justify the expense of correcting the problem.

Measuring operator actions

Sometimes operator actions reduce the effectiveness of a multivariable controller. A good high-level metric is a count of times any variable is turned off or on, or a limit value is changed.Referencing this count against a norm alerts us to the possibility that the operators are having difficulty with the behavior of the multivariable controller. Drilling into the detailed metrics for the count values of the individual variables reveals what variable or part of the controller is of concern. Just because there was an excursion in operator activity does not mean the multivariable controller is impaired. Adding another piece of detailed information, like the available operating range of the variables, flags a situation where the operator change restricted the flexibility of the controller. Possibly the worst case is where a feasible operating point does not exist, and the multivariable controller simply saturates at several limits. A high-level metric to monitor the number of constrained variables and the individual time-at-limit metric for those variables would also flag an impaired controller.

A combination of the three metrics, number of changes, operating range, and constrained variables, provides a good screening tool to identify multivariable controllers that need attention. A person responsible for the performance of half-a-dozen or more multivariable controllers could spend quite some time looking through trends to determine if a controller is out of normal condition. It is much more efficient to have a screening tool point out specific multivariable controllers not meeting their expected performance metrics.

Measuring process changes and disturbances

The components of the production process (pumps, vessels, motors, and control systems) need to be evaluated and monitored to maintain operating performance. Bearings are lubricated, valves are repacked, and heat exchangers are cleaned to achieve their expected lifetimes and avoid sudden failures, accidents, and disruption to business. The hard or physical components of our systems generally get the care required; these are components that we can touch (bearing is too hot) or that we can see (leaks from a seal). There tends to be less attention to the soft components of the production process, because the metrics to assess condition are not as obvious. Multivariable controllers and control systems in general fall into the soft component category. It is similar to evaluating the condition of your home heating and cooling. If the thermostat setting (metric) is being achieved, is everything good? Not necessarily, the system might be turning on and off more frequently or for longer periods of time. The performance metrics need to be more sophisticated. Power consumption, outside temperature, and air flow together give a better picture of the condition of the system.

figure 1.png

Figure 1. A CV with a 22 minute cycle that was not detected and corrected for 12 hours

figure 2.png

Figure 2. CV cycle relevance detects the severe cycle.

Another important metric is oscillation identification; after all, one of the purposes of a multivariable controller is to reduce variability. Oscillation condition is a high-level metric. In general, it needs some supporting metrics to qualify if a particular cycle is a problem (figure 1). Using the amplitude of the cycle is one way to sort out small, insignificant behavior. Compare the amplitude of the cycling variable to the operating range the operator has allowed. If the amplitude is as large as the span of the operator limits, perhaps someone has overly constrained the controller, and it is just moving from lower bound to upper bound. Has the variance of that variable changed? Some multivariable controllers have an optimizing function that operates when there are free manipulated variables and no controlled variables are predicted to cross limits. This feature can drive the production process to more economically attractive operating points. If a controlled-variables oscillation is reaching operator limits, the controller must leave the optimizing mode and return to enforcing the limits of the variables. This is inefficient and could be the cause of some cycling itself (figure 2).

figure 3.png

Figure 3. Regulatory layer control valve malfunction affecting MPC MV

The period of the oscillation should also be considered. Long-period oscillation is less of a concern, and could be just due to the controller making adjustments for unmeasured disturbances. Another possible long-period oscillation is the multivariable controller responding to day-to-night temperature cycles. In this situation that cycle could be economically favorable; perhaps the operation runs at a cooling or heating constraint most of the time. The cycle is observed as the multivariable controller takes advantage of the temperature change. It can reduce energy consumption or, more likely, increase production to take advantage of the atmospheric temperature changes. A common impairment to a multivariable controller that can cause a cycle is malfunctioning valves. A regulatory layer controller or valve problem can be seen as an oscillation period that is considerably smaller than the control horizon time. These cycles would also be observable at the individual loop level. Because a multivariable controller processes multiple inputs and outputs, the effects of a hardware-induced cycle can be magnified. It is a good practice to monitor regulatory layer control loop performance to quickly sort out the origin of a regulatory layer problem. It is possible that the problem is occurring on a regulatory layer controller that is not part of the multivariable control scheme (figure 3).

Conclusion

The metrics discussed in this article provide some key insights into the health of a multivariable control application. Use the high-level metrics to identify impairment, and the detail metrics to sort the root cause of the impairment. The sooner issues are discovered and corrected, the more likely it is that the benefits obtained will be sustained. Knowing the economic penalty for a particular problem allows companies to prioritize and allocate resources.

Text by Steve Obermann.
Article originally published in InTech magazine, May/June 2015 issue.

TText originally published in 2015, and slightly updated in April 2022, due to the company name change to Valmet.