Mulit-variate System of Systems Modeling & Analysis

Possible Applications of Parametric Analysis using Similarity Metrics

Pharma batch assurance, RT-qPCR CT curves, cardio, eeg signal analysis, mass spectral analysis, ensemble model error observed data analysis. 

My friend Dr. Amos Tsai (JHU Chem E) while working at Amgen and I have discussed benefit of using methods discussed below for pharma production analysis.  As an example, Amgen created Nepogen accidentally while trying to replicate Epogen formulation.

Successfully utilized in eeg analysis (alcoholic vs normal, drug intake), structural analysis, acoustic, speaker identification, aerodynamics, weather forecasting, battery managements systems and data centers. 

Organizational analysis and their key performance indicators (KPI) would also benefit using this methodology.  Also a mix of organizations, execution performance and technical performance metrics, since it is possible that organizational changes will modify functional behaviors within the technical realm.  

Below are the details associated with aircraft flight analysis.  Data utilized in this example is subset of a public NASA data resource. 

Parameters: ( 747 4 Engines ), 1..4 - utilizing Similarity Metrics in Parametric Matrix Topology 

Rationale is below the illustrations. System of system parameters are the engines and their primary interfaces.  

0-3: Engine Gas Temperature____EGT_1, EGT_2, EGT_3, EGT_4 

4-7: Fuel Flow                                FF_1, FF_2, FF_3, FF_4

8-11: Compressor Speed                  N1_1, N1_2, N1_3, N1_4 (tries to match N1C)

12-13: Control Targets                     N1C (Compressor Target), N1T (Turbine Target)

14-17: Turbine Speed___________N2_1, N2_2, N2_3, N2_4 (tries to match N1T)

18-21: Power Level  Angle_______PLA_1, PLA_2, PLA_3, PLA_4 (Human or AutoPilot)

22-25: Engine Vibration_________VIB_1, VIB_2, VIB_3, VIB_4

26: True Air Speed_____________TAS

N1C, N1T - off Diagonal similarity metric squares indicate, the engine is in non-optimal state possibly caused by bird strike that bent the compressor blades and prevents on-center synchronous speed between compressor and turbine.  

Below graphic represents system of systems analysis associated with aircraft propulsion system parameters. The colors indicate the relative associative strength between parameters.  The graphic represents a compressed time window across all parameters of interest.  As time progresses the corresponding dynamics change.

Traditional systems model based engineering analysis only checks for specific design limits being exceeded using statistical process control modeling.  Traditional modeling misses non-linear degradation between systems.




Eight Flights utilizing Similarity Metrics, similar flights can be analytically identified.

The 8th Flight in the picture below is an AOG  Airplane On Ground - bottom right. 





Validation of near normal flight, flight parameters (N1_y, y=1..4) are all synchronously similar and aligned with each other along with control target.



Validation of anomalous flight, flight parameters (N2_y, y=1..4) are all out of synch and mis-aligned, nearly all "orange" non-similar similarity metric profiles.


Multidimensional Conditional Topology

Two inboard turbine engine speeds N2_2 (X-axis), N2_3 (Y-axis) with true airspeed > 120 value .

Turbine Target displayed as labelled # conditioned on the other two parameters.  




Dashboards could show all possible pairs of engineering interest; together all these views would represent a complex data topology.  The inboard engines 2 and 3, with parameters N2_2, N2_3 values range between 60 to 90 (turbine speeds are normalized between 0 to 100). The turbine target value is the control system that should also be synchronized with turbines 2 and 3.  A turbine shaft sensor error is the likely cause for this functional behavior.  The turbine shaft speeds are measured by magnetic inductance.

In aircraft analysis, the text analysis consists of inferring state from the maintenance events (MX) and pilot logbook (LOG) observations.  In flight analysis results are interspersed with MX/LOG.  This data is usually presented within separate applications. Elasticsearch enables putting all analysis onto same timeline view, that helps humans integrate context. Chief Engineers and customer lead engineers could easily recognize beyond design envelope excursions  within operational flight data sets.  

My primary benefit to Boeing and USAF/ONR engineering platform groups has been in showing how observed data commonly called outliers are systemic that ultimately leads to better modeling and tool sets that can validate models based on operational data sets.

Rationale for methods illustrated above below:

Traditional systems modeling and analysis commonly limit observations to a narrow scope, partitioning systems into smaller functional subsystems or components. Limited parametric analysis avoids the “curse of dimensionality”; a phrase associated with combinatorial and computational complexity. Many data mining solutions reduce complexity. Common modeling methods include: linear regression,  assumptions of identical independent distributions, with sufficient number of samples, distributions tend to be modeled as a Gaussian distribution. 

The current analysis methods are predominantly utilizing a time oriented viewpoint to select features that have been chosen as significant based on models in the time or frequency domains.  Time and space are natural ordering schemes; i.e. many system parameters are defined relative to a starting time or a starting spatial reference origin.  These referential origins are approximated in real systems.   

Across many domains of study a variety of methods are utilized to help discover the underlying principles of systems under study.  Traditional groups research, design, development, simulation and testing utilizes model based engineering, physics and statistical reasoning.  Model based engineering utilizes domain experts and those experts generate mathematical models that due to real-world complexity make simplifying assumptions.  Simplifying signal processing methods discussed in the section erase essential information inhibiting finding non-optimal system states.  There are many analytic methods, including methods that often make assumptions with regard to data distributions; e.g. IID - identical independent distribution, linearity, random variables and stationarity. Taken together these assumptions permit closed form mathematical equations to be utilized. Closed form solutions are computationally efficient. 

A common method to reduce complexity reduces the number of variables to consider.  Principal component analysis (PCA) is a commonly taught data analysis method. Principal component analysis (PCA) reduces the number of parameters that are considered; dimensionality reduction eliminates computational and mathematical complexity.  PCA makes the assumption that the observed variables are linearly related.  PCA utilizes first order statistics, the mean and the variance to select parameters that contain the most variance across the observed sample set.  The variances are ordered by most to least; the top N variables with the most variance are chosen for further analysis.  Reducing the number of variables also reduces the computational complexity.   Limitations of PCA include eliminating parameters with low variance that control changes based on sinusoidal or other non-linear parameter dynamics. PCA is one of many methods utilized to reduce dimensionality. 

Current data analysis methods fit observed data into a model.  Models are characterized by first order statistics that include calculations for the sampled mean, standard deviation, skew and kurtosis.  Physics based system models further characterize an observed system by integrating design element functional models with the observed statistics.  

One branch of data driven analysis considers the observational data without relying on explicit physics based statistical models.  Instead data driven analysis defines new features within the data based on mathematical functions or complex adaptive neural networks.  Results interpretation with clear mapping between features and sampled parameter data associated with systems function and key performance indicators is difficult.  This complicates validation processes if systems are deemed critical or safety related.  

Current methods analyze time series data sets including moving averages whereby the time domain is partitioned to capture local details within a user-defined constant width window.  For periodic signals, Fourier and wavelet analysis methods transform the time domain into the frequency domain.   Linear system control theory applied to frequency domain has been utilized to control many implemented systems.

Systems often contain two or more parameters; these systems are called multivariate. Systems contain parameters defined by models or other means such as measurements from the real world or model simulations that derive parameter values.  Parametric measurements may be limited by sensor noise, response time, quantization errors and other real world factors. In some sampling systems, data samples modeled as noise are filtered out and not further considered in analytic solutions.  

Complex coupled systems including aviation flight dynamics, biological systems, financial systems, manufacturing systems, etc, are controlled and monitored by design or evolutionary principles. Current analytical methods filter, reduce, simplify, partition models, designs, functions, sensors, monitoring and controllers utilizing first order statistics and simple mathematical formulations that favor stationarity, univariate condition indicators, thresholds, identical, independent distributions, ordering samples in time domain, creating artificial features from time domain and adopting linearity in time and frequency domains.   

Performance of the various systems, subsystems and components are periodically sampled. When single parameters exceed established criteria during design, testing or operations, faults are generated.  These faults are conveyed to service and maintenance agents to help determine root cause and apply corrective actions. In complex interdependent systems it is difficult to identify actual root causes. Vehicles may report several hundred faults during daily use.  Some faults are known as nuisance faults as they may only occur during initial startup.     







Comments

Popular posts from this blog

New Wheels, Old Wheels, Threading the Needle

Bahill ( Eye Tracking & Baseball Hall of Fame ), Grad School and Systems Engineering Curricula

Portfolio Derivative Analysis