However, in the case of using a metabolite pattern or profile as the indicator of a specific physiological state, the data processing and analysis must work in a predictive
way so that this pattern or profile can be verified in new samples, and thus work as a diagnostic tool. ‘Predictive’ in this case means a processing algorithm that can efficiently detect and quantify metabolites in the generated Inhibitors,research,lifescience,medical reference table in independently analyzed samples. To obtain an efficient screening of large sample sets where the aim is to acquire data for all samples, the key issue will be the data processing step. A sophisticated processing of GC/MS data, such as curve resolution, [14,15,16] is time-consuming, which makes it not feasible to process large sample sets. However, Inhibitors,research,lifescience,medical the
benefits of such a data processing that can provide a reliable metabolite quantification and identification for further sample comparison and biological interpretation do present an incentive to solve this problem. One way of doing this could be to use a fast and crude data processing technique that still retains the variation in the data and then based on that data, select a representative subset of samples for the more sophisticated processing, i.e., generation of a reference table of putative metabolites. Again, a key here is for the sophisticated processing to work predictively for new samples. If this is the case, then the Inhibitors,research,lifescience,medical samples not selected for processing, as well as additional samples measured at a later point in time, can be predictively processed to detect and quantify the metabolites in the reference table. GC/MS has proven to be a valuable tool for the global detection Inhibitors,research,lifescience,medical of metabolites
in biofluids and tissues [17,18,19,20]. This is mainly due to the combination of high sensitivity and reproducibility, but Inhibitors,research,lifescience,medical is also due to the fact that identification of detected compounds is relatively straightforward. Metabolomic GC/MS data usually requires some type of pre-processing before multiple sample comparisons and compound identifications can be carried out. This can be achieved by applying a methodology called curve resolution, or deconvolution, to the data. By the introduction of multivariate curve resolution (MCR)[16], multiple samples could be resolved to generate a common set of descriptors suitable for comparison using, for example, multivariate Brefeldin_A data analysis. A further development of MCR, done in our lab, named hierarchical-MCR (H-MCR)[21], allows complex GC/MS data, as generated within metabolomics, to be resolved into its pure components. An extension to the H-MCR method made it possible to perform the curve resolution predictively [22]. By combining the H-MCR processing with multivariate data analysis, a strategy is obtained for multivariate data processing and analysis, which is efficient for highlighting patterns of resolved and identified metabolites systematically co-varying over multiple samples [23,24,25].