Effect of physical activity on plasma metabonomics variation using 1 H NMR , anthropometric and modeling methods

The metabolic changes in serum during a sport program were explored using a metabonomic approach, based on proton nuclear magnetic resonance ( 1 H-NMR) spectroscopy and anthropometry. The aim of this study was to classify two groups of female university students with body mass index over 25 kg/m2, using multiple measured descriptors. The first group (n=16) underwent a complex and well programmed 18-week physical training courses, and the second group (n=8), which was our control group, did not participate in any training course. Our descriptors consist of anthropometric descriptors (including height, weight, circumferences of arm, waist, hip and thigh, lean body mass and fat mass percentiles). Serum levels of growth hormone, insulin, and insulin like growth factor-1 were measured. 1 H-NMR spectra was obtained using a 500-MHz Bruker spectrometer and was calculated for certain chemical shift integrals using Chenomx software for all the individuals in both groups. These descriptors were measured both before and after the training program for the experimental group. In order to make a linear model between growth hormone (GH) and 1 H-NMR matrix as a set of variables, initially by multiple linear regression (MLR) stepwise as the variable selection method, the most important descriptors were selected by MLR modeling approaches. The results obtained for R 2 training and test show an agreement between experimental and theoretical GH values. By applying counterpropagation Artificial Neural Networks (CP-ANN) classification methods, we significantly separated our 1st group from the other one.


INTRODUCTION
Studying human metabolite variations caused by external factors like diet, drugs and physical activity is a subdivision of metabonomic, which is defined as the quantitative measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification (Gavaghan et al., 2000).The proton nuclear magnetic resonance ( 1 H-NMR) spectroscopy of biofluids (urine, serum/plasma) and tissue generates comprehensive *Corresponding author.E-mail: zzahrazamani@yahoo.com.biochemical profiles of low molecular weight endogenous metabolites (Solanky et al., 2003).Nuclear magnetic resonance (NMR) spectroscopy has also emerged as a key tool for understanding metabolic processes in living systems (Brown et al., 1977).
The range of spectroscopic techniques that are used in metabonomics are often so-called 'hyphenated' mode (e.g.liquid chromatography/ nuclear magnetic resonance/ mass spectrometry (LC-NMR-MS).A detailed inspection of NMR spectra and integration of individual peaks can give valuable information on dominant biochemical changes, in this way Pattern recognition method can be used to map the NMR spectra into a lower dimensional  2198.7 2858.3 3430 3944.5 4338.9 4555.9 space (implied by the number of points in the digital representation of the NMR spectrum) such that any clustering of the samples that is based on similarities of biochemical profiles can be easily determined (Nicholson et al., 1995;Coen et al., 2005).
Obesity is the heavy accumulation of fat in a body to such a degree that it rapidly increases the risk of diseases such as diabetes and heart disease.The fat may be equally distributed around the body or concentrated on the stomach or the hips and thigh (Fujioka et al., 1991).Urinary creatinine, lactate, pyruvate, alanine, β-hydroxybutyrate, acetate and hypoxanthine profiles shows a change after acute and chronic physical exercise by 1 HNMR base metabolomics approaches (Enea et al., 2010).In the present investigation, we used anthropometric variables along with growth hormone (GH), insulin-like growth factor-1 (IGF-1), serum insulin level along with 1 HNMR profiles to distinguish a control group of 8 obese girls with body mass index (BMI) > 25 from a group of 16 obese girls who undergone combined aerobic and anaerobic physical exercises for 18 weeks.The aim was to classify two groups of obese students with body-mass index over 25 kg/m², using multiple measured descriptors and find out the key metabolite changes in the experimental group.

Samples collection and preparation
A group of obese female students from Shariff University with BMI>25 were taken for this study.They have been instructed to use controlled and the same regiment of diet.These were divided into two groups: The first group (n = 16) underwent a complex and well programmed 18-week physical training courses.The test group (average age 18.9 ± 1.3 years, height 159 ± 0.04 cm, weight 73.46 ± 7.75 kg) did a 90-min program consisting of running-walking with 60-70% intensity of a maximum heart beat (made in Finland and with distance Polar; CE0537, F1) and with increasing distance in every week (Table 1), light strength building exercise (using body weight and small weights) and also aerobic exercise consisting of sit up, Swedish push up and etc. 3 times a week, each session lasting 90 min for 6 weeks.The 2 nd group (n = 8) was taken as our control group; they did not participate in any physical training program.Our description consisted of the anthropometric descriptors (including height, weight, circumferences of arm, waist, hip and thigh, lean body mass and fat mass percentiles).This study was approved by Department's Ethical Committee.
Fasting blood serum was collected from these groups on the 6 th week at 8 o' clock in the morning and stored at -20°C until assayed.Serum total IGF-1 (Biosource, Belgium), insulin (DRG, Germany) and human growth hormone (Radim, Italy) were assayed by enzyme-linked immunosorbent assay (ELISA) methods.Intra assay and inter-assay coefficient of variation were 6.6 and 13.3% for IGF- 1, 2.2 and 4.4% for insulin, and 4.2 and 5.3% for human growth hormone, respectively.

HNMR spectroscopy
Prior to NMR analysis, serum samples (60 µL) were diluted with 600 µL of 52% Deuterium oxide (D2O, 99.9 at.% D, Aldrich Chemicals Company, South Africa) and placed in 5 mm high quality NMR tubes (Sigma Aldrich., RSA).Conventional 1 H-NMR spectra of each serum sample was measured at 500 MHz on a Bruker Avance with Carr-Purcell-Meiboom-Gill (CPMG) protocol at Shariff University as described by Lin et al. (2007).

Data reduction of NMR data
Each 1 H-NMR spectrum was corrected for phase and baseline distortion using Chenomx NMR suite (version 6.0) and the 0.0 -10.0 parts per million (ppm) spectral region was reduced to 250 integral segments of equal width of 0.04.This optimal width of segmented regions is based on previous studies, which found the regions of 0.04 ppm accommodated any small pH-related shifts in signals and variation in shimming quality (Coen et al., 2005).

Statistical analysis
At preprocessing step to NMR spectroscopy analysis, orthogonal signal correction (OSC) which removed orthogonal variations to the class of interest was done (Gavaghan, et al., 2002).In the OSC, identification of sample classes will be assigned by a vector, Y. Calculation of the first principal component or score vector t, is the first step in OSC method, which describes the maximum separation based on classes.Concerning this corrected vector, the loading vector, p* is measured.The product of the orthogonal score and loading vectors is removed from the data of a spectrum.The residual matrix represents the filtered spectral data and was then used for calculation of multiple linear regression (MLR) and CP-ANNs.Stepwise-MLR was used to select the best descriptors among 250.Then the selected variables were used in order to make a linear model.
Multiple linear regression (MLR) techniques based on leastsquares procedures are very often used for estimating the coefficients involved in the model equation (Porter et al., 1981).Artificial Neural networks (ANNs) can solve both supervised and unsupervised problems, such as clustering and modeling of qualitative responses (classification).Among ANN learning strategies, Kohonen Maps (Figure 1) and ANNs are two of the most popular approaches that were applied to our data.Kohonen Maps are self-organizing systems applied to the unsupervised problems (cluster analysis and data structure analysis).CP-ANNs (Figure 2) are very similar to the Kohonen Maps and are essentially based on the Kohonen approach, but combines characteristics from both supervised and unsupervised learning, that is CP-ANNs can be used to build both regression and classification models (Ballabio et al., 2007).

RESULTS AND DISCUSSION
To find out the important variables affecting the GH serum levels of two groups (from all 48 samples, 32 and 16 of them related to pre-activity and post-activity tests respectively), we performed stepwise multiple linear regression (MLR) and genetic algorithm -partial least square (GA-PLS) on the data matrix (X) after the application of orthogonal signal correction (OSC).Results indicated that both methods provided same frequency ranges for the participant and non-participant groups as significant descriptors.IGF-1, Insulin serum levels and anthropometric data could not lead us to an acceptable classification or calibration model.For the physical activity participant group, the integral values of low density lipoprotein (LDL), very low density lipoprotein   (VLDL) and choline (Lipid) were obtained as the most important descriptors and for control group also, 3 sets of important descriptors in a specific chemical shift range of NMR were measured (Table 2).All these variables were used in the next section to classify the two obese girl groups.
Table 2 shows the most important descriptors with their properties calculated with MLR-stepwise.For the next step (modeling), the GH based on the best descriptors for both two groups is the main goal.The regression results for the selected MLR model are presented in Table 3.The R 2 for training set were shown to be 0.998 and 0.989, respectively.For test set, experimental and control groups were 0.995 and 0.950, accordingly.Figures 3 and  4 show the correlation between the MLR calculated and the experimental values of the GH included in the training and test sets for the physical activity of the experimental and control group, respectively.The results indicate a  Samples from obese girls, which gave the most important descriptors, were calculated with MLRstepwise method.The samples belong to two different groups and consequently, the final aim of the classification model was the determination of influence of physical exercise on obsesses girls.Kohonen maps, which deal with unsupervised issues, are not directly treated here since they are implicitly calculated as the Kohonen layer of CP-ANNs.To find the optimal CP-ANNs settings, several networks are usually evaluated by changing the number of neurons and training epochs.Settings are then selected based on the optimization of a classification parameter, such as non-error rate, present cross-validated samples.Settings were chosen based on personal experience by selecting a reasonable number of epochs ( 200) and a number of neurons relatively selected close to the number of samples (49, constituting a squared map with 7 neurons on each side).Data were auto-scaled.
In Table 4 part of the classification indices calculated (error rate, non-error rate) is shown, also the classification performances refer to the cross-validation result.It is important to have an insight into the model by interpreting samples and variables relationships.This can be done by analyzing the Kohonen top map, the neurons constituting the network.In the top map, samples can be projected in order to evaluate the data structure, the presence of cluster or outliers; while variable importance can be analyzed by coloring the neurons based on the neuron weights that are comprised between 0 and 1 (Ballabio et al., 2009).The Kohonen top map (Figure 5) represents the space defined by the neurons where the samples are placed.Samples are visualized by randomly scattering their positions within the squares; Different samples are placed far apart, while similar samples occupy the same neuron.Thus visual investigation of the data structure by analyzing the sample positions and their relationships are allowed.The neurons can be colored based on the weight values.Therefore, it is possible to interpret the sample relationships as well as the variable influence.
An alternative consists of performing principal component analysis (PCA) on the Kohonen weights, in order to investigate the relationships between variables and samples in a global way and not one variable at a time (Scampicchio et al., 2008).PCA is a well-known pattern-recognition technique, which projects the data in a reduced hyperspace, defined by the first significant principal components (Wold et al., 1987).The weights of the Kohonen layer can be arranged as a data matrix with N2 rows and J columns, where N is the number of neurons on each side of the map and J the number of variables.Therefore, each element wrj of the matrix W represents the weight of the j-th variable in the r-th neuron.By applying PCA on the W matrix, a loading matrix (with dimension J×F, where F is the number of significant principal components) and a score matrix (N2×F) is obtained.By comparing the corresponding loading and score plots, the relationships between variables and neurons can be evaluated.Each neuron can be assigned to a class (when dealing with CP-ANNs) and the relationships between variables and classes can also be investigated (Ballabio et al., 2007).For this reason, a GUI for calculating PCA on the Kohonen weights is provided (Figure 6).
In Figures 4 to 6 the score and loading plots of the first two components (explaining 96.12% of the total information) are shown.In the score plot, each point represents a neuron of the previous CP-ANN model.From each neuron, it is easy to see that neurons assigned to class 1 (Control group) are all clustered and placed at the right side of the score plot.Then, comparing score and loading plots, one can evaluate how all the variables characterize this specific class.Variables 3, 4 and 5 are placed at the right of the loading plot and thus is directly correlated with class 1.On the contrary, variables 1 and 2 have the largest negative loadings in the first component and thus samples of class 2 will be characterized by the values of these variables.

Conclusion
Metabolite profile was significantly altered after 18-week training program.The same method of classification can be used to evaluate several physical activity courses to ascertain their similarities and differences, and therefore, design an optimal physical activity course considering the NMR profile and GH level alterations.Furthermore, we were able to effectively predict GH serum levels by our linear models.So it is applicable to use NMR descriptors for quantitatively predicting the serum level alterations of GH and probably other hormones during physical activity.Since the 0.84 -0.88 ppm chemical shift is attributable to mainly low density lipoprotein (LDL) and very low density lipoprotein (VLDL), the linear model can also support the hypothesis that increasing LDL and VLDL amounts lead to increase in GH hormone, and the elevated GH hormone by increasing lipoprotein lipase activity, will decrease LDL and VLDL levels.

Figure 1 .
Figure 1.Structure of a Kohonen layer.

Figure . 2
Figure .2Structure of a counter-propagation network.

Figure 3 .
Figure 3. MLR Plot of experimental vs. the calculated values of GH (yA/ycalc.)for the training and test sets in physical activity experimental group 1.

Figure 4 .
Figure 4. MLR Plot of experimental vs the calculated values of GH (yA/ycalc.)for the training and test sets in physical activity experimental group 2.

Figure 5 .
Figure 5. Top map of the classification of samples by Kohonen artificial neural network. 1 is assigned as control and 2 is experimental samples.

Figure 6 .
Figure 6.Interface for PCA on weights.

Table 2 .
Specifications of the selected MLR Descriptors.
a Mean effect of a descriptor is the product of its mean and regression coefficient in the MLR model.

Table 3 .
The results for MLR model.

Table 4 .
Results of analysis: error rate(ER) and non -error rate (NER) for model and cross validation.