Determination of flavonoids and phenolic acids in the extract of bamboo leaves using near-infrared spectroscopy and multivariate calibration

1 State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital of Medical College, Zhejiang University, Hangzhou 310058, China. 2 The department of Food Science and Nutrition, Zhejiang University, Hangzhou 310029, China. 3 The Department of Applied Technology, Zhejiang Economic and Trade Polytechnic, Hangzhou 310053, China. 4 Bureau of Quality and Safety Supervision for Agri-products, Ministry of Agriculture, Beijing 100125, China.


INTRODUCTION
Bamboo is a giant, woody grass with has a tropical and subtropical (cosmopolitan) distribution and represents an important commodity.Bamboo leaves have been used in traditional Chinese medicine for treating fever and detoxification for over 1000 years.Recently, some biologically active components in bamboo leaves and their potential health benefits have been widely studied.Extract of bamboo leaves (EBL) is a polyphenol-rich preparation Abbreviations: EBL, Extract bamboo leaves; TF, total flavonoids; TP, total phenolic acids; RMSE, root mean square errors.
made from the leaves of the Phyllostachys Sieb.et Zucc.family by extraction, separation, concentration and spray drying.The main bioactive components of EBL are flavonoids, phenolic acids and coumaric lactones and include the compounds orientin, homoorientin, vitexin, isovitexin, naringin-7-rhamnoglucoside, quercetin, luteolin, rutin, tricin, caffeic acid, chlorogenic acid and phydroxy coumaric acid (Lu et al., 2005 and2006).Animal experiments and clinical trials have confirmed that EBL has medicinal properties, such as antioxidant, anti-aging, antibiosis, antiviral and neuroprotective potential (Zhang, 1995).For this reason, EBL has been added to foods, beverages, wine, cosmetics and animal feed and has a large pros-pective market in Asia.However, as with the production of other plant extracts, the content of these bioactive components in EBL preparations can be affected by origin of bamboo leaves, technology of preparation and method of storage.
According to the structures and the properties of these bioactive components in EBL, many could be determined quantitatively by photocolorimetric methods, high performance liquid chromatography (HPLC) and high performance capillary electrophoresis.Zhang (2002) determined the total content of flavonoids and phenolic acids by a photocolorimetric method using lutin and parahydroxybenzoic acid as standards.Zhang et al. (2005) and Lu et al. (2005) determined the composition of other bioactive components in EBL, including naringin-7rhamnoglucoside, isovitexin, rutin, vitexin, homoorientin, tricin, orientin and chlorogenic acid by the retention times of pure sample standards using HPLC and high performance capillary electrophoresis.However, these methods are time consuming, laborious, costly and inconvenient for online, rapid evaluation of flavonoids and phenolic acids in EBL.For improved quality control, it is necessary to develop a rapid, real-time and nondestructive detection method for the evaluation of flavonoids and phenolic acids in EBL.
Near-infrared spectroscopy can record the multifrequency and co-frequency information of organic molecules that contain hydrogen bonds, such as C-H, N-H and O-H (Liu et al., 2009).Though, near-infrared spectroscopy is not as accurate as chromatographic and photocolorimetric methods, it is rapid, non-destructive, simple, and cost-effective and so is suitable for highthroughput and real-time product control.Near-infrared spectroscopy has been widely used as an alternative to wet chemistry procedures for qualitative and quantitative analysis in the agricultural, pharmaceutical, food, textile, cosmetic and the polymer production industries (Wang et al., 2008a;Woodcock et al., 2008;Sinija and Mishra, 2008;Chen et al., 2009;Shao and He, 2009).However, there have been few reports on the simultaneous determination of flavonoids and phenolic acid content by near-infrared spectroscopy.
The objectives of this study were threefold: (1) to validate the feasibility of using near-infrared spectroscopy to determine the flavonoids and phenolic acids in EBL; (2) to acquire the best calibration models and to confirm the effective wavelengths and validate the prediction performance; (3) to realize the potential of this method for commercial production.

Extract of bamboo leaves (EBL)
In this study, EBL samples were obtained from a local factory (Hangzhou U-mate technology Co., Ltd.).The raw material was allowed to dry in the sun, then was pulverized and sieved through 20 mesh to yield a fine powder.The powder was extracted by deionized water at a ratio of 1:10 (w/v) at 100 ± 1°C for 1.5 h and then filtered.The extract was concentrated by removing the water under vacuum and then spray-drying.The extract samples (each about 1 L) were obtained from different sections of the factory Lu et al. 8449 production line.All EBL samples were stored in the laboratory at a constant temperature of 25 ± 1°C for more than 48 h to equalize the temperature and then filtered through one layer of filter paper.Sixtyone samples were used directly for physical and chemical measuring using near-infrared spectroscopy to obtain reference values.
Fractions of these samples were also used as the calibration (46 samples) and validation sets (15 samples) from 10 different batches of bamboo leave extract.Samples were diluted 1:5 with water before HPLC analysis.

Reference methods for flavonoids and phenolic acids
Total flavonoids and phenolic acids were determined using a photocolorimetric method (Zhang, 2002) with lutin and parahydroxybenzoic acid as standards (Sigma-Aldrich Co., Ltd., USA).
In order to determine whether one pure constituent or the sum of four constituents could be determined by near-infrared spectroscopy, reversed phase HPLC was used to separate and quantify the main flavonoids (homoorientin, orientin, isovitexin and vitexin) and phenolic acids (chlorogenic acid, caffeic acid, p-coumaric acid and ferulic acid) against known standards (Sigma-Aldrich Co., Ltd., USA).In order to detect the main polyphenol components of EBL, HPLC separations was performed using a Waters 2695 HPLC chromatography system (Waters, Milford, MA, USA) equipped with a Luna C18 reversed phase column (5 µm, 250 mm × 4.6 mm ID) at 40°C (Phenomenex, Torrance, CA, USA) and a detection wavelength of 330 nm.The mobile phase was methyl cyanide (A) and acetic acid solution diluted 1:100 in water (B).The flow rate was 1.0 ml/min and the gradient elution requirement was as follows: 0 to15 min, A 15%, B 85%; 15 to 25 min, A 15 to 40%, B 85 to 60%; 25 to 34 min, A 40%, B 60%; 34 to 40 min, A 40 to 15%, B 60 to 85%.

Spectral acquisition
A Nexus FT-IR spectrometer (Thermo Nicolet, Madison, WI, USA) was used for spectral scanning from 12,000 to 4,000 cm −1 .The resolution of the spectrometer was 4.0 cm −1 in transmission mode.
The spectral data were obtained by first adding an appropriate volume (about 4/5 of the cell volume) into the fixed liquid cell (2.0 mm light path length).The background value for spectral scanning was obtained using an empty cell (air).A new background value was obtained every 100 min.The scan results of the samples were the average value up to 64 individual scans depending on the signal-to-noise.The spectrometer was equipped with OMNIC spectral acquisition software (Thermo Nicolet, Madison, WI, USA) that controlled acquisition, storage and analysis.

Preprocessing and partial least squares analysis
For partial least squares analysis of the results, TQ Analyst V6 analysis software (Thermo Nicolet, Madison, WI, USA) was used.
To acquire the optimal performance of the partial least squares models, the spectra were preprocessed prior to the calibration.The pretreatments included 1 st and 2 nd derivatives, Norris derivative and Savitzky-Golay filters.The 1 st and 2 nd derivatives were used to reduce baseline shift, whereas Norris derivative filter and Savitzky-Golay filter were applied to decrease noise.Norris derivative filter, however, could be employed only after the 1 st and 2 nd derivative was applied.Thus, the single pretreatment or the combination pretreatments included 1 st or 2 nd derivative only, 1 st or 2 nd derivative combined with Norris derivative filter, Savitzky-Golay filter only or all filtering methods.Partial least squares is a widely utilized regression method in spectroscopic analysis (Geladi and Kowalski, 1986) that considers the spectral data matrix and the target chemical properties matrix.
In the development of calibration models, partial least squares were chosen and full cross-validation was used to validate its quality and to prevent over-fitting of the calibration model.Model assessment and predictive capability were evaluated by the following indices: correlation coefficient (R), root mean square errors of prediction (RMSEP), bias, slope and offset (Liu et al., 2009).The main evaluation indices in this paper were R and RMSEP.The two indices are calculated as follows: Where, Ci' is the value calculated by multivariate calibration methods; Ci is the value determined by reference methods; C is the mean of Ci; n is the number of the samples.
Generally, a good model should have high correlation coefficients and low RMSEPs.However, PMSEP relates to the root mean square errors of the reference values and the value of the former should be less than 1/2 of the latter.

Least squares-support vector machine
Least squares-support vector machine, a state-of-the-art learning algorithm, has a solid theoretical foundation in statistical learning methods (Liu et al., 2009).It is capable of dealing with linear and nonlinear multivariate analysis and of resolving these problems relatively quickly (Vapnik, 1995;Suykens and Vandewalle, 1999).Furthermore, support vector machine can learn in high-dimensional feature space with fewer training data.Instead of quadratic programming problems and the traditional empirical risk minimization principle, it employs a set of linear equations and the structural risk minimization principle to obtain the support vectors and to avoid over-fitting problems.In this study, the full length spectrum was applied as the input data and the performance of the models was assessed mainly by R and RMSEP.A free least squares-support vector machine lab 1.5 toolbox (Suykens, Leuven, Belgium) was applied with Matlab V7.0 (The Math Works, Natick, USA) to develop the calibration models.

Reference values
The reference values of total flavonoids (TF) and phenolic acids (TP) were obtained using photocolorimetric methods as described.The main flavonoids (homoorientin, orientin, isovitexin and vitexin) and phenolic acids (chlorogenic acid, caffeic acid, p-coumaric acid and ferulic acid) were also determined by HPLC methods as described.The calibration parameters for TF, TP, homoorientin, orientin, isovitexin, vitexin, chlorogenic acid, caffeic acid, p-coumaric acid and ferulic acid assayed by the photocolorimetric method or by HPLC are shown in Table 1.The five point calibration curves of each value were achieved by linear regression analysis, which revealed high linearity (correlation coefficients over 0.999) for all values.This indicated that the chemical values of flavonoids and phenolic acids by these linear regression functions are accurate and of high predictive value within the calibration range.
Using the calibration equation, the concentrations of flavonoids and phenolic acids could be determined for further spectral calibration analysis.The statistical values for TF, TP, sum of homoorientin, orientin, isovitexin and vitexin (SF) and sum of chlorogenic acid, caffeic acid, pcoumaric acid and ferulic acid (SP) in EBL are shown in Table 2 and Figure 1.The highest values were 99.5 mg/ml TF, 90.8 mg.ml TP, 21.5 mg/ml SF and 7.5 mg/ml SP, while the lowest values were 36.7,40.6, 4.9 and 2.3 A broad range of concentrations were observed in the calibration and validation sets, indicating that the samples in calibration and validation sets are representative of the range of possible chemical contents.These data will aid in the development of a stable and robust calibration model.

Spectral features
The transmittance spectra of the samples and of the backgrounds are shown in Figure 2. To obtain the transmittance spectra, a relatively short light path length of 2.0 mm was chosen.In addition, air was applied as the background because of the intense absorbance in the near infrared wavelength of the solvent water (Yu, 2007).A spectral resolution of 4 cm -1 was used in this study (Wang et al., 2008b;Yang et al., 2003;Yu et al., 2007) and individual spectra were based on the average of up to 64 scans to increase the signal-to-noise ratio and to reduce error.
There were troughs in the transmittance spectra around wave numbers 6,897 and 5,128 cm −1 that were related to the first overtone of the O-H stretch and a combination of stretch and deformation of the O-H group in water (Murray, 1986).
The small transmittance troughs around 5,550 and 8500 cm −1 might be associated with stretch and deformation of the specific structures of flavonoids and phenolic acids.

Partial least squares models
Partial least squares models with different pretreatments were developed for the determination of TF, TP, SF and SP in EBL.All of the pretreatments have been mentioned earlier, but the performances of partial least squares models with the pretreatment 1 st or 2 nd derivative and 1 st or 2 nd derivative combined with Savitzky-Golay filtering was inadequate.Thus, we only compared the performance of the partial least squares models with 1 st or 2 nd derivative combined with or without Norris derivative (ND) filter and Savitzky-Golay (SG) filter alone or SG with all pretreatments (Table 3).For flavonoids, 46 samples were used in the calibration set and 15 samples in the validation set; for phenolic acid, the calibration set and validation set contained 42 and 14 samples, respectively.
Taking the prediction performance into consideration, which can be judged by the prediction performance evaluation indices R and RMSEP, the optimal models were those that employed pretreatment SG for both flavonoids and phenolic acids, possibly because SG can decrease the noise of the spectrum.In contrast, models with the pretreatments 1 st or 2 nd derivative and 1 st or 2 nd derivative combine with Norris derivative filter may have introduced significant noise into the spectra at the highest and lowest wave numbers.The optimal R and RMSEP for the validation set were 0.9124 and 4.97 for TF and 0.9325 and 4.33 for TP.However, the partial least squares models of SF and SP were not practical, because the Rs for the two models were only 0.5229 and 0.5059 and the root mean square errors of the calibration and the validation were quite high (3.49and 3.80 for SF and 1.19 and 1.15 for SP), while the RMSEPs were even higher than the standard deviations.These high RMSE values for SF and SP can make the model calibration difficult when the concentrations are below 5% of the totals as is usually the case.

Least squares-support vector machine models and comparison with partial least squares models
Least squares-support vector machine models for the determination of TF, TP, SF and SP in EBL were developed using the least squares-support vector machine lab 1.5 toolbox with Matlab V7.0.The performances are shown in Table 4.The inputs were the full length spectra for flavonoids and phenolic acids.For flavonoids, 46 samples were used as the calibration set and 15 samples were used for the validation set.For phenolic acids, the calibration set and validation set contained 42 and 14 samples.As shown, the prediction values in both the calibration sets and the validation sets were quite accurate for both TF and TP, especially the prediction performance evaluation indices R and RMSEP.According to partial least squares models, the performance of the least squares-support vector machine models   of SF and SP are unfavorable.
Compared with the partial least squares models, the results in Table 4 indicated that least squares-support vector machine models performed better.One possible reason for the success of the least squares-support vector machine models could be that there was useful nonlinear information in the spectral data and partial least squares only dealt with the linear relationships between the spectra data and chemical compositions (Liu et al., 2009).But the R and RMSEC of TP using least squaressupport vector machine models were higher than those obtained from the partial least squares models, possibly because least squares-support vector machine can learn in high-dimensional feature space with fewer training data and deal with linear and nonlinear multivariate analysis, while partial least squares dealt with fewer factors.This is demonstrated clearly when comparing the partial least squares models with the pretreatments 1 st or 2 nd derivative.In conclusion, the least squares-support vector machine is quite a robust learning algorithm for the determination of TF and TP.The R and RMSEP for the validation set were 0.9418 and 3.91 for TF and 0.9535 and 3.61 for TP.

Conclusion
Near-infrared spectroscopy combined with least squaressupport vector machine modeling was successfully utilized for the determination of TF and TP in EBL.The partial least squares models and least squares-support vector machine models were developed and compared to determine TF, TP, SF and SP in EBL and the optimal prediction performance was achieved using raw spectra for TF and TP.The results indicated that least squaressupport vector machine models performed slightly better than partial least squares models for the prediction of TF and TP.For the least squares-support vector machine models, the correlation coefficient of calibration (R(cal)) and validation (R(val)) were 0.9998 and 0.9418, while the RMSEC and RMSEP values were 0.05 and 3.91 for TF.For TP, the corresponding values were 0.9778, 0.9535, 2.49 and 3.61.The overall results indicated that nearinfrared spectroscopy combined with least squaressupport vector machine could be applied as a rapid, realtime and non-destructive technique for the determination of TF and TP in EBL.The results might be useful for the process of in situ monitoring of EBL chemical composition for quality control.

Figure 1 .
Figure 1.HPLC spectra of two typical samples.

Figure 2 .
Figure 2. Near infrared spectra of samples and the background.

Table 1 .
Values of flavonoids and phenolic acids assayed by reference method.

Table 2 .
Calibration equations of flavonoids and phenolic acids.

Table 3 .
The prediction results of flavonoids and phenolic acids in calibration and validation sets by partial least squares models.

Table 4 .
The prediction results of flavonoids and phenolic acids in calibration and validation sets in least squares-support vector machine models and the comparison with the partial least squares models.