Analysis of knee osteoarthritis by using fuzzy c-means clustering and SVM classification

In this study, a technique to extract crossectional fat (CSFA), muscle (CSMA), femur (CSFEMA) and bone (CSBA) areas of the thigh in the knee offering osteoarthritis (OA) disease signs is established. These morphometric measures are obtained by using segmentation, based on Fuzzy C-means (FCM) clustering and used as features. 103 subjects which are presenting normal and four levels of severity OA are used. Subjects are allocated into five OA-severity categories, formed in accordance with the Kellgren–Lawrence scale from KL values 0 to 4 as “normal”, “doubtful, “minimal”, “moderate”, and “severe” respectively. A support vector machine (SVM) classifier is used to classify morphometric features to see the relations and detect OA between the KL scores and the morphology of the thigh muscles. Regarding the number of data for each classes and hardness of severity symptoms of OA, to get a better classification accuracy different combinations of groups, such as five individual groups and two groups (KL0-1 as group one, KL3-4 / KL0-1 as group one, KL2, 3 and 4 as group two) are tried to get classification accuracies. The best classification accuracy rate is achieved when the KL scores are grouped into two main classes. The first class represented the less severe cases and belongs to the KL scores of 0 and 1. The second class is composed of cases with KL grades greater than or equal to 2. The SVM classifier accuracy (72%) is a satisfactory result regarding the hardness of the application domain. That is, analysis of the morphometric measures used in this study is not an easy task because of the variability of MRI image morphologies depending on the people. Results demonstrate that the two groups are classified 72% classification accuracy which will provoke new researches for a precise analysis of the OA and hence leading to more accurate prognosis in clinical practice.


INTRODUCTION
Osteoarthritis (OA) is a widespread joint disease that causes degenerative alterations in the knee as well as other joints.It is a condition in which low-grade inflammation results in joints and affects mainly older people.Therefore it is a highly prevalent chronic health condition that causes substantial disability in late life in most developed countries.It is reported that about 10% of the total world population, and more than half of the people over the age of 50, are suffering from OA (Shamir et al., 2009;Alkan et al., 2010).It causes pain, swelling and reduced motion in human body joints.Healthy cartilage which is the slippery tissue that covers the ends of bones in a joint absorbs the shock of movement.Since OA breaks down the cartilage and when cartilage is lost in a joint, bones rub together.This cartilage loss can permanently damage the joint, over time.Knee OA disturbs about 30% of those over 65 years old and as frequently associated with disability as chronic lung diseases and heart are characterized by pathological features including joint space lessening, osteophyte formation, and joint angulations (Felson and Zhang, 1998;Yelin and Callahan, 1995).Factors such as being overweight, getting older or injuring a joint may be one of the causes of OA, although no direct link could be found with these factors.It gradually worsens with time and currently no definitive cure exists.But osteoarthritis therapies can relieve pain and help patient remain active (Prescott et al., 2009;He et al., 2002).
Magnetic resonance imaging (MRI) has been used to quantify the cartilage morphology, volume and thickness, Sci.Res.Essays and focal defects, and may reflect changes in the bio chemical composition of articular cartilage in a joint.MRI assessment of knee OA comprises diagnosis, evaluation of severity, and monitoring of progression of structural alterations related to the disease (Prescott et al., 2009;He et al., 2002).Several qualitative or semi-quantitative grading systems have been proposed for assessing knee OA , with the Kellgren and Lawrence (KL) grading scale (Boniatis et al, 2006) being considered the gold standard despite its deficiencies .KL system is a validated method of classifying individual joints into one of five grades, with 0 representing normal and 4 being the most severe radiographic disease.Since the parameters used for OA classification are continuous, experts may have different evaluation of OA, and therefore reach a different conclusion regarding the presence and severity.This introduces a certain degree of subjectiveness to the diagnosis (Croft, 2005).There are so many scientists that have studied known risk factors for knee OA that are resultant largely from cross-sectional studies: include obesity, previous knee injury, selected forms of physical activity, a family history of the disease and role of the femur quadriceps (Zhou et al., 2003;Changming et al., 2007;Gür and Çakin, 2003;Berry et al., 2008;Prescott et al., 2009b;2010).A semi-automated segmentation for MR images of the quadriceps muscles is proposed by Prescott et al. (2010).They used a template-based initialization of the level set-based segmentation approach for segmentation.The average ZSI standard deviations and means against two different manual readers were given as: rectus femoris, 0.78±0.12;vastus intermedius, 0.79±0.10;vastus lateralis, 0.82±0.08;and vastus medialis, 0.69±0.16.They reported that this work will enable researchers to further explore the correlation between individual muscles of the quadriceps and risk for progression of OA.Relation between cross-sectional area and concentric and eccentric torques in the quadriceps and hamstring muscles are investigated (Gür and Çakin, 2003).Gur and Cakin (2003) tried to determine how functional capacity relates to pain, muscle mass and concentric and eccentric knee torques in women who have bilateral osteoarthritis (OA) of the knee.They stated that cross-sectional area could not be considered as a single predictor of peak torque for either quadriceps or hamstring muscles.
Semi-automatic meniscus segmentation investigated in a series of MR images to use for normal knees and those with moderate osteoarthritis (Swanson et al., 2010).The segmentation method was developed then evaluated on 10 baseline MR images obtained from subjects with no evidence, symptoms, or risk factors of knee (OA), and 14 from subjects with established knee OA enrolled in the osteoarthritis initiative (OAI).They claimed that the semi-automatic segmentation method produced accurate and consistent segmentations of the meniscus when compared to manual segmentations in the assessment of normal menisci in mild to moderate OA.A fully automated method for the segmentation of the femur in axial MR images and its use in the analysis of imaging biomarkers for osteoarthritis (OA) is proposed by Madabhushi and Udupa (2005).They used a method based on anatomical constraints implemented using morphological operations to extract the femur medulla and a level set evolution to extract the femur cortex.It is reported that average agreement of the automated segmentation algorithm with ground truth manual segmentations was 0.940 ± 0.034 calculated using the Zijdenbos similarity index (ZSI).
A pooled variance t-test analysis is used and significant associations between the clinical measure of OA severity (KL grade) and both the cross-sectional area (CSA) of the femur medulla (p = 0.02) and the ratio of the femur medulla CSA to the femur cortex CSA (p = 0.04) is found for women.But, no significant association was found between femur measurements and OA severity for men.Despite the prevalence of knee OA, computer-based tools for OA detection based on single knee MRI images are not sufficient for either clinical or research purposes.Due to the high commonness of OA, there is a promising need for clinical and scientific tools that can consistently detect the presence and severity of OA (Shamir et al., 2009).One potential risk factor that has not been particularly well studied is the roles of the cross-sectional fat (CSFA), muscle (CSMA), femur (CSFEMA) and bone (CSBA) areas in knee OA.In this study, a computerized method was developed to explore these relationships between the clinical measure of OA severity levels (KL grade) and morphometric measures of the thigh for knee osteoarthritis from MR medical images.These morphometric measures are used as a preprocessing part of the system.Preprocessing, Fuzzy C-means clustering and morphological filtering form the essentials of the proposed segmentation technique.After calculation of the related crossectional areas (CSAs) of thigh, these morphometric measures are used as an input to classification system to see significant associations between the clinical measure of OA severity (KL grade) and the cross-sectional areas of thigh.Here a method for detection of OA by using computer-based image analysis of knee MRI images is described.While at this point it is not suggested that the proposed method can completely replace a human reader, it can serve as a decision-supporting tool, and can also be applied to the classification of large numbers of MRIs for clinical research trials.

MATERIALS
In this study the data is acquired from the progression cohort of the osteoarthritis initiative's (OAI) public use dataset (www.oai.ucsf.edu).MRI data from 103 subjects which are the progression cohort of the osteoarthritis is used for the analysis (Alkan et al., 2010;Prescott, 2010).Subjects and OA descriptions of them are given in Table 1.T1-weighted axial scans of the thigh were acquired at 5 mm intervals in the range between 10 to 17 cm proximal to the medial femoral epiphysis of the right knee.In this Grade 0: None: Indicates a definite absence of OA.
Grade 1: Doubtful: Doubtful narrowing of joint space and possible osteophytic lipping; Grade 2: Minimal: Definite osteophytes and possible narrowing of joint space; Grade 3: Moderate: Moderate multiple osteophytes, definite narrowing of joints space, some sclerosis and possible deformity of bone contour; Grade 4: Severe: Large osteophytes, marked narrowing of joint space, severe sclerosis and definite deformity of bone contour (Koktas, 2008).

METHODS
In this study, the relationship between the crossectional area of thigh and severity of the osteoarthritis illnesses are explored.For the classification and analysis of thigh muscles' CSFA, CSMA, CSFEMA and CSBA areas of knee are calculated as features for the related level of severity OA.Then these morphometric features are applied to the classification system which can be used in the prognostic stage of the disease.Figure 1 gives an overview of the proposed method.

Intensity standardization and normalization
On some MRI images, there were some intensity levels' fluctuations which make segmentation harder.To solve this problem and normalize the fluctuation of the intensity levels in the image first intensity standardization is applied to the images.A normalization operation based on intensity standardization is used to normalize the fluctuation of the intensity levels in the image.In the first analysis step, MR image intensities are standardized.Because the MR image intensities across subjects with images acquired at different times and at different sites, intensity standardization is essential for the consistent application of the developed algorithm.This process was achieved through bias field correction and normalization steps.Bias field correction is the correction of the gradually changing multiplicative bias field formed by magnetic field inhomogeneities using the nonparametric, non-uniform intensity normalization (N3) algorithm.Normalization is the procedure of the bias-field corrected images to a scale of (0, 1) and the top 0.05% of intensities (outliers) were removed from the scaling operation and instead scaled directly to one.Applied bias field correction followed intensity normalization for intensity standardization has been used and suggested in previous studies on the same data (Prescott al., 2010;Alkan A et al., 2010).

C-means clustering
Clustering is a technique of separating scattered groups of data into several groups.The patterns that are similar to the highest extent are assigned to the same cluster.Clustering analysis is used for image segmentation based on partitioning a collection of data points into a number of subgroups, where the objects inside a cluster (a subgroup) show a certain degree of resemblance.After the intensity standardization and morphological operations, the images are segmented using the well-known clustering technique, Fuzzy C-Means (FCM) clustering algorithm.Fuzzy clustering takes into account the overlapping of the clusters (Bezdek, 1981).If c is taken as a positive integer greater than one and X = (x1,x2, ...,xn) is taken as a data set, a partition of the data set into c clusters is represented by mutually disjoint sets X1,X2, ...,XC such that X1UX2U ... UXC or equivalently by the indicator functions µ1, µ2,… µc .This is known as clustering X into c clusters X1, . . ., Xc by a fuzzy cpartition { µ1, µ2,… µc }if allows µi(x) to take on values in the interval (0, 1) such that: all x in X.In this case, (µ1, µ2,… µc) is called a fuzzy c-partition of data X (Ruspini, 1969).If an object has a highest degree of membership to a more similar cluster, this object is selected as a prototype of this cluster.After a fuzzy clustering a FCM partition is produced which is given as: (2) Where µ is a partition with µij = µi(xj), the weighted exponent m is a fixed number greater than one establishing the degree of fuzziness and v = (v1,v2,…, vc) is the cluster center (Hung et al., 2006;Ruspni, 1969).The FCM clustering uses iteration through the necessary conditions for minimizing JFCM with the following update equations: (3) And ( 4) The FCM clustering is used to reveal crossectional areas of fat, muscle, bone and femur regions in the MRI image.

Morphological filtering and morphometric measurements
Morphological operations such as Image filling, opening and closing are applied to eliminate the unwanted small areas on the segmented/detected areas (Gonzalez et al., 2004)  segmented clearly.These areas are calculated and used as morphometric measures for the classification step.These morphometric results are used as the features for the related OA severity levels applied to a support vector (SVM) classifier to see whether these measures have any meanings related to severity levels of OA.

Support vector machine (SVM)
Support vector machines (SVM) introduced by Vapnik, have been effectively applied in classification and function estimation problems within the context of statistical learning theory and structural risk minimization.The standard SVM has been constructed to separate training data into two classes (Wang et al., 2009;Wang et al., 2011).The SVM assumes the input set as an n-dimensional feature vector space and tries to find the (n-1) dimensional hyperplane separates the space into two parts that maximizes the minimum distance between any data point.N-dimensional input data xi (I = 1,2,…,l, l is the number of samples) is labeled as yi = 1 for class 1 and as yi = -1 for class 2 by yi matrix.A hyperplane, f(x) =0 can be defined for linearly separable data.
is an n-dimensional vector and b is a scalar.These parameters determine the location of the hyper-plane which accommodates definite limits.Sgn (f(x)) function is the decision function and a completely separating hyperplane has to obey the limits: hyper-plane that maximizes the minimum distance is called as the optimal hyper-plane and the minimized solution of the depending on: Where ξi measures the distance between the edge and the example xi lying on the wrong side of the edge.This calculation can be simplified by using Kuhn-Tucker conditions into equivalent Lagrange dual problem.

∑ ∑
depending on: The function K(xi xj) that returns a dot product of the feature space mappings of original data points is called a kernel function.There are several types of kernels, such as polynomial, linear, splines, radial basis function and "multiple layer perceptron" can be used within the SVM.Further details of the SVM classifier can be found in literature (Wang et al., 2009;Wang et al., 2011).

RESULTS
MRI images are segmented and muscle, fat, bone and femur regions are extracted by using FCM clustering and morphological filtering operations.These segmented regions are used to calculate cross-sectional areas and considered as CSFA, CSMA, CSFEMA and CSBA measurements respectively.Figure 2 shows the segmentation results of four types of CSAs for a sample MRI image of thigh.Since the KL values of the used data are given only for right leg, then the morphometric measurements are computed for the right leg only.Subjects are divided into five different KL groups starting from a baseline (KL score is equal to 0) and four levels (KL score 1 to 4) of severity OA.Table 3 demonstrates the calculation results of the average morphometric measurements for each group.Means and standard deviations are computed for each group.Analyzing Table 3, one can notice that the fat area is increasing when the KL is greater than or equal to 2 (when the OA becomes severe) while the muscle area is decreasing for the same group.The same measures are changing oppositely for KL grade is less than 2. The obtained morphometric measures are applied to a SVM Classifier.Regarding the number of data for each class, different combinations of groups, such as five individual groups, two groups (KL0-1 as group one, KL3-4/KL0-1 as group one, KL2, 3 and 4 as group two) are tried to get classification accuracies.
The best classification accuracy rate is achieved when the KL scores are grouped into two main classes.The first class represented the less severe cases and belongs to the KL scores of 0 and 1.The second class is composed of cases with KL grades greater than or equal to 2. For each class, four features (CSFA, CSMA, CSFEMA and CSBA) are calculated as morphometric measures and used as inputs to an SVM classifier to see whether these measures have any meanings related to severity levels of OA.In the SVM classifier, the overall data set is randomly divided into two equal subsets, selecting the half of the data for training and the other half for testing.The SVM classifier accuracy (72%) is a satisfactory result regarding the hardness of the application domain.That is, analysis of the morphometric measures used in this study is not an easy task because of the variabilities of them depending on the people.72% classification accuracy.
Results demonstrate that the two groups are classified

DISCUSSION
In this study, a method to extract morphometric measures of the thigh in the knee presenting OA disease signs is investigated by using image processing algorithms.Fat, muscle, femur and bone CSAs are segmented and calculated to use as features for an SVM classifier.Since the classification accuracy mainly depends on the application domain, classification result of the analysis for this task is not very high.Classification results demonstrated that the two groups classified with 72% classification accuracy.This means that if these four CSAs are used as features considering KL scores as two distinct groups, there is a classifiable relationship between morphology of thigh and severity of OA.Also, it is demonstrated that the fat area is expanding and the muscle is shrinking when the KL is greater than or equal to 2, which correspond to the severe stages of the OA disease.Results intensify and reinforce the achievements of the former research (Alkan et al., 2010).Regarding the number of the data and hardness of the application domain, results are acceptable and gives a good projection in this area for the future studies.

Figure 1 .
Figure 1.Scheme of the analysis steps.

Figure 2 .
Figure 2. (a) An original sample image, (b) Results of the FCM clustering on the sample image, Segmented cross-sectional areas of (c) muscle (d) fat (e) femur and (f) bone.

Table 2 .
Parameters for acquisition of thigh MRIs.
study the slice at 17 cm from the right thigh was analyzed.Specifications of MRI acquisition parameters can be seen in Table2.According to the features of OA subjects are divided into five kl grades as follows (groups like healthy, and four levels of severity osteoarthritis):

Table 3 .
Group statistics of the data.