ITTDG: Integrated T-way test data generation strategy for interaction testing

,


INTRODUCTION
Our continuous dependencies on software often raise dependability issue. In the last 20 years, software has grown tremendously in terms of size (that is, line of codes (LOCs), and functionalities). With the increase in LOCs and functionalities, more and more unwanted interactions amongst software systems, hardware components, and operating systems are to be expected, rendering increased possibility of faults. While traditional static and dynamic testing strategies (for example, boundary value analysis, cause and effect analysis and equivalent partitioning) are useful for fault detection and prevention (Zamli et al., 2008), they may not be sufficient to tackle bugs due to interaction.
Addressing this issue, researchers are now focusing on a sampling method, termed t-way testing strategy (that is, where t indicates interaction strength). Despite its inherent benefit in terms of minimizing the test data for consideration, the applicability and adoption of existing tway strategies in the industry appears to be lacking (Czerwonka, 2006). Rather than giving the test engineers (as domain experts) the flexibility to choose amongst all interaction possibilities, some strategies dictates only uniform t-way interactions (for example, GTWay (Klaib, 2009), in-parameter-order general (IPOG) (Lei et al., 2007), multi-core modified IPOG (MC-MIPOG) (Younis, 2010), TConfig (Williams, 2010), Jenny (Jenkins, 2010), and IBM's test case handle (ITCH) (Hartman et al., 2010) while others impose on variable strength interaction (for example, simulated annealing (SA) (Cohen et al., 2003), ant colony system (ACS) (Chen et al., 2009) and pairwise independent combinatorial testing (PICT) (Czerwonka, 2006)). In fact, there are also strategies that prescribe interactions due to input-output based relationship (for example, ReqOrder (Wang et al., 2007), Union (Schroeder, 2001) and Greedy (Schroeder et al., 2002)). As such, there is a need for a strategy that is sufficiently flexible and be able to integrate all forms of interaction possibilities. In this manner, test engineers can readily exercise their creativity depending on the testing problem at hand.
Not until recently, a number of such strategies have started to appear (for example, Density (Wang et al., 2008), ParaOrder (Wang et al., 2007) and TVG (Arshem, 2010). However, as t-way test data generation is NP hard problem (Klaib, 2009;Cohen et al., 1997), no single existing strategy can claim dominance as far optimality of test size is concerned. Motivated by the aforementioned challenges, this paper discusses the development of a new strategy, called integrated t-way test data generation (ITTDG) that seamlessly integrates all interaction possibilities including uniform, variable strength, and input output based relationship. Empirical evidence demonstrates that ITTDG produces competitive test size on a number of benchmark configurations as compared to existing strategies.

Related work
In general, existing t-way strategies can be categorized either as a one-test-at-a-time (OTAT) strategy or a oneparameter-at-a-time (OPAT) strategy. An OTAT strategy generates the final test suite by adding one complete test data into the final test iteratively. Based on the method of generating test data, OTAT strategies can be further characterize into three categories as follows.
The first category is an iterative based OTAT strategy. The strategies in this category executed certain process iteratively to produce the test data. GTWay (Klaib, 2009;Zamli et al., 2011a), ITCH (Hartman et al., 2010), Jenny (Jenkins, 2010), TVG (Arshem, 2010), PICT (Czerwonka, 2006), Union (Schroeder, 2001;Schroeder and Korel, 2000) and Greedy (Schroeder et al., 2002) are the example of iterative based OTAT strategy. GTWay is a tway strategy that iteratively used the backtracking algorithm in generating a complete test data while ITCH, on the other hand, utilizes its exhaustive search algorithm in constructing the test data.
As for Jenny, TVG and PICT, these three tools are a public domain tools and available for download at the developer's site. Jenny generates test data by constructing a test suite that covers 1-way interaction first. The strategy then extends the test suite to cover 2way interaction and repeating the process until the test suite covers t-way interaction where t is the interaction requested by the user. Unlike Jenny, PICT generates test data by selecting one uncovered tuple (the parametervalue combinations produced by interaction) and iteratively fill the "do not care" (parameters that does not involve with current tuple) with the best found value (that covers the most uncovered tuples). Different from other strategies, TVG generate test data either using its T-Reduced, Plus-One or Random Sets algorithm. Due to limited description in the literature, it is not clear how Othman and Zamli 3639 each of the algorithms (for example, T-Reduced, Plus-One or Random Sets) is actually implemented. Based on our experience with TVG; T-Reduced often produces the most optimal test suite as compared to Plus-One and Random Sets. As for Union, the strategy generate a complete test data to satisfy all the input-output relation and remove any repeating test data (that is, finding the union of the test data) to find the optimal test suite. Unlike Union, Greedy select test data based on greedy algorithm as its final test suite candidates. The second category is the artificial life-based (AL) OTAT strategy. As the name suggests, the AL-based OTAT strategy adopts an artificial life technique for generating the test data. Genetic algorithm (GA) and ant colony algorithm (ACA) are amongst the most common AL technique that has been adopted for generating interaction test data. GA based strategy typically starts by randomly creating a number of test cases (m), as chromosomes, in a test candidate list. The chromosomes inside the candidate go through a series of mutation processes until the desirable interaction criteria are met. In the end, the best chromosomes inside the test candidate list will be taken in the final test suite. As for ACA, the candidate test cases are searched by colonies of ants on some possible paths. The paths qualities are evaluated in terms of the pheromones which signify convergence. Here, the optimum paths correspond to the best test candidate to be included in the final test suite. Concerning related work, GA (Shiba et al., 2004) and ACA (Shiba et al., 2004) are the Shiba's implementation of genetic algorithm and ant colony algorithm respectively, while GA-N (Nie et al., 2005) is the Nie's version of implementation based on genetic algorithm. In other work, Chen (Chen et al., 2009) also introduces a variant ant colony based strategy, called ACS.
The last category of OTAT strategy is the heuristic based strategy. Heuristic based OTAT strategy uses a heuristic method in deciding the final test data. Density (Wang et al., 2008), simulated annealing (SA) (Cohen et al., 2003) and automatic efficient test generator (AETG) (Cohen et al., 1994) are the strategies that can be characterized as heuristic based OTAT strategy. Density used the global and local density calculation (extension of "density" concept introduced by Colbourn (2009, 2007) for constructing the final test data. Formulae to calculate both global and local density can be found in Wang et al. (2008). As for SA, the strategy starts with constructing a feasible solution and the applying a series of transformation to the solution until the solution cover all generated pairs. A binary search algorithm is adopted in order to find the smallest possible solution. In the case of AETG, the strategy generates a number of test data candidates based on the parameter-value configuration that cover the most number of uncovered tuples. From the list of test data candidate, a test data that covered the most uncovered tuples will be selected in the final test suite. 3640 Sci. Res. Essays OPAT strategy is essentially strategies that adopt the vertical and horizontal extension in Lei et al.(2007) order to generate test data. Unlike OTAT strategy, OPAT strategy generates an initial test suite (an exhaustive test suite for selected number of parameters only) and extends the test suite by adding one-parameter-at-a-time (horizontal extension). To ensure the interaction coverage, a completely new test data may be added into the test suite during the horizontal extension (vertical extension). In-parameter-order (IPO), a pairwise strategy (that is, 2-way interaction) proposed by Lei and Tai (1998), is the pioneer of OPAT strategy. Later, Lei et al. (2007) introduces IPOG the general version of IPO to support higher order of interaction. Younis and Zamli (2010) introduce MC-MIPOG, a multi-core version of IPOG. In addition, Nie et al. (2005) and Williams (2010) also came out with their own version of IPOG called IPO-N and TConfig respectively. Apart from that, Wang et al. (2007) also introduces two new strategies, called ParaOrder and ReqOrder. ParaOrder and ReqOrder differ for its predecessor, IPOG, in terms of how the initial test case is generated. In IPOG, the initial test case is generated in-defined-order-of-parameter found whilst, in ParaOrder, the initial test case is generated based on the first defined input output relationships. In the case of ReqOrder, the selection of initial test case does not necessarily follow the first defined input output relationship.

Theoretical background
To introduce the concept of interaction (or t-way) testing, consider an integrated home security system shown in Figure 1. The system consist of three inputs from various sensors (that is, to detect both fire and intrusion) and one input from security control panel which is used to activate or deactivate the intrusion detection system. The system operation is as follows. If fire is detected through smoke detector, the system will ring the security bell. If intrusion is detected (either from the glass break detector or door open sensor while security control panel is set to activate) the system will alert the house owner via short messaging services (SMS). The parameters configuration for the system is summarized in Table 1. As far as testing the integrated home security system is concerned, 3 types of interaction can be considered (that is, uniform strength interaction, variable strength interaction and inputoutput based relations). Uniform strength interaction is the most common interaction type. In uniform strength interaction, it is assumed that every parameter in the system interacts uniformly with single value of interaction strength. For example, the integrated home security system given earlier can be tested using 2-way testing (with the interaction strength = 2  (2004) and Zekaoui (2006) as in Equation 1: (1) Where, N = the number of test data inside the final test suite, t = the interaction strength, C = value configuration can be represented as follows: which indicate that there are p1 parameters with v1 values, p2 parameters with v2 values and so on.
Another variant of interaction is variable strength interaction. Variable strength interaction extends the notion of uniform strength interaction. Like uniform strength interaction, variable strength interaction consists of a dominant interaction, which involves all parameters of the system. However unlike uniform strength interaction, variable strength interaction allows the specification number of disjoint covering array to be incorporated within the dominant interaction. This allows the test engineer to specify higher interaction strength for highly interacting parameters. In the case of the integrated home security system, if it is known that the inputs from intrusion detection (that is, glass break detector, door open sensor and security control panel) are highly likely to produce error, higher interaction strength can be assigned to those inputs accordingly as shown in Figure 2. Like uniform strength interaction, variable strength interaction can be mathematically represented using variable strength covering array as shown in Equation 2: (2) Where, N = the number of test data inside the final test suite, t = the dominant interaction strength, C = value configuration can be represented as follows: which indicate that there are p1 parameters with v1 values, p2 parameters with v2 values and so on, S = the multi-set of disjoint covering array with strength larger than t represented by the notation as given in Equation 1.
As for input-output based relations, a set of input-output relationship need to be obtained first before the specification of the covering array. Adopting the integrated home security system given earlier and assuming the system outputs have the following relationship: (i) "Sound the security bell" depends on smoke detector inputs. (ii) "Alert the house owner" depends on glass break detector, door open sensor and security control panel inputs.
The input-output relationships, Rel, of the system can be written as:   ITTDG strategy iterates all uncovered tuples produced by every interaction or input-output relationship specify by the user. Iteratively, the strategy will push the visited tuples into a list referred as test data candidates list (Q). The test data candidates list then will be extended by adding one parameter at a time with value that covers the most uncovered tuples. In case of "tie" situation (that is, more than one value cover the most uncovered tuples), the corresponding test data will be duplicated with all the tie values and all duplicated test data will be pushed into Q. Once all test data in Q form a complete test data, the test data which has the highest weight (that is, covers the most uncovered tuples) among the test data candidates in Q will be selected as the final test data. In case of another tie situation (that is, more than one test data candidates have the highest weight), the first found candidate in Q will be selected as the final test data. The selected test data then will be pushed into the final test suite and the tuples covered by the test data are removed from uncovered tuples list. As the size of Q can potentially grow significantly during the parameter extension process, the number of test data candidates in Q subjected to a constant integer value (M) to avoid the possible out-of-memory error (that is, when dealing with large number of parameters and values). As illustration, consider the parameters and values for the integrated home security system given in Table 1 with 2-way interaction as our running example. In order to simplify the discussion, all the values are converted into symbolic value as shown in Table 2.
Initially, the first uncovered interaction will be a0,b0,X,X (where X indicates do not care value). The ITTDG strategy will create a list of test data candidate (Q) and the first uncovered pair found is pushed into Q as the first test data candidate. Since no candidate in Q forms a complete test data as yet, the strategy will extend every candidate in Q one-parameter-at-a-time. As the first candidate in Q, the first found do not care will be Parameter C. Thus, the strategy will first extend the candidate with Parameter C. From Table 2, two values can be taken to replace do not care (that is, c0 and c1). The strategy then will calculate the number of covered pairs for each value. Here, both c0 and c1 covers the maximum number of uncovered pairs, thus the candidate will be extended with c0 and a duplicate of candidate extended with value c1 will be pushed into Q. Now, Q contains two candidates which are a0,b0,c0,X and a0,b0,c1,X. Since there are incomplete candidates, the ITTDG strategy will iterate through all candidates in Q for its extension. Firstly, candidate a0,b0,c0,X will be extended with the only Parameter D. As both d0 and d1 can cover the maximum number of uncovered pairs, the candidate will be extended with d0 and a duplicate of candidate extended with value d1 will be pushed into Q. The same process is repeated for candidate a0,b0,c1,X. Finally, Q  will contains four candidates which are {a0,b0,c0,d0}, {a0,b0,c0,d1}, {a0,b0,c1,d0} and {a0,b0,c1,d1}. Since no candidate forms an incomplete candidate, the process of parameter extension is stopped and one candidate in Q will be selected as the final test data. It should be noted that if all four candidates cover the same number of uncovered tuples, the first found candidate will be selected (that is, a0,b0,c0,d0). The tuples corresponding to the selected candidate are removed and the selected candidate is pushed into final test suite list. The whole process is repeated again until all the uncovered tuples are covered.
There were only slight variations as far as ITTDG's support for variable strength and input output relationship is concerned. In the former case, instead of storing all uniform tuples, the list of uncovered tuples (µ) merely stores all tuples produced by dominant interaction and all disjoint covering arrays. In the latter case, µ will holds all exhaustive tuples produced by each requirement or inputoutput relationship. In both cases (that is, variable strength interaction and input-output based relations), all tuples in µ is sorted based on the number of "do not care" inside the tuple (that is, tuple with less number of "don't care" will be listed first). It should be noted that the iterative and selection process remains the same as described earlier. The summary of ITTDG strategy is shown in Figure 3.

RESULTS
The main goals of our evaluation are twofold. Firstly, we wish to demonstrate the capability of ITTDG to integrate all interaction possibilities. Secondly, we benchmark the performance (that is, in terms of generated test data size) of ITTDG against its competing strategies. As different tway strategy supports different types of interaction, Table  2 summarizes the type of interaction supported by existing t-way strategies.
Our evaluation can be further explained thus. Subsequently, we compared ITTDG with other uniform strength strategies, after which we compared ITTDG with other existing variable strength strategies before demonstrating the performance of ITTDG as an inputoutput based t-way strategy. It should be noted that in all experiments, the shaded cells represent the most optimum result.

ITTDG as uniform strength t-way strategy
Since there are varying supports as far as interaction strength is concerned (that is, some strategies deal with only small interaction strength), we have decided to perform two different experiments based on the benchmark configurations in Wang et al. (2008) and Klaib (2009). The first experiment adopts lower interaction strength (that is, t = 3) while the second experiment used Table 3. Generated test data size for 3-way testing.
The results for Group 1 until 4 are depicted in Tables 4 to 7 respectively. Results for MC-MIPOG are taken from Younis and Zamli (2010) while other strategies taken from Klaib (2009). It should be noted that not all strategies are available for both experiments due to limited published results in literature.

ITTDG as Variable Strength T-Way Strategy
To demonstrate ITTG as variable strength t-way strategy and benchmark performance (in terms of generated test data size), we adopted an experiment from Chen et al. (2009) andCohen et al. (2003). The experiment consists of three system configurations which are: (i) 15 3-valued parameters system (ii) 3 4-valued parameters, 3 5-valued parameters and 2 6-valued parameters system (iii) 20 3-valued parameters and 2 10-value parameters system The sizes of test data generated are shown in Table 8. Results from other strategies are obtained from Chen et

ITTDG as uniform strength t-way strategy
Based on the result shown in Table 3, in terms of generated test data size, ITTDG outperforms TVG, PICT, TConfig and Jenny in almost all system configurations.
Compared to Density, ParaOrder and GA-N, ITTDG produces best results in all system configurations except for S8 (that is, where Density and GA-N outperforms ITTDG) and S7 (that is, where ParaOrder outperforms ITTDG).
As for AETG, ITTDG outperform AETG in S2, S7 and S8 while AETG outperform ITTDG in S1, S4, S5 and S6. Lastly, for GA, ACA and IPO-N, these strategies outperform ITTDG in all system configurations except for S2 where ITTDG equals to the three strategies and S7 where ITTDG outperforms IPO-N.
Based on results shown in Tables 4 to 7, one clear observation that can be made is that no single strategy can produce the most optimum result for every system configuration. Overall, ITTDG produces the most optimum result for more than half system configurations (that is, 18 out of 35 configurations). For other system configurations, ITTDG produces competitive test size.

ITTDG as variable strength t-way strategy
Referring to Table 8

ITTDG as input-output based t-way strategy
From result of the first experiment (Table 9), ITTDG   Table 9. Comparison of size generated by different strategies for IOR (N, 3 10 , R).

|R| N Density
TVG ReqOrder outperforms most other strategies except when |R| = 40. In the case of |R| = 40, ParaOrder produces the most optimal test size (that is, 120 test data).Overall, UNION produces the worst result.
In the second experiment (Table 10), Greedy produces the most optimum test data for |R| = 10 and |R| = 20 while Density produces the most optimum result for the rest of configurations. Although not producing optimal test size, ITTDG produces competitive test size (that is, second to the most optimum strategy) for all configurations. For |R| = 10 and |R| = 20, ITTDG produces the same number of test data as Density. For the other configurations, ITTDG comes second to Density. Similar to the first experiment, UNION produces the worst result for all configurations.

CONCLUSIONS AND RECOMMENDATION
In this paper, we have described the development of a new t-way strategy, called ITTDG, which provides seamless integration of all interaction possibilities. Empirical evidence demonstrates a sound performance as far as the generated test size is concerned especially involving uniform number of parameter values. As a scope for future work, we are now investigating sequence based interaction where the sequence of input (or the order of input arrivals) is first checked for interaction considerations (Zamli et al., 2011b). Moreover, we also investigate a statistical approach based on Design of Experiments in order to automatically determine the required interaction strength as well as input-output relationship of a system of interest so as to further simplify the test generation process.