New statistical approaches to analyze data from intercropping experiments

Discussed this and other statistical methods for processing simulated data from the Jatropha x Cowpea intercropping system, in order to compare them with others and with the conventional method, and to recommend a general application method in experiments with intercropping systems. The experimental design was randomized blocks with four replications. The treatments were arranged in a 2 x 7 factorial scheme, corresponding to two levels (90 and 100% of the recommended population in sole cropping RPSC) of jatropha, with 4 x 2 m and seven levels (40; 50; 60; 70; 80; 90 and 100% of RPSC) of cowpea. The variables analyzed in jatropha were intercropping yield (t ha), mass of seeds (t ha), oil yield (L ha ) and gross income (R$). In cowpea, the variables were yield (t ha) and gross income (R$). The evaluated methodologies were ANOVA with application of Tukey test, multivariate analysis of variance, bivariate analysis of variance, land use efficiency index (LUE), competitiveness index, DEA efficiency index, multicriteria decision support analysis of Copeland and Ward’s hierarchical cluster analysis. The index LUETOTAL associated with bivariate graphical analysis or hierarchical cluster analysis can be successfully applied in the processing of data from any intercropping system. Thus, LUETOTAL and multivariate analysis are recommended as complementary methods.


INTRODUCTION
Definition of hypotheses, planning, experiment conduction and data analysis are essential procedures for the reliability of results obtained in an experiment.
Experimental design and analytical procedures for data processing are defined in the experimental planning (DIAS; BARROS, 2009). The previous definition of these procedures is simplified for the basic designs and their arrangements. However, such definition can be complex in the evaluation of more than one source of variation in the same experiment. Tests with intercropped species generate controversies about the definition of the most adequate methodology of data processing.
Intercropping consists in the exploitation of two or more plant species (treatments) simultaneously in the same experimental plot. Studies on the evaluation of methods to determine the efficiency of intercropping systems are recent and scarce. In general, the methods are non-parametric and multivariate. This is due to the multivariate nature of the data resulting from the joint evaluation of agronomic and economic parameters (BEZERRA NETO et al., 2007a). Contrasting with the parametric methods presented by Federer (1993), the statistical analyses at plot level with data envelopment analysis (DEA) was suggested by Bezerra Neto et al. (2007a). The use of this method considers the probability of existence of positive or negative correlation between plots.
Conversely, analytical methods that use only the arithmetic mean to process data from intercropping systems are not robust (MELO et al., 2005).
Bezerra Neto et al. (2007a), Bezerra Neto et al. (2007b and Bezerra Neto et al. (2007c) observed, respectively, effectiveness and ease in the utilization of the DEA method, as well as robustness of the bivariate analysis of variance; viability of the multicriteria decision support method and of the utilization of agro-economic indicators for biological productivity of intercropping systems. The recommended methods, except that of bivariate analysis, are object of study of the functional relation between resources and products in production engineering. In plant science, there is a lack of application of these methods to process data from intercropping systems.
This study aimed to evaluate new analytical methods to process simulated data of an intercropping system of a perennial species (Jatropha curcascurcascurcas L.) and an annual species [(Vigna unguiculata (L.) Walp.)], and to recommend one or more of them.

MATERIAL AND METHODS
The experimental design used in the simulation was randomized complete blocks, with four replications. The simulated treatments were arranged in a 2 x 7 factorial scheme. The first factor corresponded to two levels (90 and 100%) relative to the recommended population for the sole cropping (RPSC) of the perennial species, at 4 x 2 m spacing, and for the oilseed crop Jatropha (Jatropha curcascurcascurcas L.) (DIAS et al., 2007). The second factor was composed of seven levels (40,50,60,70,80, 90 and 100% RPSC), whose standard was 50,000 plants, as recommended for cowpea [(Vigna unguiculata (L.) Walp.)] (CARDOSO et al., 1997).
The variables analyzed in J. curcas were yield in intercropping (t ha -1 ), mass of seeds (t ha -1 ), oil yield (L ha -1 ), based on the maximum content (38.9%) reported by Kaushik et al. (2007), and gross income, estimated from R$ 1.85 L -1 of oil (BARROSO et al., 2014). In the cowpea crop, the analyzed variables were yield and gross income, estimated from the value of R$ 2.30 kg -1 .
The experimental data were obtained by simulation using the rnorm() function of the R software version 3.2.0. The function was entered into the expression y<matrix(rnorm((), mean = (), sd = sqrt())) to obtain the estimated values of the plots, where y is a vector of four pseudo-random numbers, mean is the mean established and sd = sqrt corresponds to the variance in the normal distribution, so that the distribution of plots (Yi) in each treatment is approximately normal, i.e., (Y 11 , Y 12 , … , Y 27 ) ~ N{µ, σ 2 }.
The efficiency indices based on efficient land use (LUE) (WILLEY, 1979) were calculated using Excel. LUE was obtained by 4: where, YYJIC is the grain yield of J. curcas in intercropping and YYJSC is its yield in sole cropping; YYCIC is the grain yield of cowpea in intercropping and YYCSC is its yield in sole cropping. The partial LUE of J. curcas (LUEPARTJ) and cowpea (LUEPARTC) were respectively obtained by 5 and 6: where, LUEPART close to 1 indicates efficiency of the intercropping system in comparison to the sole cropping system. The index of competitiveness between crops of Willey and Rao (1981) was obtained by 7: The productive efficiency of each plot was obtained using the DEA model with constant returns to scale (COOPER et al., 2004), using the application ISYDS -Integrated system for decision support (ANGULO MEZA et al., 2005a), with unit inputs (LOVELL; PASTOR, 1999). For each mean of treatments, model 8 was processed: with restrictions (9, 10 and 11):  Pearce and Gilliver (1979).
Multicriteria analysis through the Copeland method was performed with the software Preferências Ordinais Agregadas -WebPROA. Pair-wise comparison of means was made and the Condorcet decision matrix was obtained for 14 treatments.
Each element aij of the matrix was obtained by attributing 0 when the element i (for i = 1, 2, ..., i-th alternative) is equal to j (j = 1, 2, ..., j-th alternative); 1 when the element i is preferred over j and -1 when an element i is preferred over j. Ranks were defined based on the sums of the lines of the Condorcet decision matrix for each treatment. The method was applied to 14 criteria referring to the treatments (T1, T2, ..., T14).
Cluster analysis was carried out with the hierarchical method of Ward (1963)

RESULTS AND DISCUSSION
Based on the analysis of variance (ANOVA), the interaction between sole cropping systems of J. curcas and cowpea had significant effect (p < 0.01) on all variables. For J. curcas in intercropping, significant effect occurred on yield, gross income, mass of seeds and oil yield (p < 0.05). For cowpea in intercropping, yield and gross income (p < 0.01) were affected (Table 1).  For the interaction, the best yields (t ha -1 ) of J. curcas at 90 and 100% RPSC were obtained with 40% RPSC of cowpea. However, there was no statistical difference between both population densities. Yields of J. curcas at 90 and 100% RPSC were up to 2.5 times higher than the yields with 100% RPSC of cowpea.
The same contrast was observed for mass of seeds, oil yield and gross income of J. curcas in intercropping, because the analysis of biological efficiency of intercropping systems considers yield as a character of main importance. However, some species have particularities that must be considered before the recommendation of an intercropping system. In the case of J. curcas, the yield of oil extracted from the seed is directly associated with the viability of the system. Therefore, the hypothesis that yield cannot be associated proportionally with the maintenance of the mean oil content in the seeds, after introducing cowpea in the plot, needs to be verified. In the present study, however, the correlation between yield and oil yield of J. curcas in intercropping was r = 0.99.
The best yields of cowpea were obtained with 40% RPSC at 90 and 100% curcas.

RPSC of
Intercropping with forage species has also been studied. Nonetheless, as for food crops, it still requires multidisciplinary investigation. In this process, the use of J. For graphical evaluation, multivariate analysis of variance (MANOVA) was carried out (Table 2).    The descriptive analysis of the dispersion graph (Figure 1) ranks T14 as the best combination for the yield of J. curcas in intercropping. Other two treatments were also grouped in Q I (T7 and T13), representing alternatives for gains of yield.
The quality and applicability of the two-dimensional analysis of intercropping data were reinforced by Bezerra Neto et al. (2007a) in a comparative study between analysis methodologies for the carrot x lettuce system. However, studies evaluating the applicability and efficiency of statistical methods to process data from intercropping systems are still incipient.
Efficiency indices and non-parametric multivariate methods have been used as auxiliary tools for inference on intercropping systems. Indices based on the efficient use of the land exploited per plot (WILLEY, 1979;WILLEY;RAO, 1981) and on data envelopment analysis (CHARNES; COOPER, 1962), as well as the ranking of treatments obtained by the multicriteria decision support method of Copeland (GOMES;  were obtained in the present study (Table 3).  T14 T7 T13 T12 T11 T10 T6  T4  T5  T9 T3 T8 T2 T1 CP (%) : cowpea population The descriptive analysis of the absolute values of the indices DEA, LUETOTAL and IC kept T14 as the most efficient combination. Consequently, the reduction in the mean efficiency of the treatments was observed as PF (%) increased. Highest DEA efficiency indices were found for T7 and T14, respectively 0.87 and 1.00.
For LUETOTAL, T7 and T14 showed the best performance with respect to land use by the intercropped species, respectively 1.90 and 12.53. LUETOTAL higher than 1 suggests that the intercropping is advantageous over sole cropping (CAMILI, 2013 (MELO et al., 2005). According to Melo et al. (2005), DEA models and the multicriteria decision support method of Copeland are non-parametric alternatives for processing (MELO et al., 2005). In the present study, the hypothesis of application of the DEA CCR model to the plots was preferred over treatment means. The procedure was adopted because the experiment was simulated according to an experimental design and met the basic assumptions of ANOVA and MANOVA. Therefore, the processing at plot level would become redundant with the use of mean values.
Lastly, the result of the multicriteria decision support method of Copeland did not differ from those obtained with the other methodologies evaluated. It brought great value to the analysis, considering the obtaining of linear indices through the joint analysis of variables with different natures, in this case, the combination of an economic variable (RBPC) and an agronomic variable (RPC). The viability of using this nonparametric technique was also found by Bezerra Neto et al. (2007b) for carrot-lettuce intercropping.
Ultimately, the applicability of the multivariate hierarchical clustering method of Ward was evaluated (Figure 2). No divergence was observed between the results obtained with the cluster analysis and with the other methodologies evaluated. Among the five groups formed in the dendrogram A, T14 defined group I and had the highest means for J. curcas (5.90) and cowpea (0.35). The other groups formed were composed of the treatments T7 and T13 (group II), T2, T1 and T8 (group III), T4, T10, T11 and T12 (group IV) and T5, T6, T3 and T9 (group V). Although the treatments T14, T7 and T13 were grouped in the QI of the bivariate analysis graph, the method does not consist in a cluster analysis. Thus, after applying the procedure of Ward, only T14 defined group I in Figure 2A and B.
The dendrogram A showed group II formed by T7 and T13, indicating similarity between the treatments, already suggested in the QI of the bivariate analysis.
Considering all variables determined in the dendrogram B, T14 remained as the best combination between J. curcas and cowpea. In B, only three dissimilar groups formed, suggesting that the increase in the number of variables analyzed, simultaneously, favored the grouping of similar combinations and highlighted the divergent one, T14.
Using the hierarchical clustering method allowed the observation of the effect of multivariate response of the intercropping system. Hierarchical methods are frequently used to discriminate groups of individuals with similar agronomic attributes.
In the present study, it was applied to discriminate combinations with similar mean scores and generated coherent and consistent results.
The DEA index and the multicriteria decision support analysis of Copeland also led to coherent and consistent results. However, both are used to evaluate production systems by operational research, a research line of the production engineering. Consequently, further studies are needed on the processing of data of intercropping experiments in agriculture.

CONCLUSION
The index LUETOTAL, associated with bivariate graphical analysis or hierarchical cluster analysis, can be applied in the processing of data from any intercropping system.