Modelling of Work Efficiency in Cable Traction with Tractor Implementing the Least-Squares Methods and Robust Regression

Wood-harvesting activities are conducted by contractors through tendering based on prices determined by the amount of transported wood, land conditions and transport method parameters. Managers should determine the average completion time of the work and the base price accurately to prevent both work and contractor losses prior to the tender and note the same in the tender contract. Thus, prediction of productivity in wood production is of great importance in the determination of the work duration and cost. In this context, the aim of the present study was to determine the most accurate estimation model that would predict productivity (Pe) based on log volume (Vt), route slope (P) and winching distance (D) in uphill cable skidding activities with a drum tractor. In the current study, estimation models were developed that use both linear regression through SPSS employing all data and the robust regression method that minimizes the effect of outliers. Harvesting units were selected among pure spruce (Picea orientalis (L.) Link) stands via the uphill cable-skidding method with a tractor in the NorthEast of Turkey. Route slope, winching distance, log volume and time-consumption data were collected in the chosen harvesting units and productivity prediction models were developed with these data. In this study, the productivity estimation was performed using linear regression in SPSS and robust regression methods prepared in MATLAB environment. The coefficients calculated by these methods were statistically tested, and it was determined that the winching distance coefficient was insignificant with both methods. Thus, the productivity estimation model was re-determined with both methods based on the slope and log volume parameters, and the findings were compared. Additionally, the standard errors of the coefficients of both models were compared and it was concluded that the robust method was more sensitive than the SPSS regression method.


Introduction
Forestry includes different activities, such as establishment of the stand, silviculture and maintenance activities, production of wood material or secondary products, marketing and sales, planning and construction of forest buildings, control of forest fires and pests (i.e., insects, fungi) and recreational activities. One of the most important forestry activities is wood production, which is the main source of income for forest enterprises and requires periodic interventions in the forest. may not be used in countries with less favourable economic conditions (Blakeney 1992, Wang et al. 2005. In Turkey, where approximately half of the forests are dispersed in areas with a slope higher than 40%, the cable-skidding method with drummed forestry tractors is predominantly used uphill, while tower yarders are used occasionally (Acar et al. 2015).
When using the method of cable-skidding with a forestry tractor, the cable on the drum of the forestry tractor, which is fixed roadside, is pulled to the location of the logs by a worker, tied to a log, and the tractor engine pulls the cable to extract the log to the roadside. The utility of the cable-skidding method depends on the presence of a forest road; the maximum winching distance of tractors is between 100 and 120 m (Erdas et al. 2014).
Certain studies report that the main limiting factors surrounding the decision to select the extraction method with a harvesting unit as well as the productivity of the machine are the silviculture, forest operation management, terrain condition, winching distance, tree size and route slope (Akay et al. 2004, Ghaffariyan et al. 2012, Spinelli et al. 2010. These parameters have a negative effect on productivity, especially in uphill extraction operations. Kovácsová and Antalová (2010) emphasized that forests should be operated with optimum productivity to meet the requirements of both present and future generations.
Wood-production activities are required to be completed within a certain period of time owing to the fact that they are conducted in natural conditions and with living material. As per the forestry regulations in Turkey, wood-logging operations are tendered to contractors with the stumpage sale method or forest villagers based on the unit price. In both cases, the operators should make an accurate work plan, determining the completion time of the work and the open tender price that should be specified accurately. Thus, it is of great importance to anticipate work productivity and come up with an adequate plan in wood production. One of the most commonly used methods is working time studies to analyse productivity of harvesting systems (Gallis 2004, Gallis and Spyroglou 2012, Savelli et al. 2010. Regression analysis is one of the most commonly used statistical methods of estimation models to establish the correlation between two or more variables (Khamis and Razak 2017). As calculation with the LSM is rather easy, it is the preferred method for most regression applications (Wu and Yu 2018). However, it was demonstrated in various works that outlier or multi-polar/missing data within a data set may adversely affect regression results Berber 2003, Wen et al. 2013). In such cases, it is more adequate to use the robust method, which could provide more reliable results by limiting the weight of outliers (Al-Amleh 2015). With the robust regression method, the measurements are affected neither by the errors in these measurements nor the errors in others, hence the negative impact of measurement errors on the results are reduced. This method permits more accurate determination of rough erroneous measurements without dispersing the effect of outliers. Therefore, a model, where matching data produces reliable results, could be designed.
The objective of the study was to design a model to accurately estimate effective productivity (P e ), where log volume (V t ), route slope (P) and winching distance (D) were known in a harvesting unit, where uphill tractor cable skidding was carried out. For this purpose, the effective productivity estimation for cable skidding was modelled with two methods: 1) Statistical Package for the Social Sciences (SPSS) linear regression; and 2) robust regression, with a final comparison of both methods.

Material and Methods
The study was performed in the East Black Sea forests in North-East of Turkey during 2016-2017. Route slope, winching distance, log volume and winching time data were collected from harvesting units that utilize uphill cable winching with a forest tractor. All harvesting units were pure spruce (Picea orientalis (L.) Link) stands of a middle age class and these forests had a closure of 0.71 to 1.00. The slopes of the skidding routes between 40-80% were measured with an inclinometer and winching distances between 38 and 115 m were evaluated with a steel tape measure.
In the study, 247 logs were extracted uphill using a MB-Track 900 model forest tractor. One log was skidded each time. Diameters (d) in cm and lengths (L) in m of the logs were measured. The volume of the transported logs (V t ) in m 3 was calculated with Huber's formula (Castellanos et al. 2007, The Forest Service 1999: (1) Time measurements were performed with a stopwatch and the winching times (t) in h for each log were determined by the reset time measurement technique. Effective productivity of the system (P e ) in m 3 /h was calculated as (Mederski 2006): The regression equation for effective productivity (P e ) included skidded log volume (V t ), route slope (P) and winching distance (D): The equation was solved with the SPSS linear regression and robust regression methods. Significant coefficients were determined for each solution and regression equations were written with these coefficients.

Regression Analysis
The regression is a popular method that determines the correlation between one independent variable and one or more dependent variables with a mathematical function (Uyanık and Güler 2013). In the literature, there are statistical estimation methods such as, the least squares method (LSM) (Koch 1999), weighted least squares (Draper and Smith 1998), robust regression (Chen and Pinar 1998, Gross 2003, Rousseeuw and Leroy 1987, Staudte and Sheater 1990, genetic algorithm (Pan et al. 1995) and artificial neural networks (Stern 1996).
The relationship between the inputs and outputs in a multivariate linear regression model was expressed as: Where: random error t number of unknown parameters.

The Least-Squares Method (LSM)
In the estimation of model parameters, the LSM was applied, and the objective function was written as: As (e) was the opposite of correction (n), when error was replaced by correction in Eq. 4, the following equation was obtained: This equation could be written as a matrix: Where: weight matrix of input variables The reverse weight matrix of the coefficients (Q bb ) was obtained with Eq. 8 as: The average error (m o ) of the unit measure was determined with the correction vector (V) calculated by the coefficients vector (b) put in Eq. 7 as: Where: n number of measurements u number of unknowns. The standard error (SE) for regression parameters (m b ) was calculated as: Where: The reverse weight matrix of the corrections (Q vv ) was calculated as: The standard error of corrections was calculated as: The significance of the regression coefficients obtained with Eq. 8 was tested as: Where:

Robust Regression Method
Although normal distribution of errors is assumed in LSM, in most applications, the normal distribution of errors cannot be ensured. An incorrect measurement adversely affects the LSM estimator and estimation value. Outliers impair all LSM findings, and thus all test sizes owing to the spillover effect. These cause the regression curve obtained with LSM to shift towards the outliers. Therefore, outliers are a serious problem in LSM analysis. One of the methods to overcome this problem is the robust regression method (Gallegos and Ritter 2005, Maronna and Zamar 2002, Meer et al. 1991, Wilcox 1997.
The robust regression method categorizes the measurements into reliable data and outliers. Reliable data have random errors, while the outliers have gross errors. The robust regression method is different from the LSM method in that it is not affected by outliers. Specifically, the parameters are calculated by assigning iteratively smaller weights to the outliers (Caspary and Borutta 1987, Gao et al. 1992, Hampel et al. 1986, Yang 1994, Yang et al. 2002. The objective function (r) was written in the robust estimation method as: The derivative of the objective function was taken to obtain the prediction function (y) as: and weight function (W) as: The prediction was made iteratively by re-weighting the LSM method. In each iteration, standardized corrections were compared with a limit value and new weights of the data were determined based on the selected weight function. The iteration was maintained until the desired convergence and the new weights of outliers were gradually reduced (Hekimoğlu and Berber 2003).
The iterated weight matrix was determined as: Where: i number of iteration The robust weight matrix in the initial iteration accepted as the unit matrix (W 0 =I).
The parameters were calculated with the LSM based on Eqs. (7) and (8) as: Iteration was maintained until the difference between the b i+1 and b i parameters were insignificant. It was observed that the weights of the outliers were diminished after each iteration, and some were even observed to approach zero. The significance tests of the regression coefficients were conducted with Eq. 14. Then, the regression equation was obtained with the significant coefficients.
Various weight functions are used in robust regression analysis (Gökalp et al. 2008) (Table 1).
The limit value parameter (c) for the weight functions in Table 1 can be determined with various techniques. In order for robust methods to provide accurate and reliable results, the limit value parameters should be determined as accurately as possible. The limit value was based on the assumption that observational errors would be scattered within the limits of ± c within a certain probability. The limit value could be assumed as: and test size (T i ) could be tested as: Where: q nini i th diagonal element in the weight matrix of corrections (Q vv ) f ,1 2 t − a t-table value in degrees of freedom f Using Eq. 22, the limit value was calculated for each correction as: When the weights of the data were different, the limit value after the first iteration in Eq. 23 was: Based on empirical studies, statisticians suggest that the limit value c could be taken as 1.5 or 2 (Somogyi 1988). However, calculating the limit value c separately for each measure with Eq. 25 provides a more realistic decision.

SPSS Regression
The correlation of several parameters of the harvesting operations with productivity is generally determined by regression models (Gallis 2004). The linear regression function was established according to Eq. 3 using the log volume, slope and transport distance for effective productivity in the uphill tractor cable-skidding method. The coefficients of the parameters and the significance levels of these coefficients were determined by SPSS from Eq. 3 (Table 2). It was found that the distance coefficient was not significant at p> 0.05 (Table 2). Therefore, the regression equation for effective productivity was reconstructed with log volume and slope as: The coefficients of the parameters and the significance levels of these coefficients were re-calculated by SPSS from Eq. 26 (Table 3).
Previous studies showed that productivity was generally affected by volume (Proto et al. 2018), distance (Nikooy et al. 2013), slope (Gilanipoor et al. 2012), the number of logs in each cycle (Gholami and Majnounian 2008) and interaction between them (Mousavi 2009, Naghdi 2004, Pilevar 1996. While it was found that distance affected productivity in some studies (Nikooy 2007, Wang 2004, it was determined that distance had no effect on productivity by Mousavi and Nikooy (2014) similar to this study. On the other hand, Gilanipoor et al. (2012) determined that productivity depended on slope and volume similar to the findings in this study.

Robust Regression
The regression coefficients in Eq. 3 were calculated by the robust regression method with iterations. These calculations were performed with the software written by the author in the MATLAB environment. The Huber M-Estimation was used as the estimation method (Table 1). In the Huber M-Estimation, the limit value (c) was determined by Eq. 25. Robust weights (W) were calculated from Eq. 17. The iterated weight matrix (P i ) was determined by Eq. 18. Regression equation coefficients in Eq. 3 were calculated by Eq. 19 after iterations. With this method, the significance of the coefficients was tested by comparing the test size (T bi ) and t-table value (Table 4). It was found that the winching distance coefficient (b 3 ) in Eq. 3 was insignificant (Table 4). Therefore, the effective productivity regression coefficients were cal-culated with log volume and route slope, which were determined as significant, and standard errors for the coefficients (m bi ) were calculated with Eq. 11 with significance tests conducted (Table 5). It was found that the transport distance coefficient was not significant (Table 5). For this reason, log volume and slope coefficients were re-calculated from the robust regression method in Eq. 26 and significance tests were performed. Hence, the regression equation of effective productivity for the uphill forestry tractor cable-skidding technique was determined using significant coefficients with the robust regression method as follows: i i e t i 3.0940 5.5182 1.3886 In the literature, various studies comparing LSM and robust regression methods have been made. Muhlbauer et al. (2009) found that external observations might bias the LSM trend estimate and lead to an overly high or low estimate. Therefore, it was determined that the robust method was more suitable than LSM for estimation modelling. Similar to this study, Hekimoğlu and Erenoğlu (2005) determined that it was more appropriate to use robust methods in cases of outliers.

Conclusions
This work is of great importance and represents the most accurate estimation of the productivity of production and realization of production during woodextraction activities, one of the most difficult and expensive stages of wood production. In the literature, productivity prediction models were largely developed by linear regression analysis in SPSS software. However, for the linear regression model of the SPSS program, sufficiently accurate results may not be obtained because the coefficients are calculated accord-ing to the LSM method from all data containing outliers. Therefore, this research sought to investigate more sensitive determination of productivity estimation. For this purpose, productivity estimation was carried out by the robust estimation method, which minimizes the effects of outliers, and the results were compared with SPSS estimation.
In this study, spruce logs with the volume of 0.28-2.35 m 3 were transported uphill with a forest tractor at 38-115 m distances in harvesting units having a slope between 40 and 80%. The productivity estimations were produced using SPSS and the Huber-M estimation with the robust regression method prepared in MATLAB by taking the slope, volume and transport distance. With both estimation methods, it was determined that slope and volume had a significant effect on productivity, while the winching distance was not effective. It was determined that the productivity was directly proportional to volume and inversely proportional to slope. The standard errors of the coefficients of both models were compared and it was concluded that the robust method was more sensitive than the SPSS regression method.
In future studies, productivity estimations for cable traction with forestry tractors should be investigated for different slopes, volumes and tree species. In addition, it is recommended that productivity estimates for different extraction methods be made with different robust estimation methods.