The Effect of New Silvicultural Trends on Mental Workload of Harvester Operators

Close-to-nature (CTN) forestry offers many advantages, but makes management more complex and generally results in lower harvesting productivity and higher harvesting cost. While the higher harvesting cost of CTN is widely acknowledged, few ever consider the potential impact on operator workload, as the harvesting task becomes more complex. This study aimed to determine the mental workload of harvester operators under two silvicultural regimes: »pure conifer« stand and »mixwood« stand. In total, 13 harvester operators with varying experience levels were monitored for work performance and mental workload when operating a harvester simulator in two virtual stands designed according to the above-mentioned silvicultural regimes. Mental workload was assessed using the NASA Task Load Index (NASA-TLX) interview method and heart rate variability measurements, during two 30-minute test sessions performed in the »pure conifer« and the »mixwood« stand, respectively. As expected, operating in a more diversified »mixwood« stand resulted in a marked productivity loss, estimated between 40 and 57%. The study also confirmed the increased aggravation of mental demand, effort and frustration experienced by the operators when passing from the »pure conifer« stand to the »mixwood« stand. Such increase in mental workload was independent of the age and experience of the operators. Results can be used to paint a more holistic picture of CTN forestry and its implications for harvester operators. Besides increasing the number of subjects being monitored, future studies should focus on live forest operations.


Introduction
In the face of a changing climate, close-to-nature (CTN) forestry can contribute to increasing the resiliency of forest stands to natural disturbances such as pests, pathogens, windthrows, and drougths (Schütz et al. 2006, Morrison et al. 2014, Felton et al. 2016, Schäfer et al. 2017. The popularity of CTN and continuous-cover forestry has been on the rise in German public forests and in several other European countries. In this context, the presence of multi-cohorts with varying stem diameter and height distributions in combination with mixed-species assortments offers strong ecological benefits that have been studied extensively. Key examples are that mixed-species stands often exhibit increased stability and higher yields as compared to mono-specific stands (Kanowski et al. 2005, Pretzsch andSchütze 2006). practice may further tax the strength of machine operators, leading to mental fatigue, loss of focus and increased accident risk. So far, no one has asked the question if this is the case, and to what extent the wellbeing of forest operators is affected by silvicultural choices. Furthermore, one may expect that different operators will be affected differently, depending on their personality, their experience and their general adaptation capacities (Cooper and Payne 1991).
Therefore, the goals of this study were to determine the influence of changing to CTN forestry on operator performance and mental workload, and to ascertain whether the possible effects are mediated by operator demographics. In particular, the study was designed to test the following null hypotheses: Þ a shift to CTN forestry has no impact on productive performance (e.g. work productivity and work quality) Þ a shift to CTN forestry has no impact on mental workload, despite the increase in task complexity Þ even if a shift to CTN forestry resulted in a significant change of productivity and/or mental workload, such change is not mediated by operator demographics.

Work Environment, Machine and Operators
The experimental hypothesis was tested in a computer-generated environment, in order to create even and reproducible experimental conditions, unaffected by the large variability encountered in real-life forest environments. The computer used for the test was a state-of-the-art John Deere harvester simulator, routinely used for operator training at the Forest Operator Training Centre in Arnsberg, Nordrhein-Westfalen, Germany. Harvester simulators produce a reasonably faithful representation of real-life work environments (Ovaskainen 2005), and have been used for operator training by many training centers for a long time (Wiklund 1999, Ranta 2009). Modern simulators allow users to design their own stands with a good level of detail, defining the size, species and gait of each individual tree and placing each tree in one specific point of the simulated stand. Besides, users can define terrain characteristics, strip-road spacing and density of the understory. In particular, the John Deere simulator used for the experiment allowed accurate designing of the test stand through the software Simulator Terrain Editor 3.2.
For the purpose of the study, researchers designed two alternative stands, conventionally called »pure conifer« and »mixwood« (Table 1). The design of the »pure conifer« stand replicated the exact characteristics of a real stand used for a previous test conducted a few years earlier in the Italian Alps . The design of the »mixwood« stand was obtained by changing into broadleaves 40% of the conifer trees in the »pure conifer stand«, thus providing a 66/44 conifer to broadleaved mix. The transformation was indeed arbitrary and just meant to reflect a hypothetical change, not a specific real stand. As far as possible, density and stocking remained the same, although the »mixwood« stand received an additional generous distribution of advance broadleaved regeneration amounting to 250 stems/ha, with height up to 6 m (single stem volume up to 0.02 m 3 ). This was done to reflect the widespread presence of a shade-tolerant understory resulting from the introduction of broadleaved trees. The prescription for the »pure conifer stand« was the same as in the real stand: a selection cut designed to remove 27% of the tree number, or 18% of the standing volume. The same tree number removal rate was applied to the »mixwood« stand, Table 1 Characteristics of the real stand used as a model for the simulated stand »conifer« although the volume removal was slightly lower due to the occasional presence of smaller broadleaved removal trees. All removal trees were marked with red paint. Tree distribution in the two simulated stands is shown in Table 2. This reflects the numbers on a 0.58 ha plot, much similar to the real plots used for the Alpine trial of 2013.
Researchers selected thirteen volunteer test subjects, capable of representing a wide range of age, experience and training (Table 3). All subjects were male, which reflected the lack of female volunteers and the characteristics of a male-dominated business sector . Each test subject was asked to conduct a 30 min harvesting simulator session on each of the two simulated stands. The machine selected for the session was the same for all, and consisted of a John Deere 1270E harvester model.
Before starting, test subjects were asked for their consent, were guaranteed anonymity and were in-formed about the purpose of the test. They also received instructions on how to operate, and were asked to follow their own work pace and technique. All sessions were preceded by a short warm up, lasting approximately 15 minutes: after that, each subject harvested the »pure conifer« stand first and the »mixwood« stand second. A 15 minute rest pause was included between the two sessions, to allow subjects to cool down, recollect and avoid that fatigue would affect performance and workload during the second session. In any case, sessions were kept relatively short (30 minutes each) with the main purpose of preventing both fatigue and boredom. The two sessions in each test set (i.e. »conifer« and »mixwood«) were administered in a sequence, in order to avoid the effect of daily variations on the energy levels, motivation and general mood of the test subjects.
Depending on the availability of operators, tests were lumped in two separate periods during the year . Four subjects (# 2-5) were available during both weeks and were tested twice. In that case, the general analyses were conducted on the results of the second iteration only, conducted in week 26. However, the results of the first iteration (week 13) were matched against those obtained from the same subjects during the second iteration (week 26) in order to gauge the reproducibility of the experimental procedure.

Measurements
Subject performance was assessed on the basis of the work productivity and work quality indicators shown on the reports that were generated by the simulator upon closing each individual session. In particular, work productivity was quantified in terms of trees and m 3 harvested per work hour, distance covered by the machine in either direction (forward and backwards), time spent driving and time spent harvesting, etc. In turn, work quality was reported as the number of residual trees damaged during work, the number of machine damage events and the number of stumps exceeding the maximum allowed cut height, which was set at 20 cm. Of course, simulator figures must be interpreted with some caution, because real life is more complex than commercial computers can represent. Therefore, one cannot take all numbers on the simulator report at face value. However, the simulator environment is reasonably close to real-life conditions and exact emulation is not necessary for a comparative test, anyway: if the two treatments are compared on the same simulator, then any drift in the simulator report figures will be equal for both treatments on trial and the comparison will remain valid.
Mental workload was assessed using the NASA Task Load Index (NASA-TLX) method, which offers the best combination of reliability, repeatability and convenience (Hart and Staveland 1988). This method is a subjective measure of mental workload, and as such it is easy to implement, non-intrusive and quite sensitive to variations in mental workload (Shick and Hahn 1987). Among the mainstream subjective measure methods, NASA-TLX was chosen because of its higher concurrent validity -i.e. its capacity to correlate mental workload measures with performance measures (Rubio et al. 2004). That should make sense in a study covering both performance and mental workload. In essence, the NASA-TLX is calculated by combining the ratings offered by subjects on six subscales. These are designed to represent the most relevant dimensions of mental workload, and namely: Mental, Physical, and Temporal Demands, Frustration, Effort,  (NASA 1986). In this study, the NASA-TLX test was administered to subjects immediately after each simulator session, even though little information may be lost when ratings are given retrospectively (Hart et al. 1986, Haworth et al. 1986).
Mental workload was also assessed using heart rate variability (HRV) as an objective physiological measure. HRV reflects the balance between the parasympathetic and sympathetic activity of the autonomic nervous system (ANS) and is sensitive to changes in mental workload (Mulder and Mulder 1981, Aasman et al. 1987, Jorna 1992, Veltman and Gaillard 1993. In particular, increased mental workload tends to decrease HRV (Delliaux et al. 2019). Recently, HRV analysis has become increasingly popular for monitoring the training-readiness of professional and amateur athletes (Kiviniemi et al. 2007, Plews et al. 2013, Tian et al 2013, which has led to a rapid growth in the market for accurate, affordable and rugged wearable HRV devices (Flat and Esco 2013, Hernando et al. 2019). In particular, the HRV device used in this study was a Polar H8 chest belt heart rate monitor (Polar Electro Oy, Kempele, Finland), coupled with the Elite HRV dedicated smartphone-based software (https:// elitehrv.com/). The accuracy and reliability of both the hardware and software are confirmed by previous validation studies (Giles et al. 2016, Caminal et al. 2018. Unfortunately, the HRV device was only available for the second test period on week 26 and was administered to 11 out of 13 test subjects. Therefore, the experimental plan included: thirteen subjects for testing hypotheses 1 and 2, the latter limited to subjective measures (NASA TLX); ten subjects for testing hypotheses 2 with physiological measures (HRV analysis); thirteen subjects for testing hypothesis 3.

Data Analysis
Since data violated the normality assumption, the significance of any difference between treatments was tested with non-parametric techniques. In particular, the effect of a change in silviculture on operator performance and mental workload was checked with the Wilcoxon Signed Rank test, a paired test matching the results obtained by each operator under each treatment. This was done for the final scores, as well as for each of the individual performance indicators con-

Impact on Performance
As expected, operating in a more diversified »mixwood« stand resulted in a marked productivity loss, estimated between 40 and 57% (mean value 48%). This ranks alongside a similar reduction in the mean size of the harvested trees, which was also reduced by approx. 50% (Table 4). Work quality also degraded, as machine damage events doubled and residual tree wounding occurrences tripled. All these differences were highly significant, and the results may be taken as conclusive. The occurrence of overly tall stumps was lower in the »mixwood« treatment than in the »pure conifer« treatment: that may be taken as a work quality improvement, possibly related to the smaller diameter of the trees being cut, which makes it easier to slide the harvesting head down to the ground at the time of felling.
Productivity was associated with operator experience, and increased by ca. 1.5% points with each year on the job, as an average (Table 5). Subjects with 20 years of experience were 27% more productive than beginners. The percent of work time spent driving and maneuvering the boom was significantly higher for the more experienced operators, which may point at a more sophisticated use of machine functions, or simply at a faster execution of tasks other than driving and boom operation, which would inflate the total share occupied by these two tasks.
On the other hand, no significant relationship was found between operator experience and productivity losses incurred when shifting from the »pure conifer« treatment to the »mixwood« treatment (Table 6). Apparently, changing to a more complex prescription impacted all operators equally, regardless of their experience. This finding justified the use of the »mixwood« indicator variable in isolation (static effect), and not as an interaction variable associated with op-

Impact on Mental Workload
As an average, TLX scores increased 75% when passing from the »pure conifer« stand to the »mixwood« stand (Table 4). Such increase was caused by a marked aggravation of mental demand, effort and frustration. All these differences were statistically significant. In contrast, treatment had no significant effects on physical demand, time demand and performance. While physical demand and time demand were rather small contributors to mental workload, performance was the second most important contributor, but it did not change significantly between treatments. This may indicate that performance pressure is an important component of mental workload in harvesting work, in general.
Interestingly enough, TLX scores did not correlate significantly with operator experience, nor did the  Notes: NN -time between successive heartbeats; RMSSD -Root mean square of the successive differences; ln RMSSD -natural logarithm of RMSSD, SDNN -Standard deviation of the NN intervals, NN50 -The number of pairs of successive NN intervals that differ by more than 50 ms, HR -Heart rate, LF Power -frequency activity in the 0.04 -0.15 Hz range (low frequency range), HF Power -frequency activity in the 0.15 -0.40 Hz range (high frequency range), SD1 -Dispersion of points perpendicular to the line of identity, SD2 -Dispersion of points along the line of identity, Elite HRV -Score attributed by the Elite HRV device using its own proprietary algorithm TLX score increments associated with the treatment change. However, TLX scores were significantly correlated with productivity, confirming the high concurrent validity of this test method (Table 6). Treatment effects were not clearly reflected in the objective measures of mental workload, obtained through the analysis of heart rate variability. Results were analyzed separately for the two 15-minutes records obtained from each session, thus comparing the first 15-minute record for the »pure conifer« session with the first 15-minute record obtained from the »mixwood« session -and so on with the second 15-minute records, separately for each subject. Some evidence was obtained from the second iteration only, and limited to SDNN, NN50, PNN50 and Poincaré SD2: all these values decreased when shifting to the mixwood treatment, as generally occurs when mental workload increases (Table 7). However, only the results for SDNN are significant at the 5% level and thus conclusive, while the others are significant at the 10% level and may be taken as suggestive, not conclusive. No significant correlation was found between TLX score and SDNN, NN50, PNN50 or Poincaré SD2 (R 2 <0.1).

Reproducibility
The figures of performance and subjective mental workload rating were not significantly different between the two repeat sessions administered to subjects 2-5 (Table 8). Productivity actually increased 28% for the »pure conifer« treatment, but the difference was not statistically significant. A much smaller productivity increase (6%) was also recorded for the »mixwood« treatment but -again -the difference was not significant. These results may still suggest some learning effect over the two months elapsed between the two iteration, which were stronger for the conifer treatment, since this is the treatment routinely carried out and the one with which operators may have gained additional experience. However, in the absence of conclusive evidence, such inference remains highly speculative. The result for mental workload was even clearer: differences were small and deprived of any signifi-cance. The TLX scores obtained in the two iterations are essentially the same, pointing at a marked reproducibility of the method.

Limitations of the Study
First of all, it is important to outline the limitations of this study so that any statements made in the following paragraphs are interpreted with the necessary caution. In particular, the main limitations of this study are: the degree to which the simulated stands and treatments reflect the actual silvicultural practice, the capacity of simulated environments to mirror reality, the relatively small number of participants and the risk for transfer effects between treatments.
Even if a conscious effort was made to replicate the actual stands and treatments, the selected prototypes cannot be taken as anything more than one example of the many ways in which silviculture is applied, especially when trying to convert conventional pure or »semi-pure conifer« forests into more complex »mixwoods« (Larsen and Nielsen 2007). The authors are conscious of these limitations, and only wanted to offer some insights into the effect of these type of changes on harvesting performance and forest machine operator wellbeing. It stands to reason that such effects may be more or less marked depending on variations in the complexity of stand architecture and silvicultural prescriptions, and that is a good reason why the same experiment should be repeated under a wider range of forest types and silvicultural regimes.
Simulated work environments present obvious differences from the real ones, not least the absence of factual consequences in case of errors, which is bound to reduce fear and anxiety (Diane 1996, Bell et al. 1998, Veltmann 2002. That may affect their capacity to elicit the same stress levels otherwise recorded when performing potentially dangerous tasks like the one at hand. For that reason, it is important to recall that this study addresses mental workload -not stress -and in that regard, the capacity of simulators to reflect real work techniques and demands has been demonstrated in the past (Ovaskainen 2005). The only missing element is the effect of noise and vibration, which is known to affect cognitive performance (Ljungberg and Neely 2007). However, such eventual bias would affect both treatments equally, so that the comparison is not invalidated. In any case, much caution must be taken when interpreting data in the simulator reports, especially for what concerns machine and residual stand damage: these figures are often inflated, since the soft-ware counts as damage events all impacts between the machine and the tree (felled or standing), regardless of the kinetic energy at the point of contact. Therefore, the total count likely includes many events that would not normally result in damage. In contrast, the productivity data returned by simulators offer a reliable representation of real-world performance, because they generally match the figures from actual operations (Eriksson and Lindroos 2014) Obviously, it would have been good to include a larger number of participants in the study, but subject availability was especially limited at the time of the study, when most of the workforce was busy trying to tackle exceptional workload determined from one of the worst bark beetle outbreaks in the history of the region (Niesar et al. 2018). Even so, the number of participants was large enough to disclose highly significant differences in productivity and mental workload, determined with subjective measures. Such encouraging result may have been related to a good selection of test subjects, which offered an even coverage of a relatively wide range of age and experience. In any case, similar studies have been successfully completed even when using fewer test subjects than here (Wenhold et al. 2019). On the other hand, the number of test subjects was below the minimum figure of 20 recommended by specialists when conducting physiological measures of mental workload, and that may be the reason why this component of the study only produced suggestive results (Quintana 2017).
Concerning transfer effects, one may surmise that the second test in the sequence (»mixwood«) was affected by the learning and fatigue accumulated during the first test. In order to mitigate this effect, the experimental plan included a 15 min break between the two tests, which could be further extended if the test subjects felt especially tired. In any case, the risk of a transfer effect was weighed against the risk of hitting daily or weekly fluctuations in energy, attention or motivation that would have been incurred if the two tests were further spaced out.

Impact on Productivity and Work Quality
Although the study indicates that the »mixwood« treatment was associated with a marked and significant reduction of harvesting productivity, that may not imply a direct causal relationship with task complexity. The »mixwood« treatment was also associated with a strong reduction in mean tree size, which is known to have a strong effect on productivity. This effect has been variably described with linear (Holzscher and Bossy 1997, Sirén and Aaltio 2003, Nakagawa et al. 2007, quadratic (Kärhä et al 2004, Nur-minen et al. 2006 or power functions (Jirousek et al. 2007, Visser andSpinelli 2012), and it is in the order of magnitude observed in this study. Therefore, the effect on productivity resulting from the treatment change may be due in the largest measure to the change in mean tree size, rather than the change in task complexity. If so, one should pay special attention to the increase in mental workload, as the main direct effect resulting from the new treatment. The other effect that is not mediated by tree size differences is a decrease in work quality, especially for what concerns damage to the machine or to residual trees: however, that effect cannot be quantified with absolute certainty due to the way in which the software grouped all damage together, regardless of their severity.
As a collateral benefit, the study offered additional proof about the fundamental relationship between productivity and operator experience. The most experienced operators were between 25% and 50% more productive than their least experienced colleagues, confirming past estimates -all in the range of 40% (Kirk et al. 1997, Ovaskainen and Heikkila 2007, Purfürst and Erler 2011. Of course, these are very general figures that can be strongly affected by changes in working conditions and motivation over time (Leonello et al. 2012). The study also found indications that more experienced and productive operators may use a different work technique than adopted by their less experienced and productive colleagues, as already suggested in the literature (Ovaskainen et al. 2004).
In contrast, this study did not find any evidence for the effect of age, despite a good balance of the age and experience factors. Therefore, one may conclude that -for the range of ages found in the subject pool -age has very little effect on performance, unless it is associated with experience. Apparently, the cognitive and motorial abilities of healthy individuals peak around age 24 (Thompson et al. 2014) and do not decline before 65, if at all (Harada et al. 2013). The absence of any evidence for the effect of gaming habit was likely due to data set unbalance, whereby very few subjects regularly used video-games.
The fact that the rate of performance decline was not correlated with experience denies the expectation that more experienced and skillful operators may better cope with an increase of task difficulty compared with their less experienced colleagues (McEwan et al. 2016). The study shows that more experienced operators maintain their edge over the rest when task difficulty increases, but they suffer the same productivity decline as their less experienced colleagues. This may indicate that experience brings better performance, but not a better adaptation capacity. Of course, this is true for the first 30 minutes only. When a new challenge is introduced, its effect on performance is likely strongest at the very beginning, and may decrease as the operators become familiar with it and develop their own coping strategies. In that process, more experienced and skillful operators may -or may not -progress more rapidly than the rest. The study was not designed to gauge long-term adaptation and therefore cannot answer that question.

Impact on Mental Workload
The NASA-TLX method proved very effective at detecting differences between tasks and at pinpointing the main sources of mental workload increases -in this case specifically mental demand, effort and frustration. In the present study, mental demand is indeed the dominant component and is strongly affected by task changes, whereas time demand is not affected by the changes, but still is an important component of mental workload. The significant effects of task change on effort and frustration may be related to the dense understory, which impairs visibility and constrains movement -as noted by several of the test subjects during informal debriefing interviews. This assumption is corroborated by past studies that indicate how a dense understory is a handicap to productive harvesting work (Ireland andKerr 2008, Niemistö et al. 2012) that aggravates the mental workload of the operators engaged in it (Gellerstedt 1997).
The absence of any correlation between TLX scores and operator experience is not surprising: it stands to reason that subjective mental workload depends on both task type and the subject's own personality, which is the result of many factors beside experience with the task at hand. That is why the TLX method is best used for comparison between alternative tasks, not between operators.
While the TLX method offered a strong and clear answer to the study question, the same clarity was not achieved by HRV analysis. Results of the physiological measures were suggestive, not conclusive. It is most likely that HRV analysis was unable to produce conclusive results because the sample was too small for the difference one wanted to detect: HRV analysis is a good indicator of mental workload only when differences in task demand are high (Eilers 1999, Mulder et al. 2000, and even so at least 20 subjects are required (Quintana 2017). In fact, the suitability of HRV analysis to detect changes in mental workload has been challenged by some authors, who state that current HRV methods lack clarity and need improvement (Billman 2013, Heathers 2014. In that regard, it is important to recall that HRV analysis was already used 20 years ago to determine if a change in silviculture would affect harvester operator workload: Yamada (1988) found out that a shift from clearcut to thinning would indeed cause a decrease of the HRV in two harvester operators and concluded that their mental workload did increase. However, a similar study conducted at about the same time in Japan could only find differences between HRV at rest and during work, but not between treatments (Imajma 1997). This may support the notion that few operators are not enough for a conclusive outcome, but may offer suggestive evidence nevertheless. In this study, subjective and objective measurement methods tended towards agreement and that may speak in favor of HRV, since the few significant or quasi-significant differences pointed at increased task complexity resulting in a reduction of heart rate variability (Fairclough et al. 2005, Veltman andGaillard 1998), especially when expressed through non-linear indicators (Delliaux et al. 2019).

Conclusions
During tests in a controlled virtual environment, the mental workload of harvester operators was statistically higher when engaged in a diverse »mixwood« stand rather than in a more even »pure conifer« stand. This result contributes another important element to making informed decisions about forest management. When weighing the advantages and drawbacks of a silvicultural choice, it is important to include all its consequences, to the best of one's knowledge. The impact on the mental workload of forest machine operators is seldom considered, mostly because few have ever brought it to the attention of decision makers. This paper contributes to a better awareness of how silvicultural decisions affect forest machine operator workload. Increased mental workload is not just about discomfort, but it may have serious effects on worker health and work safety: fatigued operators are more prone to make mistakes, and these may turn into injury and/or damage. Besides, there is a fundamental contradiction between the frequent complaints about the lack of forest workers and the constant increase in the workload imposed on them. One cannot attract people to a job by making it more difficult. If that is the case, the increased demands must be matched by a suitably higher compensation. At the same time, countermeasures should be devised for mitigating the increased workload. Today, better technology can greatly help in that direction, through automation and augmented reality. This new technology may partly offset performance decline and relieve operator workload (Cottrel and Barton 2013). Obvi-ously, this study is just a first contribution to assessing the relationship between silvicultural choices, mechanization and operator workload. More work should follow up, in order to discern long-term trends, overcome the limitations posed by virtual environments and better define individual variability in a wider pool of actual and potential operators.

Acknowledgments
Funding: The research leading to these results has received funding from the AUGUST-WILHELM SCHEER Visiting Professor Program, Technical University of Munich International Centre and from the TECH4EFFECT project funded under the Bio Based Industries Joint Undertaking -European Union's Horizon 2020 research and innovation program [grant number 720757]. The Authors gratefully acknowledge the support of the Nordrhein-Westfalen Forstlisches Bildungszentrum für Waldarbeit und Forsttechnik in Arnsberg, and in particular the assistance provided by Dr. Thilo Wagner and Dipl. Ing. Olaf Müller in organizing and conducting the tests. Thanks are also due to Dr. Giovanna Ottaviani (NiBio) for her precious advice on experiment planning and methodology.