Abstract
Screen-based assessments of handwriting may ask children to write directly on a screen to measure grapho-motor parameters (such as letter size, line alignment, etc.) characterizing legibility of their handwritten texts. These tools have been recently considered as valid alternatives to paper-based assessments, often perceived as time consuming and over reliant on subjective coder judgements. But use of screen-based tools to assess handwriting legibility in primary school children is still limited because their effect on child performance still needs to be fully understood. To overcome this limitation, in the present study we contrast scores on 9 grapho-motor parameters obtained from a screen-based assessment of handwriting in 40 primary school children, to equivalent scores derived from validated paper-based tests on the same children. We also explored whether children’s familiarity with screen-based tools would impact their handwriting speed, a parameter that is strongly experience-based. Results showed significant correlations between screen-based assessment scores and validated paper-based tests supporting future applications of the former. But screen-based assessments also detected significantly more errors in almost all grapho-motor parameters. Preliminary data on familiarity with screen-based handwriting also suggests that this may have an impact on children’s handwriting speed. Notwithstanding these difficulties children liked using the new technology and, overall, screen-based assessments seem a promising alternative to traditional tests, but their efficient use will require the acquisition of new normative data and more in-depth assessments of tool familiarity.
Similar content being viewed by others
Introduction
Increased availability of digital tools for handwriting has led to new questions on the impact of technology in the acquisition and assessment of handwriting in childhood (Graham, 2022; Guilbert et al., 2019; Karavanidou, 2017; Wollscheid et al., 2016). In particular, embodied approaches suggest that screen-based technologiesFootnote 1 may preserve the haptics of handwriting and reinstate the value of mark-making (Karavanidou, 2017; Kiefer & Velay, 2016; Mangen & Balsvik, 2016; Mangen & Velay, 2010). Considering handwriting assessment, screen-based technologies have the advantage of allowing to preserve handwritten texts, while also capturing dynamic and temporal characteristics of handwriting acts that allow producing them (Asselborn et al., 2018; Gerth et al., 2016a). Therefore, consistent research has been dedicated to screen-based assessments (SBAs)Footnote 2 of handwriting skills in childhood (Danna et al., 2023; Hammer et al., 2021).
In particular, SBAs have been used to assess both legibility and fluency of children’s handwriting. SBAs of legibility usually involve analytic assessments of a handwritten text (i.e., the handwritten product) and provide objective post-hoc measurements of individual grapho-motor parameters (GMPs).Footnote 3 This method is also used in traditional validated paper-based assessments (VPAs) of children’s grapho-motor skills (Rosenblum et al., 2003a; Sparaci et al., 2024). On the other hand, SBAs of fluency focus on the fine-motor movements performed while writing (i.e., the actual visuo-motor and proprioceptive processes enacted while handwriting) and involve objective online measurements of kinematic or dynamic parameters,Footnote 4 which can be captured only to a limited extent in VPA (Asselborn et al., 2018; Danna et al., 2023). In the present paper, we will focus on using SBAs to assess handwriting legibility in primary school children. Our main aim is to explore whether screen-based technologies and dedicated software solutions (both required for SBAs) may effectively measure individual GMPs as compared to traditional VPAs.
VPAs are gold-standard tools used to measure legibility, usually in primary school (Danna et al., 2023; Rosenblum et al., 2003a). They frequently rely on copying tasks in a specific handwriting style (print or cursive) and in different conditions (e.g., children may be asked to write slowly in their best handwriting or as quickly as possible under time constraints). VPAs provide clinicians with much needed measures of GMPs, allowing to compare individual performance to normative data (Rosenblum et al., 2003a; Sparaci et al., 2024). But they incur in multiple limitations due to the fact that they are mostly post-hoc evaluations of children’s handwritten texts. In particular, multiple studies have pointed out that VPAs show overreliance on subjective coder judgements (i.e., asking coders to make post-hoc inferences on handwriting processes based on handwritten products), extremely time-consuming scoring systems (i.e., requiring fine-grained measurements that have to be carried out by hand) and limited ecological or external validity (Provenzale et al., 2023; Rosenblum et al., 2003a; Sparaci et al., 2024; Sudsawad et al., 2001). Given rising numbers in teachers’ referrals and consistent increase in children with handwriting difficulties,Footnote 5 there has been a growing need for objective and fast assessments of handwriting, accompanying tailored support strategies (Lyon, 1996; Indira & Vijayan, 2015; Marquardt et al., 2016; MI–DGSIS 2022). Some attempts have been made at using SBAs in kindergarten and primary school (Accardo & Perrone, 2008; Chang & Yu, 2022; Dui et al., 2020; Mekyska et al., 2016; Pagliarini et al., 2015; Philip et al., 2023; Polsley et al., 2022 ; Rosenblum, et al., 2003b; Serpa-Andrade et al., 2021), but use of SBAs of legibility is still limited, often encountering multiple drawbacks.
First, there is the issue of available software. Most software for SBAs currently relies on automatic measurements of fluency parameters (i.e., handwriting processes) and aims at detecting children with handwriting difficulties for further referral and clinical evaluation (Accardo & Perrone, 2008; Asselborn et al., 2018; Asselborn et al., 2020; Chang & Yu, 2022; Dui et al., 2021; Kedar et al., 2021; Mekyska et al., 2016; Pagliarini et al., 2015; Philip et al., 2023; Polsley et al., 2022; Rosenblum et al., 2003b; Rosenblum et al., 2006; Rosenblum & Dror, 2016; Šafárová et al., 2021, Serpa-Andrade et al., 2021; Zvoncak et al., 2019). Some attempts have been made at using SBAs to measure some GMPs related to legibility (i.e., the handwritten product), but direct comparisons of children’s scores on individual GMPs with VPAs are still lacking (Asselborn et al., 2020; Gerth et al., 2016b; Simonnet et al., 2019). Therefore, while extremely useful, these software solutions often need to be followed by further VPAs, since they provide limited data on individual GMPs and lack normative data. Assessing difficulties in specific GMPs, is extremely relevant for educators and occupational therapists, who often use these parameters to: define a child’s personal profile of strengths and weaknesses, select remediation strategies and monitor exercise efficacy (Cramm & Egan, 2015; Feder & Majnemer, 2007). For example, consider letter alignment and letter size (two GMPs measuring if letters are written on the ruled line and in the appropriate size): these parameters are well known to teachers and occupational therapists, as they are commonly taught in primary schools (e.g., notebooks with different ruled lines are commonly used to teach letter alignment and sizing) and may be hard to tackle for children (Guilbert & Fernandez, 2024; Sparaci et al., 2024). For a teacher/occupational therapist, it is important to know if a child has a specific difficulty in letter alignment or sizing, rather than in other GMPs, because tailored exercises may be provided (e.g., facilitating notebooks with highlighted lines delimiting the handwriting space are available on the market) (Guilbert & Fernandez, 2024; Pellegrini & Dongilli, 2010). But software solutions for SBAs currently do not provide measurements of individual GMPs comparable to VPAs. Software solutions measuring GMPs have been implemented instead in systems that do not involve writing directly on a screen. For example, using dedicated software for feature extraction from handwritten images (Dimauro et al., 2020; Isa et al., 2019) or using graphic tabletsFootnote 6 often with an added paper sheet on top (Asselborn et al., 2018; Chang & Yu, 2022; Deschamps et al., 2021; Devillaine et al., 2021; Drey et al., 2022; Drotár & Dobeš, 2020; Falk et al., 2011; Gargot et al., 2020; Herstic et al., 2025). These studies are important, but they offer no data on using SBAs to measure individual GMPs, because they do not involve writing directly on a screen.
Only two research attempts have been made at implementing software measuring individual GMPs (rather than overall score) directly comparable to those measured in a VPA while writing on a screen with a stylus (Provenzale et al., 2022, 2023). Both studies (one on adults, the other on children) were aimed at assessing reliability of scoring systems (software vs. human) and asked participants to copy a phrase in cursive handwriting on a screen (Wacom Cintiq 16) in two writing conditions (i.e., using their best handwriting or writing as fast as possible). Texts were later analysed using two scoring systems: directly by the software, which scored screen-acquired texts, or by hand, relying on a human coder scoring paper print-outs of the screen-acquired texts. Comparable scoring methods, derived from multiple VPAs, were used and GMP scores were compared to assess reliability of the two scoring methods. The first study comprised ten adults and measured 8 GMPs, results showing good agreement between software- and human-based scoring on all GMPs, with the exception of letter joins, where the software detected more errors (Provenzale et al., 2022). The second study measured 9 GMPs in 10 primary school children (second- and third-graders), results confirming absence of significant differences between scoring systems for 6 GMPs (i.e., max amplitude of letter misalignment, max variation of medium letters, max variation of ascending/descending letters, letter height, space between words, margin alignment), while for other GMPs (i.e., joins, letter alignment, trace direction) software-based scoring detected significantly more errors (Provenzale et al., 2023). Interestingly, in this second study authors implemented a human–machine interaction approach: some GMPs (i.e., quantitative GMPs, such as: speed, fluctuations, letter dimension, space between words, margin alignment) were automatically scored by the software, while others (i.e., qualitative GMPs, such as: letter joints and direction of letter trace), were initially scored by the software, but the output was later checked by a human coder, who could add/modify the output. This human–machine interaction approach allowed, to some extent, system transparency, offering direct access to errors detected by the software on specific GMPs and providing evaluators with relevant information on children’s handwriting style (Provenzale et al., 2023). These studies, while including only a limited number of participants, show evidence of software solutions allowing to assess legibility also yielding data comparable to VPA (provided that scoring methods are comparable). More importantly, they suggest that for some GMPs SBA software may detect more errors than a human coder. But these studies provide no information on comparing SBAs to VPAs, because they focused on scoring methods and only involved texts written on screens.
This brings us to the second limitation to SBAs use: differences between writing on a screen and writing on paper. Using a stylus on a screen is a closer experience to handwriting compared to typing, but screens still differ from paper in terms of tactile, propriokinesthetic and even auditory feedback (Alamargot & Morin, 2015; Gerth, 2016b; Karavanidou, 2017; Mangen & Balsvik, 2016; Mangen & Velay, 2010; Mayer, 2020; Van der Weel et al., 2024). This is particularly true for cursive handwriting, which initially depends on memorizing letter forms and joins, but with practice becomes a “kinetic melody”, whose automaticity is strongly embedded in the haptics of the instrument with which it is played, so that small variations in surface resistance, visual and auditory feedback may alter it (Lurija, 1973; Mangen & Balsvik, 2016; Mangen & Velay, 2010). Therefore, it is appropriate to hypothesize that use of SBAs to measure legibility may affect children’s cursive handwriting and studies comparing screen vs. paper use seem to support this hypothesis. Alamargot and Morin (2015) measured fluency and showed that handwriting on a screen (Wacom Cintiq 21UX) in the preferred style with a plastic-tipped pen affects pen pauses in second-grade and pen movements in ninth-grade, suggesting a need for consistent motor adjustments in screen-based handwriting (Alamargot & Morin, 2015). Gerth et al. (2016b) measured fluency (e.g., speed, overall writing time) and legibility (e.g., overall legibility, letter shape) in 27 s-graders asking them to either write on a tablet screen (ThinkPad X61) using a digital pen or on a paper sheet placed on a screen (Intuos4 XL DTP) without controlling for writing style or conditions (i.e., children were allowed to write using both print and cursive at their preferred pace). In a quite paradoxical result when writing on a screen second-graders showed longer writing time (longer overall duration of the copying task calculated in milliseconds), but higher writing velocity (calculated as the proportion of millimetres of trace produced per seconds). This result was explained by legibility data, which highlighted differences in letter size: children produced larger letters in the screen condition, resulting in overall longer writing time (Gerth et al., 2016b). A similar effect is reported also by Alamargot and Morin (2015) showing that in their sample the distance travelled to form a letter was always longer when children were writing on a screen. Guilbert et al. (2019) showed that changes in proprioceptive and visual feedback impact handwriting speed, letter size and legibility in cursive handwriting, with reductions of visual and proprioceptive feedback leading to a greater effect in children than adults. These studies indicate that embodied approaches may be right in suggesting the need for more in-depth analyses of the impact of SBAs in childhood as these may lead to higher cognitive and motor costs in children rather than adults (Guilbert et al., 2019; Karavanidou, 2017; Mangen & Velay, 2010). In fact, overall handwriting on a screen seems to be harder for children, leading to motor adjustments that affect at least some aspects of legibility such letter size, while less is known on effects on other GMPs (Gerth et al., 2016b; Guilbert et al., 2019; Wollscheid et al., 2016). To date, no study compared GMP scores resulting from SBAs to those obtained in VPAs, using comparable methods, while controlling for writing style and conditions (see also Danna et al., 2023 for a comprehensive review).
Finally, there is the issue of tool familiarity and practice. With practice, both children and adults seem able to adapt just as well to writing on paper as to writing on a screen. Gerth et al., (2016b, p. 13–14) have shown that over the course of a task requiring to write a short phrase multiple times (i.e., 10 repetitions of the phrase “Sonne und Wellen”), handwriting fluency (measured as the number of inversions in velocity, NIV, between first and last repetitions) improved in both adults and second graders (i.e., both showing gradual decrease in NIV over the 10 repetitions). However, practice with screen-based technologies requires, to the least, that the latter are accessible to kids, enticing and equipped with appropriate learning environments. Current evidence suggests instead that use of screen-based technologies for handwriting is relatively low, in fact children rarely experience both direct and indirect use of screens for handwriting at home, while even teachers are often not overly familiar with appropriate use of this technology for handwriting (Bonneton-Botté et al., 2021; Couse & Chen, 2010; Gerth et al., 2016b; Graham, 2022; M��ller et al., 2015). This suggests that it is important to start investigating children’s familiarity and practice with screen-based tools when considering SBAs, especially considering that some GMPs may be affected by practice. For example, handwriting speed, which is often related to fluency (e.g., studies showing that fast handwriting is associated with fewer NIV), is affected by handwriting practice: multiple studies showing gradual increase in handwriting speed between first and fourth grade (Accardo et al., 2013; Gerth et al., 2016b; Graham & Weintraub, 1996; Graham et al., 1998; Loizzo et al., 2023; Tressoldi et al., 2019; Yekeler Gökmen et al., 2022). Overall, it seems important, in the future, to consider children’s familiarity and the opportunities that they have to practice use of screen-based technologies, as these may be associated to differences in performance on specific GMPs, such as handwriting speed.
Summing up, to evaluate effective use of SBAs of legibility in primary school children, in the present study we aimed to compare GMP scores obtained using SBAs to the ones resulting from VPAs in a sample of primary school children, using comparable scoring methods and controlling for writing style and conditions. To this aim, 9 GMP scores obtained in two assessment conditions (SBAs and VPAs) were scored by the same expert coder and compared, investigating both correlations and differences between scores. Based on previous research, we expected to find at least some correlations between GMP scores (Accardo & Perrone, 2008; Dui et al., 2020), while we also expected SBAs to detect a greater number of errors for some GMPs (e.g., joins, letter alignment, trace direction) (Provenzale et al., 2022, 2023). Furthermore, given documented impact of screen-use on child performance, we expected to find more errors on GMP scores related to letter sizing (Bonneton-Botté et al., 2020; Gerth et al., 2016b; Guilbert et al., 2019; Mayer et al., 2020; Wollscheid et al., 2016). No predictions could be made on the effect of SBAs on other GMPs as compared to VPAs, and this data was considered explorative. We also wished to provide some preliminary data on children’s level of familiarity and practice with screen-based technologies in general. Therefore, we added a dedicated questionnaire asking a sub-group of children within our sample which screen-based technologies they had at home (i.e., computer, smart-phone, tablet), how often they used them and if they had a tablet whether they used it with their fingers of with a stylus. Therefore, not limiting children’s experiences to specific handwriting tasks, but attempting to provide initial data on general tool familiarity and practice. Given that current literature indicates that children rarely use tablets for handwriting, we expected to find only few occurrences of stylus use. But we did expect some children to have some experience with using screen-based technologies at home (e.g., for other tasks not involving handwriting), which may have led them to achieve at least some practice with these tools (e.g., having tactile experiences of screens’ resistance). We then explored if any association emerged between frequency in use and handwriting speed in SBAs. Finally, we also explored children’s reactions to screen-based technologies, tentatively asking them whether they enjoyed using a screen and stylus. We consider this data on familiarity and practice as merely explorative and further studies needed to fully understand to what extent familiarity with screen-based tools may impact SBAs. However, we choose to report it in the hope of enriching future research on using SBAs of handwriting.
Methods
Participants
Forty-eight Italian primary school children were recruited to the present study as follows: 10 through word of mouth among colleagues, 38 in collaboration the public primary school Istituto Comprensivo Via Merope Rome, Italy. To guarantee some expertise in cursive handwriting, inclusion criteria were: being enrolled in the second semester of second grade or in third grade and actively using cursive in school.Footnote 7 While exclusion criteria were: not using cursive in school or not completing study sessions. Based on these criteria 8 children were excluded (7 for not using cursive, 1 for not completing study sessions) and the final sample included 40 children (3 s-graders, 37 third-graders), well balanced for gender and within population means for handedness (7.5% being left handed) (Perelle & Ehrman, 1994) (Table 1). Sample size is comparable to other studies on similar populations/skills (Gerth et al., 2016b; Guilbert & Fernandez, 2024; Sparaci et al., 2024). All children had normal or corrected-to-normal vision (8 wore glasses). Non-verbal cognitive level, visuo-motor skills and handwriting skills were also measured using: Raven’s Coloured Progressive Matrices (RCPM, Raven et al., 1990), Beery Visual Motor Integration Test (VMI), including the Visual Perception (VMI-V) and the Motor Coordination (VMI-M) subtests (Beery & Beery, 2004), and the Italian standardized version of the Brave Handwriting Kinder (BHK) test (Di Brina & Rossini, 2010). Performance in these tests was used to describe sample characteristics: all children showing non-verbal cognitive level ≥ 80, absence of visuo-motor coordination difficulties and no dysgraphia (see Table 1). Study procedures were approved by the CNR Ethics Committee (approval n. 0060644/2022) as well as by the Ethics Committee of the Università Campus Bio-Medico di Roma (approval n. PAR 73.21, Rome 28 Sept. 2021), parents signing an informed consent form before inclusion of their child in the study.
Materials and procedures
Legibility of children’s cursive handwriting was assessed in our sample using a SBA and a VPA (Fig. 1). To control for handwriting conditions in both assessments children were asked first to copy a phrase in their best handwriting (Best condition) and then to copy it as fast as possible while maintaining legibility (Fast condition). Therefore, the final data sample included 160 texts (80 for SBAs, 80 for VPAs). Order of assessments was counterbalanced: 21 children performing the SBA before the VPA and 19 doing the opposite. A questionnaire was administered after the SBA to a sub-group of children within our sample to provide preliminary data on children’s familiarity and appreciation of screen-based technologies. While other standardized tests (i.e., RCPM, VMI, VMI-V, VMI-M, BHK) were administered to all children in our sample and test order was randomized within-participants. Children were evaluated individually in a quiet room at the child’s home, school or parents’ workplace at the Università Campus Bio-Medico di Roma and assessments were carried out in one day, allowing for pauses between tests to avoid fatigue. Participants sat at a table in good lighting conditions, with the screen/paper sheet placed vertically in front of them at approximately 30 cm from the eyes. At the beginning of each assessment children were explicitly encouraged to rest the wrist of the writing hand on the screen/paper and their non-dominant hand to the side, but while writing they were left free to choose their preferred posture to avoid interruptions/interferences (Fig. 1). Given that ruled paper has been shown to support handwriting legibility, we chose a VPA requiring use of ruled paper and this procedure was included in the SBA (Borean et al., 2012; Guilbert & Fernandez, 2024; Provenzale et al., 2023). Prior to writing tasks in SBA and VPA children were asked to choose the A4 ruled paper that they commonly used in school from four formatsFootnote 8 (these were shown on the screen in the SBA and physically presented in the VPA). Selected paper format was then used in the writing task: reproduced on the screen carefully respecting line spacing and proportionsFootnote 9 in the SBA or using ruled paper sheets in the VPA.
Screen-based assessments (SBAs)
An interactive display (Wacom Cintiq 16 Full HD with 1920 × 1080 pixel resolution) and its stylus (Wacom Pro Pen 2) were used for SBAs (see Fig. 1). This portable technology has a screen size (16 inches) allowing to reproduce the exact size and proportions of a ruled A4 paper sheet (when placed vertically). Stylus digital trace was set at 2 pixels to mirror the trace pens used in the VPA (see below). Furthermore, the screen has a mate finishing reducing reflections and providing friction similar to paper. A laptop was connected to the interactive display for stimuli presentation and data acquisition. SBAs used Eye and Pen Software (Alamargot et al., 2006; Chesnet & Alamargot, 2005) and dedicated research software developed using MATLAB R2021a App Designer for offline extraction of 9 GMPs (see Provenzale et al., 2023 for details on software characteristics).
SBAs were preceded by a short practice phase during which children wrote their name, made some drawings and copied geometrical shapes on the screen. Then children listened to a short story of an elf that liked to receive and collect handwritten letters (the elf story was recorded while all subsequent prompts were written on the screen and read out loud by the experimenter). As the story ended, children selected the paper format (see above) and were given the following instructions for the Best condition: “Now you will see a phrase on the topmost part of the screen and you will have to copy it on the paper below in cursive handwriting. You will have to write well, in an orderly fashion and in your best handwriting. Do not rush. The important thing is that you write as best as you can. If, while writing, you make a mistake erase your letters by striking them out with a line. When you have finished use the mailbox below to send your letter to the elf”. The selected ruled paper was then shown full screen and children read and copied in cursive handwriting a typed sentence appearing on the top of the screen, containing all letters of the Italian alphabet (i.e., “L’elefante vide benissimo quel topo che rubava qualche pezzo di formaggio”, literally “The elephant saw very well that mouse who was stealing some piece of cheese”). The same procedure was used immediately afterwards for the Fast condition, but with the following instructions: “In the next page you will do a race. Try to write the sentence as fast as you can. This time do not worry if your handwriting isn’t as nice as before. The important thing is that what you write is legible. If, while writing, you make a mistake erase your letters by striking them out with a line. When you have finished use the mailbox below to send your letter to the elf”. Throughout SBAs children were allowed to pace themselves by advancing from one screen to the next and when both conditions were completed, they received a thank you note from the elf. The entire SBA was built in accordance with procedures available from a VPA (i.e., the Italian validated Test per la Valutazione delle Difficoltà Grafo-Motorie e Posturali della Scrittura–DGM-P; Borean et al., 2012): using the same phrase, paper formats, handwriting style, conditions and instructions, with only minor adaptations (i.e., the elf story) to maintain child interest in an otherwise more passive task.Footnote 10
Scoring was conducted off-line using research software to extract 9 GMPs relying on human–machine interaction (Provenzale et al., 2023) and comparable scoring rules to the ones used for the VPAs (Table 2). The software initially asked an expert human coder (second author) to segment handwritten texts, by marking the beginning and end of each letter with simple mouse clicks (initial letter sequencing) (see Fig. 2 panel A). Then the software automatically provided output scores for all quantitative GMPs (GMPs 1,3,7,8,9) requiring faster and objective measurements of time or space. These were reported as either a proportion (GMP 1), number of errors (GMP 3) distance in mm (GMPs 7,8,9) (Table 2). For qualitative GMPs, reported as number of errors (GMPs 2,4,5,6), the software provided instead dedicated text visualizations supporting human–machine interaction and scoring (Table 2; Fig. 2 panels B, C, D). Notwithstanding time initially required for letter segmentation and labelling, the software allows to reduce coding time (i.e., previous comparisons of coding time between human-based coding and software GMP extraction showed that the software allowed saving on average 17 min of coding time for each participant in coding each condition, saving a total of 34 min of coding time for each participant) (Provenzale et al., 2023). To guarantee consistency between SBAs and VPAs, all letter sequencing and human–machine interaction scoring for SBAs were conducted by the same expert coder (second author) who scored all VPAs.
Sample of images provided by the software for GMPs requiring human–machine interaction: A initial letter sequencing for the word “elefante” (elephant), small dots between letters are placed by the coder (second author) to parse out individual letters. B Enlarged image of the letter “a” in the word “elefante” (elephant). This image allows to code GMP 2 by showing differentially coloured letter traces according to tracing order and arrows for trace direction, and allowing coder to select incorrect trace order by checking the appropriate box below the image and accept/change software evaluation of trace direction as correct/incorrect. It is also used in coding GMP 4 as it shows continuous/discontinuous trace within the letter and allows coder to appropriately indicate presence of open/overlapping/separate trace or eyelets by checking the appropriate boxes below the image. Finally, it is used to code GMP 5 allowing the coder to indicate presence of an ambiguous letter by checking the appropriate box. C Enlarged image of the entire phrase showing presence of interrupted/overlapping joins between letters. D Enlarged image of individual words used by coder to approve/decline errors pointed out by the software in image C by checking the appropriate accept/decline button
Validated paper-based assessments (VPAs)
VPAs were carried out following materials and procedures from the DGM-P test (Borean et al., 2012). Children were initially asked to select the paper format (see above) and the chosen sheet was attached to a plastic paper holder to control for surface resistance (Fig. 1). Children were given a black Bic Cristal ballpoint pen to use for handwriting while an experimenter (first, second or third authors) used a stopwatch to measure handwriting time. Children were then asked to read a phrase containing all letters of the Italian alphabet shown on a printed card (same phrase as in SBAs) and copy it in cursive handwriting on the selected paper sheet (Fig. 1). Instructions mirrored the ones described above for SBAs in both writing conditions.
All children’s texts were manually scored by the same expert coder (second author) following procedures derived from the DGM-P test manual (Borean et al., 2012) and comparable to the ones used to score SBAs (see Sparaci et al., 2024 for similar scoring procedures). Coding of handwriting speed (i.e., GMP 1) is based on actual handwriting execution time for each child (measured with the aid of a stopwatch), but all other scoring procedures for the VPAs are based on a post-hoc evaluation of children��s handwritten texts. For some GMPs (i.e., 3,7,8,9) this requires taking exact measures using transparent graph paper provided in test materials, while for other GMPs (i.e., 2,4,5,6) the coder needs to observe the handwritten text very accurately and make the necessary evaluations and inferences (the latter resulting in a time consuming assessment) (see Table 2 for detailed scoring methods). Coder reliability was evaluated by having a second expert coder (first author) code 22.5% of VPAs. The VPA provided legibility scores for 9 GMPsFootnote 11 measured as either a proportion (GMP 1), number of errors (GMPs 2,3,4,5,6) or distance in mm (GMP 7,8,9) (Table 2).
Questionnaire
A short questionnaire containing 11 questions was used to provide explorative data on children’s familiarity and practice with screen-based technologies as well as their appreciation of these tools. For each question children were instructed to select one viable answer (see Table 3). Children filled in the questionnaire using a pen while an experimenter sat next to them offering guidance and/or clarifications if needed. Questionnaires were introduced based on researchers’ observations (first and second authors) during initial data collection (e.g., some children upon seeing the SBA commented that they had never used this technology, others that they had tablets at home). Therefore, questionnaires were available and administered only to a subsample of 30 children (Familiarity sample in Table 1). Following questionnaire results, children were subdivided, based on their answers to questions 1 and 4, in two samples: 13 children that had a tablet at home and used it to some extent (i.e., they answered “sometimes”, “often” or “always” to question 4 assessing frequency of tablet use) were considered as having comparatively higher familiarity or practice with this tools (HF sample, Table 1), while 17 children that did not have a tablet at home or did not use it frequently (i.e., they did not have a tablet or answered “never” or “rarely” to question 4) were considered as having a comparatively lower familiarity with this tool (LF sample, see Table 1).
Data analyses
Inter-coder agreement between first (second author) and second coder (first author) on individual GMP scores from VPAs was: 83.3% for speed (Cohen’s kappa = 0.822, 95% CI: 0.641–1.000), 83.3% for letter forming (Cohen’s kappa = 0.740, 95% CI: 0.489–0.992), 66.7% for letter alignment (Cohen’s kappa = 0.622, 95% CI: 0.385–0.860), 27.0% for letter distortions/interrupted overlapping joins (Cohen’s kappa = 0.228, 95% CI: 0.032–0.424),Footnote 12 77.8% for ambiguous letters (Cohen’s kappa = 0.619), 88.9% for unrecognizable letters (Cohen’s kappa = 0.684; 95% CI: 0.293–1.000), 72.2% for max amplitude of letter misalignment (Cohen’s kappa = 0.660; 95% CI: 0.411–0.909), 72.2% for max variation in size of medium letters (Cohen’s kappa = 0.647, 95% CI: 0.387–0.907) and 55.6% for max variation in size of ascending/descending letters (Cohen’s kappa = 0.495; 95% CI: 0.231–0.759). Consistency between scoring of SBAs and VPAs for letter sequencing as well as for GMPs requiring human–machine interaction (GMPs 2,4,5,6), was guaranteed by having the same expert coder (second author) score both data sets (160 child texts). While consistency between machine and human coding for all other GMPs automatically coded in the SBA (i.e., GMPs 1,3,7,8,9) was documented in a. previous study (see Provenzale et al., 2023).
Normality distribution of GMP scores was tested using Shapiro–Wilk test and for some GMPs null hypothesis was rejected. Therefore, Spearman’s rank correlation was used to verify presence/absence of significant correlations between GMP scores from SBAs and VPAs. While Wilcoxon signed-rank test was used to compare GMP scores from the two assessment conditions (SBA and VPA). Questionnaire answers were calculated as percentages and reported in Table 3. For the Familiarity sample Shapiro–Wilk test and Levene test showed normality distribution and homogeneity of variance, in handwriting speed (GMP 1) measured in SBAs (for both conditions). Therefore, to explore presence/absence of an association between comparatively higher practice of tablet use at home on handwriting speed as measured by SBAs, two separate one-way ANOVAs, one for each condition (Best and Fast), were performed, comparing HF sample and LF sample performance.
Results
Legibility assessments
Mean scores on 9 GMPs measured using SBAs and VPAs in the two writing conditions (Best and Fast) are shown in Table 4. Spearman’s rank correlation between GMP scores from SBAs and VPAs showed moderate positive correlations for speed (r(38) = .55, p = .000), letter forming (r(38) = .42, p = .007), letter alignment (r(38) = .55, p = .000), ambiguous letters (r(38) = .64, p = .000), max variation in size of medium letters (r(38) = .49, p = .002) and max variation in size of ascending/descending letters (r(38) = .48, p = .002) in the Best condition (see Table 4). While in the Fast condition moderate positive correlations emerged for speed (r(38) = .71, p = .000), letter forming (r(38) = .41, p = .007), letter alignment (r(38) = .48, p = .002), ambiguous letters (r(38) = .54, p = .000) and max variation in size of medium letters (r(38) = .56, p = .000); and weak positive correlation was present for max amplitude of letter misalignment (r(38) = .33, p = .041) (Table 4). Wilcoxon signed-rank tests evaluating presence of significant differences in the 9 GMP scores obtained from SBAs and VPAs, showed significant differences in the Best condition for: letter forming (Z = − 4.789, p = .000), letter alignment (Z = − 4.037, p = .000), letter distortions, interrupted/overlapping joins (Z = − 5.513, p = .000), ambiguous letters (Z = − 2.542, p = .011), unrecognizable letters (Z = − 2.632, p = .008), max amplitude of letter misalignment (Z = − 4.181, p = .000), max variation in size of medium letters (Z = − 4.154, p = .000) and max variation in size of ascending/descending letters (Z = − 4.234, p = .000) (Fig. 3A–C). While in the Fast condition significant differences between GMP scores emerged for speed (Z = − 3.992, p = .000), letter forming (Z = − 4.962, p = .000), letter alignment (Z = − 4.064, p = .000), letter distortions, interrupted/overlapping joins (Z = − 5.516, p = .000), ambiguous letters (Z = − 3.368, p = .001), max amplitude of letter misalignment (Z = − 3.744, p = .000), max variation in size of medium letters (Z = − 4.543, p = .000) and max variation in size of ascending/descending letters (Z = − 3.804, p = .000) (Fig. 3A–C).
Wilcoxon signed-rank tests evaluating differences between GMP scores from SBAs and VPAs reported as proportion of letters per seconds (A), as mean millimetres (B) and as number of errorrs (C) in both conditions (Best and Fast). Numbers refer to GMPs as listed in Tables 2 and 4. Significant differences are indicated by asterisks (**p < .00; ***p < .000)
Questionnaire assessment
Answers to questionnaire are reported as percentages in Table 3 and discussed below. One-way ANOVAs assessing differences between HF and LF samples on handwriting speed in SBAs showed a significant difference in the Fast condition (F (1,28) = 4.442, p = .044), but not in the Best condition (F(1,28) = 2.442, p = .129) (Fig. 4).
Discussion
This study investigated effectiveness of screen-based assessments (SBAs) of handwriting legibility in primary school children as compared to traditional validated paper-based assessments (VPAs). In particular, children scores on individual grapho-motor parameters (GMPs) assessing legibility in cursive handwriting obtained from SBAs were compared to comparable scores from VPAs, in two writing conditions: writing slowly in the best handwriting (Best condition) or as fast as possible while maintaining legibility (Fast condition). We also explored, in a sub-group of children, whether more familiarity and practice with screen-based tools would be associated to better handwriting speed in SBAs in both conditions.
Before each assessment (SBA and VPA) children were allowed to choose freely among four paper formats (see above) the format that they were more accustomed to. With the exception of one child, all children proved consistent in their choices and were able to recognize and choose in both conditions the paper format that they commonly used in class (i.e., second-graders consistently chose second grade lines, while third-graders consistently chose third grade lines in both conditions) (see Fig. 1 for an example of third grade lines use). The one exception was a third-grade child that chose third grade lines in the VPA and, comparatively easier, second grade lines in the SBA. However, given that choosing a comparatively easier paper format in the SBA did not prove to affect overall performance in this assessment (i.e., the child’s scores where within group means), we chose to include this child in the final sample as an occurrence of a viable behaviour also to avoid reducing sample numerosity. Such behaviour may arise because children are not always able to recognize their preferred paper format when it is presented on a screen and may instead opt for an easier layout in a less familiar task such as handwriting on a tablet. This interpretation is consistent with studies suggesting that digital media provide fewer material anchors—that is, reduced tactile and spatial cues supporting perception and memory—compared to paper, leading to altered perceptual affordances when interacting with on-screen materials (Schilhab et al., 2018).
Legibility assessments
Our first step was to assess presence of significant correlations between 9 GMP scores from SBAs and VPAs in both conditions (Best and Fast). Results showed significant correlations between multiple GMPs in both conditions (significant correlations were present for six GMPs in the Best and Fast conditions), while only two parameters (GMPs 4 and 6) showed no correlation in either condition (Table 4). For letter distortions, interrupted/overlapping joins (GMP 4), we cannot rule out that this result may be due to the software used in scoring SBAs. In fact, previous studies comparing scoring methods (software vs. human) show that similar software solutions are able to detect comparatively more errors for letter joins and trace direction (GMP 4) (Provenzale et al., 2022, 2023). This is understandable, given that during scoring the software provides enlarged images of letters (Fig. 2B), which make this error type comparatively more visible and easier to detect. Therefore, lack of correlation for this GMP, may, at least in part, be ascribed to SBA’s scoring. We also wish to underscore that VPA of GMP 4 led to the lowest inter-rater agreement scoresFootnote 13 (see above), supporting on one side the hypothesis that human coders may experience significant difficulties in scoring this GMP, but also suggesting that caution must be exercised interpreting this result, which certainly requires further research to be better understood. As for unrecognizable letters (GMP 6), absence of a significant correlation may due to low error rates on this GMP in our sample (see Table 4 and Fig. 3C).Footnote 14 A viable cause may be that unrecognizable letters are more easily found in younger children (first graders) that are in the process of learning letter shapes or in older children (fifth graders) that, by developing a personal style, may change letter shapes making them less recognizable (Hamstra-Blez & Blote, 1990). Overall, presence of significant correlations in multiple GMPs may be considered a promising result, suggesting that, with some exceptions, SBAs may be able to outline similar patterns of strengths and weaknesses in GMPs. For example, previous studies suggest that some GMPs can be harder to tackle than others for children (Hamstra-Blez & Blote, 1990). For example, data on DGM-P test scores show that letter alignment and max variation in size of ascending/descending letters (GMPs 3 and 9) result in higher error rates compared to ambiguous letter and max variation in size of medium letters (GMPs 5 and 8) (Sparaci et al., 2024). A similar pattern was found in our VPAs, but more importantly also in the SBAs (Table 4), supporting use of the latter as a viable resource to measure some GMPs. It is also worth noting that all significant correlations between GMP scores obtained from SBAs and VPAs were positive, although their magnitude ranged from weak to moderate. This pattern indicates that, despite systematic differences in absolute scores between the two assessment methods, children who performed relatively better or worse on a given GMP in the paper-based assessment tended to show a comparable relative performance in the screen-based assessment. In this sense, the observed correlations support association of individual GMP scores across assessment modalities. The presence of weak-to-moderate correlations is not unexpected, given that SBAs and VPAs differ in tools used (screen vs. paper) as well as in the scoring procedures (software-assisted vs. fully human coding).
To better understand differences between SBAs and VPAs, our next step was to confront scores obtained on the 9 GMPs in both assessments. Results showed that, with minor exceptions,Footnote 15 SBAs always resulted in higher error rates (Fig. 3). For some GMPs (GMPs 2,3,4), based on previous studies, we can hypothesize that differences may be ascribed to the SBA scoring system (as stated above). But for the other parameters (GMPs 1,5,6,7,8,9) this explanation is less viable, given that previous studies comparing scoring systems (software vs. human) detected no differences (Provenzale et al., 2022, 2023). Consequently, for speed (GMP 1), letter shape (GMPs 5 and 6), letter alignment (GMP 7) and variations in letter size (GMPs 8 and 9), a viable alternative is to hypothesize that higher error rates in SBAs may be due screen-based handwriting. This hypothesis is partially supported by previous studies suggesting that writing on a screen with a stylus puts higher demands on motor control, leading to motor adjustments, lower relative speed and larger or longer letter traces (Alamagot & Morin, 2015; Gerth et al., 2016b; Guilbert et al., 2019). In particular, it is interesting to note that children in our sample showed more variation in letter size when writing on a screen in both conditions (Fig. 3B), a phenomenon paralleled by lower handwriting speed in both conditions (Fig. 3A). Differently from other studies which did not control for handwriting conditions (Gerth et al., 2016b), our data also showed difference in handwriting speed was significant only in the Fast condition. Possibly this happened because when children were explicitly required to write in their best handwriting, they tended to slow down and be more self-conscious of the handwriting quality even on paper, resulting in a comparatively similar speed. However, these are mostly working hypotheses and further studies will be needed to fully disentangle effects on GMP scored that are ascribable to the scoring system (software vs. human) and those that are due to the tools used (screen and stylus vs. paper and pen). Overall, it is important to note that screen-based handwriting seems to have an effect on children’s handwriting as measured by individual GMPs which extends beyond speed and letter size. Importantly, the coexistence of significant correlations and systematic differences suggests that SBAs may capture similar underlying grapho-motor constructs as VPAs, while remaining more sensitive to certain error types, leading to higher error rates. Taken together, results from correlations and confront of GMP scores may be interpreted as evidence of convergent validity at the level of individual performance patterns, but also lack of interchangeability between SBA and VPA scores. These new data suggest that SBAs of legibility in primary school while promising, should not be taken lightly. In particular, for this new technology to be efficiently used, as government agencies and researchers increasingly suggest (Danna et al., 2023; Istituto Superiore di Sanità, ISS, 2022; Philip et al., 2023), new normative data are needed. In other words, if we want to exploit potential benefits of these new tools, we need to invest further time and resources in providing new normative data, given that population means available from VPAs will not apply.
Questionnaire assessment
Questionnaire data showed that while smart phones are commonly present in children’s homes, computers and tablets are not, and that even when they are present, they are less used. In particular, it is important to acknowledge, that out of the 30 children that were administered the questionnaire, 23 had a tablet at home, but only 13 stated that they used it (i.e., sometimes/often/always, see Table 3). Some children spontaneously volunteered explanations, saying that: the tablet belonged to an adult (father or sister) and/or that they were not allowed access to it because it was considered fragile and/or costly (one child described breaking by accident a tablet screen at home, after which he was no longer allowed to use it). Smart phones were not only more present, but also more accessible, all children stating presence of at least one smart phone at home (often more than one) and only 7 children declaring that they never/rarely used them (these children explicitly explaining that caregivers limited their smart phone use) (Table 3). We did not assess what children actually used their tablets for, as previous studies suggest that tablets and smart phones are mostly used by children for internet browsing and watching videos online (Radesky et al., 2020). But we did ask whether children that had a tablet were familiar with a stylus, setting aside the type of activity that this may be used for (e.g., writing, drawing). As expected, we found that stylus use was extremely rare (only 3 children in our sample) (Table 3). This is in line with previous studies suggesting that even if tablets are present in the home, they are rarely used for handwriting and/or drawing, but are rather employed to watch videos or play games (Couse & Chen, 2010). These results, even if on a limited sample, suggest that the introduction of SBAs will require an evaluation of tool familiarity and practice, possibly developing appropriate training strategies.
To explore impact of familiarity and practice with tablets on a specific GMP (i.e., handwriting speed) as measured in the SBA, children completing the questionnaire were subdivided in two samples (i.e., with comparatively higher or lower tool practice, se HF and LF samples in Table 1) and their performance was compared. We expected children in the HF sample, which had comparatively more occasions to experience and practice tool characteristics, to produce more letters per seconds and this prediction was confirmed in both conditions (Fig. 4). This result is in line with previous studies showing that practice with screen-based technologies may lead to more fluent handwriting (i.e., lower NIV) and higher handwriting speed (Gerth et al., 2016b). This preliminary result also suggests that even minimal familiarity (i.e., we included in the HF group even 4 children that responded ‘sometimes’ to the question on frequency of tablet use at home), may mitigate challenges posed by these novel tools, especially when they are asked to write fast (as shown by the significant difference in the Fast condition). However, these data are purely exploratory and further studies are needed to fully understand the impact of tool familiarity on SBAs of handwriting skills in childhood possibly considering larger samples and/or more fine-grained questionnaires that did not rely on children’s self-reported perceptions to avoid risk of subjective bias. Furthermore, we cannot rule out that the higher handwriting speed found in the HF sample may have been due to other variables (e.g., children in this group may have been from educationally supportive home environments that allowed tablet use as well as other learning activities). In reporting this data, we are therefore attempting to point out the relevance of tool familiarity for SBAs, rather than proposing conclusive data on this issue.
Finally, our preliminary exploration of children’s appreciation of screen and stylus use showed that approximately 80% of children in the Familiarity sample found using this technology more fun than using regular pen and paper. This was not surprising given the novelty effect and considering data from previous research suggesting appreciation of this technology in some student samples (Hammer et al., 2021). But it is interesting to note that on average 43% of these children also declared that using the screen and stylus made handwriting more difficult (Table 3), suggesting that children perceived difficulties documented by higher error rates reported above. Finally, while 70% of these children would be willing to use screen-based technologies in school only 50% would readily use them for homework, this may be due to a general lack of interest/appreciation of homework or because children perceived the new tool to be more difficult and therefore better suited for contexts where they could count on adult assistance (i.e., relying on teachers’ help in class). However, we think that this explorative data highlights importance of considering children’s perception and willingness to use screen-based technologies in future studies, as well as their familiarity with these new tools to better understand performance outcomes.
Limitations
The present study presents multiple limitations. First, given that SBAs and VPAs differ in scoring systems (software vs. human) as well as tools used (screen and stylus vs. pen and paper) our data, while detecting differences in GMP scores, does not allow to fully disentangle whether these differences are due one or the other. Given that previous studies directly compare only scoring systems (software vs. human) (Provenzale et al., 2022, 2023), future studies may consider direct comparisons of tools (screen vs. paper) to better understand this point. Second, some GMPs led to low or moderate inter-rater agreement in the VPA coding (i.e., GMP 4 and 9), suggesting difficulties in achieving inter-coder agreement which have already been documented in the literature (Borean et al., 2012), but should be further explored. Third, while sample size was comparable to other studies, future research may benefit from considering larger samples possibly allowing to use of SBAs at different ages/school grades, this would also allow a better understanding of the impact of different paper formats on children’s performance. Fourth, this study only addressed handwriting legibility parameters, not analyzing fluency parameters as provided by SBAs which will be the object of a future study. Finally, the selected task, replicating procedures of the DGM-P test, only asked children to copy a phrase, so we are unable to calculate the effects of SBAs when children are confronted with the production of longer texts.
Notwithstanding these evident limitations we think that data presented in this work will be relevant for future studies on the use of screen-based technologies for handwriting assessments in childhood First, supporting viability of SBAs, as highlighted by correlations with VPA in some, if not all, GMPs considered. Secondly, suggesting that some caution must be exercised in introducing SBAs, as normative data will be needed, given that SBAs lead to comparatively higher error rates. Finally, suggesting relevance of building tools to assess tool familiarity and practice in children as well as their perception of novel tools. We think that in the future screen-based assessments of GMPs in primary school children may be useful for educators and occupational therapists, who commonly use these parameters to better understand a child’s profile of strengths and weaknesses. Given that screen-based handwriting appears both challenging and pleasant to children, we are confident that future studies may lead to a more conscious exploitation of SBAs, supporting the acquisition of the kinetic melody of handwriting in childhood.
Notes
By “screen-based technologies”, we are referring to digitalizers, tablets and/or any type of interactive displays that allow writing directly on a screen with a stylus.
Throughout the paper the acronym SBAs will be used in reference to handwriting assessments relying on hardware and software solutions, that require writing directly on a screen using a stylus. In contrast to use of validated paper-based assessments (VPAs) which involve handwriting on a sheet of paper using a regular pen or pencil.
GMPs are reliable and objective measures of legibility present in multiple analytic handwriting assessments (Rosenblum et al., 2003a). Common examples of GMPs are: letter size, alignment, joints, etc. While most authors agree on the main aspects of handwriting that should be measured by GMPs (size, shape, slant, shaping), there are many differences between validated tests on what and how GMPs are measured (Sparaci et al., 2024).
Common parameters related to handwriting fluency are: speed, pressure, irregular letter trace and number of inversions in velocity (NIV). Speed and irregular letter trace are the only two parameters related to handwriting fluency that can be captured using VPAs. However, some authors argued that these are not good measures of fluency, which is better described by parameters such as NIV and/or pressure, that cannot be measured using paper-based test, but require instead use of dedicated technology (Asselborn et al., 2018; Wicki et al., 2014).
Throughout the paper we will use the expression “handwriting difficulties” in reference cases of dysgraphia as well as to children defined in the literature as “poor writers”, i.e. children that do not have a dysgraphia diagnosis, but produce scarcely legible or non-fluent handwriting, often effecting their academic performance.
Graphic tablets and screen-based technologies are commonly referred to in the literature as ‘tablets’. This results in confusion between extremely different tools. In fact, while graphic tablets have a writing surface, but they have to be connected to a screen, leading to dissociation between writing surface and text; screen-based technologies allow to actually write on a screen, so that users engage directly with the displayed content.
On average, Italian children begin using cursive handwriting in the second semester of first grade.
A4 notebooks with ruled lines are commonly used in Italian primary schools. These notebooks come in four types of standard ruled paper well known to children: first grade squares, second-grade lines, third-grade lines and fourth/fifth grade lines. Ruled paper formats change between first and fourth grade, based on teachers’ evaluation of children’s handwriting skills (Borean et al., 2012; Pellegrini & Dongilli, 2010).
First-grade squares paper has squares with 1 cm sides. Second-grade lines and third-grade lines both rely on four differentially spaced lines to delimit the handwriting space. These four lines include a bottom, lower middle, higher middle and top line. Space between the lower middle and higher middle lines is used for the central part of cursive letters (i.e., known as letter body) and is 5 mm in second-grade lines and 3 mm in third-grade lines. The two spaces between bottom line and lower middle line and between higher middle line and top line are used respectively for descending and ascending letter parts and are 7 mm apart in both paper types. Therefore, overall bottom and top lines are 12 mm apart in second grade lines (7 + 5 + 7) and 10 mm apart in third grade lines (7 + 3 + 7). It is important to note that, as children’s cursive handwriting skills increase, they are taught to use paper that progressively elicits smaller letter bodies, while ascending and descending extensions remain constant. Finally, the fourth/fifth grade lines format only has two lines (i.e., bottom and top line) which delimit a 10 cm space that should be used to write the letter body as well as ascending letter parts, while descending letter parts extend below the bottom line (see also Borean et al., 2012, p. 43 for lined paper samples).
However, some differences must be noted. While the DGM-P test allows to measure 12 GMPs and provides a final legibility score based on comparisons with normative data, in the present study we concentrated only on 9 GMPs that had proven relevant in previous studies comparing different VPA tests (Sparaci et al., 2024). We also avoided any attempt at providing and overall legibility score due to lack of population data on SBA use.
As stated above, the DGM-P test allows to measure 12 GMPs, but for the purpose of the present study, we limited our assessment to the 9 GMPs also available from SBAs.
Low inter-coder reliability scores for this GMP are reported also in the original DGM-P test manual as extremely common, as this parameter is extremely complex to evaluate, also because letter joins in some cases are hard to observe only based on the handwritten text (see Borean et al., 2012, p. 135).
Difficulties in scoring this GMP in VPA is also shown by lower inter-rater reliability scores reported in the DGM-P test manual (see Borean et al., 2012, p. 135).
Performance of children in our sample on this GMP in the VPA were better than the average population means. In fact, the mean population error rate in third grade, as reported in the DGM-P test manual, is 0.34 in the Best condition and 0.74 in the Fast condition, while in our sample mean scores were respectively 0.18 and 0.45 (see Table 3) (Borean et al. 2012, p. 165).
Lack of a significant difference in unrecognizable letters may be ascribed once again to very low error rates in our sample (see above), while the case of handwriting speed is explained further in the text.
References
Accardo, A., & Perrone, I. (2008). Automatic quantification of handwriting characteristics before and after rehabilitation. In 14th Nordic-Baltic Conference on Biomedical Engineering and Medical Physics: NBC 2008. 16–20 June 2008 Riga, Latvia (pp. 95–98). Springer
Accardo, A. P., Genna, M., & Borean, M. (2013). Development, maturation and learning in uence on handwriting kinematics. Human Movement Science, 32(1), 136–146. https://doi.org/10.1016/j.humov.2012.10.004
Alamargot, D., Chesnet, D., Dansac, C., & Ros, C. (2006). Eye and pen: A new device for studying reading during writing. Behavior Research Methods, 38(2), 287–299.
Alamargot, D., & Morin, M. F. (2015). Does handwriting on a tablet screen affect students’ graphomotor execution? A comparison between grades two and nine. Human Movement Science, 44, 32–41.
Asselborn, T., Chapatte, M., & Dillenbourg, P. (2020). Extending the spectrum of dysgraphia: A data driven strategy to estimate handwriting quality. Scientific Reports, 10(1), 3140.
Asselborn, T., Gargot, T., Kidziński, Ł, Johal, W., Cohen, D., Jolly, C., & Dillenbourg, P. (2018). Automated human-level diagnosis of dysgraphia using a consumer tablet. NPJ Digital Medicine, 1(1), 42.
Beery, K., & Beery, N. (2004). The developmental test of visual motor integration. Western Psychological Services.
Bonneton-Botté, N., Beucher-Marsal, C., Bara, F., Muller, J., Corf, L. L., Quéméneur, M., & Dare, M. (2021). Teaching cursive handwriting: A contribution to the acceptability study of using digital tablets in French classrooms. Journal of Early Childhood Literacy, 21(2), 259–282.
Bonneton-Botté, N., Fleury, S., Girard, N., Le Magadou, M., Cherbonnier, A., Renault, M., Anquetil, E., & Jamet, E. (2020). Can tablet apps support the learning of handwriting? An investigation of learning outcomes in kindergarten classroom. Computers & Education, 151, 103831.
Borean, M. (2012). DGM-P: test per la valutazione delle difficoltà grafo-motorie e posturali della scrittura. Edizioni Erickson.
Di Brina, C., & Rossini, G. (2010). BHK. Scala sintetica per la valutazione della scrittura in età evolutiva. Edizioni Erickson.
Chang, S. H., & Yu, N. Y. (2022). Computerized handwriting evaluation and statistical reports for children in the age of primary school. Scientific Reports, 12(1), 15675.
Chesnet, D., & Alamargot, D. (2005). Analyses en temps réel des activités oculaires et graphomotrices du scripteur: Intérêt du dispositif ‘Eye and Pen.’ L’année Psychologique, 105(3), 477–520.
Couse, L. J., & Chen, D. W. (2010). A tablet computer for young children? Exploring its viability for early childhood education. Journal of Research on Technology in Education, 43(1), 75–96.
Cramm, H., & Egan, M. (2015). Practice patterns of school-based occupational therapists targeting handwriting: A knowledge-to-practice gap. Journal of Occupational Therapy, Schools, & Early Intervention, 8(2), 170–179.
Danna, J., Puyjarinet, F., & Jolly, C. (2023). Tools and methods for diagnosing developmental dysgraphia in the digital age: A state of the art. Children, 10(12), 1925.
Deschamps, L., Devillaine, L., Gaffet, C., Lambert, R., Aloui, S., Boutet, J., Brault, V., Labyt, E., & Jolly, C. (2021). Development of a pre-diagnosis tool based on machine learning algorithms on the BHK test to improve the diagnosis of dysgraphia. Advances in Artificial Intelligence and Machine Learning, 1(2), 114–135.
Devillaine, L., Lambert, R., Boutet, J., Aloui, S., Brault, V., Jolly, C., & Labyt, E. (2021). Analysis of graphomotor tests with machine learning algorithms for an early and universal pre-diagnosis of dysgraphia. Sensors, 21(21), 7026.
Dimauro, G., Bevilacqua, V., Colizzi, L., & Di Pierro, D. (2020). TestGraphia, a software system for the early diagnosis of dysgraphia. IEEE Access, 8, 19564–19575.
Drey, T., Janek, J., Lang, J., Puschmann, D., Rietzler, M., & Rukzio, E. (2022). SpARklingPaper: Enhancing common pen-and paper-based handwriting training for children by digitally augmenting papers using a tablet screen. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 6(3), 1–29.
Drotár, P., & Dobeš, M. (2020). Dysgraphia detection through machine learning. Scientific Reports, 10(1), 21541.
Dui, L. G., Calogero, E., Malavolti, M., Termine, C., Matteucci, M., & Ferrante, S. (2021). Digital tools for handwriting proficiency evaluation in children. In 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI) (pp. 1–4). IEEE.
Dui, L. G., Lunardini, F., Termine, C., Matteucci, M., Stucchi, N. A., Borghese, N. A., & Ferrante, S. (2020). A tablet app for handwriting skill screening at the preliteracy stage: Instrument validation study. JMIR Serious Games, 8(4), e20126.
Falk, T. H., Tam, C., Schellnus, H., & Chau, T. (2011). On the development of a computer-based handwriting assessment tool to objectively quantify handwriting proficiency in children. Computer Methods and Programs in Biomedicine, 104(3), e102–e111.
Feder, K. P., & Majnemer, A. (2007). Handwriting development, competency, and intervention. Developmental Medicine and Child NeuroloGy, 49(4), 312–317. https://doi.org/10.1111/j.1469-8749.2007.00312.x
Gargot, T., Asselborn, T., Pellerin, H., Zammouri, I. M., Anzalone, S., Casteran, L., Johal, W., Dillenbourg, P., Cohen, D., & Jolly, C. (2020). Acquisition of handwriting in children with and without dysgraphia: A computational approach. PLoS ONE, 15(9), e0237575.
Gerth, S., Dolk, T., Klassert, A., Fliesser, M., Fischer, M. H., Nottbusch, G., & Festman, J. (2016a). Adapting to the surface: A comparison of handwriting measures when writing on a tablet computer and on paper. Human Movement Science, 48, 62–73.
Gerth, S., Klassert, A., Dolk, T., Fliesser, M., Fischer, M. H., Nottbusch, G., & Festman, J. (2016b). Is handwriting performance affected by the writing surface? Comparing preschoolers’, second graders’, and adults’ writing performance on a tablet vs. Paper. Frontiers in Psychology, 7, 1308.
Graham, S. (2022). Teaching writing in the digital age. Educational psychology section; D. Fisher (Ed.). Routledge encyclopedia of education (Online). Taylor & Francis.
Graham, S., Berninger, V., Weintraub, N., & Schafer, W. (1998). Development of handwriting speed and legibility in grades 1–9. The Journal of Educational Research, 92(1), 42–52.
Graham, S., & Weintraub, N. (1996). A review of handwriting research: Progress and prospects from 1980 to 1994. Educational Psychology Review, 8, 7–87.
Guilbert, J., & Fernandez, J. (2024). The use of lined paper in child education: impact of line presence on handwriting quality. Reading and Writing, 1–19.
Guilbert, J., Alamargot, D., & Morin, M. F. (2019). Handwriting on a tablet screen: Role of visual and proprioceptive feedback in the control of movement by children and adults. Human Movement Science, 65, 30–41.
Hammer, M., Göllner, R., Scheiter, K., Fauth, B., & Stürmer, K. (2021). For whom do tablets make a difference? Examining student profiles and perceptions of instruction with tablets. Computers & Education, 166, 104147.
Hamstra-Bletz, L., & Blöte, A. W. (1990). Development of handwriting in primary school: A longitudinal study. Perceptual and Motor Skills, 70(3), 759–770.
Herstic, A. Y., Bansil, S., Plotkin, M., Zabel, T. A., & Mostofsky, S. H. (2025). Validity of an automated handwriting assessment in occupational therapy settings. Journal of Occupational Therapy, Schools, & Early Intervention, 18(1), 115–127.
Indira, A., & Vijayan, P. (2015). Teaching cursive hand writing as an intervention strategy for high school children with dysgraphia. International Journal of Social Sciences, 2(12), 1–10.
Isa, I. S., Rahimi, W. N. S., Ramlan, S. A., & Sulaiman, S. N. (2019). Automated detection of dyslexia symptom based on handwriting image for primary school children. Procedia Computer Science, 163, 440–449.
Istituto Superiore di Sanità, ISS, 2022, Linee Guida per i Disturbi Specifici dell’Apprendimento (DSA), https://www.iss.it/-/snlg-disturbi-specifici-apprendimento
Karavanidou, E. (2017). Is handwriting relevant in the digital era?. Antistasis, 7(1) , 153–167.
Kedar, S. V., Parab, P. P., Sharma, A. R., Patil, J. M., & Wagh, R. T. (2021). Identifying learning disability through digital handwriting analysis. Turkish Journal of Computer and Mathematics Education, 12(1S), 46–56.
Kiefer, M., & Velay, J. L. (2016). Writing in the digital age. Trends in Neuroscience and Education, 5(3), 77–81.
Loizzo, A., Zaccaria, V., Caravale, B., & Di Brina, C. (2023). Validation of the concise assessment scale for children’s handwriting (BHK) in an Italian population. Children, 10(2), 223.
Lurija, A. R. (1973). The working brain: an introduction to neuropsychology. The Penguin Press.
Lyon, G. R. (1996). Learning disabilities. The Future of Children, 6(1), 54–76.
Mangen, A., & Balsvik, L. (2016). Pen or keyboard in beginning writing instruction? Some perspectives from embodied cognition. Trends in Neuroscience and Education, 5(3), 99–106.
Mangen, A., & Velay, J. L. (2010). Digitizing literacy: Reflections on the haptics of writing. Advances in Haptics, 1(3), 86–401.
Marquardt, C., Diaz Meyer, M., Schneider, M., & Hilgemann, R. (2016). Learning handwriting at school – A teachers’ survey on actual problems and future options. Trends in Neuroscience and Education, 5(3), 82–89. https://doi.org/10.1016/j.tine.2016.07.001
Mayer, C., Wallner, S., Budde-Spengler, N., Braunert, S., Arndt, P. A., & Kiefer, M. (2020). Literacy training of kindergarten children with pencil, keyboard or tablet stylus: The influence of the writing tool on reading and writing performance at the letter and word level. Frontiers in Psychology, 10, 3054.
Mekyska, J., Faundez-Zanuy, M., Mzourek, Z., Galaz, Z., Smekal, Z., & Rosenblum, S. (2016). Identification and rating of developmental dysgraphia by handwriting analysis. IEEE Transactions on Human-Machine Systems, 47(2), 235–248. https://doi.org/10.1109/THMS.2016.2586605
MI–DGSIS - Ufficio di Statistica (2022). I principali dati relativi agli alunni con DSA aa.ss. 2019/2020– 2020/2021. https://www.miur.gov.it/web/guest/pubblicazioni
Müller, H., Gove, J. L., Webb, J. S., & Cheang, A. (2015). Understanding and comparing smartphone and tablet use: Insights from a large-scale diary study. In Proceedings of the annual meeting of the australian special interest group for computer human interaction (pp. 427–436).
Pagliarini, E., Guasti, M. T., Toneatto, C., Granocchio, E., Riva, F., Sarti, D., Molteni, B., & Stucchi, N. (2015). Dyslexic children fail to comply with the rhythmic constraints of handwriting. Human Movement Science, 42, 161–182.
Pellegrini, R., & Dongilli, L. (2010). Insegnare a scrivere: Pregra smo, stampato e corsivo. Edizioni Erickson: Milano.
Perelle, I. B., & Ehrman, L. (1994). An international study of human handedness: The data. Behavior Genetics, 24(3), 217–227.
Philip, B. A., Li, F., Hawkins-Chernof, E., Chen, L., Swamidass, V., & Zwir, I. (2023). Motor assessment with the STEGA iPad app to measure handwriting in children. American Journal of Occupational Therapy, 77(3), 7703205010.
Polsley, S., Powell, L., Kim, H. H., Thomas, X., Liew, J., & Hammond, T. (2022). Detecting children’s fine motor skill development using machine learning. International Journal of Artificial Intelligence in Education, 32(4), 991–1024.
Provenzale, C., Sparaci, L., Fantasia, V., Bonsignori, C., Formica, D., & Taffoni, F. (2022). Evaluating Handwriting Skills through Human-Machine Interaction: A New Digitalized System for Parameters Extraction. 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) (pp.5128–5131).
Provenzale, C., Bonsignori, C., Sparaci, L., Formica, D., & Taffoni, F. (2023). Using screen-based technologies to assess handwriting in children: A preliminary study choosing human-machine interaction. IEEE Access, 11, 118865–118877. https://doi.org/10.1109/ACCESS.2023.3326357
Radesky, J. S., Weeks, H. M., Ball, R., Schaller, A., Yeo, S., Durnez, J., & Barr, R. (2020). Young children’s use of smartphones and tablets. Pediatrics, 146(1), e20193518. https://doi.org/10.1542/peds.2019-3518
Raven, J. C., Court, J. H., & Raven, J. (1990). Coloured progressive matrices. Oxford Psychologists Press.
Rosenblum, S., & Dror, G. (2016). Identifying developmental dysgraphia characteristics utilizing handwriting classification methods. IEEE Transactions on Human-Machine Systems, 47(2), 293–298.
Rosenblum, S., Dvorkin, A. Y., & Weiss, P. L. (2006). Automatic segmentation as a tool for examining the handwriting process of children with dysgraphic and proficient handwriting. Human Movement Science, 25(4–5), 608–621.
Rosenblum, S., Parush, S., & Weiss, P. L. (2003b). Computerized temporal handwriting characteristics of proficient and non-proficient handwriters. American Journal of Occupational Therapy, 57(2), 129–138.
Rosenblum, S., Weiss, P. L., & Parush, S. (2003a). Product and process evaluation of handwriting difficulties. Educational Psychology Review, 15, 41–81.
Šafárová, K., Mekyska, J., & Zvončák, V. (2021). Developmental dysgraphia: A new approach to diagnosis. International Journal of Assessment and Evaluation, 28(1), 143.
Schilhab, T., Balling, G., & Kuzmicova, A. (2018). Decreasing materiality from print to screen reading. First Monday, 23(10), 1–10.
Serpa-Andrade, L. J., Pazos-Arias, J. J., López-Nores, M., & Robles-Bykbaev, V. E. (2021). Design, implementation and evaluation of a support system for educators and therapists to rate the acquisition of pre-writing skills. IEEE Access, 9, 77920–77929.
Simonnet, D., Girard, N., Anquetil, E., Renault, M., & Thomas, S. (2019). Evaluation of children cursive handwritten words for e-education. Pattern Recognition Letters, 121, 133–139.
Sparaci, L., Fantasia, V., Bonsignori, C., Provenzale, C., Formica, D., & Taffoni, F. (2024). Handwriting in primary school: Comparing standardized tests and evaluating impact of grapho-motor parameters. Reading and Writing. https://doi.org/10.1007/s11145-024-10562-3
Sudsawad, P., Trombly, C. A., Henderson, A., & Tickle-Degnen, L. (2001). The relationship between the Evaluation Tool of Children’s Handwriting and teachers’ perceptions of handwriting legibility. American Journal of Occupational Therapy, 55(5), 518–523.
Tressoldi, P. E., Cornoldi, C., & Re, A. M. (2019). BVSCO-2. Batteria per la Valutazione della Scrittura e della Competenza Ortografica 2. Giunti Edu.
Van der Weel, F. R., & Van der Meer, A. L. (2024). Handwriting but not typewriting leads to widespread brain connectivity: A high-density EEG study with implications for the classroom. Frontiers in Psychology, 14, 1219945.
Wicki, W., Lichtsteiner, S. H., Geiger, A. S., & Müller, M. (2014). Handwriting fluency in children. Impact and correlates. Swiss Journal of Psychology, 73(2), 87–96.
Wollscheid, S., Sjaastad, J., Tømte, C., & Løver, N. (2016). The effect of pen and paper or tablet computer on early writing–A pilot study. Computers & Education, 98, 70–80.
Yekeler Gökmen, A. D., Yildiz, M., Aktas, N., & Atas, M. (2022). Handwriting speeds of 4th-8th grade students. International Electronic Journal of Elementary Education, 15(2), 123–136.
Zvoncak, V., Mucha, J., Galaz, Z., Mekyska, J., Šafárová, K., Faundez-Zanuy, M., & Smekal, Z. (2019). Fractional order derivatives evaluation in computerized assessment of handwriting difficulties in school-aged children. In 2019 11th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT) (pp. 1–6). IEEE.
Funding
Open access funding provided by Consiglio Nazionale Delle Ricerche (CNR) within the CRUI-CARE Agreement. This article was funded by HORIZON EUROPE Framework Programme, Grant No. (871803), Domenico Formica.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sparaci, L., Bonsignori, C., Provenzale, C. et al. Handwriting legibility in primary school: comparing screen-based and validated paper-based assessments. Read Writ (2026). https://doi.org/10.1007/s11145-026-10777-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1007/s11145-026-10777-6





