Syntactic Complexity and Lexical Density in EFL Academic Writing: A Corpus-Based Investigation of Developmental Patterns

Authors

  • Ayesha Saddique Department of English, Govt. Post Graduate College for Women, Mandi Bahauddin, Punjab, Pakistan
  • Mohammad Aafaq Nadeem The University of Lahore Sargodha Campus
  • Muhammad Danish The University of Lahore Sargodha Campus

Keywords:

Syntactic Complexity, Lexical Density, EFL Academic Writing, Corpus-Based Analysis, Arabic L1, Writing Proficiency

Abstract

This study investigates syntactic complexity and lexical density in academic writing by Arabic L-1 learners of English as Foreign Language (EFL) through a corpus-based methodology. A 350, 000-word dataset of argumentative essays was compiled from learners at CEFR B1, B2, and C1, evenly distributed across humanities and material sciences. Automated analysis was conducted using the L2 Syntactic Complexity Analyzer (L2SCA) and AntConc, with Stanford POS tagging applied to calculate 14 syntactic indices and lexical density via the Ure formula. Results revealed a progressive developmental trajectory: mean length of T-unit increased from 12.87 at B1 to 19.04 at C1, while complex nominals per clause rose by 84%, surpassing clause elaboration. Lexical density also advanced from 48.7 to 56.3 with a competitive relationship at B2 (r =.32) shifting to positive synergy at C1 (r =.56). Humanities learners at higher proficiency levels produced more nominal structures than science learners. Multiple regression identified complex nominals and lexical density as predictors of writing quality, accounting for 58% of variance. Findings show phrasal complexity as a key marker of the academic writing maturity and designate B2 as a critical stage of linguistic restructuring. The research contributes to second language development theory and informs EFL pedagogy by emphasizing nominalization, lexical sophistication and disciplinary writing performance.

 

Downloads

Published

2025-12-03