Taking time off: The semantics of holiday and its synonyms in Maltese

Jessica Nieder (L-Università ta’ Malta)

Michael Spagnol (L-Università ta’ Malta)

Abstract
This study investigates the semantic and contextual variation among three Maltese nouns denoting ‘holiday’ or ‘time off’: btala (Semitic), vaganza (Romance), and holiday (English). Using corpus data and computational analysis based on distributional semantics, we show that while the three terms are near-synonyms, they differ in register, contextual scope, and degree of integration into Maltese. The findings illustrate how language contact shapes subtle variation in meaning and usage.
10.5281/zenodo.16933098

1 Introduction

Maltese is a contact language shaped by the convergence of Semitic, Romance, and English influences, resulting in a lexicon that reflects centuries of sociolinguistic interaction and lexical borrowing (Vella 2013). While often described as a “high borrower” (Comrie & Spagnol 2016), Maltese resists overly rigid typological labels. As Stolz (2003) has argued, the language occupies a position along a continuum of contact-induced change, defying classification as either a case of extensive borrowing or a prototypical mixed language (see also Bakker & Mous 1994).

One notable consequence of Maltese’s layered linguistic history is the presence of synonymous lexical items drawn from different etymological strata. This is a direct outcome of sustained contact between Semitic, Romance, and English, which has led to a lexicon where words of diverse origins coexist and frequently express the same or very similar meanings. For example, tejjeb (Semitic), immiljora (Romance), and impruvja (English) all convey the idea of ‘improving’ (Vella 2013). Similarly, lewn (Semitic) and kulur (Romance) both mean ‘colour’, while skiet and silenzju express ‘silence’. As Friggieri (2000) points out, these sets or pairs are not functionally identical but diverge along dimensions such as intensity, emotional tone, professional or dialectal register, and stylistic markedness.

Following Geeraerts (2010), synonymy is best understood as a context-sensitive relationship: two words are synonymous if they can replace each other in a given context without changing the overall semantic value. However, true or absolute synonymy is rare. Partial synonymy is more common and can be observed when words share core meaning but differ in register or stylistic tone. For instance, prostituta ‘prostitute’ and qaħba ‘whore’ may refer to the same individual, but the latter is more emotionally charged. Similarly, dijabete (a medical term) contrasts with zokkor (its popular counterpart for diabetes), much like gonorrhoea and clap in English. Even regional variation plays a role, as in trampi (Gozo) and bużullotti (Malta) for ‘odd, funny behaviour’ or żajba żajbona (Rabat, Malta), semperlina (Siġġiewi) and nannakola (Standard Maltese) for ‘ladybird’.

Synonyms in Maltese occur across different parts of speech, such as nouns (lsien and lingwa ‘language’), verbs (spera and ttama ‘to hope’), adjectives (żgħir and ċkejken ‘small’), adverbs (dlonk and immedjatament ‘immediately’), prepositions (fi and ġo ‘in, inside’, see Stolz et al. 2023), and conjunctions (imma, iżda, and però ‘but’, see Stolz et al., accepted). In some cases, the part of speech itself shifts: an adjective like sħun ‘warm, hot’ in the phrase it-taġen sħun ‘the pan is hot’ is synonymous with the verb ħaraq in the Imperfect, it-taġen jaħraq. Similarly, xena sabiħa (‘beautiful scene’) may alternate with xena bellezza (noun), xena tal-istampi (noun phrase, literally ‘a scene of the pictures’), or even xena ma tpinġihiex (‘a scene you can’t even paint’, using a negative verb phrase).

Furthermore, lexical variation may involve multiword expressions versus single lexical items. The adjective stinat ‘stubborn’ coexists with idiomatic alternatives like rasu iebsa and rasu taż-żonqor. Likewise, aktarx ‘probably’ finds synonymous expression in phrases such as jista’ jkun and għandu mnejn.

Synonyms in Maltese often differ in collocational behaviour. For instance, although jum and ġurnata both mean ‘day’, only Jum l-Omm ‘Mother’s Day’ is acceptable, not Ġurnata l-Omm. Likewise, illum il-ġurnata ‘nowadays’ is idiomatic, while illum il-jum is not, and tużżana bajd ‘a dozen eggs’ is well-formed, but tużżana appostli ‘a dozen apostles’ is infelicitous.

In terms of syntactic flexibility, words like beda and qabad ‘to begin’ may both occur in verb chains, e.g. it-tifel beda/qabad jibki ‘the boy began to cry’, but not in all structures. Il-film beda issa ‘the film just started’ is acceptable, but *il-film qabad issa is not, if the intended meaning is to indicate the onset of the film. Similarly, anki ‘also’ may appear sentence-initial anki t-tifla ġiet ‘the girl also came’, whereas its synonym ukoll does not *ukoll it-tifla ġiet.

Polysemy adds further complexity. While domanda and mistoqsija are synonymous in the sense of ‘question’, domanda also means ‘demand’, making it the only valid term in a sentence such as hawn domanda kbira fis-suq ‘there is high demand in the market’. The noun ilsien offers multiple senses beyond ‘language’, including ‘tongue’ (body part), ‘bell clapper’, ‘shoe tongue’, and ‘peninsula’ as in ilsien ta’ art. Similarly, while temp and ajru can both mean ‘weather’ or ‘sky’ in it-temp fetaħ and l-ajru fetaħ ‘the sky has cleared’, only temp is correct in technical or metaphorical uses, such as it-temp tal-verb ‘verb tense’ or il-mużiċist iżomm mat-temp ‘the musician keeps the tempo’.

Maltese features a range of binomial expressions composed of synonymous terms that serve to stylistically reinforce meaning through lexical doubling. These pairings often reflect the language’s hybrid etymological makeup, typically combining a Semitic term with a Romance-derived counterpart, as in paċi u sliem ‘peace’ or rieda u volontà ‘will’. However, some expressions involve elements from the same stratum, whether Romance (e.g. fidil/fidili u leali ‘loyal’) or Semitic (e.g. imkisser u mfarrak ‘smashed’). Such constructions not only intensify the semantic effect but also exemplify the interplay between the language’s core lexical strata.

This study focuses on a specific set of Maltese noun synonyms: btala (Semitic), vaganza (Romance, Aquilina (1987) provides the etymology < Sic. vacanti), and holiday (English), three terms commonly used to express the concept of ‘time off’ or a ‘holiday’ in Maltese. While often treated as interchangeable, their actual usage may vary subtly depending on domain, register, and discourse context. These differences offer a compelling case study in synonymy within a highly bilingual and contact-rich setting.

The semantic phenomenon of synonymy has long been a central topic in lexical semantics, driven by cognitive, semantic, and contextual factors that allow speakers to express similar meanings through different word forms (Sikogukira 1994). At first glance, synonymy may appear straightforward - using different words to say the same thing - but its theoretical underpinnings reveal a much more nuanced picture. Following J. Lyons’ influential classification (Lyons 1995; see also Sikogukira 1994), synonyms can be broadly divided into absolute synonyms and partial synonyms. Absolute synonyms, especially in the sense of complete synonyms in Lyons’ more fine-grained classification, are rare and often debated, requiring complete interchangeability in all dimensions of meaning. Partial synonyms, on the other hand, may overlap in meaning but differ in collocational preferences, stylistic register, or domain-specific usage.

This raises a key question: to what degree must two lexical items overlap in meaning to be considered synonymous? Is mere denotational equivalence enough, or must they also be interchangeable across all contexts? These distinctions are particularly relevant in languages shaped by extensive contact, where synonym candidates may come from different etymological sources and be embedded in distinct contexts. Determining synonymy in such a setting requires not only formal linguistic analysis but also a nuanced understanding of usage. The insight of trained linguists, and ideally native speakers, is indispensable in assessing whether two word forms truly function as synonyms.

Building on this, Friggieri (2000) offers a valuable framework for analysing synonymy in Maltese by adapting and translating the nine-point scale of variation among synonym pairs from W.E. Collinson (spelled out in Ullmann 1962, Chapter 6). This typology captures different axes along which words may diverge, including cases where one word is more general than another (karabbeka ‘to cry, beg’), more intense (wissawiddeb ‘to warn’), more emotionally charged (parpartelaq ‘to leave’), more evaluative (berbaqnefaq ‘to waste, spend’), more professional (omiċidjuqtil ‘homicide, killing’), more literary (ħajrringrazzjament ‘thanks’), more current or discursive (xagħardliel ‘hair’), more dialectal (manoċċatajra ‘kite’), or more associated with children’s speech (pappaikel ‘food’). Crucially, following Ullmann (1962), Friggieri (2000) shows for Maltese that synonymous forms often differ not only in semantic content but also in discourse function, stylistic register, emotional valence, or social positioning-factors especially salient in a contact language.

While Ullmann’s framework is primarily grounded in qualitative linguistic analysis and native speaker intuition, it raises compelling questions about how such fine-grained variation might be detected in large-scale corpora. Advances in natural language processing (NLP) and computational linguistics now offer new tools to complement this intuition, especially for low-resourced and morphologically rich languages like Maltese. Large-scale language models can reveal subtle patterns of usage, co-occurrence, and context-sensitive meaning that may not be easily detected through introspection alone. Recent work, for instance, has shown how psycholinguistic effects such as semantic priming or prime-target meaning differences can be modelled computationally, revealing previously unobserved patterns (Nieder et al. 2024).

In this study, we adopt a computational corpus-based approach to investigate whether, and to what extent, the terms btala, vaganza, and holiday exhibit similar contextual profiles. Rather than assuming synonymy, we interrogate it: how do these words behave across different domains, registers, and contexts? What does this tell us about language contact, code-switching, and the nature of meaning in use?

2 Analysis of synonyms

2.1 Synonyms and corpora

Corpus analysis has long been a central method in linguistic research, offering rich empirical evidence for how language is used across registers, genres, and speaker communities. For semantic phenomena such as synonymy, however, surface-level comparisons of word meaning often fall short. Synonymous terms may share a basic definition, but their distribution, register, and pragmatic function are shaped by the contexts in which they occur. Analysing such subtle semantic variation typically requires not only extensive corpora but also deep knowledge of the language and a significant investment of time and interpretive effort.

This challenge is particularly salient in the case of Maltese, a contact language with a lexicon shaped by both Semitic and Romance influences, and more recently, by English. In such a setting, multiple lexical items exist side by side to express similar concepts, yet their actual usage and meaning in context may differ in important and revealing ways.

In this study, we explore the Maltese synonyms btala, vaganza, and holiday, all referring to the concept of a ‘holiday’ or ‘time off’. We propose a computational approach that can assist traditional corpus methods by modelling word meaning through contextual embeddings generated by large language models (see also Nieder & List 2024, Nieder et al. 2024). While not a replacement for human linguistic expertise, such models offer a scalable way to identify semantic patterns across thousands of real-world contexts.

The concept of ‘holiday’ provides a useful test case, not only for its relevance to the occasion of this volume, but also for its ability to illustrate the subtle semantic and sociolinguistic variation that computational tools can help uncover. Although this study focuses on a single lexical field, the methodology we adopt can be readily extended to explore other semantic relations in Maltese and beyond.

2.2 Corpus analysis

This study draws on data from the Korpus Malti v. 4.2, a wide-ranging corpus of contemporary Maltese consisting of approximately 474 million words across 163,395 texts, spanning various genres such as parliamentary proceedings, press articles, academic writing, online blogs, and literature. We extracted all instances of the target terms btala, vaganza, and holiday, including 20 words of the surrounding context in each direction as well as the words’ respective plural forms btajjel, vaganzi and holidays in context, to analyse their contextual usage and semantic behaviour. Our final data set contains a total number of 8,581 observations.

The three terms display notable differences in overall frequency and distribution. Vaganza was the most frequent, with 4,065 occurrences across 2,112 texts, yielding a frequency of 8.56 instances per million words. Btala followed closely, occurring 3,339 times in 1,703 texts (7.03 instances per million words). In contrast, holiday appeared just 1,177 times in 592 texts, with a frequency of only 2.48 instances per million words. This lower frequency suggests that holiday, while semantically similar, may play a more marginal or context-specific role in the Maltese lexicon. Another possible reason for this lower frequency is the composition of the Korpus Malti. Despite its high frequency in spoken Maltese, holiday might be underrepresented in a corpus that is primarily composed of written texts. This reflects not only the diglossic nature of Maltese but also the normative pressures associated with written language, where speakers may consciously avoid English-derived forms in favour of Semitic or Romance alternatives perceived as more appropriate or authentic, especially for written Maltese.

The distribution across genres further underscores these differences. The lexeme btala shows a broad and productive usage pattern, with high frequencies in online blogs (15.14 instances per million words), literature (22.22 instances per million words), and the press (19.58 instances per million words). Similarly, vaganza appears extensively in online content (21.93 instances per million words), literature (36.36 instances per million words), and press texts (21.50 instances per million words), suggesting it is well-integrated and semantically flexible. Holiday, on the other hand, is largely restricted to institutional domains such as parliamentary texts (11.96 instances per million words) and jurisprudence (3.26 instances per million words), with minimal representation in more informal genres. This suggests that holiday may function as a domain-specific or code-switched term, rather than a fully integrated synonym in written Maltese.

The most common collocations for each term offer further insight into their usage profiles. The lexeme btala frequently appears with Ħdud ‘Sundays’, pubbliċi ‘public’, and jitħallsu ‘are paid’, reflecting its use in references to official or paid leave. Vaganza co-occurs most often with sajf ‘summer’, Milied ‘Christmas’, and tas- (a genitive particle), suggesting its association with seasonal or festive breaks. Holiday, meanwhile, tends to appear in more technical or bureaucratic contexts, collocating with English words such as tax, complex, and flats.

While this corpus-based overview already suggests subtle semantic and stylistic distinctions among the three terms, the full picture requires a more fine-grained analysis of the contexts in which these words appear. To this end, we turn to a computational method that allows us to model word meaning as it emerges through real usage.

2.3 Computational analysis

To investigate the contextual behaviour of Maltese near-synonyms for holiday, we carried out a computational analysis using the large language model mBERTu (Micallef et al. 2022), a Maltese-specific adaptation of multilingual BERT (Bidirectional Encoder Representations from Transformers). Large language models like BERT are trained on vast amounts of text and are able to represent words in terms of the contexts in which they appear. This allows us to model meaning not in isolation, but as it emerges through usage (Nieder et al. 2024).

We followed a multi-step process:

  1. Corpus extraction: Sentences were extracted from the Korpus Malti v. 4.2, containing instances of the three near-synonymous terms under investigation: btala, vaganza, and holiday, including their plural forms (btajjel, vaganzi, holidays).
  2. Contextual embeddings: Each sentence was passed through the mBERTu model, which generated a high-dimensional vector (embedding) representing the meaning of each token as shaped by its surrounding context. These embeddings capture subtle shifts in meaning that would be difficult to identify manually.
  3. Similarity measurement: We calculated semantic similarity (i.e. cosine similarity) between the contextual embeddings to assess how semantically close the usages of each word were, both within and across synonym groups.
  4. Variance analysis: To assess the internal semantic consistency of each term, we measured the variance in cosine similarity scores within each word group. Lower variance suggests that the term is used in a more stable, consistent manner across contexts.
  5. Semantic space visualisation: Finally, we applied UMAP (Uniform Manifold Approximation and Projection), a dimensionality reduction technique that preserves relative distances between data points, to visualise the distribution of contextual embeddings in a two-dimensional space.

This methodology provides a way to observe semantic behaviour at scale-offering a systematic perspective on how near-synonyms are used across varied contexts in Maltese. The combination of cosine similarity, variance, and spatial distribution allows us to identify both overlaps and distinctions in the usage patterns of the target terms.

The data and code for our study is openly available at https://osf.io/ebp2a/.

2.4 Results

2.4.1 Intra-group similarity and contextual consistency

The results of the similarity measurement analysis within the separate groups are summarised in Table 1, which reports descriptive statistics for each of the three synonym groups: btala, vaganza, and holiday. For each group, we calculated the cosine similarity between all pairs of contextual embeddings generated by the mBERTu model. These values reflect how consistently each term is used across its various occurrences in the corpus, that is, how similar the contexts are in which the word appears.

Table 1: Descriptive statistics of cosine similarity scores within each synonym group (btala, vaganza, holiday), based on contextual embeddings generated by the mBERTu model. Higher mean scores indicate more semantically consistent usage across contexts. The number of pairwise comparisons (no. of pairs) reflects the frequency of each term in the corpus.
token no. of pairs mean cosine-similarity variance
btala 11148921 0.8560 0.0039
vaganza 16524225 0.8635 0.0029
holiday 1385329 0.8721 0.0035

All three groups show high average similarity scores, with holiday having the highest mean (0.8721), followed by vaganza (0.8635) and btala (0.8560). These values indicate that, overall, the words are used in semantically coherent ways, with relatively little variation in meaning across contexts. However, small differences in the variance of similarity scores offer further insight. Btala displays the highest variance (0.0039), suggesting that it appears in a slightly wider range of contexts. In contrast, vaganza has the lowest variance (0.0029), indicating that it is used in a more narrowly defined set of contexts. Holiday, while having the highest mean similarity (0.8721), also shows moderate variance (0.0035). Taken together, these results suggest that holiday is used in semantically consistent and relatively uniform contexts. To visualise the distribution of contextual similarities within each group, we plotted density estimates of the pairwise cosine similarity scores for btala, vaganza, and holiday (see Figure 1).

Density plots of pairwise cosine similarity scores within each synonym group (btala, vaganza, holiday), based on contextual embeddings. Higher and narrower peaks indicate more semantically consistent usage across contexts; broader distributions suggest more contextual flexibility.

All three curves show a strong peak in the high similarity range (above 0.8), confirming that each term is used in a consistent and semantically cohesive way across contexts. Notably, btala’s curve (green) is slightly broader, with a few more lower-similarity pairs, supporting its higher variance. Vaganza’s curve (red) is narrower and more concentrated, reflecting its more restricted usage. Holiday (blue) shows a similarly tight distribution, although its shape and location on the x-axis suggests even more consistency across contexts. These visual patterns align with the statistical results in Table 1 and reinforce the interpretation that btala is used more flexibly, while vaganza and holiday are semantically more constrained. Taken together, these results suggest that although the three terms function as near-synonyms, they differ subtly in terms of semantic scope and contextual flexibility.

2.4.2 Inter-group similarity and synonym overlap

To complement the analysis of internal consistency, we also examined the degree of semantic overlap between the synonym groups by calculating the average cosine similarity between the contextual embeddings of each pair of terms. These between-group comparisons indicate how closely the words align in usage, and whether they occupy overlapping or distinct regions in semantic space.

Table 2: Inter-group cosine similarity scores based on contextual embeddings generated by the mBERTu model. Higher average scores indicate a higher semantic overlap.
token-pair average cosine-similarity
btala - vaganza 0.8585
vaganza - holiday 0.8528
holiday - btala 0.8477

The results in Table 2 reveal a high level of semantic proximity across all three groups. The highest similarity was observed between btala and vaganza (0.8585), followed by vaganza and holiday (0.8528), and finally btala and holiday (0.8477). These figures suggest that the terms are frequently used in similar contexts, reinforcing their functional synonymy. At the same time, however, the small differences we observe are linguistically meaningful. The slightly lower similarity between btala and holiday reflect differences in language of use, with holiday appearing more frequently in English-medium or code-switched discourse, while btala is more common in native Maltese usage. The close alignment between btala and vaganza, on the other hand, suggests that vaganza, originating from a Romance background, is well-integrated into the Maltese lexicon and used in much the same way as its native Semitic counterpart, even though there is a considerable difference in intra-group variance.

2.4.3 Semantic variance and flexibility

Beyond average similarity scores, the variance in similarity within each synonym group offers further insight into how flexibly each term is used across different contexts (see Table 1, column ‘variance’). Variance here refers to the spread of contextual similarity scores: lower variance indicates that a word tends to appear in highly similar, stable environments, while higher variance points to greater contextual diversity, and potentially greater semantic flexibility.

Among the three terms, btala exhibited the highest variance (0.0039), suggesting less contextual consistency, i.e. it is used in a broader range of linguistic and situational settings. This is consistent with its status as the native Semitic term, which may be used productively across multiple registers — from informal conversation to institutional discourse — and may refer to various types of leave (e.g., sick leave, school holidays, time off work). Vaganza, by contrast, showed the lowest variance (0.0029), indicating a narrower contextual usage. As a word of Romance origin, its semantic function and context may be more specialised, or influenced by stylistic or sociolinguistic factors. Its contextual consistency suggests that it carries a slightly more fixed or constrained meaning, despite its apparent synonymy with btala. Holiday showed moderate variance (0.0035) and the highest mean similarity, which suggests highly stable usage, likely within English-medium discourse, as established through our earlier collocation check. Its moderate variance could reflect its appearance in fixed expressions or more formulaic structures, further reinforcing a more constrained contextual scope.

2.4.4 Visualising meaning: Semantic space projection

To provide a more intuitive view of the semantic relationships between the synonym groups, we visualised the contextual embeddings using UMAP (Uniform Manifold Approximation and Projection). UMAP projects high-dimensional vectors into two dimensions while preserving their relative distances, allowing us to represent semantic similarity in a visual format. Each point in Figure 2 represents a single occurrence of btala, vaganza, or holiday in the corpus, coloured by group. Overlapping data points suggest highly similar usage contexts, while separate clusters indicate a slightly different meaning.

UMAP projection of contextual embeddings for btala, vaganza, and holiday, with each point representing one occurrence in the corpus. Colours indicate token group. The spatial distribution reflects contextual similarity: denser clusters suggest consistent usage, while more dispersed clusters indicate greater semantic flexibility.

The visualisation shows that all three terms occupy a shared semantic space, but with distinct patterns of distribution. Btala (turquoise) is the most widely dispersed, spreading across both dimensions and showing less tightly clustered usage. This visual pattern aligns with its higher variance in the similarity analysis and suggests that btala is used in a broader variety of contexts, likely reflecting its flexibility and high productivity in native Maltese discourse. A closer inspection of the spread out data points located in the top of the plot for btala revealed that some of these data points reflect the broken plural form btajjel which seems to have a more distinct usage pattern than any other singular and plural form in the data.1

Vaganza (grey), by contrast, forms a more compact cluster, concentrated in a central zone of the semantic space. Its relatively tight grouping indicates more restricted usage, likely tied to specific registers. Holiday (light green) also forms a relatively compact but slightly more peripheral cluster, suggesting stable but even more contextually distinct usage.

The partial overlap between btala and vaganza reflects their high semantic similarity and supports their functional interchangeability. Holiday is less central, indicating that while it is related in meaning, it plays a more specialised role in the corpus data.

2.4.5 Summary

This analysis demonstrates that large language models such as mBERTu can effectively capture subtle patterns of synonym usage in real corpus data. While btala, vaganza, and holiday are functionally similar and often interchangeable, their contextual behaviour reveals meaningful distinctions. All three terms are used in semantically coherent ways, as shown by high average similarity scores, but differ in semantic scope and distributional flexibility. The Semitic term btala shows the greatest internal variation and the widest spread in semantic space, suggesting broader usage across registers and meanings. Vaganza, a term of Romance origin, displays more constrained and centralised usage, indicating its integration into the Maltese lexicon with a narrower functional role despite a considerable meaning overlap with btala (see Figure 2 again). Holiday, though semantically close to the others, is used in more peripheral and consistent ways, likely reflecting its presence in English or code-switched contexts.

3 Discussion

Synonyms are traditionally defined as words that share the same meaning, yet in practice, true or complete synonymy is rare. Words that are functionally equivalent often differ subtly in their connotations, frequency, register, or contextual distribution. The results of our corpus and computational analysis illustrate this principle clearly in the case of the Maltese terms btala, vaganza, and holiday. While all three are used to express the concept of a ‘holiday’ or ‘time off’, they exhibit measurable differences in how, where, and to what extent they are used.

Our findings show that although the words are highly similar in meaning — as evidenced by high cosine similarity scores — they differ in semantic flexibility, register, and distributional range. Btala, the native Semitic term, displays the broadest contextual usage, appearing in a wider range of registers and syntactic environments. Its higher variance and broader spread in semantic space reflect its adaptability and productivity across domains such as education, work, and health.

Vaganza, derived from Romance, shows slightly higher semantic consistency and occupies a more central, tightly clustered region in semantic space. This suggests that it has become well-integrated into the Maltese lexicon, functioning as a stable synonym for btala in many contexts. The overlap between these two terms highlights the internal convergence between Maltese’s Semitic and Romance linguistic layers, This is a reflection of the language’s historical hybridity and internal lexical integration. Holiday, by contrast, while also semantically aligned with the other two, shows slightly more peripheral and constrained usage. Its high average similarity and low variance suggest contextual stability, but also limited flexibility. It appears to function primarily in English-language or code-switched discourse, often embedded in English expressions or formal usage. This suggests that while the English term holiday has entered the Maltese lexicon, it may not have the same range of semantic or pragmatic uses as its Semitic or Romance counterparts, btala and vaganza. In this sense, it exhibits a kind of partial integration, functioning as a near-synonym with register-specific limitations.

These findings demonstrate that contextual meaning, captured here through embeddings and similarity metrics, can reveal subtle distinctions that are often difficult to observe through traditional manual analysis alone (Nieder et al. 2024). While all three terms refer to the same conceptual domain, their actual use in language reflects historical origin, degree of integration, and sociolinguistic function.

In this regard, Friggieri’s (2000) typology of Maltese synonymy offers a valuable interpretive lens. Rather than simply defining the meaning relationship between words as synonymous or not, his framework classifies relationships between word pairs, capturing the ways in which synonymous terms can differ in nuance, intensity, register, or stylistic effect. When viewed through this lens, the differences observed between btala, vaganza, and holiday suggest that these terms may stand in asymmetric relationships of the sort Ullmann (1962) and Friggieri (2000) describe — where one is broader or more neutral, and others are more contextually or stylistically marked. While our computational methods do not explicitly classify synonym pairs according to Ullmann’s and Friggieri’s scale, the variation we observe in distribution and semantic consistency echoes the kinds of relational patterns their typology captures.

Moreover, this study highlights the value of combining corpus data with computational tools in linguistic analysis. By modelling meaning as it emerges through usage, we are able to quantify and visualise how speakers employ synonymous terms in different ways. This approach complements traditional corpus analysis by making it possible to examine large-scale patterns of variation, and to better understand the lexical dynamics of a contact-rich language like Maltese. Together, the similarity metrics, variance measures, and semantic space visualisation presented in this work offer insight into how meaning is shaped by register, origin, and language contact.

4 Conclusion

To conclude, while the Maltese nouns btala, vaganza, and holiday all point to time away from work, they do so in subtly different ways — shaped by linguistic history, usage patterns, and social context. If nothing else, this analysis reminds us that the concept of a holiday is not only semantically rich, but also pragmatically embedded in our everyday life. We can think of no better way to honour the occasion than by encouraging our celebrated colleague to embrace all three meanings — and to enjoy a well-earned btala, a relaxing vaganza, and perhaps even a proper holiday.

5 Acknowledgement

We want to express our gratitude to Dr. Thomas Haider from the University of Passau for early discussions on this project. Additionally, we thank our colleagues Slavomír Čéplö and Adam Ussishkin not only for conceiving the idea for this volume, but also for their ongoing editorial support during the preparation of this contribution. Finally, and most importantly, we are honoured to contribute to this Festschrift dedicated to Professor Thomas Stolz, whose work has profoundly influenced our thinking on language contact and linguistic typology.

6 Data Availability Statement

The code and data used in this study is openly available at https://osf.io/ebp2a/.

7 References

Peter Bakker & Maarten Mous. 1994. Mixed Languages: 15 Case Studies in Language Intertwining. Amsterdam: Institute for Functional Research into Language and Language Use.

Aquilina, Joseph. 1987-1990. Maltese-English Dictionary. Malta: Midsea Books.

Comrie, Bernard & Michael Spagnol. 2016. “Maltese loanword typology.” In Gilbert Puech & Benjamin Saade (eds.), Shifts and Patterns in Maltese, 315–330. Berlin: De Gruyter.

Friggieri, Oliver. 2000. Dizzjunarju Ta’ Termini Letterarji (3rd edition). Malta: Publishers Enterprises Group.

Geeraerts, Dirk. 2010. Theories of lexical semantics. New York: Oxford University Press.

Lyons, John. 1995. Linguistic Semantics. Cambridge University Press.

Micallef, Kurt, Albert Gatt, Marc Tanti, Lonneke van der Plas & Claudia Borg. 2022. “Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese.” In Colin Cherry, Angela Fan, George Foster, Gholamreza (Reza) Haffari, Shahram Khadivi, Nanyun (Violet) Peng, Xiang Ren, Ehsan Shareghi & Swabha Swayamdipta (eds.), Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing, 90–101. Association for Computational Linguistics. Available online at https://aclanthology.org/volumes/2022.deeplo-1/.

Nieder, Jessica & Johann-Mattis List. 2024. “A Computational Model for the Assessment of Mutual Intelligibility Among Closely Related Languages.” In Michael Hahn, Alexey Sorokin, Ritesh Kumar, Andreas Shcherbakov, Yulia Otmakhova, Jinrui Yang, Oleg Serikov, Priya Rani, Edoardo M. Ponti, Saliha Muradoğlu, Rena Gao, Ryan Cotterell & Ekaterina Vylomova (eds.), Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, 37–43. St. Julian’s, Malta: Association for Computational Linguistics. Available online at https://aclanthology.org/2024.sigtyp-1.0/.

Nieder, Jessica, Ruben van de Vijver, & Adam Ussishkin. 2024. “Emerging Roots: Investigating Early Access to Meaning in Maltese Auditory Word Recognition.” Cognitive Science 48 (11). e70004. Available online at https://doi.org/10.1111/cogs.70004

Sikogukira, Matutin. “Measuring Synonymy as an Intra-Linguistic and Cross -Linguistic Sense Relation.” Edinburgh Working Papers in Applied Linguistics 5. 109–118.

Stolz, Thomas. 2003. “Not quite the right mixture: Chamorro and Malti as candidates for the status of mixed language.” In Matras Yaron & Peter Bakker (eds.), The Mixed Language Debate, 271–316. Berlin/New York: Mouton de Gruyter.

Stolz, Thomas. 2025. “But, but, but. Three competing adversative connectors in contemporary Maltese.” In Maike Vorholt, Raffaello Bezzina & Michela Vella (eds.), The Next Century of Maltese Linguistics. Berlin: De Gruyter.

Stolz, Thomas, Nataliya Levkovych, & Maike Vorholt. 2023. “Variable overt marking of Place/Goal with place names as complements: On the competition between fi, ġewwa, and ġo.” Journal of Maltese Studies 30. 177–215.

Ullmann, Stephen. 1962. Semantics: an introduction to the science of meaning. Oxford: Blackwell.

Vella, Alexandra. 2013. “Languages and language varieties in Malta.” International Journal of Bilingual Education and Bilingualism 16 (5).532–552.


  1. Note that for the purpose of this paper we grouped singulars and plurals together into one concept group.↩︎