A revised reconstruction of the Proto-Tupian vowel system

Andrey Nikulin; Fernando Carvalho

resúmenes

secciones

referencias

imágenes

Abstract: This contribution is concerned with the reconstruction of the vowel qualities of Proto-Tupian, the ancestral language of the Tupian language family. The study is grounded in a bottom-up application of the comparative method and seeks to offer a more balanced reconstruction that avoids an overreliance on the Tupí-Guaraní branch. It is first shown that the height opposition traditionally reconstructed for the rounded vowel series (*o vs. *u) is best interpreted as an opposition between an unrounded vowel and a rounded one (*ə vs. *o). It is also argued that multiple instances of *e in the traditional reconstruction should be rather attributed to *ə. Finally, it is shown that two vowels (symbolized as *ɨ and *ɯ) must be reconstructed in lieu of the traditional *ɨ. The resulting proposal has consequences for the subgrouping of the Tupian family.

Keywords: Tupian languages, Comparative reconstruction, Vowels, Sound change.

Resumo: O foco do presente trabalho é a reconstrução do sistema vocálico do Proto-Tupi, a língua ancestral da família linguística Tupi. A investigação baseia-se em uma aplicação bottom-up do método comparativo e busca oferecer uma reconstrução mais equilibrada, evitando uma influência desproporcional do ramo Tupi-Guarani. Mostraremos primeiro que a diferença de altura tradicionalmente reconstruída para a série de vogais arredondadas (*o vs. *u) pode ser melhor compreendida como uma oposição entre uma vogal não arredondada e uma vogal arredondada (*ə vs. *o). Argumentamos também que múltiplas instâncias de *e na reconstrução tradicional devem ser atribuídas a *ə. Por fim, duas vogais (simbolizadas *ɨ e *ɯ) devem ser reconstruídas em lugar do segmento *ɨ da reconstrução tradicional. Estas propostas têm implicação para a classificação interna da família Tupi.

Palavras-chave: Línguas Tupi, Reconstrução comparativa, Vogais, Mudança sonora.

Carátula del artículo

Research Articles

A revised reconstruction of the Proto-Tupian vowel system

O sistema vocálico do Proto-Tupi: uma nova proposta reconstrutiva

Andrey Nikulin

Universidade Federal de Goiás, Brasil

Fernando Carvalho fernaoorphao@gmail.com

Universidade Federal do Rio de Janeiro, Brasil

Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, vol. 17, no. 2, e20210035, 2022
MCTI/Museu Paraense Emílio Goeldi

Received: 15 March 2021

Accepted: 22 December 2021

DOI: https://doi.org/10.1590/2178-2547-BGOELDI-2021-0035

INTRODUCTION

This article addresses the reconstruction of the vowels of Proto-Tupian (PT). Tupian is a major language family of South America, which comprises as many as ten universally recognized close-knit branches: Tupí-Guaraní, Awetí, Sateré-Mawé, Mundurukú, Juruna, Mondé, Tuparian, Arikém, Karo (= Ramarama), and Puruborá. Of these, four consist of only one language (Awetí, Sateré-Mawé, Karo, Puruborá), while Mundurukú, Juruna, and Arikém are similar in that each of them comprises one living language (Mundurukú, Yudjá, and Karitiana, respectively) and one extinct (or dormant), less well-attested language (Kuruaya, Xipaya, and Arikém, respectively); the Juruna branch also includes a third extinct language, Manitsawá, of which very little is known.

It is now universally accepted that Tupí–Guaraní, Awetí, and Sateré-Mawé are particularly closely related to each other, constituting thus a node within Tupian; furthermore, Tupí-Guaraní and Awetí share some innovations that did not affect Sateré-Mawé, suggesting an early split of the latter language off Proto-Mawé-Awetí-Tupí-Guaraní (Rodrigues, 1984-1955, p. 35; Rodrigues & Dietrich, 1997; Corrêa da Silva, 2010; Meira & Drude, 2015). In the remainder of this paper, we will label the node that comprises Tupí-Guaraní and Awetí ‘Awetí-Guaraní’, and the node that comprises Awetí-Guaraní and Sateré-Mawé will be accordingly called ‘Mawé-Guaraní’. Another proposal related to the internal classification of Tupian advances a hypothesis according to which Puruborá and Karo would form a subgroup (‘Ramarama–Puruborá’, see Rodrigues, 2007, p. 168, fn. 2; Galucio & Gabas Jr., 2002).

In this article, we critically assess the only existing reconstructive proposal that has ever been put forward for the PT segmental inventory (cf. Rodrigues, 1999, 2005, 2007), focusing on the vowel system as reconstructed in Rodrigues (2005). Rodrigues (2005) proposal is presented as established common knowledge in later reference works on the Tupian family (see e.g. Rodrigues & Cabral, 2012). However, we will argue that although the author’s cognacy judgments are, for the most part, precise, his analysis of the attested correspondence sets is problematic and, for this reason, in need of revision. Specific problems which we attempt to resolve in this contribution include the sound correspondence in the pan-Tupian word for ‘cultivated field’ (cf. Sateré-Mawé ŋo/ko, Awetí ko, Mundurukú kə, Yudjá kúá, but Makurap and Wayoró ŋge, Karitiana ŋga), which is considered irregular by Rodrigues (2005, p. 40), as well as the reason why the vowel *o in Rodrigues’ (1999) reconstruction occurs predominantly after labial consonants (cf. Rodrigues, 1999, pp. 110–111).

After a brief presentation of the relevant aspects of Rodrigues (2005) PT reconstruction (“Earlier scholarship”), we discuss some vowel correspondences that are especially troublesome in Rodrigues’ (2005) framework, and put forward an alternative account of the evolution of the vowel system of PT (sections “PT *ə” and “PT *ɯ vs. *ɨ”). The scenario we propose requires positing some non-trivial innovations shared by specific groups of Tupian, as discussed in “Implications for the subgrouping of Tupian”. After the “Conclusions”, the “List of abbreviations”, and the “References”, we include two appendices. Appendix 1 contains all the cognate sets discussed in this contribution, whereas Appendix 2 summarizes our reconstruction of Proto-Tupian consonants.

NOTATION

Before proceeding to the bulk of the discussion, we make explicit our conventions regarding the representation of linguistic data. In order to warrant the comparability of the data, which are extracted from a variety of sources on multiple Tupian languages, we have unified the representation of all cited forms according to the following principles.

First of all, the flap ɾ is represented as r throughout this paper, and the open-mid vowels ɔ and ɛ as o and e, respectively, as is in fact commonly done in Tupian linguistics. An exception is made for Kuruaya, where /ɔ/ and /o/ contrast and are thus represented as ɔ and o. The vowels of the extinct language Arikém, as well as of its direct ancestor Proto-Arikém, are represented as /i e æ ɒ ʉ/. The choice of the less common characters /æ ɒ ʉ/ is suggested by the variable representation of these vowels as ‹a› ~ ‹e›, ‹o› ~ ‹ḁ› ~ ‹a›, ‹u› ~ ‹u̥› ~ ‹i› in our only sources on the language (Rondon & Faria, 1948; Nimuendajú, 1932).

In most Tupian languages, consonants in the coda position do not contrast for features other than the place of articulation. This is true for Sateré-Mawé, Awetí, Proto-Tupí-Guaraní (and most contemporary Tupí-Guaraní languages), Proto-Tuparian (and all contemporary Tuparian languages), Proto-Arikém, and Puruborá. We write such codas in small caps: p (labial), t (dental/alveolar), c (palatal), k (velar). Their precise surface realizations vary depending on the language, on the point of articulation, and on the phonological environment, and include [p̚ m β] for p, [t̚ n ɾ] for t, [(ʲ)c̚ ɲ j] for c, [k̚ ŋ ɰ g] for k.

The phonetic inventory of many Tupian languages has two-phase stops that start out with a lower velum which raises during the occlusion. These are variably analyzed as postoralized allophones of underlying nasals or prenasalized allophones of underlying voiced stops (cf. Wetzels & Nevins, 2018), or (more rarely) as underlying prenasalized stops (cf. González, 2008). In this paper, such segments are always spelt as mb, nd, nʤ, ŋg, regardless of their phonological status in a given language.

Nasal vowels are always explicitly marked with a tilde, even if our sources leave nasality unmarked in some environments (usually following or preceding nasal consonants).

In languages where [s] and [ts] are free, idiolectal, dialectal, or chronological variants of one phoneme, we have normalized the data from our sources in order to warrant consistency across examples (e.g., only s is used in Tuparí, and only ts in Sakurabiat), following the current transcription practices in the most recent expert source in each case.

Proto-Mawé-Awetí-Tupí-Guaraní (= Proto-Mawetí-Guaraní = PMG) reconstructions are mostly those by Meira and Drude (2015), with the following modifications. Meira and Drude (2015)*tʲ is rewritten as *c, except when it follows an *i or a *j in their reconstruction, in which case we employ an ad hoc character *ć in order to capture the fact that the reflexes of *ć in all daughter languages are different from those of *c (this also allows us to give up the reconstruction of ‘phantom’ instances of PMG *i and *j, which are in Meira and Drude (2015) proposal hypothesized to have been lost in all daughter languages). For example, Meira and Drude’s (2015) reconstructions such as *itʲet ‘his/her name’ and *tʲajtʲu ‘armadillo’ are replaced with *ćet and *caću, given that no daughter language preserves any segmental trace of the alleged PMG segments *i and *j (Sateré-Mawé het, sahu; Awetí tet, tatu[pep]; PTG *tset, *tatu). We also posit PMG *tʲ alongside *c and *ć in order to account for the sound correspondence involving Sateré-Mawé t (in unstressed syllables) / ɾ(j) (in stressed syllables), Awetí ʐ, and PTG *t, found in the stem for ‘fire’ (reconstructed as *atia, *atja in Meira & Drude, 2015; *(c)atʲa in our proposal) as well as before *i (in complementary distribution with *t)1. We also diverge from Meira and Drude (2015) in reconstructing *w instead of their *kʷ, whereas the segment they reconstruct as *w is considered to have been independently epenthesized in the environment *u_V after the split of PMG (that way, Meira & Drude’s (2015) reconstructions such as *tʲuwaj ‘tail’ and *tʲuwɨ(k) ‘blood’ are replaced here with *cuac, *cuɨ > Sateré-Mawé suwac-po, suː; Awetí -uwac, -uwɨ[k]; PTG *tuwac/*-ruwac, *tuwɨ/*-ruwɨ)2. We accept Schleicher’s (1998, pp. 18–24) suggestion, reinforced by Meira and Drude (2015, pp. 278–279), whereby only one PTG affricate is reconstructed instead of the traditional *c (*ts) and *č (*ʧ); we symbolize it as *ts3.

Proto-Mundurukú reconstructions are taken from Picanço (2019). For relational stems, whose leftmost consonant often alternates depending on the left context, Picanço (2019) lists all possible allomorphs. In this article, only the allomorphs with *p-, *ð-, and *ʧ- (rather than *b-, *t-, *ɟ-) are given for such stems. We also omit the hyphen, used by Picanço (2019) in order to indicate that the relational stems are bound.

For Proto-Juruna and Proto-Tuparian, a variety of proposals exist; the reconstructions in this article are mostly extracted from the more recent ones (Carvalho, 2019 for Proto-Juruna; Nikulin & Andrade, 2020 for Proto-Tuparian) or adapted from earlier proposals (Fargetti & Rodrigues, 2008 for Proto-Juruna; Moore & Galucio, 1994; Nogueira et al., 2019 for Proto-Tuparian) so as to match the phonological reconstruction of the most recent works.

The reconstruction of the Proto-Tupian consonants adopted here differs from previous proposals and is based on the correspondence sets summarized in Appendix 2. For reasons of space, it is impossible to discuss this problem in detail in this contribution. Note, however, that nothing in our reconstruction of the Proto-Tupian vowels hinges on our interpretation of the PT consonants, and the validity of our proposal would remain intact if one adopted a different interpretation (such as that of Rodrigues, 2007).

The acute accent symbolizes the high tone in tonal languages (including the languages of the Mundurukú, Juruna, and Mondé branches, Karo, and maybe also Makurap); the low tone is left unmarked. Tones other than high and low are found only in the Mondé languages, and the transcription of our sources is retained in such cases. In the Tuparian languages Tuparí and Akuntsú, contrastive stress has been described, which is also symbolized by means of an acute accent (when its position is known).

Data quoted from premodern sources, which are not expected to faithfully represent all the relevant phonological oppositions, are given ‘verbatim’ enclosed in chevrons. Subscript letters after such forms indicate the ultimate source of the data: ‹›_S and ‹›_L refer to Emilie Snethlage’s and Lopes’ data on Kuruaya (Snethlage, 1932); ‹›_B and ‹›_N refer to Barbosa’s and Nimuendajú’s data on Arikém (Rondon & Faria, 1948; Nimuendajú, 1932). Forms followed by ‹›kg come from Koch-Grünberg (1932) on Puruborá, and those with ‹›es from Snethlage (1934), again on Puruborá. Forms in chevrons without subscript letters are from Steinen (1886) for Manitsawá, Nimuendajú (1923–1924, 1928, 1929) for Xipaya, and Sekelj (1948) for Aruá and Makurap.

Much of the discussion in this paper is based on analyzing cognate sets. In some cases, a given form is not synchronically segmentable, but only a part of it is cognate with the material of other languages. The part which is deemed non-cognate is then given in brackets. In premodern attestations (enclosed in chevrons), the cognate part is given in boldface.

EARLIER SCHOLARSHIP

The vocalic system of Proto-Tupian has been reconstructed by Rodrigues and Dietrich (1997, p. 268) and Rodrigues (1999, p. 110, 2005) as comprising six vowel qualities (*a, *ɨ, *o, *u, *e, *i), each of which would also have a nasal counterpart (*ã, *ɨ̃,*õ, *ũ̃, *ẽ, *ĩ; see Rodrigues & Cabral, 2012, p. 502). In Table 1, we list the reflexes of the Proto-Tupian oral vowels according to the proposal by Rodrigues (2005).

Table 1
The reflexes of PT oral vowels (after Rodrigues, 2005, p. 37).

It can be easily seen from the table above that, according to Rodrigues (2005), the evolution of Proto-Tupian *e and *ɨ in the constituent families involved some phoneme splits. In the case of PT *e, Rodrigues (2005) posits a split in the so called ‘Eastern’ Tupian languages (Mawé-Awetí-Tupí-Guaraní, Mundurukú, and Juruna) allegedly conditioned by an adjacent consonant. The examples 1-2 illustrate the default development of PT *e, whereas 3 instantiates the development of PT *e affected, according to Rodrigues’ (2005, p. 40) proposal, by a following labialized stop (the labialization is reconstructed here exclusively in order to account for what are thought to be the divergent reflexes of PT *e)4.

For *ɨ, Rodrigues (2005, pp. 40–41) posits a split in Juruna and in the so called ‘Western’ Tupian languages of the Arikém, Tuparian, Karo (Ramarama), and Puruborá groups. This time, however, he does not identify a phonological environment which could have conditioned the alleged split (beyond a generic reference to the ‘immediate consonantal context’), nor is he explicit about whether the alleged split proceeded in the same way in all the aforementioned languages. The examples 4-6 illustrate.

In the subsequent sections, we will argue against the proposal by Rodrigues (2005), suggesting instead that the observed sound correspondences are best accounted for by reconstructing a phonemic inventory of seven (rather than six) vowel qualities for Proto-Tupian and positing a number of mergers in the daughter languages, in addition to one conditioned split. That way, the examples in 1-6 are reconstructed in our proposal as *kʲet ‘to sleep’, *mẽt ‘husband’, *jəp ‘leaf’, *ḳɯp ‘tree; stick-like’, *mbɨ/*pɨ ‘foot’, *pətɨc ‘heavy’. Tables 2 and 3 show the oral vowel inventories of Proto-Tupian in Rodrigues’ (2005) and our proposals, respectively.

Table 2
PT vowel inventory in Rodrigues’ (2005) proposal.

Table 3
PT vowel inventory in our proposal.

PT *ə

This section deals with the reconstruction of a vowel we chose to represent as *ə. We start by stating its proposed reflexes in the daughter branches and listing the relevant cognate sets. In subsequent sections, we discuss how our findings relate to Rodrigues’ (2005) reconstruction of the PT vowels and consonants. We conclude that the recognition of *ə as a contrastive unit allows reducing the phonological inventory of Proto-Tupian by three phonemes (*pʷ, *kʷ, *kˀʷ), to account for the sound correspondences in a number of cognate sets which are unexplainable in Rodrigues’ (2005) proposal, and to account for the limited distribution of the sound correspondence which underlies Rodrigues’ (2005) reconstruction of PT *o (which occurs exclusively following labial consonants). Our proposal also entails that the vowels traditionally reconstructed as *o and *u should be reinterpreted as PT *ə, *o.

PROPOSAL

The vowel we reconstruct as PT *ə has evolved in the following way in the daughter languages5. In the Mawé-Guaraní branch, it has acquired rounding and changed to PMG *o (in fact, in our proposal PT *ə is the only source of PMG *o). In addition, it has been raised to *u before a vowel (as in ‘blood’, ‘sun’) or before a glottal stop and a vowel (as in ‘arrow’). In the Mundurukú branch, PT *ə is regularly reflected as Mu ə, Ku ɨ (< PMu *ɨ in Picanço’s (2019) reconstruction) and is the only source of this PMu vowel. Before a pre-PMu vowel, however, PT *ə is reflected as PMu *o (‘blood’, ‘arrow’, ‘sun’, provided that PMu *ðoj and *op/*ðop go back to pre-PMu *ðoi, *oip/*ðoip). The sequence *mə̃ is reflected as Mu mə̃, Ku mã (< PMu *mã in Picanço’s (2019) reconstruction). In the Juruna branch, one finds PJu *a except next to a labial consonant in the final syllable of the PT stem (‘leaf’, ‘snake’, ‘hand’, ‘wing’; exception: ‘what’) or before a vowel (‘sun’), where the regular reflex is *u. In the Tuparian branch, PT *ə is fronted to PTpr *e except after labial consonants, in which case the regular reflex is PTpr *o (‘to return’, ‘hand’, ‘wing/feather’, ‘heavy’, ‘causative’). Likewise, in the Arikém branch PT *ə is reflected as a low vowel (Kt a, Ari ) except after labial consonants, in which case one finds Kt ɨ, Ari ʉ (‘hand’, ‘vine’, ‘wing/feather’, ‘heavy’). Puruborá retains PT *ə as ə (or ə̃, if nasality is present). In the Mondé languages, PT *ə is reflected as a except after coronal consonants and word-initially, in which case e is found (‘leaf’, ‘larva’, ‘house/village’). Only in Karo do the reflexes seem chaotic at present: one finds ɨ in two examples (‘cylindrical and small’, ‘to hold’), i in two examples (‘heavy’, ‘to go’, in both cases next to a *t), o in two examples (‘leaf’, ‘third person coreferential’), a in two prefixes (‘causative’, ‘sociative causative’), as well as the following vowels in one example each: ə (‘to go, to walk’), ə̃ (‘snake’), and u (‘blood’, a likely result of contraction of the unique vowel sequence *əɯ).

ADVANTAGES WITH RESPECT TO RODRIGUES (2005)

In what follows, we discuss four correspondence sets derived by Rodrigues (2005) from three Proto-Tupian vowels. Two of them, identified by Rodrigues (2005) with PT *e and *u, show no overlap at all; these correspondence sets appear as (a) and (d), respectively, in Table 4 below. The remaining two correspondences, given as (b) and (c) in Table 4, are attributed in our account to PT *ə, yet an entirely different account is proposed by Rodrigues (2005). The correspondence (c) has the same reflexes as (d) in Karitiana and Tuparí, but other languages show distinct reflexes, which are typically lower than those of (d); it is associated by Rodrigues (2005) with PT *o. The correspondence (b) shows significant overlaps with (a) and (c): in Tupí–Guaraní, Awetí, Sateré-Mawé, Mundurukú, and Yudjá, the observed reflexes are identical to those of (c), whereas in all other branches the correspondence set in question — in Rodrigues’ (2005) account — coincides completely with (a).

Table 4
Four correspondence sets for Tupian vowels (adapted from Rodrigues, 2005, p. 37).

According to Rodrigues (2005, pp. 40, 42, 2007, pp. 175–176, 181–182, 186), the overlapping pattern that involves the correspondence (b) — which coincides with (c) in Tupí–Guaraní, Awetí, Sateré-Mawé, Mundurukú, and Yudjá, but with (a) in the remaining branches — can be explained by reconstructing a secondary rounding feature for the consonant that immediately follows the vowel (the available options in Rodrigues’s, 2005 reconstruction are *pʷ, *kʷ, *kʷˀ). In the proto-language of Mawé-Guaraní, Mundurukú, and Yudjá (‘Eastern’ Tupian languages in Rodrigues’, 2005 terms), this contextual factor would have induced the merger of the correspondence set in (b) with the *o series, Yudjá being later subject to *o > a and Mundurukú undergoing *o > ə. The remaining branches — that is, Arikém, Tuparian, Mondé, Ramarama, and Puruborá — would have not been subject to any contextual coloring and show reflexes identical to those of PT *e, which leads Rodrigues (2005, 2007) to reconstruct *e for the correspondence set in question and to posit a conditioned split in his ‘Eastern’ languages. In synthesis, then, Rodrigues’ (2005) proposal for the PT segments whose reflexes appear in the correspondences in Table 4 above is as follows: *e for the correspondences (a) and (b), *o for (c), and *u for (d). Moreover, the context-dependent merger of the series (b) and (c) in Mawé-Guaraní, Mundurukú, and Juruna is attributed to the influence of a secondary rounding feature hosted on the following consonant.

Under closer scrutiny, however, it appears that the available evidence does not support either the identification of the correspondence (b) as a context-dependent offshoot of (a) or the reconstruction of a labialized stop series for Proto-Tupian. Although we concur with Rodrigues (2005) in reconstructing PT *e for the correspondence (a), we disagree with his diachronic interpretation of the remaining three correspondences in that:

we consider (b) to be the default development of PT *ə (rather than a positional development of *e);
we consider (c) to be a positional development of PT *ə (rather than the default reflex of a PT phoneme of its own, symbolized as *o by Rodrigues (2005);
we derive (d) from PT *o, which in our account is the only rounded vowel of PT (as opposed to Rodrigues’ (2005) reconstruction, whereby PT had both *u and *o).

Rodrigues’ (2005) account is seriously undermined by the following facts.

1. First of all, the consonants reconstructed as labialized by Rodrigues (2005) in etyma that instantiate the correspondence set (b) appear not to have reflexes distinct from those of non-labialized consonants.
2. Furthermore, the correspondence in (b) may also occur morpheme-finally or morpheme-internally before vowels, making it impossible to attribute the emergence of the correspondence to a following consonantal segment.
3. The correspondence set in (c) may be explained away as a conditioned offshoot of (b).
4. The reflexes listed by Rodrigues (2005) for (b) and (c) in Suruí-Paiter, Karo, and Puruborá are partially based on non-cognate material and are thus incorrect.
5. Finally, there is typological evidence that renders Rodrigues’s (2005) hypothesis implausible.

Each of these five points is discussed in the subsequent sections.

PURPORTED LABIALIZED CONSONANTS HAVE THE SAME REFLEXES AS PLAIN CONSONANTS

Let us consider the reflexes of the PT segments that Rodrigues (2007) reconstructs as labialized consonants. As will become clear, their reflexes do not differ from those of their plain (non-labialized) counterparts, and the only reason for positing such phonemes in Rodrigues (2007) reconstruction is to account for the correspondence set (b). Once it is recognized that the (b) series does not result from a conditioned split of *e, it is no longer necessary to reconstruct labialized consonants for Proto-Tupian.

We will start by examining the occurrences of *pʷ that are supposed to account for the alleged rounding of PT *e in ‘Eastern’ Tupian. Rodrigues (2007) reconstructs it for two roots, *epʷ ‘leaf’ and *epʷa ‘face’ (as well as in its derivative *epʷa-pokˀ ‘to appear’). In the former case, all Tupian branches have a reflex with a plain labial stopp (in some languages, which lack an opposition between oral and nasal codas, it is symbolized as P). In most Tupí-Guaraní languages, as well as in the Arikém language before the suffix -ɒ, the stop is lenited to β or a similar sound (cf. Schleicher, 1998, pp. 29–32; Storto & Baldi, 1994); this development regularly targets word-final stops of any origin in these languages8. In our reconstruction, the vowel correspondence between PMG *o, PMu *ɨ, PTpr *e, Kt a, Ari æ, Pu ə, and Mo e is derived from PT *ə, whereas the correspondence between the word-final consonants straightforwardly continues PT *-p. That way, there is no need to reconstruct a labialized stop for ‘leaf’ in our proposal.

An identical rhyme is found in the word for ‘bitter’. Rodrigues (2007, p. 196) reconstructs its PT etymon as *rʲop and lists its reflexes in Sateré-Mawé, Awetí, PTG, and Mundurukú. Had he considered the Tuparí and Karitiana cognates, he would have likely reconstructed *rʲepʷ.

In Rodrigues’ (2007) account, *pʷ is claimed to have a divergent reflex between vowels in the languages of the Mawé-Guaraní (PT *pʷ > PMG *β, as opposed to PT *p > PMG *p) and Mundurukú (PT *pʷ > PMG *p, as opposed to PT *p > PMG *b) branches. Rodrigues (2007) gives only two cognate sets that instantiate the sequence *epʷ: PT *epʷa ‘face’ and *epʷapokˀ ‘to appear’; Corrêa da Silva (2010, p. 128) claims the latter to be a derivative of the former. Rodrigues (2007, p. 186) lists the following reconstructions and reflexes (quoted verbatim).

Note, however, that even within Rodrigues’ own framework the proposed etymology for ‘face’ presents serious irregularities. In his account, PT *e before a labialized consonant would be expected to yield Mw o (rather than e), Mu ə (rather than o), Ku ɨ (rather than o — note that u in the datum cited by Rodrigues (2007) is a phonetic variant of /o/), and Kt a (rather than ɨ). We surmise that in this case Rodrigues (2007) has failed to distinguish between two unrelated cognate sets, which we derive from PT *jəβa ‘forehead’ and *jopʔa ‘face’ (in addition, Mw -ewa, or sewa in our notation, appears to be unrelated to either etymon).

PT *β in *jəβa ‘forehead’ is reconstructed based on the correspondence PMG *β ~ PTpr *β, otherwise found in the cognate set for ‘wind’, PMG *ɨβɨću ~ PTpr *ɨβijo (cf.Nikulin & Andrade, 2020, p. 292). In *jopʔa, the consonant cluster is reconstructed in order to account for the voiceless intervocalic stop *p in PMu, also found in etymologies such as PMu *óropo ~ PTpr *oropʔo ~ PMG *uruβu ‘vulture’9. In our account, the correspondence between PMG *o and PTpr *e need not be conditioned by any feature hosted on the following consonant.

As for the cognate set for ‘to appear’, we believe that PTG *oβapo should be excluded from it (no other examples are known where a velar coda in Mundurukú or Tuparian would correspond to zero in PTG)10. Moreover, the Mundurukú cognate does not actually contain an initial vowel (the root is papə́k ‘to be visible’, with the allomorph bapə́k occuring after vowels; cf, Picanço, 2005, p. 17). That way, this cognate set does not instantiate the vowel correspondence which interests us in this section, nor does it back up the reconstruction of *pʷ.

Now we turn to the purported PT *kʷ. According to Rodrigues (2007), this phoneme induces contextual coloring in the preceding vowel in the following cognate sets.

To the best of our knowledge, no PT reconstructions for the following cognate sets are proposed in the works by Rodrigues (2007). He would probably reconstruct *tˀakˀekʷ ‘army ant’ and *kekʷ ‘to hold’.

As can be seen in these examples, the putative PT *kʷ does not have labialized reflexes in any single Tupian branch. In fact, in all cases it is reflected precisely in the same way as PT *k (represented as *k in the coda position). The only example which, according to Rodrigues (2007, p. 182; 2008, p. 6), instantiates a labialized reflex of PT *kʷ in a Tupian language is PT *ekʷ-at ‘plaza’ > Xi koað-á, Sk ekʷat, Tu ekoat-pe ‘area around the house’11. We reconstruct the etymon in question as PT *ək-at and reject the appurtenance of the cited words to this cognate set. First of all, the ultimate sources of the Xipaya and Sakurabiat words provide glosses which are quite distant from ‘(village) plaza’: ‹ku̥aẓá› ‘village of foreigners’ (Nimuendajú, 1928, p. 827), ‹hekʷat› ‘field’ (Hanke et al., 1958, p. 205). Second, Sakurabiat kʷ does not regularly correspond to Tuparí ko (cf. Nikulin & Andrade, 2020).

The third purported labialized stop of Proto-Tupian is reconstructed by Rodrigues (2007) as *kʷˀ for one single etymon.

Rodrigues (2007, p. 186) states explicitly that *kʷˀ is reflected in the daughter languages precisely in the same way as *kˀ (we prefer to symbolize the segment in question with the ad hoc character *ḳ), and the labialization in the etymology for ‘arrow’ is reconstructed only in order to account for the correspondence between a front vowel in the Tuparian languages and rounded vowels in the ‘Eastern’ Tupian languages. Also note that the PT stem for ‘arrow’ almost certainly contains the formative for tree- or stick-like objects *-ḳɯp (*-kˀɨp in Rodrigues’, 2007 reconstruction), with reflexes in all Tupian languages, none of which shows any trace of labialization. In this sense, our proposal is superior to Rodrigues’ (2007) in that no need arises to reconstruct an extra consonant found in only one stem.

THE CORRESPONDENCE SET (B) BEFORE VOWELS OR PAUSE

In this subsection, we examine the instances of the correspondence set (b) in environments in which no consonant follows the vowel in question. These receive no explanation in the framework of Rodrigues (2007): his proposal attributes the emergence of the series (b) to the rounding conditioned by a following labialized consonant, whereas in the etyma under consideration there is no consonant which could have triggered the alleged conditional rounding. The relevant data are provided below.

Regarding the etymology for ‘field’, Rodrigues (2005, p. 40) explicitly notes that the vowel of the Karitiana reflex (Kt ŋga/ŋa/) corresponds to that of the Makurap (Tuparian) reflex (Ma ŋge /ŋe/), but is not the regular outcome of PT *o, reconstructed by Rodrigues (2005) based on the reflexes in other branches of Tupian12. Rodrigues (2005, pp. 39-40) also notes that his proposal fails to account for the Yudjá reflex (Yu kú-á), as PT *o in his proposal is supposed to be reflected as a in the Juruna branch; he tentatively suggests that the Yudjá word is a loan from Língua Geral Amazônca, a Tupí-Guaraní language. In our account, none of these problems arises, as we hypothesize that PT *ə regularly yields Kt a, Ma e, and (before vowels) Yu u.

ROUNDING OF PT *ə IN TUPARÍ AND ARIKéM

Rodrigues (1999, pp. 110-111) notes that PT *o — in his own reconstruction — occurs almost exclusively after labial consonants, and tentatively suggests that this vowel arose as a positional variant of PT *u. We argue that the correspondence set which Rodrigues (1999, 2005) equates with PT *o is indeed secondary: it may be indeed explained away as an offshoot of another PT vowel in the environment C_[+labial](ʔ)_. However, our account differs from Rodrigues’ (1999) suggestion in that the correspondence set in question is derived from PT *ə (rather than from another rounded vowel) and in that the proposed conditioned split is attributed to a shallower level than Proto-Tupian: its effect is clearly visible in only two Tupian branches, Tuparian and Arikém. Instead of yielding PTpr *e, Kt a, Ari æ, one finds PTpr *o, Kt ɨ, Ari ʉ in the aforementioned context. The following examples illustrate.

After non-labial consonants or word-initially, the reflexes are PTpr *e, Kt a, Ari æ.

Not a single exception has been identified.

REFLEXES OF (B) AND (C) IN KARO, PURUBORÁ, AND MONDé

According to Rodrigues (2005), the correspondence sets (b) and (c) have different reflexes not only in Tuparí and Karitiana (as we have shown in the previous section, they are in fact in a complementary distribution in these languages), but also in Karo, Puruborá, and Mondé (represented by Suruí-Paiter in Rodrigues’, 2005 study). The (b) series (derived from PT *e before a labialized consonant in Rodrigues’, 2005 proposal) is supposed to be reflected as e in Karo, Puruborá, and Suruí-Paiter, whereas the correspondence set (c) (< PT *o according to Rodrigues, 2005) is expected to yield a in the three languages.

In reality, however, the Karo and Puruborá reflexes listed by Rodrigues (2005, pp. 39-40) for the correspondence sets (b) and (c) turn out to be nonexistent, and the Mondé reflexes may be equally accounted for if one accepts our reconstruction of PT *ə. Consider the following etymologies (R = Rodrigues, 2007):

The reflex e, listed by Rodrigues (2005) for Karo and Puruborá, simply does not occur in the available data. In Karo, one finds o, ɨ, i, ə, ə̃, a, and u (with no obvious distribution), and in Puruborá, ə is found in all cases. Currently we have no explanation for the reflexes in Karo (but note that Rodrigues’ proposal also fails to account for them).

In the Mondé languages, one does indeed find a and e in accordance with Rodrigues’ (2005) predictions (PT *e > e; PT *o > a). However, it is also possible to account for the Mondé reflexes if one recognizes that the etyma of all the aforementioned cognate sets contained one and the same vowel, PT *ə. Note that all instances of e occur following a coronal consonant (‘leaf’, ‘larva’) or word-initially (‘house/village’), whereas all instances of a (‘hand’, ‘snake’, ‘cultivated field’, ‘sun’, ‘heavy’) occur following a peripheral (labial or velar) consonant. We propose, therefore, that PT *ə was fronted to Proto-Mondé *e following coronal consonants or word-initially and yielded Proto-Mondé *a elsewhere, and that the Mondé languages lend no support to the Proto-Tupian age of the distinction between the correspondence sets (b) and (c). We parenthetically note that the fronting of the type *ə > e following coronal consonants is also known from the history of Djeoromitxí, a Macro-Jê language of the Jabutian branch (Voort, 2007, p. 147), which is, like the Mondé languages, spoken in the Rondonian East. Typologically, the functioning of labial and velar consonants as a natural class in processes triggering vowel backing (as in *ə > a) is amply documented (Hyman, 1973; Vago, 1976)13.

GENERAL PHONETIC CONSIDERATIONS

The development PT *eCʷ > *oC, posited by Rodrigues (2005), inter alia for his ‘Eastern’ Tupian languages, conjoined with the reconstruction of such consonants with labial off-glides — the factor that accounts for these environmentally restricted vocalic outcomes — raise two issues of phonetic plausibility of historical reconstructions: one related to directionality considerations of the presumed coloring effect of the PT consonants, the other related to the distribution of the labialized consonants. First of all, a baffling aspect of the labializing effect exerted by these consonants is that it always affects the preceding, not the following vowel: a pre-vocalic labialized stop has no effect on a following vowel. From a phonetic point of view this is extremely counterintuitive. If labialization, or, more precisely, a labial release feature, is to play the role of contrastive feature distinguishing between these consonants (i.e. *pʷ and *kʷ) and their plain counterparts (i.e. *p and *k), one would expect its ‘coloring’ influence upon adjacent vowels to be realized more strongly (if not exclusively) on a following rather than a preceding vowel (that is, in a C^w-to-V transition, as opposed to the V-to-C^w boundary).

As to their distribution, Rodrigues’ (2007) PT labialized stops tend to occur, or are found quite frequently, in word-final position. In fact, the most significant phonotactic gap in their distribution in Rodrigues’ (2007) proposal is the absence of *pʷ from word-initial position. The expectation, commented on above, that a consonant with a secondary articulation will exert a stronger coarticulatory effect on a following rather than a preceding vowel derives from the fact that such segments depend, for their realization, on a following resonant element. As a consequence, we also expect such consonants to be less-optimally realized (qua contrastive segments) in word-final or pre-consonantal position, with no vocoid to work as a base for its contrastive release features to be imposed. As a matter of fact, plenty of evidence suggests that this is the case (see, e.g., Blevins, 2004, p. 116). In the words of Ladefoged and Maddieson (1996, p. 357):

Thus we can say that labialization is typically concentrated on the release phase of the primary articulation it accompanies. This observation has both phonetic and phonological significance. Many more languages have a restriction between the presence of labialization and the choice of the following vowel, than between its presence and the choice of the preceding vowel, and in many languages with labialized consonants the set of syllable-final consonants, if any, does not include labialized ones.

Aside from general considerations stemming from principles of acoustics and perceptual phonetics, it is not hard to find cross-linguistic evidence supporting the contention that such secondary articulations found in stop consonants behave phonologically as if ‘looking for’ a supporting vowel. Thus, in Khwarshi, an Eastern Caucasian language, labialization is found as a secondary articulation feature, mostly in velar and uvular consonants (Khalilova, 2009, pp. 17-18). The contrast is restricted basically to word-initial and word-medial position preceding a vowel, as in etʷa ‘fly’ vs. eta ‘touch’, lakʷa ‘see’ vs. laka ‘lick’. The dynamic phonology of the language also demonstrates a preference for such labialized release consonants to occur preceding a vowel. Labialization is either lost (89a) or transferred to another consonant, one that precedes a vowel (89b), whenever a -C(V) suffix is added to a root containing a final labialized stop.

The joint effect of these generalizations, both static phonotactic patterns and processes in the dynamic phonology of Khwarshi, is to suggest that having consonants with a labial release preceding something other than a vowel is a highly undesirable or marked configuration in this language. Similar regularities are found in the phonologies of many unrelated languages, and can be understood more broadly in terms of the acoustic and phonetic constraints mentioned above, the same that make Rodrigues’ (2007) proposal of stop consonants with secondary offglides that are almost always realized in contexts other than that of a following vowel very implausible.

INTERIM SUMMARY

Above we have presented evidence against positing a sound change whereby PT *e would have acquired rounding (> *o) preceding labialized consonants as well as against reconstructing these labialized consonants for PT. Instead, we have proposed that the suspect correspondence should be derived from PT *ə. Moreover, in our reconstruction, PT *ə accounts for some sound correspondences deemed irregular in Rodrigues’ (2005) proposal as well as for the correspondences which underlie his reconstruction of *o. Our proposal is summarized in Table 5, where we list the reflexes of PT *ə as well of two other vowels which do not present split reflexes: PT *e and *o (in Rodrigues’, 2005, 2007 interpretation, *e and *u).

Table 5
PT *ə, *e, and *o and their reflexes. A = before PT *ɨ or *ɯ in the next syllable;14 B = before a vowel; C = next to a labial in a stem-final syllable; D = after a labial; E = after a coronal.

The notational change in the reconstruction of the sole rounded vowel of Proto-Tupian (*o) as opposed to (*u) does not affect the correctness of the sound correspondences identified by Rodrigues (2005) in any way. It is suggested by the fact that the typical realization of its reflex is a mid vowel in most Tupian branches (Mundurukú, Tuparian, Karo, Puruborá, and Mondé). Since Rodrigues’ (2005)*o is reinterpreted as an unrounded vowel *ə in our account, it is now unproblematic to reconstruct *o in PT stems such as *amẽko ‘jaguar’, *jacjo ‘armadillo’, *jaḳo ‘lizard’, *jeko ‘monkey’, *jõk ‘flea’, *jopi(-ʔa) ‘egg’, *ḳo ‘to ingest’, *ndo ‘hill, rock’, *ndok ‘to eat (intr.)’, *õp ‘to give’, *õt ‘I’ (and the first person prefix *o-), *toḳo ‘to bite’, *top ‘to see’, *waco ‘alligator’, among many others (we do not list these well-established etymologies in Appendix 1 for reasons of space).

PT *ɯ VS. *ɨ

In this section, we will argue that it is necessary to reconstruct two distinct vowel phonemes in place of Rodrigues’ (2005) Proto-Tupian *ɨ. According to Rodrigues (2005), Proto-Tupian *ɨ would have undergone a number of splits in the daughter languages depending on the immediate consonantal environment, yielding ɨ/i in Yudjá, e/i in Karitiana, i/ʉ in Tuparí, i/ɨ in Karo, and ɨ/i in Puruborá. Unfortunately, Rodrigues (2005) does not specify the consonantal environments which would have triggered the putative fronting of *ɨ in the daughter languages, nor is he explicit on whether these environments were identical for each constituent branch (Juruna, Arikém, Tuparian, Karo, and Puruborá). In what follows, we show that Rodrigues’ (2005) reconstruction collapsed two correspondence sets into one and that two distinct vowels must therefore be reconstructed for Proto-Tupian. We symbolize them as PT *ɯ and *ɨ15, as in the minimal pair PT *jɯ ‘liquid’ vs. PT *jɨ ‘urine’, still retained in Karitiana as se ‘liquid’ and si ‘urine’. Their reflexes are identical in some branches (PMG *ɨ, PMu *i, PJu *ɨ, Mo i); for this reason, in what follows we are concerned only with the remaining branches (that is, Tuparian, Arikém, Karo, and Puruborá). At the end of the section, however, we will see that the distinction between the PT vowels in question is indirectly preserved in the Mundurukú branch as well.

PT *ɯ is reconstructed for the correspondence set which involves the following reflexes: PTpr *ɨ (> Ma/Sk/Ak ɨ, Wy/Tu ʉ), Kt/Ari e, Pu ɨ. In Karo, one usually finds i in the word-initial position (i-cɨ ‘water’, itɨ ‘deer’), ə̃ if the syllable is nasal (nə̃p ‘louse’, wakə̃ja ‘agouti’), and ɨ elsewhere (i-cɨ ‘water’, ma-ʔɨp ‘tree’, tɨt ‘to cook’, itɨ ‘deer’, jaɨ ‘howler monkey’); we are as of now unable to account for the apparently aberrant reflex pək ‘to burn’. In Karo ju ‘blood’, the vowel u continues the PT sequence *əɯ and is thus not necessarily irregular. Below we list the PT etyma where evidence from multiple branches converges to the reconstruction of PT *ɯ as opposed to PT *ɨ (reflexes in branches which do not distinguish between them are omitted).

The following example can be considered regular if it turns out that Lemos Barbosa (1951) transcription ‹mixon›_B of the Arikém cognate (see Rondon & Faria, 1948, p. 199) stands for mʉ̃sɒ̃ (for the development PT *w > Kt/Ari m in nasal environments, compare Kt mĩɲõ ‘Brazil nut’).

In the following examples, the PT etymon is preserved in Tuparian, but not in other branches which distinguish between *ɯ and *ɨ. Based on the Tuparian evidence, we reconstruct *ɯ.

There are also two PT etyma which have reflexes in Karitiana, but not in other languages which distinguish between *ɯ and *ɨ: PT *mbVʔɯt ~ *mbVḳɯt ‘necklace’,18*tɨʔɯt ~ *tɨḳɯt ‘maternal aunt’ > Kt mboʔet ~ mõet, teʔet. Based on the Karitiana evidence, we reconstruct PT *ɯ (we assume that the latter stem is derived from PT *tɨ ‘mother’ and thus reconstruct *ɨ in the first syllable, translaryngeal assimilation of vowels is not unheard of in Karitiana; alternatively, the Karitiana word could be related to Ari ‹utaíră›_N ‘id.’).

PT *ɨ is reconstructed for the correspondence set which involves the following reflexes: PTpr *i (preserved as i in all daughter languages), Kt/Ari i, Pu i. Below we list the PT etyma where evidence from multiple branches converges to the reconstruction of PT *ɨ as opposed to PT *ɯ (reflexes in branches which do not distinguish between them are omitted).

To these one may add the verbs Tu [e]pik-, Kt (a-)mbik and possibly Pu [t]api-a ‘to sit’, if these are related to PT *apɨk ‘to sit’.

In the following examples the apparent mismatch between the reflexes in Tuparian and Arikém is explained by vowel fusion: PT *ewɨ > *æhi > Kt eː (instead of *ai), Ari e (instead of *æi); PT *ɨ(p)ḳɯ or *ɨ(p)ʔɯ > *iʔe > Kt eː (instead of *ie).

In the following examples, the reconstruction of PT *ɨ as opposed to PT *ɯ is based on non-converging evidence from only one branch, as no cognates in other diagnostic branches have been identified.

Finally, one etymology has been found with a mismatch between Tuparian and Karo regarding the reconstruction of *ɨ or *ɯ. It is possible that the Karo word is ultimately unrelated to PT *mẽpɨt, as the vowel correspondence in the first syllable and the nasal reflex of PT *p appear to be irregular.

We now turn to the Juruna branch, for which Rodrigues (2005) also posited a split of PT *ɨ into Yu ɨ, i. However, this alleged split does not have anything to do with the reconstruction of PT *ɯ and *ɨ proposed in this section: in our account, both vowels yield PJu *ɨ, usually preserved in the daughter languages (PT *kʲɯt ‘green’, *ŋgɯp ‘louse’, *nẽcɯk ‘horsefly’, *mbɨʔa/*pɨʔa ‘liver’, *(j)atɨ ‘pain, sour’, *ewɨt ‘bee, honey’, *tɨk ‘resin’, *pɨtɨk ‘to take, to grab’, *mẽpɨt ‘son (female ego)’, *apɨk ‘to sit’, *wɨp ‘cooked’ > PJu *[a]kɨ́ɮ-ú, *kɨpá, *nãtɨ́k-á, *bɨʔá, *ʃadɨ́ ‘sour’, *awɨɮ-á, *dɨ́́k-á, *pɨd–í́k-ú, (?) *mãbɨ-a ‘daughter’, *ab–í́k-ú, *[u]wɨp-u ‘to cook’). A complication arises from the fact that PJu *ɨ has a special reflex in the word-initial position in Yudjá (e-) and Xipaya (i-): PT *ḳɯc ‘earth’ > PJu *ɨt-á > Yu etá ‘sand, beach’; PT *ḳɯc-pɨ ‘earth’ > PJu *ɨpɨ́-á > Yu epɨ́á, Xi ipɨa; PT *ḳɯp ‘tree; stick-like’ > PJu *ɨpá ‘stick’ > Yu epá, Xi ipa. Note that the extinct Manitsawá appears to have preserved a non-front vowel in this case, as in ‹upá› ‘Holz’ (Steinen, 1886, p. 361). In the following example, Yudjá appears to have raised the vowel in the prevocalic position, feeding glide epenthesis: PT *ʔɯ ‘water’ > PJu *ɨ-á > *e-á > *i-á > Yu ijá (cf. Xi ija). Finally, in the following two etymologies PT *ɨ is reflected in unexpected ways in the Juruna languages: PT *pətɨc ‘heavy’, *tɨ ‘mother’ > PJu *padét-ú, *di-á. At present, we are unable to account for the development of these words.

We conclude this section with presenting a piece of indirect evidence for the opposition between PT *ɯ and *ɨ from the Mundurukú branch. In Proto-Mundurukú, PT *p is lost before front vowels, including *i < *ɨ, as in PT *pe ‘path’, *peo ‘pus’, *pe ‘tobacco’, *pɨ ‘foot’, *pɨcja ‘heel’, *pɨtɨk ‘to take, to grab’, *jepɨ ‘payment’, *w-epɨk ‘to revenge’ > PMu *e, *eɨ ‘to swell’, *e, *i, *(ʔ)iða, *iʧik, *ðéi, *w-eik (cf. Rodrigues, 2007, p. 173). However, in all available examples *p is retained before a PT *ɯ: PT *pɯk ‘to burn’, *kɨp-ḳɯt ~ *kɨp-ʔɯt ‘brother’ > PMu *pik, *kipit, suggesting that PT *ɨ and *ɯ were still distinct at the time when PT *p was lost before front vowels (i.e., PT *ɨ had already become front, but PT *ɯ had not). This argument is unfortunately not as strong as it could have been in light of the existence of cognate sets in which PT *p was not lost even before reflexes of PT *ɨ or *i (PT *jupi-ʔa ‘egg’, *pɨcjo ‘breath’, *kɯpɨ(-ḳɯt ~ -ʔɯt) ‘younger sister (female ego)’, *pɨ ‘inner part’ > PMu *ðopia̰, *piðo, *kibḭt, *pi),20 but the fact that there is not a single example of loss of PT *p preceding a *i < PT *ɯ is hardly due to chance.

IMPLICATIONS FOR THE SUBGROUPING OF TUPIAN

Rodrigues and Cabral (2002, 2012) and Rodrigues (2007, p. 170) put forward a hypothesis whereby the Tupian family is considered to be split, in a binary manner, into two large branches: ‘Eastern’ (comprising Sateré-Mawé, Awetí, Tupí-Guaraní, Mundurukú, and Juruna) and ‘Western’ (comprising Mondé, Tuparian, Arikém, Ramarama, and Puruborá). These two proposals, however, have remained insufficiently demonstrated in the sense that they have not yet been supported with bunches of identified shared innovations.

Figure 1 summarizes the distribution of the innovations identified in this paper. Only phonologically significant innovations shared by more than one low-level group are shown, namely: (i) the merger of PT *ɨ and *i (Puruborá, Arikém, Tuparí, Karo, Mondé, Mundurukú), (ii) the merger of PT *ə and *e in the default environment (Arikém, Tuparí), (iii) the merger of PT *ə and *o after labials (Arikém, Tuparí), (iv) the merger of PT *ɯ and *ɨ (Mondé, Mundurukú, Juruna, Sateré-Mawé, Awetí, Tupí-Guaraní), and (v) the merger of PT *ə and *o before vowels (Mundurukú, Juruna, Sateré-Mawé, Awetí, Tupí-Guaraní).

Figure 1
Distribution of innovations affecting PT *ə and *ɨ.

The following groups share more than one innovation related to the evolution of PT *ə and *ɨ: Arikém and Tuparí (3 innovations), Mundurukú, Juruna, Sateré-Mawé, Awetí, and Tupí-Guaraní (2 innovations), and Mundurukú and Mondé (2 innovations). Of these, the former two sets are strong candidates for valid clades: both include non-trivial, positionally conditioned innovations (merger of *ə and *o following labials in Tuparí and Arikém, preceding vowels in Mundurukú, Juruna, Sateré-Mawé, Awetí, and Tupí-Guaraní). In contrast, the set comprising Mondé and Mundurukú is in all likelihood spurious (or paraphyletic): in addition to being incompatible with the proposal which links Mundurukú to Juruna and Mawé-Guaraní, there is indirect evidence which suggests that the fronting of PT *ɯ in Mundurukú counterfed the loss of *p before front vowels, an innovation specific to that branch. Therefore, the triple merger of PT *i, *ɨ, and *ɯ as *i has probably occurred independently in the phonological history of Mondé and Mundurukú.

That way, evidence from the development of the PT vowels supports the identification of two mid-level clades within Tupian. The node comprising Tuparí and Arikém is defined by the sound change *ə > *e (default) / *o (after labials); we suggest the label Tuparikém for this subgrouping hypothesis.21 The vowel inventory of Proto-Tuparikém (*/i ɨ e a o/) is preserved without changes in Proto-Tuparian, whereas in Proto-Arikém these vowels yielded /i e æ ɒ ʉ/ (> Karitiana /i e a o ɨ/) by means of a vowel shift identified by Storto and Baldi (1994). The second clade includes Mundurukú, Juruna, and Mawé-Guaraní (that way, our findings partially corroborate Rodrigues’ (2005) hypothesis regarding the validity of his Eastern branch) and has the merger of PT *ə and *o before vowels as its defining innovation. It may have proceeded in two stages: first, PT *ə and *o may have changed into *o and *u in a chain shift (this is precisely the state reconstructed by Rodrigues, 2005 for Proto-Tupian); in turn, the vowel *o (from PT *ə) may have been raised to *u in prevocalic contexts (thus merging with *u from PT *o). After that, Proto-Eastern Tupian *o, *u yielded PMG *o, *u; PMu *ɨ, *o; PJu *a (*u next to labials in stem-final syllables), *u. As the Eastern Tupian languages reach their greatest diversity between the Lower Madeira and the Lower Iriri, the Proto-Eastern Tupian Urheimat has to be sought in that region.

CONCLUSION

This paper has presented a reconstruction of the Proto-Tupi (PT) inventory of oral vowels alternative to that advanced by Rodrigues (2005), this being clearly the accepted view on the PT vocalism since its adoption in reference works on the family such as Rodrigues (1999) and Rodrigues and Cabral (2012). Our proposal is summarized in Table 6.

Table 6
PT vowels and their reflexes (proposal). ^A = before PT *ɨ or *ɯ in the next syllable; ^B = before a vowel; ^C = next to a labial in a stem-final syllable; ^D = after a labial; ^E = after a coronal.

We have argued that this new proposal is superior to the Rodrigues (2005) reconstruction in that it avoids the postulation of unexplained bifurcations of reflexes and the proposal of exception-ridden splits that, moreover, lack phonetic plausibility. Rodrigues’ (2007) proposal of a series of labialized consonants to PT is rejected too, as the segments in question lack reflexes different from those of their plain counterparts and because the positional developments of contextual vowels were shown to be spurious.

Supplementary material

ABBREVIATIONS

1: First person

3: Third person

3CRF: Coreferential third person

IV: Class IV

abs.: Absolute

Ak: Akuntsu

Ar: Aruá

Ari: Arikém

Aw: Awetí

CAUS: Causative

Cl: Cinta-larga

excl.: Exclusive

Gv: Gavião

incl.: Inclusive

INF: Infinitive

Kp: Kepkiriwat

Kr: Karo

Kt: Karitiana

Ku: Kuruaya

Ma: Makurap

Mn: Manitsawá

Mo: Mondé

Mu: Mundurukú

Mw: Sateré-Mawé

Pa: Suruí-Paiter

PJu: Proto-Juruna

PL: Plural

PMG: Proto-Mawetí-Guaraní

PMu: Proto-Mundurukú

PRS: Present

PT: Proto-Tupian

(P)TG: (Proto-)Tupí-Guaraní

PTpr: Proto-Tuparian

Pu: Puruborá

rel.: Relational

SG: Singular

Sk: Sakurabiat (Mekéns)

Sl: Salamãy

Tu: Tuparí

Wy: Wayoró

Xi: Xipaya

Yu: Yudjá

Zo: Zoró

Appendices

Appendix 1

Etymologies

This appendix includes all etymologies that contain the relevant vowels in at least two branches of the family, as well as several other etymologies that were mentioned in the body of the text. In what follows, cognates which cannot be regularly derived from the reconstructed etyma are marked with (!), and the irregular reflexes of specific segments are underlined (except in cases of irregular deletion of segments).

Appendix 2

PT consonants

ACKNOWLEDGMENTS

We are grateful to two anonymous reviewers for their comments on the presentation and the substance of the paper. These have certainly improved the quality of our submission, and the authors are fully responsible for any remaining errors or shortcomings. We are especially grateful to the editors and technical staff of the Boletim for their swift and high-quality work in preparing the proofs and dealing with our observations on necessary adjustments and revisions.

REFERENCES

Alarcon, D. F., Millikan, B., & Torres, M. (2016). Apresentação. In D. F. Alarcon, B. Millikan & M. Torres (Orgs.), Ocekadi: Hidrelétricas, conflitos socioambientais e resistência na Bacia do Tapajós (pp. xv–xvii). Programa de Antropologia e Arqueologia da Universidade Federal do Oeste do Pará.

Alves, P. M. (2004). O léxico do Tuparí: Proposta de um dicionário bilíngue [Ph.D. dissertation, Universidade Estadual Paulista Júlio de Mesquita].

Aragon, C. C. (2008). Fonologia e aspectos morfológicos e sintáticos da língua Akuntsú [MA thesis, Universidade de Brasília].

Aragon, C. C. (2014). A grammar of Akuntsú, a Tupían language [Ph.D. dissertation, University of Hawai‘i at Mānoa].

Betts, L. V. (1981). Dicionário Parintintin–português, português–parintintin. Sociedade Internacional de Lingüística.

Blevins, J. (2004). Evolutionary Phonology. Cambridge University Press.

Bontkes, W. (1978). Dicionário preliminar: Suruí–português, português–suruí. Summer Institute of Linguistics.

Braga, A. O. (1992). A fonologia segmental e aspectos morfofonológicos da língua Makurap (Tupí) [MA thesis, Universidade Estadual de Campinas].

Braga, A. O. (2005). Aspects morphosyntaxiques de la langue Makurap/Tupi [Ph.D. dissertation, Université de Toulouse–Le Mirail].

Carvalho, F. O. (2019). Revisitando o Proto-Jurúna: A reconstrução da série de oclusivas orais. In E. S. Oliveira, E. A. Vasconcelos & R. D. Sanches (Eds.), Estudos Linguísticos na Amazônia (pp. 215–236). Pontes Editores.

Caspar, F. (n.d.). German-Tupari dictionary: A unpublished manuscript. http://www.etnolinguistica.org/caspar:german-tupari-a.

Corrêa da Silva, B. C. (2010). Mawé/Awetí/Tupí–Guaraní: Relações lingüísticas e implicações históricas [Ph.D. dissertation, Universidade de Brasília].

Costa, R. N. V. (2002). Fonologia segmental da língua Kuruaya. Moara, (17), 85–101.

Dietrich, W. (2009). Correspondências fonológicas e lexicais entre Karitiána (Arikém, Tupí) e Tupí–Guaraní. Revista Brasileira de Linguística Antropológica, 1(2), 25–48. https://doi.org/10.26512/rbla.v1i2.12365

Drummond, C. (1952–1953). Vocabulário na Língua Brasílica (2 Vols.). Boletim da Faculdade de Filosofia, Ciências e Letras da Universidade de São Paulo.

Fargetti, C. M. (2001). Estudo fonológico e morfossintático da língua Juruna [Ph.D. dissertation, Universidade Estadual de Campinas].

Fargetti, C. M., & Rodrigues, C. L. R. (2008). Consoantes do Xipaya e do Juruna: Uma comparação em busca do proto-sistema. Alfa, 52(2), 535–563.

Felzke, L. F., & Moore, D. (2019). Terminologias de parentesco dos grupos da família linguística Mondé. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 14 (1), 15–32. https://doi.org/10.1590/1981.81222019000100003

Figueiredo, M. V. (2010). A flecha do ciúme: O parentesco e seu avesso segundo os Aweti do Alto Xingu [Ph.D. dissertation, Universidade Federal do Rio de Janeiro].

Franceschini, D. (1999). La langue sateré-mawé : Description et analyse morphosyntaxique [Ph.D. dissertation, Université Paris VII–Denis Diderot].

Gabas Jr., N. (1989). Estudo fonológico da língua Karo (Arara de Rondônia) [MA thesis, Universidade Estadual de Campinas].

Gabas Jr., N. (1999). A Grammar of Karo, Tupi (Brazil) [Ph.D. dissertation, University of California, Santa Barbara].

Galucio, A. V. (2001). The morphosyntax of Mekens (Tupi) [Ph.D. dissertation, University of Chicago].

Galucio, A. V., & Gabas Jr., N. (2002). Evidências de agrupamento genético Karo-Puruborá, tronco Tupi. XVII Encontro Nacional da ANPOLL, Universidade Federal do Rio Grande do Sul.

Galucio, A. V. (2005). Puruborá: Notas etnográficas e lingüísticas recentes. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 1(2), 159–192.

Galucio, A. V., Meira, S., Birchall, J., Moore, D., Gabas Jr., N., Drude, S., . . . Rodrigues, C. R. (2015). Genealogical relations and lexical distances within the Tupí linguistic family: A lexicostatistical and phylogenetic approach. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 10(2), 229–274. https://doi.org/10.1590/1981-81222015000200004

Galvão, E. (1953). Cultura e sistema de parentesco das tribos do Alto Xingu. Boletim do Museu Nacional, 14, 1–56.

Gavião, I. K. S. (2019). Nomes, verbos, adjetivos, posposições e predicações na lingua dos Ikólóéhj (Gavião, fam. Mondé, tronco Tupí) [MA thesis, Universidade de Brasília].

Gomes, D. M. (2006). Estudo morfológico e sintático da língua Mundurukú (Tupí) [Ph.D. dissertation, Universidade de Brasília].

González, H. A. (2008). Una aproximación a la fonología del tapiete (Tupí–Guaraní). LIAMES, 8(1), 7–43. https://doi.org/10.20396/liames.v8i1.1469

Hanke, W., Swadesh, M., & Rodrigues, A. D. (1958). Notas de fonologia Mekens. In J. Comas (Ed.), Miscellanea Paul Rivet octogenario dicata (Vol. II, pp. 187–217). Universidad Nacional Autónoma de México.

Hyman, L. (1973). The feature [grave] in phonological theory. Journal of Phonetics, 1, 329–337.

Kakumasu, J. Y., & Kakumasu, K. (2007 [1988]). Dicionário por tópicos Kaapor–português. Associação Internacional de Lingüística–SIL Brasil.

Karitiana, N. (2016). Cultura, memória e aspectos da variação linguística da língua do povo Byyjyty Osop Aky na aldeia Kyõwã da terra indígena Karitiana [Licentiate Degree Monograph, Universidade Federal de Rondônia].

Khalilova, Z. (2009). A Grammar of Khwarshi (LOT Dissertation Series). LOT, Netherlands Graduate School of Linguistics.

Koch-Grünberg, T. (1932). Wörterlisten “tupý”, maué und purúborá. Journal de la Société des Américanistes, 24(1), 31–50.

Lacerda, M. C. (2014). Bekã Pamakube (lugar de aprender). Aprendendo com os Zoró: Análise da identidade indígena através da experiencia das escolas nas aldeias do povo indígena Zoró [Ph.D. dissertation, Universidad de Salamanca].

Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s languages. Blackwell.

Landin, D. (2005). Dicionário e léxico Karitiana / português (2 ed.). Sociedade Internacional de Lingüística.

Lemos Barbosa, P. A. (1951). Pequeno vocabulário Tupi–português. Com quatro apêndices: Perfil da língua Tupi; palavras compostas e derivadas; metaplasmos; síntese bibliográfica. Livraria São José.

Lima, S. O. (2008). A estrutura argumental dos verbos na língua Juruna (Yudjá) [MA thesis, Universidade de São Paulo].

Meer, T. H. (1982). Fonologia da língua Suruí [MA thesis, Universidade Estadual de Campinas].

Meira, S., & Drude, S. (2015). A summary reconstruction of proto-maweti–guarani segmental phonology. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 10(2), 275–296. https://doi.org/10.1590/1981-81222015000200005

Mello, A. A. S. (2000). Estudo histórico da família lingüística Tupi–Guarani: Aspectos fonológicos e lexicais [Ph.D. dissertation, Universidade Federal de Santa Catarina].

Mendes Jr., D. G. (2007). Comparação fonológica do Kuruáya com o Mundurukú [MA thesis, Universidade de Brasília].

Monserrat, R. M. F. (2005). Notícia sobre a língua Puruborá. In A. D. Rodrigues & A. S. A. C. Cabral (Eds.), Novos estudos sobre línguas indígenas (pp. 9–22). Editora UnB.

Moore, D. (1984). Syntax of the language of the Gavião Indians of Rondônia, Brazil [Ph.D. dissertation, City University of New York].

Moore, D. (2005). Classificação interna da família lingüística Mondé. Estudos Lingüísticos, 34, 515–520.

Moore, D., & Galucio, A. V. (1994). Reconstruction of Proto-Tupari consonants and vowels. In M. Langdon & L. Hinton (Eds.), Proceedings of the Meeting of the Society for the Study of the Indigenous Languages of the Americas, July 2–4, 1993, and the Hokan-Penutian Workshop, July 3, 1993, both held at the 1993 Linguistic Institute at Ohio State University in Columbus, Ohio (pp. 119–137). Survey of California and other Indian Languages.

Nikulin, A., & Carvalho, F. O. (2019). Estudos diacrônicos de línguas indígenas brasileiras: Um panorama. Macabéa: Revista Eletrônica do Netlli, 8(2), 255–305.

Nikulin, A., & Andrade, R. (2020). The rise and fall of approximants in the Tuparian languages. Journal of Language Relationship, 18(4), 284–319.

Nimuendajú, C. (1923–1924). Zur Sprache der Šipáia-Indianer. Anthropos, 18–9(4–6), 836–857.

Nimuendajú, C. (1928). Wortliste der Šipáia-Sprache. Anthropos, 23(5–6), 821–850.

Nimuendajú, C. (1929). Zur Sprache der Šipáia-Indianer. Anthropos, 24(5–6), 863–896.

Nimuendajú, C. (1930). Zur Sprache der Kuruáya-Indianer. Journal de la Société des Américanistes, 22(2), 317–345.

Nimuendajú, C. (1932). Wortlisten aus Amazonien. Journal de la Société des américanistes, 24(1), 93–119.

Nogueira, A. F. S. (2011). Wayoro ẽmẽto: Fonologia segmental e morfossintaxe verbal [MA thesis, Universidade de São Paulo].

Nogueira, A. F. S. (2019). Predicação na língua Wayoro (Tupi): Propriedades de finitude [Ph.D. dissertation, Universidade de São Paulo].

Nogueira, A. F. S., Galucio, A. V., Soares-Pinto, N., & Singerman, A. R. (2019). Termos de parentesco nas línguas Tuparí (família Tupí). Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 14(1), 33–64. https://doi.org/10.1590/1981.81222019000100004

Payne, D. L. (1991). A classification of Maipuran (Arawakan) languages based on shared lexical retentions. In D. C. Derbyshire & G. K. Pullum (Eds.), Handbook of Amazonian languages (Vol. 3, pp. 355–499). Mouton de Gruyter.

Picanço, G. L. (2005). Mundurukú: Phonetics, phonology, synchrony, diachrony [Ph.D. dissertation, University of British Columbia].

Picanço, G. L. (2019). A fonologia diacrônica do Proto-Mundurukú (Tupí). Appris.

Restivo, P. (1893 [1722]). Lexicon Hispano-Guaranicum. Wilhem Kohlhammer.

Ribeiro, M. J. P. (2010). Dicionário Sateré-Mawé/português [MA thesis, Universidade Federal de Rondônia, campus de Guajará-Mirim].

Rocha, I. (2011). A estrutura argumental da língua Karitiana: Desafios descritivos e teóricos [MA thesis, Universidade de São Paulo].

Rocha, I. (2014). Processos de causativização na língua Karitiana. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 9(1), 183–197. https://doi.org/10.1590/S1981-81222014000100012

Rodrigues, A. D. (1984–1985). Relações internas na família lingüística Tupí–Guaraní. Revista de Antropologia, 27(8), 33–53.

Rodrigues, A. D., & Dietrich, W. (1997). On the linguistic relationship between Mawé and Tupí-Guaraní. Diachronica, 14(2), 265–302.

Rodrigues, A. D. (1999). Tupí. In R. M. W. Dixon & A. Y. Aikhenvald (Eds.), The Amazonian Languages (pp. 107–124). Cambrige University Press.

Rodrigues, A. D. (2002). Correspondências lexicais e fonológicas entre Tupí–Guaraní e Tuparí. In A. S. A. C. Cabral & A. D. Rodrigues (Orgs.), Línguas indígenas brasileiras: Fonologia, gramática e história. Atas do I Encontro Internacional do Grupo de Trabalho sobre Línguas Indígenas da ANPOLL (Vol. 1, pp. 288–297). Editora da Universidade Federal do Pará.

Rodrigues, A. D. (2005). As vogais orais do Proto-Tupí. In A. D. Rodrigues & A. S. A. C. Cabral (Orgs.), Novos estudos sobre línguas indígenas (pp. 35–46). Editora UnB.

Rodrigues, A. D. (2007). As consoantes do Proto-Tupí. In A. S. A. C. Cabral & A. D. Rodrigues (Orgs.), Línguas e culturas Tupí (pp. 167–203). Curt Nimuendajú.

Rodrigues, A. D. (2008). Linguistic reconstruction of elements of prehistoric Tupi culture. In E. B. Carlin & S. Kerke (Eds.), Linguistics and Archaeology in the Americas: The Historization of Language and Society (Brill’s Studies in the Indigenous Languages of the American, Vol. 2). Brill.

Rodrigues, A. D., & Cabral, A. S. A. C. (2002). Revendo a classificação interna da família Tupí-Guaraní. In A. D. Rodrigues & A. S. A. C. Cabral (Eds.), Línguas Indígenas Brasileiras: Fonologia, Gramática e História (pp. 327–337). UFPA.

Rodrigues, A. D., & Cabral, A. S. A. C. (2012). Tupían. In L. Campbell & V. Grondona (Eds.), The Indigenous languages of South America: A comprehensive guide (Vol. 2, pp. 495–574). Mouton de Gruyter.

Rodrigues, C. L. R. (1995). étude morphosyntaxique de la langue xipaya (Brésil) [Ph.D. dissertation, Université Paris VII–Denis Diderot].

Rondon, C. M. S., & Faria, J. B. (1948). Glossário Geral das tribos silvícolas de Mato-Grosso e outras da Amazônia e do Norte do Brasil (Tom. I). Imprenta Nacional.

Sabino, W. K. (2016). Awetýza tiʔíngatú: Construindo uma gramática da língua Awetý, com contribuições para o conhecimento do seu desenvolvimento histórico [Ph.D. dissertation, Universidade de Brasília].

Santos, C. A. B. (2013). Aspectos da fonologia do Mundurukú do Madeira (AM) [MA thesis, Universidade de Brasília].

Schleicher, C. O. (1998). Comparative and internal reconstruction of the Tupi–Guarani language family [Ph.D. dissertation, University of Wisconsin–Madison].

Sekelj, T. (1948). Wordlist Aruá-Makurap-Zaboti-Arikapú-Tupari. Unpublished manuscript. http://www.etnolinguistica.org/sekelj:1

Silva, E. B. (2009). Estruturas fonéticas e fonológicas de vogais e consoantes da língua Kuruaya [MA thesis, Universidade Federal do Pará].

Singerman, A. R. (2018). The morphosyntax of Tuparí, a Tupían language of the Brazilian Amazon [Ph.D. dissertation, University of Chicago].

Snethlage, E. (1932). Chipaya- und Curuaya-Wörter. Anthropos, 27, 65–93.

Snethlage, E. H. (1934). Wörterverzeichnis der Boo̯roo̯buo̯rá-Sprache. Unpublished manuscript. http://www.etnolinguistica.org/emil:4

Steinen, K. (1886). Durch Central-Brasilien: Expedition zur Erforschung des Schingú im Jahre 1884. F. A. Brockhaus.

Storto, L. (1999). Aspects of a Karitiana grammar [Ph.D. dissertation, Massachusetts Institute of Technology].

Storto, L., & Baldi, P. (1994). The Proto-Arikém vowel shift. Paper presented at the 68^th Annual Meeting of the Linguistic Society of America.

Vago, R. (1976). More evidence for the feature [grave]. Linguistic Inquiry, 7(4), 671–674.

Voort, H. V. (2007). Proto-Jabutí: Um primeiro passo na reconstrução da língua ancestral dos Arikapú e Djeoromitxí. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 2(2), 133–168. https://doi.org/10.1590/S1981-81222007000200007

Wetzels, W. L., & Nevins, A. (2018). Prenasalized and postoralized consonants: The diverse functions of enhancement. Language, 94(4), 834–866. http://dx.doi.org/10.1353/lan.2018.0055

Zoró, T. K., & Camargos, Q. F. (2019). Estruturas interrogativas polares e informacionais na língua Pangyjẽ̃̃̃ej (Zoró, família Mondé, tronco Tupí). Revista Brasileira de Linguística Antropológica, 11(2), 111–133. https://doi.org/10.26512/rbla.v11i02.28508

Notes

1 The reconstruction of *tʲ as one segment in the word for ‘fire’ (as opposed to *ti ~ *tj) is supported by the fact that not only PTG *tata / *-rata, Awetí taʐa / -aʐa, but also Mundurukú daʃá, Kuruaya láʃa, Yudjá aʃí, Xipaya aʃi have precisely one segment as its correspondence. Only Sateré-Mawé would have unfolded *tʲ to rj in arja ‘fire’.

2 The epenthetic nature of the w in these stems is likewise confirmed by the fact that no corresponding consonant is found in branches such as Mundurukú (*ðoj ‘blood’, *t-oaj-bɨ ‘tail’; see Picanço, 2019) or Tuparian (*jeɨ ‘blood’, *joac ‘tail’; see Nikulin & Andrade, 2020). Our amendment to Meira and Drude’s (2015) reconstruction spares us from the necessity of positing a typologically improbable ‘zigzag’ development in Sateré-Mawé, whereby PT *w > PMG *kʷ > Sateré-Mawé w (h before u in stressed syllables; ∅ before u in unstressed syllables). In our account, Sateré-Mawé w (h/∅ before u) simply continues PT *w > PMG *w. Only the ancestral language of Awetí and PTG would thus have innovated by transforming PT *w > PMG *w into a stop.

3 The diverging reflexes in the Guaraní varieties that were earlier seen as warranting the reconstruction of two affricates *c and *č for PTG are now explained as late developments involving diffusion or dialect borrowing among Guaraní dialects.

4 When citing comparanda after Rodrigues (2005) and Rodrigues and Cabral (2012), we leave unchanged their transcription conventions, morphological segmentations, cognation judgments, and reconstructions. Our analysis, which may differ significantly, is available in the Appendix 1. We have been unable to confirm the existence of some of the forms given by these authors (such as Makurap men ‘husband’).

5 Note that in this article we take nasality to be an autosegmental feature in Proto-Tupian and most Tupian languages. It rarely interacts with the development of the vocalic segments (we note it explicitly when it does).

6 Galucio (2005) gives the phonetic form as [mɐ̃ˈɲũm] and analyzes [ɐ̃] as a realization of /ã/. In Galucio et al. (2015, p. 271), the form mə̃jũp(m) is given instead. We assume the representation /ə̃/ for the vowel in question.

7 Word-initially, PT *t is expected to yield PJu *ʧ > Xi t, not d. Also the regular reflex of PT *ə is u only before vowels, but not before consonants. Together, these facts cast doubt on the inclusion of the Xipaya prefix in this cognate set. Note, however, that at least the apparently irregular vowel reflex could be attributed to leveling: at an earlier stage, *da- may have occurred before consonants and *du- before vowels. The allomorph du- is still used in some pre-vocalic contexts in Xipáya (du-ázi ‘his (own) wife’), but shows up as d- before unaccented vowels (d-aká ‘his/her (own) house’; Rodrigues, 1995, p. 12).

8 At the PTG level, at least, only *-p and *-t were subject to lenition, but the dorsal stop *-k was not (Schleicher, 1998, p. 32; Meira & Drude, 2015, p. 281). A few languages like Kayabí and members of the Kagwahiva cluster extended the process to reflexes of PTG *-k as well.

9 An anonymous reviewer notes that positing PMu *p as a regular reflex of PT *pʔ could be problematic if one accepts that PMu *eba ‘wing’ (Picanço, 2019, p. 136) is related to PT *pepʔə ‘wing’. Although we find the comparison in question tempting, the mismatch between PMu *a and PT *ə precludes us from considering PMu *eba a reflex of PT *pepʔə. Instead, it is possible that the Mundurukú form continues a different derivative of a hypothetical PT root *pep-, whose erstwhile existence is suggested by Tuparí pépʔe ‘fin’ and pépʔa ‘butterfly’ (Alves, 2004, p. 236) alongside pépʔo ‘feather’ (< PT *pepʔə).

10 Of relevance to this issue, note that the lack of an explicit bottom-up lexical reconstruction of PTG is a significant drawback of the Tupian comparative literature. Although much of the work by Rodrigues often uses Old Tupi (or some other conservative/well-attested language) as a kind of proxy for PTG, we were unable to find either PTG *oβapo ‘to appear’ (Rodrigues, 2007, p. 186), or any presumed reflex of this etymon, after searching the PTG vocabulary of Mello (2000) and the extensive lexical documentation of conservative languages such as Old Guaraní (Restivo, 1893 [1722]) and Old Tupi (Drummond, 1952–1953). These are reasons strong enough to believe that Rodrigues’ (2007) PTG *oβapo is a lexical ghost.

11 Rodrigues (2007, p. 182) and Rodrigues and Cabral (2012, p. 508) also reconstruct word-initial instances of PT *kʷ in some cognate sets, which are allegedly preserved in the Awetí–Guaraní languages as Aw kw, PTG *kʷ. Meira and Drude (2015) show that the structure (*)kʷV in these languages continues earlier sequences of the type *koV, which may or may not result from elision of an erstwhile intervocalic consonant. Some examples are PMG *kocap ‘to pass’ > Aw kwap, PTG *kʷap (Meira & Drude, 2015, p. 294); PT *ŋgəat ‘sun’ > Aw kwat, PTG *kʷat (this paper). It is, therefore, doubtful that PT had *kʷ word-initially.

12 cf. the original: “. . . AR [= Arikém branch] Karitiána ŋa (a vogal desta forma corresponde à do Makuráp, mas não à do étimo **ŋko, TP [= Tuparian branch] Kepkiriwat go, mas Makuráp ke (e não é o reflexo regular de **o nesta língua) . . .” (Rodrigues, 2005, p. 40). Note that Rodrigues (2005) takes Kepkiriwat to be a Tuparian language, in which we do not concur.

13 In principle, it is still possible that the regular Mondé reflex of PT *ə is a even after t/d. This would allow us to propose two new Tupian etymologies for Mondé roots at the expense of the etymology for ‘larva’ shown above: PT *tək ‘to pound, to grind’ > Pa -tagá ‘to smash’ (as in ɬo-dagá ‘to pound’), Gv tágá ‘to beat’; PT *ðəp ‘bitter’ > Pa [pe]ʧáp, Ar ‹petab›, Gv [pe]tɨ́ɨ̀p (note that Gv ɨ is usually derived from /a/ in diminutives). We thank an anonymous reviewer for bringing the Paiter form [pe]ʧáp ‘bitter’ to our attention.

14 It has not been previously suggested that the combination PT *e…ɨ/ɯ has special reflexes in Juruna (*a…ɨ) and Mondé (i...i). Examples include PT *ewɨt ‘bee, honey’, *ejɯ ‘marico bag’, *nẽcɯk ‘horsefly’ and, in Juruna only, *mẽpɨt ‘son (female ego)’, which are reflected as PJu *awɨɮ-á, –, *nãtɨ́k-á, *mãbɨ-a; Aruá ‹ivirej ~ ividei›, ‹itji›, ‹digá› (but *mẽpɨt > ‹mambid›).

15 Note that we do not claim that the phonetic values of these segments were necessarily a back and a central unrounded vowel, respectively. At this time, other interpretations (such as */ɨ/ vs. */ɪ/) cannot be discarded.

16 If Kr ci ‘liquid’ and Pu ʃi ‘chicha; blood, menstruation’ turn out to be related, they would constitute an exception. However, it appears possible to derive them from PT *jɨ ‘urine’ (> ‘liquid’).

17 This root has a plausible cognate in Karitiana, kɨpeet, but the vowel of the first syllable is in any case irregular.

18 Rodrigues (2007, p. 190) lists Tu oir-pot as a cognate, which could be a counterexample to our proposal. However, PT *mb is expected to be preserved as p in Tuparí, casting doubt on the validity of the etymology.

19 Semantically, Kr ci ‘liquid’ and Pu ʃi ‘chicha; blood, menstruation’ are closer to PT *jɯ ‘liquid’, but the vowel *ɯ would be expected to yield Karo and Puruborá ɨ in this position.

20 PT *mbɨʔa/*pɨʔa ‘liver’ > PMu *pia̰ is not necessarily an exception, as in this case one might suspect that the PT absolute form with *mb- was generalized in PMu.

21 It is interesting to note that the defining innovation of this branch did not affect Kepkiriwat, an extinct language of Rondônia sometimes classified as Tuparian (cf. Hanke et al., 1958, p. 188; Rodrigues, 1999, p. 109; Galucio, 2001, pp. 5–6; Aragon, 2008, pp. 6, 10–11, 2014, pp. 3, 15, 19–20; Rodrigues & Cabral, 2012, p. 497, inter alia). The default reflex of PT *ə in Kepkiriwat appears to be o rather than e, as in ‹uóque›_R, ‹uóc›_B ‘house, village’, ‹óp›_B ‘leaf’, ‹gó›_B ‘cultivated field’ (Rondon & Faria, 1948, pp. 181, 187, 191) < PT *ək ‘house’ or maybe the first person form *o-jək ‘my house’, *jəp ‘leaf’, *ŋgə ‘cultivated field’. This suggests that Kepkiriwat is not a Tuparian language but rather forms a branch of its own. The issue awaits further investigation.

22 An anonymous reviewer suggests that this form could be a Tuparian loan. Note, however, that the correspondences are regular and non-trivial, a fact which is consistent with its being an inherited form.

23 With the verb to ‘to go.sg’, this prefix unexpectedly takes the form ere- (Franceschini, 1999, p. 234). In addition, its final vowel is raised to u before a w (wat ‘to go.pl’ → eru-wat) and undergoes regressive translaryngeal harmony, thus behaving identically to the vowel of the prefix to- ‘3crf’.

24 >We were unable to locate this form in any primary source on Puruborá. However, it corresponds well to the data of other Tupian languages and thus we provisionally leave it here, given that Rodrigues (2005) may have had access to unpublished Puruborá data.

25 Rodrigues (2005, p. 40) also includes Kt s-ak ‘cave’ and Kr ek ‘inside’ into this cognate set. We were unable to locate such forms in the available sources on these languages.

26 An anonymous reviewer suggests that this form could be a Makurap loan.

27 An anonymous reviewer notes that the basic Gavião (2019) form for ‘arrow’ is idzáp or ijáp, and that adzáp ~ ajáp derives from a process of reanalysis of the initial i as an affix and its change to a.

28 Reconstruction based on Old Tupí taʔók-a (Lemos Barbosa, 1951, p. 148), Ka’apor toʔok (Kakumasu & Kakumasu, 2007 [1988], p. 30).

29 Corrêa da Silva (2010, p. 402) lists Aw ʔãpɨ̃c, PTG *apɨ̃c as cognates, but their palatal coda does not match the absence of a coda in Sateré-Mawé.

30 Arikém has a very similar root, which differs from the Karitiana one in having a labial coda: ɒ-sʉpɒβ-ɒ ‹axupáua›_B / ‹u-asupáua›_N ‘eye’, sʉpɒβ-ɒ ‹ixipáua›_B ‘seed’ (Rondon & Faria, 1948, pp. 193, 200; Nimuendajú, 1932, p. 109). It is plausibly cognate with Proto-Tuparian *jopap ‘grain’ > Proto-Core Tuparian *opap ‘grain’, *eβaopap ‘eye’ (Nikulin & Andrade, 2020, pp. 292, 297). It is likely that Proto-Arikém inherited both roots from Proto-Tupian (say, Proto-Arikém *sʉpɒ ‘eye’, *sʉpɒp ‘seed’), which were later contaminated in both Arikém languages due to their phonetic and semantic similarity: Karitiana retained only *sʉpɒ, and Arikém only *sʉpɒp.

31 Root preserved in Old Tupi i-ʔɨr-a (Drummond, 1952–1953, p. 119), Kamayurá je-ɨt (Galvão, 1953).

32 Meira and Drude (2015, p. 293) mention only Sateré-Mawé po and PTG *po (and reconstruct PMG *po). These allomorphs are found when a possessor is specified. The unpossessed form PMG *mbo is reconstructed based on Sateré-Mawé mo, PTG *mbo (Ribeiro, 2010, p. 72; Schleicher, 1998, p. 347).

33 Rodrigues (2005, p. 39) and Galucio et al. (2015, p. 255) consider Karo pá- (as in pá-peʔ ‘hand’,pá-ro ‘hands’; Gabas Jr., 1999, p. 90) and Puruborá ba ‘arm’ (Galucio, 2005, p. 167) to belong here. From a phonetic point of view, these roots are rather comparable with PMu *pa̰ ‘arm’ > Mu/Ku pa̰ (Picanço, 2019, p. 136).

34 Meira and Drude (2015, p. 293) mention only Sateré-Mawé pɨ and PTG *pɨ (and reconstruct PMG *pɨ). These allomorphs are found when a possessor of this now is specified. The unpossessed form PMG *mbɨ is reconstructed based on Mw/Aw mɨ, PTG *mbɨ (Ribeiro, 2010, p. 75; Sabino, 2016, p. 111).

35 Root preserved in Old Tupí (Lemos Barbosa, 1951, p. 87), Parintintin (Betts, 1981), Ka’apor (Kakumasu & Kakumasu, 2007 [1988], p. 53), and other languages.

36 In all Tuparian languages except Makurap, this prefix has the allomorph õ- which occurs before consonant-initial stems. The loss of PTpr *m is not known to be a regular sound change; it can nevertheless be identified as an innovation shared by the Core Tuparian languages (Wayoró, Tuparí, Sakurabiat, Akuntsú).

37 Rodrigues (2005, p. 40) lists Kr na-cɛj as a member of this cognate set; however, na- is not a root but rather a prefix in Karo.

38 Meira and Drude (2015, p. 293) attest Sateré-Mawé ko and reconstruct PMG *ko. While the relational stem ko is indeed used in Sateré-Mawé, there is also its absolute counterpart ŋgo (Franceschini, 1999, p. 29; Ribeiro, 2010, p. 56), which prompts the reconstruction of PMG *ŋgo.

39 PMu *kaʧi ‘sun’ (Mu káʃi ‘sun/moon’, Ku kaʤi) (Picanço, 2019, p. 142) is sometimes seen as a cognate of PTG *kʷat ~ *kʷaratsɨ and Yu kuadɨ́, Xi kuazadɨ (Rodrigues, 2007, p. 192), but it might be better accountable for as an Arawakan borrowing (cf. Proto-Arawakan *kečɨ ‘sun’, as reconstructed in Payne (1991, p. 420).

40 The rounded vowel found in Mundurukú is irregular (*pəʃí would be expected). The Kuruaya form ‹ipidy›_L, attested by Antonio Lopes da Costa (Snethlage, 1932, p. 80), shows an unrounded vowel in the first syllable of the root; guided by the PMG cognate with *o, we phonologize this Kuruaya form as /i-pɨdi/ i-pɨʤi.

41 Root preserved in Old Tupí (Lemos Barbosa, 1951, p. 132), Parintintin (Betts, 1981), and other languages.

42 Rodrigues (2007, p. 182, 2008, p. 6) includes Tuparí tek into this cognate set. However, other sources attest the Tuparí verb for ‘to pound, to grind’ as tet- (Alves, 2004, p. 258), which goes back to PTpr *ndet- (Nikulin & Andrade, 2020, p. 306).

43 The form nɨ (not given by Meira and Drude (2015) is used both in Sateré-Mawé and in Awetí as the absolute equivalent of the relational stem tɨ. It is attested in Franceschini (1999, p. 29), Ribeiro (2010, p. 76) and Sabino (2016, p. 113).

44 This form is documented only in the Madeira dialect of Mundurukú. In the Tapajós dialect, this term has been replaced with káʃi ‘sun/moon’ (Picanço, 2005, p. 27). Interestingly, the form ‹wasuptá›_N ‘star’, attested for the Madeira dialect in Nimuendajú (1932, p. 107), has been likewise replaced with kasoptá (Picanço, 2005, p. 128).

45 We were unable to locate this form in any primary source on Gavião (2019). However, it corresponds well to the data of other Tupian languages and thus we provisionally leave it here, given that Rodrigues (2007) may have had access to unpublished Gavião (2019) data. An anonymous reviewer remarks, however, that the Gavião (2019) form in question is composed of i ‘chicha’ and -áàp ‘convex or concave object’ and, thus, might not be cognate with the remaining forms.

Nikulin, A., & Carvalho, F. (2022). A revised reconstruction of the Proto-Tupian vowel system. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 17(2), e20210035. doi: 10.1590/2178-2547-BGOELDI-2021-0035

Author notes

Responsabilidade editorial: Ana Vilacy Galúcio

AUTHORS’ CONTRIBUTIONS A. Nikulin contributed to conceptualization, data curation, formal analysis, investigation, methodology, and project administration; and F. Carvalho to conceptualization, data curation, formal analysis, investigation, and methodology.

Autor para correspondência: Fernando Carvalho. Museu Nacional/Universidade Federal do Rio de Janeiro. Quinta da Boa Vista, São Cristóvão, Rio de Janeiro, RJ, Brasil. CEP 20940-040 (fernaoorphao@gmail.com).

Table 1
The reflexes of PT oral vowels (after Rodrigues, 2005, p. 37).

Table 2
PT vowel inventory in Rodrigues’ (2005) proposal.

Table 3
PT vowel inventory in our proposal.

Table 4
Four correspondence sets for Tupian vowels (adapted from Rodrigues, 2005, p. 37).

Figure 1
Distribution of innovations affecting PT *ə and *ɨ.