Building a Corpus of Ancient Venetic

For the past few weeks I’ve been doing something which has been rewarding and frustrating in almost equal measure (in between some weekends visiting family in Sussex and Yorkshire, with some Roman sites and a grinning she-wolf thrown in): namely, building up a complete corpus of all the extant inscriptions in the ancient Venetic language.

In principle, this is a simple exercise. The last full collection of the Venetic inscriptions was written in the 1970s, but plenty more inscriptions have been found since then. So, starting with the 1970s volume, I add each text to a spreadsheet, making sure to include all its metadata details (such as the material it’s written on, the kind of inscription it is, whether it contains a personal name, and so on). I also copy out the wording of each text, so that I’ll be search them for words or phrases later if I want to. Then I have to try to work my way forwards through all the subsequent finds, adding them into the spreadsheet as new inscriptions are found or (sometimes) as the interpretation of an older inscription changes drastically because of new knowledge of the language. Sometimes more than one book has included the same inscriptions, and so one inscription might have been given several different numbers – I need to include all of those too, so that I can figure out which inscription an author is talking about no matter what numbering system they’re using.

So basically what I’m doing is taking exciting inscriptional material like this:


And turning it into something like this:

venetic corpus

What is sometimes a simple and mechanical exercise can quickly become slow and frustrating. Some authors do not include all the details I want – even quite simple ones, such as an estimate of the date of the inscription. I’m often left wondering – is the date completely unknown, or do they think it’s implied by something in the previous pages? Or are they simply not interested in the chronology just now, so they’ve decided not to include it in this bit of the discussion? When they say this is on a “large dish”, do they mean a ceramic dish? Or will it turn out to be a special bronze one used for a dedication to a god or an important burial?

Finding the newer editions can be a tricky one too – in practice, it’s often better to start at the present day and work backwards, rather than starting at the 1970s and working forwards, since people will tend to refer to the articles and editions that went before them. But many of the articles are in small local journals, mostly published in Italy or France, or even in Festschriften for retiring academics, many of which had tiny print runs and are unlikely to be available anywhere online. (I’m very lucky to belong to several world-class libraries, where I can normally track things down somehow, but even using the golden Oxford-Cambridge-London triangle it’s not always possible.)

Then there’s the problem of the newest inscriptions, which have not yet been assigned a fixed number. Of course, it’s possible to refer to these as “Marinetti 2004 p5” or whatever it might be, but it’s obviously preferable for each inscription to have a number that it is more memorable and recognisable. For example, Venetic inscriptions are normally numbered with a letter code, so that Es 1 is from Este and Pa 1 from Padova, and so on. If I assign a temporary number to a recently-discovered inscription, I will almost certainly have to change it – tomorrow, or next week, or next year – so is it worth the trouble? (I’ve often decided that it is, not least because other authors often seem to coincidentally decide on the same temporary numbers as I do, because we’ve all followed more or less the order in which the inscriptions were found.)

And there’s the mistakes. How do you deal with the fact that there is no Pa 32, because the last scholar somehow missed that number out? Why do there seem to be two different inscriptions both labelled Sp 1 in the same book? Are two inscriptions from Este with the same text actually the same inscription or just two copies? I’d have to track down photographs to tell, which is sometimes possible but takes time.

And so no matter how many hours go into collecting inscriptions, the job never quite seems to be finished. They also tend to be bigger than you think – a quick Google may tell you that there are only a couple of hundred Venetic inscriptions, but I can tell you I’m at 540 and counting. Many are very short indeed, but they still need to be collected and accounted for: it’s very important to be as complete as possible when working with scantily attested ancient languages.

But despite the frustrations, building a corpus like this is an essential part of how I work. Time-consuming spreadsheets make a lot of new work possible that previously would have taken much longer, but they’re also a great way of getting to grips with the material in a hands-on and active way. This kind of leg-work is central to how I learn a new ancient language, see patterns in the material and start to see inconsistencies in past explanations. I also feel that in a small way this kind of work can help out colleagues as well as my future self (and I’m happy to share, not least because colleagues often spot my mistakes for me).

Now that the corpus is in a (temporarily) finished state, I can hopefully do something with all the data very soon. For now, my goal is to present some new research from this project at the Cambridge Indo-European seminar during Michaelmas.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at

Up ↑

%d bloggers like this: