Art by Numbers: the myth of big data in the arts

From online adverts to tackling disease, big data is crucial. But it hasn’t lived up to its potential for the cultural sector, says Michael Nabarro

Remember ‘Big Data’? In a world where information is increasingly important to everything we do, this new approach (new in 2012 terms) became a useful buzzword for data management in virtually any context. ‘Don’t have a big data strategy? Well if not then we’re doomed to fail’ … or so the conventional wisdom seems to say.

For most of us, of course, big data didn’t quite have such an impact. Big data as a concept (and the methods used to interrogate it) remains important in all kinds of contexts, with potentially massive real-world applications from enabling governments to tackle disease to letting Facebook show you adverts based on how much it thinks you like cats. In the cultural sector, however, it has yet to be proven as a helpful approach to data.

This isn’t to say it couldn’t. The ACE-funded Digital R&D Fund for the Arts currently has several ‘big data’ research projects to experiment with some of these ideas. These projects, however, whilst they remain interesting initiatives, seem to be much more about aggregate data than big data. It seems then that, for now, the promise that the big data revolution would turn data management in the arts completely on its head has failed to come true.

In principle, big data is about analysing monumental volumes of unstructured data, requiring tools like Hadoop to break seemingly unmanageable datasets down into something useful. Yet in the cultural sector, and I suspect in many others, the datasets we’re dealing with are rarely this large. Companies like eBay, with upwards of 90 petabytes of data to analyse (90 x 1000 terabytes) need to approach their data creatively, using big data techniques to manage it effectively.

Yet for arts and cultural organisations, no useful dataset is likely to be this large. By way of reference, at Spektrix we manage all of the ticketing, marketing and fundraising data for over 180 arts organisations; even in aggregate, this raw data equates to no more than 500GB – far below big data territory.

Reviewing Nesta and Magic Lantern’s Counting What Counts report into big data in the cultural sector, Chris Unitt hits on a key issue when he makes the point: ‘I think there’s some conflation going on between the concepts of big data and data-driven decision making’. With this lack of clarity, arts organisations might find themselves encouraged to gather more data than they can usefully manage, about every area of their operation.

In theory, it’s true that arts organisations could begin to use big data-like techniques in order to measure and track every interaction with every person they have a relationship with. In museums and theatres, quantitative data about footfall and time spent in queues could be considered alongside more qualitative data about where individuals travel around the building – which posters they stand in front of, which sculpture do they spend most time looking at. All of this could be merged with ticketing data, information about online purchase paths, retargeting marketing data from Google, social data from Facebook, etc., to create a vast and detailed picture of every individual. Whilst this would successfully create a complex and difficult to manage dataset, seemingly ideal for picking apart using big data tools, the real value of this would be tiny compared to the effort.

Whilst big data analysis is currently on too great a scale for use in the arts, data management is still crucial. The alternative approach has an inevitable (and equally misleading) title – small data. Already as much of a buzzword as its bigger brother, small data is about interrogating datasets that, whilst potentially still very large, are not on the scale of petabytes and zettabytes that defy standard data management tools. In the arts, small data is about identifying and tackling issues that are human-scale, and answering questions like: how many people opened my last post-show email and what can I do to improve this next time? Small data is about using information like this to achieve a goal – like encouraging first time bookers to re-attend, or tracking the success of specific time-limited offers – and then measuring this in a useful, actionable way.

The data management tools we have in the arts are already under-used. The other articles in this Art by Numbers series focus on the practical, everyday tips that we can put to work in the arts, making better use of our data using simple but forward-thinking techniques. Let’s not get bogged down in the myths around big data. There’s smaller data we should be interested in.

Michael Nabarro is co-founder and managing director of Spektrix