Diversity was a theme of the 20th ISMIR annual conference held in 2019; the conference ‘tagline’ was ‘Across the Bridge’, which was taken to reflect the ‘diversity of scientific disciplines, seniority levels, professional affiliations, and cultural backgrounds’ characterising MIR as a field.1 Yet when the conference chairs invited me to give a conference keynote, they asked me to speak to insufficient diversity in two senses: they wanted insights into how to create ‘a more diverse ISMIR in terms of discipline’, and they also noted that ‘[w]e are trying hard to overcome the current bias [towards] Western male engineers’.2 This article is a revision of the keynote address that resulted from their invitation. Together, these observations suggest that the MIR community embraces diversity as a positive value, with some believing that it already embodies this value, while others consider it to be a goal towards which ISMIR should be moving, while acknowledging that it currently has a deficit.
Diversity is one of those values perhaps too often carelessly invoked. It can also be controversial, particularly if the language of ‘diversity’ is employed in ways that occlude older concerns – notably matters of inequality, injustice or bias. In the words of influential writers, the elevation of diversity in recent public and policy debates can mean ‘that other kinds of vocabularies are no longer used,… including terms such as “equal opportunities”, “social justice”, “anti-racism” and “multiculturalism”’, terms with complex histories linked to the histories of political movements such as feminism and anti-racism. For ‘when the terms disappear from policy talk, a concern is that such histories might also disappear’ (Ahmed and Swan 2006: 96). This warning is especially salient in the current moment, when the world is reeling from the events that impelled movements like #MeToo and Black Lives Matter, as well as their sometimes violent consequences.3 Because, of course, it is not so much the events – although they matter in themselves – but their chronic, repeated nature and structural foundations that are the terrain on which efforts towards diversity must be built.
Academic and scientific fields are just as likely to host these structural foundations as other areas of intellectual, cultural and social life. They, too, are likely to be sites in which those inequalities, injustices or biases made more palatable by the term ‘diversity’ may become apparent and may require to be addressed. And if we think music is immune to such issues, then the recent furore that has arisen within the academic music theory community over accusations that it embodies and upholds a ‘white racial frame’ should give pause for thought.4
Evidence of gender imbalances in MIR is provided by Hu et al. (2016), who also describe the organisational response: the creation at the 2011 ISMIR conference of ‘Women in MIR’ or WiMIR sessions ‘in order to identify current issues and challenges female MIR researchers face, and to brainstorm ideas for providing more support to female MIR researchers’ (Hu et al., 2016: 765). A complementary perspective comes from a wide-ranging, reflexive discussion of ‘Ethical dimensions of MIR’ by Holzapfel, Sturm and Coeckelbergh (2018), which identifies bias in several senses. Among them are demographic biases stemming from the fact that the MIR community, ‘as many engineering research communities’, is characterised by researchers who ‘are typically WEIRD (white, educated, industrialized, rich, operating within democracies) (Henrich et al., 2010), from a limited set of geographical origins, and a majority is male’ (Holzapfel et al., 2018: 50). Also significant for these authors are ‘technical biases’ apparent in how ‘[d]atasets are biased towards Eurogenetic forms of music, and consequently MIR tasks are biased towards challenges that are meaningful in these idioms’ (ibid.). The consequence is that ‘[m]usic that is under-represented in MIR datasets, or that does not fit MIR tasks and evaluation measures, is unlikely to be interpreted in a semantically correct way by methods that emerge from the biased MIR community’ (ibid.).
Holzapfel, Sturm and Coeckelbergh link these observations to a further point: they trace the ‘MIR value chain’, suggesting that MIR researchers do not have clear through-lines of influence to the several ensuing stages their research feeds into: software development, product design, publishing and thence to the end-user. ‘MIR research, as most engineering research, is often not immediately involved in the following steps through the value chain. This leads to a barrier between MIR research and the higher levels of constraints on system design’ (ibid.). Certainly, this describes a dilemma; yet while their analysis is valuable, arguably it does not go far enough. They present this problem as an ethical one stemming from a delinked value chain that results in a ‘remoteness from users’ (ibid.). But that understates the effects of this fragmentary value chain, which is that MIR, by delivering a music-‘information infrastructure’ (Kornberger et al., 2019), acts as a perhaps unwitting participant in reproducing and favouring the normative, restricted repertoire of commercial popular music and associated types of musical expression proffered to consumers by the global digital music industries. Indeed, the authors detail one such outcome: ‘An example can be conceived of in relation to rhythm, where most MIR tools focus on common time signatures, which finds its continuation in tools within digital audio workstations’ (ibid.). I return later to the risks of a collusive relationship between MIR and the digital music industries.
My task, then, is to offer suggestions about how to diversify ISMIR and, by implication, MIR as a field. I take this as an ambitious challenge: to try to help the MIR community understand the scope, scale and depth of the undertakings entailed by responding to calls for greater diversity in the structural sense. For these challenges may not be obvious and, importantly, they are not singular but several. I frame the concerns I want to raise not by a narrow understanding of diversity but a broad one, asking: how can MIR refresh itself and its endeavours, scholarly and real world, by addressing diversity? What can diversity mean for a field like MIR?
At this point, I return to the 2019 ISMIR conference, which took place in the city of Delft, home of one of the world’s greatest painters, Johannes Vermeer. I want to draw three initial messages, or prompts to thought, from reflections on Vermeer, whose works are striking for their capacity to convey Dutch domestic life in the 17th century. Among their many virtues and innovations is to dwell sympathetically on women’s interiority, their everyday activities and their creativity – for example, when conversing with one another, when writing a letter (Figure 1), or when seated to perform at the virginal (Figure 2) – through a compelling visual humanism focused on facial expressivity and embodied experience. This prompts a first message for the MIR community: women as bearers of diverse modes of subjectivity, and subjectivity as coloured by embodied experience, which in human societies is mediated by such social differences as gender, class, race and ethnicity – so that subjectivities, experience and embodied experience are not everywhere the same.
A second fascinating message for the MIR community, and for the era of big data, stems from the fact that a mere thirty-four works are attributed to Vermeer: it is that cultural value does not equate with scale, size or ubiquity. A third message arises from the ways in which Vermeer experimented with rare pigments, with the portrayal of light, and with perspective, for he is believed to have employed optical aids like the camera obscura to achieve his most spectacular effects. What we witness in Vermeer, then, is early science in the service of art – which prompts a question for ISMIR: which masters or mistresses does the science and engineering of MIR serve?
Before proceeding, it is important to acknowledge that I write as an outsider to MIR – I am not a scientist but a qualitative social scientist – and it is perhaps foolhardy to advise colleagues whose methodologies I would be hard pressed to understand. Nonetheless, I am emboldened because in recent years I have been in dialogue with some colleagues in MIR. On the other hand, as someone who has worked for decades on the anthropology and sociology of music, media and digital cultures, I have had to confront and analyse matters of gender, class, race and ethnicity in my research, issues that arose vividly in my ethnographic studies of computer music (Born, 1995) and the BBC (Born, 2005a), as well as in a European Research Council (ERC)-funded interdisciplinary research program that I led involving ethnographic studies tracing the impact of digitization and digital media on musical practices worldwide (Born and Devine, 2016; Born, 2021). These experiences caused me to develop a theoretical account of how social relations of gender, class, race and ethnicity enter into – or mediate – the musical and media fields and institutions that my team and I researched (Born, 2012). It is on this basis that I developed the framework set out below.
To take one element of the ERC project: in a mixed qualitative and quantitative study, Kyle Devine and I showed that of the young people entering higher education in the UK to study music technology degree courses over a five year period, 90 per cent were male, and that they came from a lower social class background and had only slightly higher representation of Black, Asian and Minority Ethnicity (BAME) students than the average demographic profile for all British undergraduate students (Born and Devine, 2015). This is a restricted gauge of humanity entering music technology, and it appears to correlate with the gender equality challenges for ISMIR, as shown also by statistics on gender representation at ISMIR that the chairs ran for the 2019 conference. In what follows I will certainly address questions of diversity in this demographic sense. However, I will do more.
I want to draw out four interrelated dimensions of diversity with which I suggest ISMIR should engage. Each has a certain autonomy and matters in itself. But they are also interrelated and together present a formidable lattice of challenges.
1) The first is the one just referred to: who gets to be a member of ISMIR, which is to say, what is the demographic makeup of MIR as a profession? Could it be more diverse? And how do the field’s feeder educational and employment structures result in these ‘typically WEIRD’, male (gendered) and white (raced) demographics? But while matters of educational and employment equality and equity are critical goals in themselves, diversity raises much more than this.
2) The second dimension of diversity is to do with whose music and which music, among the vast ocean of sounds in the world, gets to be the focus of MIR’s influential scientific practices. As shown above, it is an accepted strand of criticism within the MIR community that the techniques and parameters employed in MIR tend to derive from, and reflect, commercially dominant areas of global popular music. Yet those techniques and parameters come to be applied in powerful technologies as though they were universal, with inevitable what might be called ‘de-pluralizing’ effects. Why is this the case? Could MIR be more responsive to musical diversity – which is likely to equate with social and cultural diversity? And linking back to the first dimension: might a more diverse population of MIR practitioners favour awareness of and sensitivity to a wider spectrum of the world’s musics?
3) The third aspect of diversity is directly implicated in the previous point: it concerns the foundational epistemological and ontological premises that currently undergird MIR as a field. In the face of greater musical diversity, can such premises be sustained or will they necessarily be pluralised – and therefore fundamentally challenged? How can MIR equip itself with epistemologies and ontologies of music responsive to a greater diversity of musical cultures? Might that demand new interdisciplinary partnerships, bringing areas of humanistic and social scientific music scholarship into dialogue with MIR in ways that are currently undeveloped?
4) The fourth and final dimension of diversity, an overarching question, also follows on. This returns to my earlier question, riffing on Vermeer: which masters or mistresses does MIR serve – the profit-seeking imperatives of commercial music tech corporations and online music services, entangled as they are in the recorded music industries? And which mistresses should MIR serve in order to diversify its goals, partners and worldly effects? In sum: could MIR cultivate a more plural set of orientations and institutional partners so as to include non-commercial, publicly-oriented initiatives aimed at enhancing human musical flourishing, and – given escalating anxieties about impending climate catastrophe – the need to create sustainable music economies? As one of the escalating preoccupations of our time, should this issue be foregrounded within the MIR community?
The remainder of the article elaborates on these four facets of diversity, drawing out connections between them.
Regarding the first dimension of diversity, important insights come from research in science and technology studies (STS) that probes how certain kinds of social relations come to be immanent in technological design. As the STS scholar Madeleine Akrich has argued, design is a key stage in which engineers ‘script’ envisaged uses into their technologies, in this way ‘configuring’ potential user identities and preferring certain patterns of use (Akrich, 1992). To exemplify: the workings of gender, in particular, have been probed by Nelly Oudshoorn and her colleagues, who undertook empirical research on the design of information and communications technologies (ICTs). Through comparative case studies of the design cultures of two online ‘digital cities’ developed in the Netherlands, Oudshoorn et al. found that in both cases the designers worked with an ‘I-methodology’ (Oudshoorn et al., 2004). Although the designers aimed to create technologies with all-embracing appeal and usability – to configure the user as ‘everybody’ and ‘anybody’ – a key slippage was evident in a guiding assumption that the designers themselves, and their own subjective and corporeal experiences of the technologies, represented a universal user. Since ICT designers are predominantly male, their ‘I-methodology’ hindered their ability to imagine the potential and actual diversity of the eventual population of users. The technologies resulting from these processes, emerging from gendered conditions and assumptions, embodied and entrenched existing norms – prominent among them gender norms. In their words,
‘The dominance of the I-methodology…resulted in a gender script: the user who came to be incorporated into the design of [the ICT] matched the preferences and attitudes of male rather than female users. As almost all designers were male and technologically highly competent, they made [the ICT] into a masculine technology.’ (Oudshoorn et al., 2004, p. 44)
To design technologies and interfaces that respond to real social diversity, then, Oudshoorn et al. argue that I-methodology, along with its universalising projections, must be reflexively acknowledged and consciously changed. In this way they offer a powerful cautionary tale relevant to all those fields feeding into the design of new technologies, including MIR.
In suggesting that social relations and imaginaries are scripted into technological design, I am not making an essentialist point that gender identities always determine design; nor do I imply that actual users are entirely constrained to follow the scripts inscribed in the technologies. As Akrich argues, the uses made of any technology cannot be read off design assumptions; user-configurations are not wholly determinant of actual uses. Rather, I am suggesting that wider social relations of gender, race, class and so on, on the one hand, and practices of technological design, on the other, exist in relations of mutual constitution. In other words, they mediate one another and result in the conception and design of certain kinds of technologies. That is why who gets to engineer matters. Now, there is clearly a danger that such ideas can become too crude; nonetheless, the point is that all kinds of experience – including gendered experiences, but not limited to this – will affect design paradigms. So this is not just a matter of equal opportunities for those currently marginalized, whose talents may be unrecognised by the engineering profession, although that is important in itself. It is equally about the likely benefits of enriching, by diversifying, the social ecology of research groups, and thus the collective imagination and design practices of which they are capable – so that everyone gains, potentially including end-users.
The underlying point is that scientific and engineering fields like those of MIR, which are all too easy to envision as spaces isolated from wider social forces, are in fact consequential sites in which these social forces are played out, and therefore ripe for a politics of diversity. A first reflexive challenge for MIR as a field, then, is to recognize MIR as a site in which existing cultural categories, as they relate to social inequalities, injustices and biases, are being reproduced or amplified, and that it could be otherwise. Whoever is doing the science and engineering, I-methodology and its normative drive needs to be altered, and consciousness of the diversity of users and user needs – and also of musics and musical communities – should take its place, resulting optimally in more diverse knowledge and technologies.
As to the problem of attracting and educating a next, larger generation of women engineers and engineers from BAME communities, this remains a key challenge for the STEM (science, technology, engineering and mathematics) disciplines at large, and one without easy solutions. The recursive weight of history is great, as Judy Wajcman, a leading feminist scholar in STS, comments regarding the gendering of STEM:
‘In contemporary Western society, the hegemonic form of masculinity is still strongly associated with technical prowess and power.… Notwithstanding the recurring rhetoric about women’s opportunities in the new knowledge economy, men continue to dominate technical work. … These sexual divisions in the labour market are proving intransigent and mean that women are largely excluded from the processes of technical design that shape the world we live in.’ (Wajcman, 2010: 145)
For Wajcman, long-standing ideologies of masculinity are a core force behind the scarcity of women in STEM and the gendering of the cultures of STEM, suggesting that masculinity itself might benefit from being reflexively scrutinized. Is this an exercise for ISMIR and its feeder fields? Can it be avoided?
Finally, here, it is salutary to invoke the philosopher Peter-Paul Verbeek’s work on the politics and ethics of technological design, notably his paper ‘Materializing morality: Design ethics and technological mediation’ (Verbeek, 2006). Verbeek has long argued that STS should enter into direct dialogue with engineers and engineering discourses, thereby helping to foster among engineers a self-critical and self-reflexive paradigm such that ‘the ethics of engineering design… take more seriously the moral charge of technological products and rethink the moral responsibility of designers accordingly’ (379). His eloquent point is one that might inform reflections within MIR about the moral seriousness of matters of diversity.
The second dimension of diversity is the question of whose music and which music become the focus of MIR’s scientific practices and their influential applications, for example in recommendation systems. It is well accepted that, as Emilia Gómez and her colleagues put it, ‘Since the beginning of [MIR],… most of its models and technologies have been developed [on the basis of] mainstream popular music in the so-called “Western” tradition’ (Gómez et al., 2013: 111). They continue, however, that the last few years have seen ‘an increasing interest in applying available techniques to the study of traditional, folk or ethnic music’ (ibid.). This is certainly laudable, and it is clear that computational ethnomusicology has been developing as a key test-bed for opening up and diversifying the musical sounds and cultures with which MIR engages.
Recently, for example, Xavier Serra and his team have put rigorously to the test the limits of the kinds of musical sounds, knowledge and representations dealt with by MIR in the ERC-funded CompMusic project (2011–17). As they describe it: ‘we work on computational approaches to describe music recordings by emphasizing the use of domain knowledge of particular music traditions … focusing on five music cultures’ from the Maghreb, China, Turkey, North and South India (Serra, 2014: 1; Serra, 2017). ‘A target application for this work’, they continue, ‘is a system with which to browse through audio music collections of the chosen cultures; being able to discover specific characteristics of the music and relationships between different musical concepts’ (ibid.). While they assume that ‘there are universal musical concepts, like melody and rhythm’, they stress ‘that many important aspects of a particular music recording can be better understood by considering cultural specificities’ (2014: 2). They focus on non-Western ‘art music traditions, [and] thus on types of music that have been formalised and for which theoretical frameworks have been proposed for their understanding’ (2). The team consulted expert musicians and musicologists in each tradition to select recordings for the corpora being put together, and their efforts seem exemplary in these terms.
Addressing similar challenges, Olmo Cornelis and collaborators aimed to test the presumed universality of existing analytical approaches to pulse and tempo by examining how existing ‘automated tempo estimation approaches perform in the context of Central-African music’ (Cornelis et al., 2013:1). Advocating a ‘multidisciplinary approach’, they employed a range of musicological and ethnomusicological scholarship to hone the study, from Fred Lerdahl and Ray Jackendoff to Kofi Agawu and Justin London. While they acknowledge that ‘a major difficulty is… the dominance of Western musical concepts in content-based analysis tools’ (2), it is surprising that this team was nonetheless content to check whether existing beat trackers can or cannot be used reliably for Central African music – rather than taking the cue from that music’s difference with regard to pulse and tempo and, in that light, asking: which tools might be needed to address it, and how radically does MIR’s existing conceptual and computational toolset demand to be revised in order to tackle the salient aesthetic features of this non-Western music? There is, in other words, surely a risk of teleology in beginning with existing tools derived from the analysis of Western pop and seeing if they ‘work’ for some fragmentary, decontextualised trait extracted from the total socio-musical existence of Central African music. Indeed, this teleology is registered in their preliminary comment that ‘the research in this paper relies on existing computational tools, and does not aim to introduce novel approaches in beat tracking and tempo estimation’ (2).
Although an auto-critique of the elevation of Western pop music as a universal model for all music is beginning to develop within MIR, then, it seems that the profound challenges posed by ‘other’ musics have not yet been sufficiently registered and worked through (Born and Hesmondhalgh, 2000). One kind of challenge stems from all those acoustic, electronic and computer art, popular and folk musics in which melody, harmony, tempo and rhythm do not capture key aesthetic features, which are likely to include qualities such as timbre, ‘gesture’, microtonality, melisma, spatialisation (Smalley, 1986, 2007), and the rhythmic subtleties identified by Steven Feld and Charles Keil as ‘groove’ or ‘participatory discrepancies’ (Keil, 1987; Keil and Feld, 1994). The problem is that decades after the musicological debate began over these less readily quantized and notated, multidimensional aesthetic qualities, musicology still has great difficulty analysing them. Perhaps the musicologist Anne Danielsen’s work – which combines quantitative and qualitative research on microrhythm in groove-based popular musics (Danielsen, 2010, 2012) – casts light and might be adapted into computational analytical tools of great subtlety. But this seems to be the scale of the problem. It is also notable that, in reaching across disciplines, there is a risk of going backwards into the future: the work of the ethnomusicologist Alan Lomax (Lomax 1968, 2003), for example, is cited approvingly in some computational ethnomusicology; but his reductive quantitative approach to the analysis and classification of non-Western musics has long been controversial and criticised within his own field (Feld, 1984).5
The points made so far about the challenges posed by non-Western and other musics have been limited to their subtle intra-musical features. When it comes to broader cultural understandings of these musics, the problems for MIR are of another order, and can include the very assumption that such musics can or should be represented and made available globally through recordings, and digital corpora, at all.
The nature of these problems can be conveyed by a case study from my ERC research program carried out in North India by Aditi Deo (Deo, 2021). Deo wanted to investigate the digital recording and archiving of North Indian folk musics, particularly music from low caste communities, part of a current wave of archiving initiatives in the region stimulated by ideas of cultural heritage and facilitated by the ease of digital recording. In Rajasthan one focus of her work was a cultural collective called Lokayan run by high caste young male activists. Lokayan was in partnership with a Bangalore-based organization called the Kabir Project led by middle class artists and intellectuals, funded mainly by national and international charities and development agencies. Together, Lokayan and the Kabir Project were digitally recording a number of elderly, illiterate, low caste hereditary women folk singers renowned for their repertoire of songs devoted to Kabir, a 15th century Hindu mystic poet and saint popular among low caste communities. Through a secular pluralist reinterpretation of Kabir, the Kabir Project sought to use ordinary people’s devotion to Kabir’s poetry and related folk music to attract local people to a politics of secular nationalism. However, among the low caste adherents, the saint and related cultural and musical traditions have a different significance, for they are associated with resistance to caste-based discrimination and, thus, with opposition to the pervasive inequalities of caste and class to which these groups are subject. Hence, the very use that the Kabir Project was making of Kabir and related folk traditions entailed a wilful erasure of the affective and political meanings of the saint and the music for local low caste communities.
One of the most renowed singers in the local community was the blind, elderly woman folk singer Gavara-devi Gosayi. In this film clip, shot by Deo in 2012, Gosayi was performing live at an annual festival called the Kabir Yatra. The sheer pleasure and joy of the audience, drawn largely from surrounding rural communities, is palpable as Gosayi and her musicians perform live. The second clip, in contrast, shows Gosayi singing, accompanied by her own harmonium playing as well as two young male percussionists, inside a local recording studio where she was being recorded by Lokayan activists. Gosayi is visibly uncomfortable; she had never experienced recording before these recording sessions, and she preferred to play live performances out of doors for her usual audience – local followers of Kabir. What comes across is how social inequalities of gender, caste and age permeate the recording session, shaping the sociality of the studio and its musical results. As Deo notes in her analysis of the sessions, the young men overrode the singer on focal aesthetic matters: ‘Critical aspects of the energy of this genre came from its improvisatory form, interactions between vocalist and instrumentalists, and the open-air contexts of its customary performance. The cramped studio, and Gavara-devi’s unfamiliarity with studio techniques, skewed the recording process…. The recording studio emerged… as a space of negotiation over musical sounds and technical practices between those with unequal social status and power’ (Deo, 2021: 17). In Lokayan recording sessions, then, social relations of gender, caste and age influenced both the recording process and the ensuing musical sounds. Gosayi’s music was made for live performance, not recording, and she was not in a position to understand the implications of recording in the sense either of the sounds being abstracted, frozen, commodified, and lifted into global circulation online, or of any potential monetary reward. In general, ideas of individual authorship, ownership and copyright are quite alien to these musicians.
It was when we took this research to an anthropology conference in Delhi that we became aware of the intensity of the issues that it raises. For a clamorous debate occurred about all digital archiving projects focused on Indian folk musics, coming as these musics do from low caste and ‘tribal’ groups. The accusation made by critics of these practices was deeper than musical accuracy, cultural sensitivity or even turning this music into a commodity. It was, rather, that the very disembedding of this music from its live, local communities of practice by recording perpetrated a form of ontological violence, an ontological violence fuelled by the social, cultural and economic distance between local musicians – whose music is profoundly embedded in communal socialities and religious cosmologies – and development-aid-funded activists, and a violence that those who value and respect Gosayi, her music and her community must call out.
So the challenge posed to MIR by non-Western musics is not just to get the rhythmic or timbral analysis right, or to take the cues about musical difference from such musics and not impose inappropriate musical values, qualities and systems on them. It is also to recognize that all musics – and most spectacularly those non-Western musics that have as yet resisted incorporation into the global archives of digitized recorded music – have an ontology, that those ontologies are plural and often deeply social as well as religious or cosmological (Bohlman, 1999; Born, 2013; Sykes, 2019), and that they may be antithetical to, or profoundly different from, the universalised music ontologies assumed by MIR and companion disciplines. This poses ethical and political tests akin to those posed by discourses of sustainability – but here, tests of musical and cultural sustainability.
The third aspect of diversity follows on: it concerns certain epistemological and ontological assumptions underpinning MIR as a field, which turn on the way that ontology is understood. If MIR intends to embrace a wider diversity of musical cultures, can such assumptions be maintained? How could MIR equip itself with tools suited to analysing and modelling a greater diversity of musics?
These questions arise from the very different approaches to analysing ontologies of music that characterise MIR, on the one hand, and contemporary musicology and ethnomusicology, on the other. In computer and information sciences, ‘an ontology defines a set of representational primitives with which to model a domain of knowledge or discourse. The representational primitives are typically classes (or sets), attributes (or properties), and relationships (or relations among class members)’ (Gruber, 2009: 1). Similarly, for music informatician Darrel Conklin, applying this approach to a corpus of Basque folk songs, ‘an ontology is an encoding of concepts and their relations in a domain of knowledge’ (Conklin, 2013: 162). In these disciplines, ontology is therefore conceived in terms of modelling concepts, knowledge and representation. In marked contrast, in contemporary musicology and ethnomusicology, such an approach would be identified with the analysis and modelling of an epistemology of music – that is, conceptual knowledge about a certain music. For in these disciplines, ontologies of music are considered to exceed knowledge and representation. If, in philosophy, ontology is ‘the study of what there is’ (Hofweber 2020), or ‘the study of being or existence’ (Gruber ibid.), then contemporary musicology and ethnomusicology adhere more closely to this approach when analysing ontologies of music: they are interested in analysing ‘what music is’ in any particular musical culture in terms that include but exceed knowledge and representation by taking into account the embodied, social and material aspects of any musical culture (Bohlman, 1999; Born, 2005b, 2013).
Currently, MIR takes a range of digital data as an approximation of the contours of a musical culture. In this sense MIR itself embodies a theory of ‘what music is’, or what might be called an analytical ontology (Born, 2010: 232): one that assumes that music can universally be represented by datasets of digital sound recordings, perhaps with added digitised scores or other kinds of metadata, various kinds of knowledge and representation, perhaps with human annotations, and that such representations capture the most salient aesthetic and ontological features of all musics. But we have just seen through Deo’s Rajasthan case study that this is not the case. For the music of Gavara-devi Gosayi and her community is immanently bound to live performance, to the socialities engendered in such performance situations, and to the cosmologies and poetry associated with a 15th century poet-saint as they are infused with a politics of resistance to caste oppression. All of these qualities – this rich embodied, social and material assemblage – together constitute the actors’ ontology of music, and many of them are shed once the sounds are captured by recording, digitised and put into circulation on the internet. So what gets lost in MIR’s epistemological and ontological assumptions? To presume MIR’s theory of music is to foreclose on many living musics, and to prioritise ontologically those sounds that, through recording, have been disembedded from originating bodies, socialities and locales – a process that R. Murray Schafer named ‘schizophonia’: how recording splits sounds from its originating sources (Schafer, 1994).
Turning this around, the question arises for the MIR community: if more musical diversity is sought, and if respecting the musical ontologies of the source communities is ethically responsible and aesthetically desirable, then what kind of knowledge practices might support MIR to analyse and model these kinds of musical cultures ‘as a whole’ (Serra, 2017: 1), or at least more adequately and less reductively? Would it not make sense to consider whether there are ways to bring the cultural, social and material dimensions immanent in diverse ontologies of music into MIR’s analytical frame? This means going far beyond thinking of ‘a musical culture as a stylistically coherent musical repertory’ that can only be accessed via ‘available digital traces’ (op cit. 2). To tackle this challenge would mean prizing open those base epistemological and ontological premises of MIR through close dialogue with those music scholars and disciplines whose specialism is the analysis of music’s cultural, social and material as well as sonic dimensions: that is, music anthropologists and sociologists. Implicit in this move would be another challenge: to break with ‘mentalist’ conceptions of music, an abstraction that is perhaps so axiomatic in computation, given its long-standing links to cognitive and information-theoretical ontologies, that it has become second nature in MIR.
What, then, if we began again, refreshing MIR by building a new kind of relationship between the field and music anthropology and sociology? And by taking the terms of the ontology at issue from each musical culture, rather than by squeezing it into the existing template of what can most readily and efficiently be formalised, computationally represented and modelled? Doing this entails recalibrating how interdisciplinarity proceeds in this field. Instead of taking computer and information sciences to be keystone disciplines around which other disciplines revolve and to which they are subordinate, another approach would entail enabling the distinct disciplines to dialogue without hierarchy, and to ask music informaticians to consider alternative epistemological and ontological grounds, perhaps by inventing novel hybrids of qual-quant modelling in ways as yet unimagined and unforged.
Such a reshaped interdisciplinary practice is actually envisaged in a UK Economic and Social Research Council-funded research project on interdisciplinarity that I co-directed, which examined empirically different kinds of interdisciplinary practice in several major interdisciplinary fields and adduced three basic forms (Barry and Born, 2013a, 2013b). The first, which we call the ‘additive’ or ‘synthesis’ mode of interdisciplinarity (and which is close to what is often called multidisciplinarity), involves bringing different disciplines to the table and allowing each to contribute as they are, without any of them being changed. The second, the ‘subordination’ or ‘service’ mode of interdisciplinarity, is akin to the situation now in MIR: a core discipline or disciplines, the computer and information sciences, supervise inputs from other, subordinate disciplines, so that a ‘dash’ of the social or cultural may be added to the framework without this disturbing the premises of the master disciplines. This subordination mode is a common way in which the physical and natural sciences bring in aspects of the qualitative humanities and social sciences; and this is what MIR is doing when it adds a touch of ethnomusicology to its research, but without this threatening to alter or disturb its core epistemological and ontological premises. In effect, nothing much need change. In contrast, the third mode, the ‘agonistic’ mode of interdisciplinarity, is the most promising because in this mode there is no hierarchy, and the potential is that all contributing disciplines might change through mutual transformations and the genesis of entirely unforeseen, novel methodologies and theories.
This third, agonistic mode of interdisciplinarity therefore takes the form neither of a synthesis nor a hierarchy. Rather, it is driven by an agonistic relationship to existing forms of knowledge – that is, by a common sense of the problematic limits of established disciplines. What we are highlighting is how agonistic interdisciplinarity stems from a collective desire to contest or transcend the prevailing epistemological and ontological assumptions of given or established disciplines through innovative knowledge practices that aspire to render the new hybrid interdiscipline irreducible to the simple addition of its antecedent disciplines – in this case, MIR plus music anthropology/sociology.
The leading information theorist Geoffrey Bowker portrays something akin to this mode of interdisciplinarity as key to the ‘new knowledge infrastructures’ demanded in the present. As he puts it, ‘The epistemic cultures of the academy all have their own historical “ways of knowing”… [But today], the objects of study… require the triangulation of multiple methodologies, both qualitative and quantitative, and call upon… investigators to integrate multiple epistemic viewpoints’ (Bowker, 2018: 207). Indeed, in finding inventive ways to overcome or finesse the qual-quant divide, an agonistic interdisciplinarity between computational musicology and music anthropology and sociology could prototype new methodologies that are urgently required by the digital humanities in general.
Two further points follow. Such a new interdisciplinarity is not limited to music anthropology and sociology. If they can bring social, cultural and material dimensions of music to MIR, then psychoacoustics can bring auditory perception – and more generally, the natural sciences of music can bring greater acuity in analysing features relevant to cognition (Aucouturier and Bigand, 2013) – and music analysis can throw light on higher dimensions of musical structure. And the benefits potentially flow both ways: these disciplines will in turn be nourished by what MIR brings to the interdisciplinary exchange. To take two examples: MIR can ‘bring unprecedented signal-processing sophistication to cognitive neuroscience and psychology’ (ibid.: 495); while the Tarsos platform developed by Joren Six, Olmo Cornelis and Marc Leman offers tools for enhancing the analysis of pitch organization beyond prevalent concepts in Western music theory – ‘octave equivalence, stability of tones, equal tempered scale and so on’ (Six et al., 2013: 126). Tools like Tarsos respond to the extraordinary diversity of pitch distribution in non-Western classical and folk musics and some areas of 20th century art music; and research of this kind conveys vividly how much music theory has to gain, in terms of diversifying its conceptual range and resources, from dialogues with MIR.
I come finally to the fourth dimension of diversity: the question of which masters or mistresses MIR serves. MIR is an international field based mainly in academia, but it has strong links to industry and the burgeoning start-up ecology of music AI. My sense is that its culture is ambiguous, oriented both by the ethos of academia and by responsiveness to the commercial goals of the digital music economy. It is as though, fortuitously, MIR serves both; and this seems to go along with certain commercial precepts being transferred almost unconsciously into the field. In an area of application like music recommendation, the focus is on a series of goals – attracting and retaining consumers, increasing user engagement, boosting revenues – that find their way into scientific analysis and matters of design. So just as a theory of music is manifest in MIR, where music is conceptualized, after Western pop, as everywhere taking the form of a ‘track’ or ‘song’, a theory of the human subject is built into recommendation – a human subject who is existentially overwhelmed by the scale of the global digital music archive, whose evolving taste is structured by a preference for ‘similarity’, who is individualized, and who seeks to maximize her/his listening events (Born et al., 2020). Built into machine learning applications, these models are likely performatively to shape, rather than merely reflect, listener practices (Prey, 2018): another potential reduction of human cultural diversity.
Against this background I want to ask: which mistresses should MIR serve in order to diversify its goals, partners and worldly effects? If MIR’s pursuit of scientific research oriented to technological innovation often comes to be tied, directly or indirectly, to the drive for economic growth, then the escalating criticisms of the FAANG6 corporations along multiple vectors (among them transparency, accountability, privacy and security), and parallel concerns about sustainable (music) economies (Devine, 2019), remind us of the urgent need for other goals and values to guide future science and engineering. We might ask: what would computational genre recognition and recommendation look like if, under public-cultural or non-profit imperatives, the incentives driving them aimed to optimise musical self- or group-development, linked to goals of human flourishing (Nussbaum, 2003; Hesmondhalgh, 2013)? Or if they aimed to foster not a logic of ‘similarity’ but diversity? Or if they were built to enhance the potential cultural and social as well as musical riches and benefits of music discovery? Or if, rather than honing normative models of genre, computational genre systems could be tuned so as to respond to and give insight into different individual and collective perspectives on genre? Such questions would imply rendering the interface both legible or transparent and modifiable; it might mean enabling the user to call up and browse among the genre systems – that is, the musical universes – of, say, Angélique Kidjo, Kim Gordon or Diamanda Galás, of Indonesian noise music scenes or Canadian First Nation Country musicians. Being less normative may be less commercially viable, but it might well enhance and enliven our music-computational tools, experiences and futures, while empowering users and responding to, and stimulating, the sociable nature of our musical lives.
Diversity has many potential meanings. It is often understood in terms of the social makeup of a profession or discipline – whether MIR or music anthropology. And this certainly matters. But one of the ways it matters is by fostering a population of practitioners harbouring a more variegated cache of cultural and musical experiences to inform practice – in MIR, ways of computationally analysing and modelling music. This article has set out four key interrelated components of diversity relevant to MIR, each of which has an autonomy, while together they add up to a series of potential interlocking changes.
For progress to occur, it should be clear by now, the new forms of interdisciplinarity envisaged should have ambitions not only of epistemological and ontological kinds but of ethical and social kinds. To be clear: this is not a call to reinvent an already discredited wheel – one example is the repeated return to data-rich but impoverished conceptions of the social in the lineage linking Adolphe Quetelet, the early 19th century inventor of empirical social research and of the idea of ‘social physics’ (Donnelly, 2015; Adolf and Stehr, 2018), through Gabriel Tarde, the early 20th century sociologist who held that ‘society is imitation’ fuelled by collective flows of affect (Tarde, 1903),7 to the ‘social physics’ espoused by MIT’s Alex Pentland (Pentland, 2014).8 Today’s exponents of this lineage risk repeating conceptual and methodological errors while neglecting the abundant resources of contemporary social theory. It is essential, then, to avoid resuscitating outdated paradigms, and a good way to do this is to create an ‘agonistic’ interdisciplinarity integrating current thinking in relevant disciplines: to put MIR into interdisciplinary dialogue with today’s ethnomusicology, music anthropology and sociology.
The time is ripe for sustained interdisciplinary engagements in ways previously untried, and the new hybrid knowledge forms suggested in this article demand cumulative and coordinated efforts. The recent concept of ‘responsible innovation’, the title of a journal founded in 2014, offers a stimulus; it foregrounds the benefits of reflexivity, inclusion and responsiveness in emerging technological design, suggesting that these are favoured by ‘an open organisational culture, emphasising innovation, creativity, interdisciplinarity, experimentation and risk taking; … [and] commitment to public engagement and to taking account of the public interest’ (Stilgoe et al., 2013: 1573). This suggests the importance of creating new institutional ecologies, collective efforts within which such values can be cultivated, for example when transforming MIR through the four facets of diversity. And this in turn prompts a call for action: an invitation to MIR colleagues to join myself and others from relevant disciplines in forming a think tank or similar initiative to develop and take forward these ideas. Think tanks in this vein have recently arisen to address ethical issues surrounding the development and application of AI.9 Music informatics, given its prominent position in the ongoing evolution of the data and computational sciences as they affect culture, surely deserves its own initiative of this kind.
1See the statement about the conference posted on 16 July 2019: https://transactions.ismir.net.
3I wrote this article (in late August 2020) as the protests erupted in Kenosha, Wisconsin, over the shooting of Jacob Blake by a police officer, one of a series of acts of police violence in the USA that inflamed the Black Lives Matter movement.
4The paper that set this controversy in motion was written by Dr. Philip A. Ewell, originally a plenary talk at the 2019 Society for Music Theory meeting: https://mtosmt.org/issues/mto.20.26.2/mto.20.26.2.ewell.html. For a journalistic account of what ensued, see: https://www.insidehighered.com/news/2020/08/07/music-theory-journal-criticized-symposium-supposed-white-supremacist-theorist.
5On the history of responses to Lomax’s cantometrics, see Savage (2018).
7On Quetelet, Tarde and their relationship in relation to the history of criminology, see Beirne (1993).
8On ‘social physics’ in Quetelet and Pentland, and the connections between them, see Adolf and Stehr (2018).
I am grateful for dialogues with Bob L. T. Sturm, Fernando Diaz and others from the ‘AI, Recommendation, and the Curation of Culture’ workshop in Paris, October 2019, notably Jeremy Morris and Ashton Anderson. I am also grateful to Emilia Gómez for conversations, information and for inviting me, with Cynthia Liem, to give a keynote at the 20th ISMIR conference. My thanks, finally, to Andrew Barry and Kyle Devine, with whom many of these thoughts were incubated.
The author has no competing interests to declare.
Adolf, M., & Stehr, N. (2018). Information, knowledge, and the return of social physics. Administration and Society, 50(9), 1238–1258. DOI: https://doi.org/10.1177/0095399718760585
Ahmed, S., & Swan, E. (2006). Doing diversity. Policy Futures in Education, 4(2), 96–100. DOI: https://doi.org/10.2304/pfie.2006.4.2.96
Aucouturier, J.-J., & Bigand, E. (2013). Seven problems that keep MIR from attracting the interest of cognition and neuroscience. Journal of Intelligent Information Systems, 41(3), 483–497. DOI: https://doi.org/10.1007/s10844-013-0251-x
Barry, A., & Born, G. (2013a). Introduction – Interdisciplinarity: Reconfigurations of the social and natural sciences. In A. Barry & G. Born (Eds.), Interdisciplinarity: Reconfigurations of the Social and Natural Sciences, pages 1–56. London and New York: Routledge. DOI: https://doi.org/10.4324/9780203584279
Barry, A., & Born, G. (2013b). Interdisciplinarity: Reconfigurations of the Social and Natural Sciences. London and New York: Routledge. DOI: https://doi.org/10.4324/9780203584279
Born, G. (1995). Rationalizing Culture: IRCAM, Boulez, and the Institutionalization of the Musical Avant-Garde. Berkeley, CA: University of California Press. DOI: https://doi.org/10.1525/9780520916845
Born, G. (2005b). ‘On musical mediation: Ontology, technology and creativity’. Twentieth-Century Music, 2(1), 7–36. DOI: https://doi.org/10.1017/S147857220500023X
Born, G., & Devine, K. (2015). Music technology, gender, and class: Digitization, educational and social change in Britain. Twentieth-Century Music, 12(2), 135–172. DOI: https://doi.org/10.1017/S1478572215000018
Born, G., & Devine, K. (2016). Gender, Creativity and Education in Digital Musics and Sound Art. Special issue, Contemporary Music Review, 35(1). DOI: https://doi.org/10.1080/07494467.2016.1177255
Born, G., Morris, J., Diaz, F., & Anderson, A. (2020). Artificial intelligence, recommendation, and the curation of culture. Report from the conference Artificial Intelligence, Recommendation, and the Curation of Culture. Paris, France, 4–5 October, funded by the Canadian Institute for Advanced Research (CIFAR).
Bowker, G. (2018). Sustainable knowledge infrastructures. In N. Anand, A. Gupta & H. Appel (Eds.), The Promise of Infrastructure, pages 203–222. Durham, NC: Duke University Press. DOI: https://doi.org/10.1215/9781478002031-009
Conklin, D. (2013) Antipattern discovery in folk tunes. Journal of New Music Research, 42(2), 161–169. DOI: https://doi.org/10.1080/09298215.2013.809125
Cornelis, O., Six, J., Holzapfel, A., & Leman, M. (2013). Evaluation and recommendation of pulse and tempo annotation in ethnic music. Journal of New Music Research, 42(2), 131–149. DOI: https://doi.org/10.1080/09298215.2013.812123
Danielsen, A. (2012). The sound of crossover: Micro-rhythm and sonic pleasure in Michael Jackson’s ‘Don’t Stop ‘Til You Get Enough’. Popular Music and Society, 35(2), 151–168. DOI: https://doi.org/10.1080/03007766.2011.616298
Deo, A. (2021). Oral traditions in the aural public sphere: Digital archiving of vernacular music in North India. In G. Born (Ed.), Music and Digital Media: A Planetary Anthropology. Durham, NC: Duke University Press.
Devine, K. (2019). Decomposed: The Political Ecology of Music. Cambridge, MA: MIT Press. DOI: https://doi.org/10.7551/mitpress/10692.001.0001
Donnelly, K. (2015). Adolphe Quetelet, Social Physics and the Average Men of Science, 1796–1874. London and New York: Routledge. DOI: https://doi.org/10.4324/9781315653662
Gómez, E., Herrera, P., & Gómez-Martin, F. (2013). Computational ethnomusicology: Perspectives and challenges. Journal of New Music Research, 42(2), special issue on ‘Computational Ethnomusicology’, 111–112. DOI: https://doi.org/10.1080/09298215.2013.818038
Gruber, T. (2009). Ontology. In L. Liu & M. Tamer Özsu (Eds.), The Encyclopedia of Database Systems. Berlin: Springer-Verlag. Accessed 29 August 2020. http://tomgruber.org/writing/ontology-definition-2007.htm. DOI: https://doi.org/10.1007/978-0-387-39940-9_1318
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2–3), 61–83. DOI: https://doi.org/10.1017/S0140525X0999152X
Hofweber, T. (2020). Logic and ontology. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. Accessed 31 August 2020 https://plato.stanford.edu/archives/sum2020/entries/logic-ontology/
Holzapfel, A., Sturm, B. L., & Coeckelbergh, M. (2018). Ethical dimensions of music information retrieval technology. Transactions of the International Society for Music Information Retrieval, 1(1), 44–55. DOI: https://doi.org/10.5334/tismir.13
Hu, X., Choi, K., Lee, J. H., Laplante, A., Hao, Y., Cunningham, S. J., & Downie, J. S. (2016). WiMIR: An informetric study on women authors in ISMIR. Proceedings of the 17th International Society for Music Information Retrieval (ISMIR) Conference. New York, USA.
Keil, C. (1987). Participatory discrepancies and the power of music. Cultural Anthropology, 2(3), 275–283. DOI: https://doi.org/10.1525/can.1987.2.3.02a00010
Kornberger, M., Bowker, G., Elyachar, J., Mennicken, A., Miller, P., Nucho, J., & Pollock, N. (Eds.) (2019). Thinking Infrastructures. Bingley, UK: Emerald. DOI: https://doi.org/10.1108/S0733-558X201962
Oudshoorn, N. E. J., Rommes, E., & Stienstra, M. (2004). Configuring the user as everybody: Gender and design cultures in information and communication technologies. Science, Technology & Human Values, 29(1), 30–63. DOI: https://doi.org/10.1177/0162243903259190
Prey, R. (2018). Nothing personal: Algorithmic individuation on music streaming platforms. Media, Culture and Society, 40(7), 1086–1100. DOI: https://doi.org/10.1177/0163443717745147
Savage, P. E. (2018). Alan Lomax’s cantometrics project: A comprehensive review. Music & Science, 1, 1–19. DOI: https://doi.org/10.1177/2059204318786084
Serra, X. (2014). Creating research corpora for the computational study of music: The case of the CompMusic Project. In Audio Engineering Society 53rd International Conference: Semantic Audio (article number 1–1). New York, NY: Audio Engineering Society.
Six, J., Cornelis, O., & Leman, M. (2013). Tarsos, a modular platform for precise pitch analysis of Western and non-Western music. Journal of New Music Research, 42(2), 113–129. DOI: https://doi.org/10.1080/09298215.2013.797999
Smalley, D. (1986). Spectro-morphology and structuring processes. In S. Emerson (Ed.), The Language of Electroacoustic Music, pages 61–93. London: Macmillan. DOI: https://doi.org/10.1007/978-1-349-18492-7_5
Smalley, D. (2007). Space-form and the acousmatic image. Organised Sound, 12(1), 35–58. DOI: https://doi.org/10.1017/S1355771807001665
Stilgoe, J., Owen, R., & Macnaghten, P. (2013). Developing a framework for responsible innovation. Research Policy, 42, 1568–1580. DOI: https://doi.org/10.1016/j.respol.2013.05.008
Sykes, J. (2019). Sound studies, difference, and global concept history. In G. Steingo & J. Sykes (Eds.), Remapping Sound Studies, pages 203–227. Durham, NC: Duke University Press. DOI: https://doi.org/10.1215/9781478002192-013
Verbeek, P.-P. (2006). Materializing morality: Design ethics and technological mediation. Science, Technology, & Human Values, 31(3), 361–380. DOI: https://doi.org/10.1177/0162243905285847
Wajcman, J. (2010). Feminist theories of technology. Cambridge Journal of Economics, 34(1), 143–152. DOI: https://doi.org/10.1093/cje/ben057