Open science/Introduction

LIVING VERSION - EDITING IN PROCESS

White Noise: On the Limits of Openness (Living Books Mix)

Gary Hall

One of the aims of the Living Books series is to provide something of a bridge between the humanities and the sciences. In this context, this Living Book on open science takes as its starting point the recent interest in data-intensive scholarship in the humanities – a development which has gained the moniker ‘the computational turn’.


As part of this computational turn’, techniques and methodologies drawn from computer science and related fields – including science visualization, interactive information visualization, image processing, network analysis, statistical data analysis, and the management, manipulation and mining of data ¬– are being used to produce new ways of approaching and understanding texts in the humanities. In this way, the humanities are in the process of becoming redefined as ‘the digital humanities’. The main concern in this emergent field so far has been with either digitizing ‘born analog’ humanities texts and artifacts (e.g. making annotated editions of art and writing of William Blake available to scholars and researchers online), – or gathering together ‘born digital’ humanities texts and artifacts (videos, websites, games, photography, sound recordings, 3D data). With respect to the latter, complex and often extremely large-scale data analysis techniques are being borrowed from computing science and related fields and then applied to these humanities texts and artifacts. As examples of this approach we can mention Lev Manovich and the Software Studies Initiative’s use of ‘digital image analysis and new visualization techniques’ to study ‘20,000 pages of Science and Popular Science magazines… published between 1872-1922, 780 paintings by van Gogh, 4535 covers of Time magazine (1923-2009) and one million manga pages’ (Manovich, 2011), or Dan Cohen and Fred Gibb’s text mining of ‘the 1,681,161 books that were published in English in the UK in the long nineteenth century’ (Cohen, 2010).


Arguably, such data-focused transformations in research can be seen as part of a major alteration in the status and nature of knowledge – an alteration that, according to the philosopher Jean-François Lyotard, has been taking place since at least the 1950s. It involves a shift away from a concern with questions of what is right and just, and toward a concern with legitimating power by optimizing the social system’s performance in instrumental, functional terms. This shift has some major consequences for our idea of knowledge, argues Lyotard:


The nature of knowledge cannot survive unchanged within this context of general transformation. It can fit into the new channels, and become operational, only if learning is translated into quantities of information. We can predict that anything in the constituted body of knowledge that is not translatable in this way will be abandoned and that the direction of new research will be dictated by the possibility of its eventual results being translatable into computer language. The ‘producers’ and users of knowledge must now, and will have to, possess the means of translating into these language whatever they want to invent or learn. Research on translating machines is already well advanced. Along with the hegemony of computers comes a certain logic, and therefore a certain set of prescriptions determining which statements are accepted as ‘knowledge’ statements. (1986: 4)


I want to suggest that the turn in the humanities toward data-driven scholarship, science visualization, statistical data analysis, etc., is part of a broader discursive shift that is taking place at the moment, whereby greater openness, transparency, efficiency and accountability are being promoted as necessary for the proper functioning of both the academy and society.


Open Access

The open access movement is a case in point. John Houghton suggests that the open access academic publishing model, in which peer reviewed scholarly research and publications are made available for free online to all those who are able to access the Internet, is actually the most cost effective mechanism for scholarly publishing. Others have detailed the increases open access publishing enables in the amount of material that can be published, searched and stored, in the number of people who can access it, in the impact of that material, the range of its distribution, and the speed and ease of reporting and information retrieval. The following announcement, posted on the BOAI (Budapest Open Access Initiative) list in March 2010, is fairly typical in this respect:

Today PLoS released Pubget links across its journal sites. Now, when users are browsing thousands of reference citations on PLoS journals they will be able to get to the full text article faster than ever before.

Specifically, when readers encounter citations to articles as recorded by CrossRef (which are accessed via the ‘CrossRef’ link in the ‘Cited in’ section of any article’s Metrics tab), a PDF icon will also appear if it is freely available via Pubget. Clicking on the icon will take you directly to the PDF.

On launching this new functionality, Pete Binfield, Publisher of PLoS ONE and the Community Journals said: ‘Any service, like Pubget, that makes it easier for authors to quickly find the information they need is a welcome addition to our articles. We like how Pubget helps to break down content walls in science, letting users get instantly to the article-level detail that they seek.’ (Pubget, 2010)


Open Data

Yet it’s not just the research literature that is positioned as being rendered more accessible by scientists. Even the data created in the course of scientific research is being promoted as freely and openly available for others to use, analyse and build upon.This includes data sets that are too large to be included in any resulting peer-reviewed publications. Known as a movement towards open data, or data-sharing, this initiative is motivated by the idea that publishing data online on an open basis bestows it with a ‘vastly increased utility’. Digital data sets are said to be ‘easily passed around’; they are seemingly ‘more easily reused’, reanalysed and checked for accuracy and validity; and they supposedly contain more ‘opportunities for educational and commercial exploitation’ (Swan, 2009).

Interestingly, certain academic publishers are already promoting the possibility of linking their journals to the underlying data as another of their ‘value-added’ services, to set alongside automatic alerting and sophisticated citation, indexing, searching and linking facilities. The idea behind such promotion is no doubt to help ward off the threat of disintermediation posed by the development of digital technology, which enables academics to take over the means of dissemination and publish their work for and by themselves cheaply and easily. Significantly, a 2009 JISC report identified ‘open-ness, predictive science based on massive data volumes and citizen involvement as [all] being important features of tomorrow’s research practice’.

In a further move in this direction, all Public Library of Science (PLoS) journals are now providing a broad range of article-level metrics and indicators relating to data usage on an open basis. No longer withheld as trade secrets, these metrics reveal which articles are attracting the most views, citations from the scholarly literature, social bookmarks, coverage in the media, comments, responses, ‘star’ ratings, blog coverage, etc. (Patterson, 2009). PLoS has positioned this programme as enabling science scholars to assess ‘research articles on their own merits rather than on the basis of the journal (and its impact factor) where the work happens to be published’, and they encourage readers to carry out their own analyses of this open data. Yet it is difficult not to perceive such article-level metrics and management tools as being part of the wider process of transforming knowledge and learning into ‘quantities of information’ (Lyotard, 1986: 4); quantities, furthermore, that are produced more to be exchanged, marketed and sold (1986: 4) – for example, by individual academics to their departments, institutions, funders and governments in the form of indicators of ‘quality’ and ‘impact’(1986: 5).


From Open Science to Open Government

The developments around open access and open data are themselves part of what the trend or phenomenon which is coming to be known as ‘open science’. As Murray et al put it:

Open science is emerging as a collaborative and transparent approach to research. It is the idea that all data (both published and unpublished) should be freely available, and that private interests should not stymie its use by means of copyright, intellectual property rights and patents. It also embraces open access publishing and open source software… (Murray et al, 2008)


One of the most interesting and well known examples of how such open science may work is the Open Notebook Science of the organic chemist Jean-Claude Bradley. ‘[I]in the interests of openness’, Bradley is making the ‘details of every experiment done in his lab freely available on the web. This ‘includes all the data generated from these experiments too, even the failed experiments’. What’s more, he is doing so in ‘real time’, ‘within hours of production, not after the months or years involved in peer review’ (Poynder, 2010). Again, we can see how emphasis is being placed on the amount of research that can be shared, and the speed with which this can be achieved. This openness on Bradley’s part is also positioned as a means of achieving usefulness and impact, as is evident from the very title of one of his Open Notebook Science projects, UsefulChem.


To be fair, such discourses around openness, transparency, efficiency and utility are not confined to the sciences – or even the university, for that matter. We can mention here wider political initiatives, dubbed ‘Open Government’, or ‘Government 2.0’, with both the New Labour and the Conservative/Liberal Democrat coalition governments in the UK making a great display of freeing government information. The Labour government implemented the Freedom of Information (FOI) Act in 2000, and launched a website [LINK www.data.gov.uk]expressly dedicated to the release of governmental data sets in January 2010. The Conservative/Liberal Democrat coalition has continued to make extensive use of this website. In a similar vein, The Guardian newspaper has campaigned for the UK government to relinquish its copyright on all local, regional and national data collected with taxpayers’ money and to make such data freely and openly available to the public by publishing it online, where it can be collectively and collaboratively scrutinized, searched, mined, mapped, graphed, cross-tabulated, visualized, audited and interpreted using software tools.


This phenomenon is by no means just confined to the UK. Throughout his election campaign, Barack Obama promised to make government in the United States more open. He followed this up by issuing a memorandum on transparency the very first day after he became President, vowing to make openness one of ‘the touchstones of this presidency’” (Obama, cited in Stolberg , 2009): ‘My Administration is committed to creating an unprecedented level of openness in Government. We will work together to ensure the public trust and establish a system of transparency, public participation, and collaboration. Openness will strengthen our democracy and promote efficiency and effectiveness in Government’. (The White House , 2009)


The connection I am endeavouring to make here between the seemingly diverse movements for open access, open data, open science and open government has already been highlighted by Michael Gurstein at the 2011 annual conference of the Open Knowledge Foundation. According to Gurstein,

the ‘open data/open government’ movement begins from a profoundly political perspective that government is largely ineffective and inefficient (and possibly corrupt) and that it hides that ineffectiveness and inefficiency (and possible corruption) from public scrutiny through lack of transparency in its operations and particularly in denying to the public access to information (data) about its operations. And further that this access once available would give citizens the means to hold bureaucrats (and their political masters) accountable for their actions. In doing so it would give these self-same citizens a platform on which to undertake (or at least collaborate with) these bureaucrats in certain key and significant activities—planning, analyzing, budgeting that sort of thing. Moreover through the implementation of processes of crowdsourcing this would also provide the bureaucrats with the overwhelming benefits of having access to and input from the knowledge and wisdom of the broader interested public.


Put in somewhat different terms but with essentially the same meaning—it’s the taxpayer’s money and they have the right to participate in overseeing how it is spent. Having “open” access to government’s data/information gives citizens the tools to exercise that right. (Gurstein , 2011)


Interestingly, for Gurstein, a much clearer understanding is needed than has been displayed by many open data/open government advocates to date of what exactly is meant by openness, and of where arguments in favour of open access, open information and open data are likely to lead us in the not too distant future. With this in mind, we could endeavour to put some flesh on the bones of Gurstein’s sketch of the politics of openness and suggest that, from a liberal perspective, freeing publicly funded and acquired information and data – whether it is gathered directly in the process of census collection, or indirectly as part of other activities (crime, healthcare, transport, schools and accident statistics) – is seen as helping society to perform more efficiently. For liberals, openness is said to play a key role in increasing citizen trust, participation and involvement in democracy, and indeed government, as access to information – such as that needed to intervene in public policy – is no longer restricted either to the state or to those corporations, institutions, organizations and individuals who have sufficient money and power to acquire it for themselves.


Such liberal beliefs find support in the idea that making information and data freely and transparently available goes along with Article 19 of The Universal Declaration of Human Rights. The latter states that everyone has the right ‘to seek, receive and impart information and ideas through any media and regardless of frontiers’. Hillary Clinton, the United States Secretary of State, put forward a similar vision when, at the beginning of 2010, she said of her country that ‘We stand for a single internet where all of humanity has equal access to knowledge and ideas’, and against the authoritarian censorship and suppression of free speech and online search facilities like Google in countries such as China and Iran. Clinton declared:


Even in authoritarian countries, information networks are helping people discover new facts and making governments more accountable.


During his visit to China in November [2009], President Obama held a town hall meeting with an online component to highlight the importance of the internet. In response to a question that was sent in over the internet, he defended the right of people to freely access information, and said that the more freely information flows, the stronger societies become. He spoke about how access to information helps citizens to hold their governments accountable, generates new ideas, and encourages creativity. The United States' belief in that truth is what brings me here today.

… And technologies with the potential to open up access to government and promote transparency can also be hijacked by governments to crush dissent and deny human rights. (Clinton, 2010)

This political sentiment was shared byJeff Jarvis, author of What Would Google Do?, when, in support of Google’s decision to stop self-filtering search results in China, he argued in March 2010 for a bill of rights for cyberspace: ‘to claim and secure our freedom to connect, speak, assemble, and act online; to each control our identities and data; to speak our languages; to protect what is public and private; and to assure openness’ (Jarvis, 2010: 4). Yet are Clinton and Jarvis not both guilty here of overlooking (or should that be conveniently forgetting or even denying) the way liberal ideas of freedom and openness (and, indeed, of the human) have long been used in the service of colonialism and neoliberal globalisation? Does freedom for the latter not primarily mean economic freedom, i.e., freedom of the market, freedom of the consumer to choose what to consume – not only in terms of goods, but also lifestyles and ways of being?


Certainly, it is interesting that ‘fifteen years after the Freedom of Information Act law was passed’ in the US in 1966, ‘the General Accounting Office reported that 82 percent of requests [for information] came from business, nine percent from the press, and only 1 percent from individuals or public interest groups’ (Fung et al, 2007: 27-28) – even if this was before the widespread use of networked computers. ‘The truth is that the [UK] FOI Act [2000] isn't used, for the most part, by “the people”’, as Tony Blair acknowledged in his recent memoir. ‘It's used by journalists’ (Blair, 2010) – and by businesses, one might add. In view of this, it is no surprise to find that neoliberals also support the making of government data freely and openly available to businesses and the public. They do so on the grounds that it provides a means of achieving the best possible ‘input/output ratio’ for a society or community (Lyotard, 1986: 54). This way of thinking is of a piece with the emphasis placed by neoliberalism’s audit culture on accountability, transparency, evaluation, measurement and centralised data management: for example, in the context of UK higher education, it is evident in the emphasis placed on measuring the impact of research on society and the economy, league tables, teaching standards, contact hours, as well as student drop-out rates, future employment destinations and earning prospects. From this perspective, such openness and communicative transparency is perceived as ensuring greater value for (taxpayers’) money, supposedly helping to eliminate corruption, enabling costs to be distributed more effectively, and increasing choice, innovation, enterprise, creativity, competiveness and accountability.


Meanwhile, some libertarians have gone so far as to argue that there is absolutely no need to make difficult policy decisions about what data and what information it is right to publish online and what to keep secret. Instead, we should work toward the kind of situation the science-fiction writer Bruce Sterling proposes. In Shaping Things, his non-fiction book on the future of design, Sterling advocates retaining all data and information, ‘the known, the unknown known, and the unknown unknown’, in large archives and databases equipped with the necessary bandwidth, processing speed and storage capacity, and simply devising search tools and metadata that are accurate, fast and powerful enough to find and access it (Sterling 2005: 47).


Transparency Before we go any further, I should perhaps confess at this point that I am a staunch advocate of the open access movement in the humanities. Nevertheless, there are a number of issues that need to be raised with regard to making research and data openly available online for free.


The first point I want to make in this respect is that, far from revealing any hitherto unknown, hidden or secret knowledge, such discourses of openness and transparency are themselves not very open or transparent. Staying with the relationship between politics and science, let us take as an example the response of Ed Miliband, leader of the UK’s Labour Party, to the ‘Climategate’ controversy, in which climate skeptics alleged that emails hacked from the University of East Anglia’s Climatic Research Unit revealed evidence of global warming was the result of a conspiracy among scientists. Miliband’s response was to advocate ‘maximum transparency – let’s get the data out there’, he said. ‘The people who believe that climate change is happening and is man-made have nothing to fear from transparency’ (Miliband, quoted in Westcott, 2009: 7; cited by Birchall, ???). Yet, actually, complete transparency is impossible. This is because, as Clare Birchall has shown, there is an aporia at the heart of any claim to transparency. ‘For transparency to be known as transparency, there must be some agency (such as the media [or government], say) that legitimises it as transparent, and because there is a legitimising agent which does not itself have to be transparent, there is a limit to transparency’ (Birchall, ???). In fact, the more transparency is claimed, the more the violence of the mediating agency of this transparency is concealed, forgotten or obscured. Birchall offers the example of ‘The Daily Telegraph and its exposure of MPs’ expenses during the summer of 2009. While appearing to act on the side of transparency, as a commercial enterprise the paper itself has in the past been subject to secret takeover bids and its former owner, Lord Conrad Black, convicted of fraud and obstructing justice’ (Birchall, ???). To briefly paraphrase a question from Lyotard I am going to return to at more length: Who decides what transparency is, and who knows what needs to be transparent (1986: 9)?


Furthermore, merely making such information and data available to the public online will not in itself necessarily change anything. In fact, such processes have often been adopted precisely as a means of avoiding change. Aaron Swartz provides the example of Watergate: ‘after Watergate, people were upset about politicians receiving millions of dollars from large corporations. But, on the other hand, corporations seem to like paying off politicians. So instead of banning the practice, Congress simply required that politicians keep track of everyone who gives them money and file a report on it for public inspection’. (Swartz, 2010)


Openness

Much the same can be said for the idea that making research and data accessible to the public supposedly helps to make society more open and more free. Take the belief we saw expressed above by Hilary Clinton: that people in the United States have free access to the internet while those in China and Iran do not. Those of us who live and work in the West do indeed have a certain freedom to publish and search online.


Yet none of this rhetoric about freedom and transparency prevented the Obama government from condemning Wikileaks in November 2010 as ‘reckless and dangerous’, after it opened up access to hundreds of thousands of classified State Department documents (Gibbs, 2010); nor from putting pressure on Amazon.com and other companies to stop hosting the whistle-blowing website, an action which had echoes of the dispute over censorship between Google and the Chinese government earlier in 2010. (Significantly, Obama has also recently closed the United States open government website www.data.gov, which served as an influential precursor to the previously mentioned www.data.gov.uk website in the UK.) What is more, unless you are a large political or economic actor, or one of the lucky few, the statistics show that what you publish online is unlikely to receive much attention. Just ‘three companies – Google, Yahoo! and Microsoft – handle 95 percent of all search queries’; while ‘for searches containing the name of a specific political organisation, Yahoo! and Google agree on the top result 90 percent of the time’ (Hindman, 2009: 59, 79). Meanwhile, one company, Google, reportedly has 65 % of the world’s search market, ‘72 per cent share of the US search market, and almost 90 per cent in the UK’ – a degree of domination that has led the European Union to investigate Google for abusing its power to favour its own products while suppressing those of rivals (Arthur, 2010: 3).


But it is not just that Google’s algorithms are ranking some websites on the first page of its results and others on page 42 (which means, in effect, that the latter are rarely going to be accessed, since very few people read beyond the first page of Google’s results). It is that conventional internet search engines are reaching only an extremely small percentage of the total number of available web pages. Ten years ago Michael K. Bergman was already placing the figure at 0.03%, or ‘one in 3,000’, with ‘public information on the deep Web’ even then being ‘400 to 550 times larger than the commonly defined World Wide Web’. Consequently, while according to Bergman as much as ‘ninety-five per cent of the deep Web’ may be ‘publicly accessible information –not subject to fees or subscriptions’ – by far the vast majority of it is left untouched (Bergman, 2001). And that is before we even begin to address the issue of how the recent rise of the app, and use of the password protected Facebook for search purposes, may today be annihilating the very idea of the openly searchable Web.


We can therefore see that it is not enough simply to ‘Free Our Data’, as The Guardian has it; or to operate on the basis that ‘information wants to be free’ (Wark, 2004) (although doing so of course may be a start, especially in an era when notions of the open web and net neutrality are under severe threat). We can no doubt put ever more research and data online; we can make it freely available to both other researchers and the public under open access, open data, open science and open government conditions; we can even integrate, index and link it using the appropriate metadata to enable it to be searched and harvested with relative ease. However, none of this means this research and data is going to be found. Ideas of this kind ignore the fact that all information and data is ordered, structured, selected and framed in a particular way. This is what metadata is used for, after all. Metadata is information or data that describes, links to, or is otherwise used to control, find, select, filter, classify and present other data. One example would be the information provided at the front of a book detailing its publisher, date and place of publication, ISBN number, etc. However, the term ‘metadata’ is most commonly associated with the language of computing. There, metadata is what enables computers to access files and documents, not just in their own hard drives, but potentially across a range of different platforms, servers, websites and databases. Yet for all its associations with computer science, metadata is never neutral or objective. Although the term ‘data’ comes from the Latin word datum, meaning ‘something given’, data is not simply objectively out there in the world already provided for us. The specific ways in which metadata is created, organized and presented helps to produce (rather than merely passively reflect) what is classified as data and information –and what is not.


Clearly, then, it is not just a question of free and open access to the research and data; nor of providing support, education and training on how to understand, interpret, use and apply it effectively, as Gurstein has argued (2010). It is also a question of who (and what) makes decisions regarding the data and metadata, and thus gets to exercise control over it, and on what basis such decisions are made. To paraphrase Lyotard once more: who decides what data and metadata is, and who knows what needs to be decided?’ (1986: 9). Who gets to legislate? And who legitimates the legislators (1986: 8)? Will the new ‘ruling class’ – top civil servants and consulting firms with their MBA-holding employees, ‘corporate leaders, high-level administrators, and the heads of the major professional, labor, political, and religious organizations’, including those behind Google, Apple, Facebook, Amazon, JISC, AHRC, OAI, SPARC, COASP – continue to operate as the class of interpreters, gatekeepers and ‘decision makers,’ not just with regard to having ‘access to the information these machines must have in storage to guarantee that the right decisions are made’, but with regard to creating and controlling the data and metadata, too (1986: 14)?


If, as demonstrated above, discourses of openness and transparency are themselves not very open or transparent at all, much of the current emphasis on making the research and data open and free is also lacking in self-reflectivity and meaningful critique. We can see this not just in those discourses associated with open access, open data, open science and open government that are explicitly emphasizing the importance of transparency, performativity and efficiency. This lack of criticality it is apparent in much of what goes under the name of ‘digital humanities’, too, especially those quarters associated with the so-called ‘computational turn’.

We tend to think of the humanities as being self-reflexive per se, and as frequently asking questions capable of troubling culture and society. Yet after decades when humanities scholarship made active use of variety of critical theories – Marxist, psychoanalytic, post-colonialist, post-Marxist – it seems somewhat surprising that many advocates of this current turn to data-intensive scholarship find it difficult to understand computing and the digital as much more than tools, techniques and resources. As a result, much of the scholarship that is currently occurring under the ‘digital humanities’ agenda is uncritical, naive and often banal (Liu, 2011; Higgen, 2010).


Witness the current injunction amongst some scholars to make data not only visible but also visual. Stefanie Posavec’s Literary Organism, which visualises the structure of Part One of Kerouac’s On the Road as a tree, provides one example; those cited earlier courtesy of Lev Manovich and the Software Studies Initiative offer another one. Now, there is a long history of critical engagement within the humanities with ideas of the visual, the image, the spectacle, the spectator and so on: not just in critical theory, but also in cultural studies, women’s studies, media studies, film studies and television studies. Such history of critical engagement stretches back to Guy Debord’s influential 1967 work, The Society of the Spectacle, and beyond. For example, in his introduction to a1995 book edited with Lynn Cooke, Visual Display: Culture Beyond Appearances (Seattle: Bay Press), Peter Wollen writes that an excess of visual display within culture has:

the effect of concealing the truth of the society that produces it, providing the viewer with an unending stream of images that might best be understood, not simply detached from a real world of things, as Debord implied, but as effacing any trace of the symbolic, condemning the viewer to a world in which we can see everything but understand nothing—allowing us viewer-victims, in Debord’s phrase, only ‘a random choice of ephemera’ (1995: 9).


It can come as something of a surprise, then, to discover that this humanities tradition in which ideas of the visual are engaged critically appears to have had comparatively little impact on the current enthusiasm for data visualisation that is so prominent an aspect of the turn toward data-intensive scholarship.


Of course, this (at times explicit) repudiation of criticality could be precisely what makes certain aspects of the digital humanities so seductive for many at the moment. Exponents of the computational turn can be said to be endeavouring to avoid conforming to accepted (and often moralistic) conceptions of politics that have been decided in advance, including those that see it only in terms of power, ideology, race, gender, class, sexuality, ecology, affect etc. Refusing to ‘go through the motions of a critical avant-garde’, to borrow the words of Bruno Latour (2004), they often position themselves as responding to what is perceived as a fundamentally new cultural situation, and to the challenge it represents to our traditional methods of studying culture, by avoiding conventional theoretical manoeuvres and by experimenting with the development of fresh methods and approaches for the humanities instead.


Manovich, for example, sees the sheer scale and dynamics of the contemporary new media landscape as presenting the usually accepted means of studying culture that were dominant for so much of the 20th century – the kinds of theories, concepts and methods appropriate to producing close readings of a relatively small number of texts – with a significant practical and conceptual challenge. In the past, ‘cultural theorists and historians could generate theories and histories based on small data sets (for instance, “classical Hollywood cinema”, “Italian Renaissance”, etc.) But how can we track “global digital cultures”, with their billions of cultural objects, and hundreds of millions of contributors’, he asks? (Manovich, 2010) Three years ago Manovich was already describing the ‘numbers of people participating in social networks, sharing media, and creating user-generated content’ as simply ‘astonishing’:

MySpace, for example, claims 300 million users. Cyworld, a Korean site similar to MySpace, claims 90 percent of South Koreans in their 20s and 25 percent of that country's total population (as of 2006) use it. Hi5, a leading social media site in Central America has 100 million users and Facebook, 14 million photo uploads daily. The number of new videos uploaded to YouTube every twenty-four hours (as of July 2006): 65,000. (Manovich, 2008)


The solution Manovich proposes to this ‘data deluge’ is to turn to the very computers, databases, software and vast amounts of born-digital networked cultural content that are causing the problem in the first place, and to use them to help develop new methods and approaches adequate to the task at hand. This is where what he calls Cultural Analytics comes in. ‘The key idea of Cultural Analytics is the use of computers to automatically analyze cultural artefacts in visual media, extracting large numbers of features that characterize their structure and content’ (Manovich in Kerssens & Dekker, 2009); and what is more, to do so not just with regard to the culture of the past, but also with that of the present. To this end, Manovich (not unlike the Google company) calls for as much of culture as possible to be made available in external, digital form: ‘not only the exceptional but also the typical; not only the few ‘cultural sentences spoken by a few ‘great man’ [sic] but the patterns in all cultural sentences spoken by everybody else’ (Manovich in Kerssens & Dekker, 2009)[CHECK QUOTATION MARKS HERE].


In a series of posts on his Found History blog, Tom Scheinfeldt, managing director at the Center for History and New Media at George Mason University, positions such developments in terms of a shift from a concern with theory and ideology to a concern with methodology (2008). In this respect there may well be a degree of ‘relief in having escaped the culture wars of the 1980s’ – for those in the US especially – as a result of this move ‘into the space of methodological work’ (Croxall, 2010) and what Scheinfeldt reportedly dubs ‘the post-theoretical age’ (cited in P. Cohen, 2010). The problem, though, is that without such reflexive critical thinking and theories many of those whose work forms part of this computational turn find it difficult to articulate exactly what the point of what they are doing is, as Scheinfeldt readily acknowledges (2010a).


Take one of the projects I mentioned earlier: the attempt by Dan Cohen and Fred Gibbs to text-mine all the books published in English in the Victorian age (or at least those digitized by Google). Among other things, this allows Cohen and Gibbs to show that use of the word ‘revolution’ in book titles of the period spiked around ‘the French Revolution and the revolutions of 1848’ (D. Cohen, 2010). But what argument are trying to make with this calculation? What is it we are able to learn as a result of this use of computational power on their part that we did not know already and could not have discovered without it? (Scheinfeldt, 2010 a/b?)

In an explicit response to Cohen and Gibbs’s project, Scheinfeldt suggests that the problem of theory, or the lack of it, may actually be a question of scale:

It expects something of the scale of humanities scholarship which I’m not sure is true anymore: that a single scholar—nay, every scholar—working alone will, over the course of his or her lifetime ... make a fundamental theoretical advance to the field.


Increasingly, this expectation is something peculiar to the humanities. ...it required the work of a generation of mathematicians and observational astronomers, gainfully employed, to enable the eventual “discovery” of Neptune... Since the scientific revolution, most theoretical advances play out over generations, not single careers. (Scheinfeldt, 2010 a/b/c?)


Now, it is absolutely important that we as scholars experiment with the new tools, methods and materials that digital media technologies create and make possible, in order to bring into play new forms of Foucauldian dispositifs, or what Bernard Stiegler calls hypomnemata, or what I am trying to think in terms of media gifts – just use link,. I would include in this 2experimentation imperative’ techniques and methodologies drawn from computer science and other related fields, such as information visualisation, data mining and so forth. Still, there is something troubling about this kind of deferral of critical and self-reflexive theoretical questions to an unknown point in time, still possibly a generation away. After all, the frequent suggestion is that now is not the right time to be making any such decision or judgement, since we cannot yet know how humanists will eventually come to use these tools and data, and thus what data-driven scholarship may or may not turn out to be capable of, critically, politically, theoretically. One of the consequences of this deferral, however, is that it makes it extremely difficult to judge whether this postponement is indeed acting as a responsible, political and ethical opening to the (heterogeneity and incalculability of the) future, including the future of the humanities; or whether it is serving as an alibi for a naive and rather superficial form of scholarship instead (Meeks, 2010). Indeed, this deferral seems to be producing a form of scholarship that, in uncritically and un-self-reflexively adopting techniques and methodologies drawn from computer science, forms part of the general trend in contemporary society which Lyotard associates with the widespread use of computers and databases, and with the exteriorization of knowledge in relation to the ‘knower’. As we have seen, it is a movement away from a concern with ideals, with what is right and just and true, and toward a concern to legitimate power by optimizing the system’s performance in instrumental, functional terms.


All of this raises some rather significant and timely questions for the humanities. Is it merely a coincidence that such a turn toward science, computing and data-intensive research is gaining momentum at a time when the UK government is emphasizing the importance of the STEMs (science, technology, engineering and medicine subjects) and withdrawing support and funding from the humanities? Or is one of the reasons all this is happening now due to the fact that the humanities, like the sciences themselves, are under pressure from government, business, management, industry and increasingly the media to prove they provide value for money in instrumental, functional, performative terms? Is the interest in computing a strategic decision on the part of some of those in the humanities? As the project of Cohen and Gibbs shows, one can get funding from the likes of Google (D. Cohen, 2010). In fact, in the summer of 2010 ‘Google awarded $1 million to professors doing digital humanities research’ (P. Cohen, 2010). To what extent is the take-up of practical techniques and approaches from computing science providing some areas of the humanities with a means of defending (and refreshing) themselves in an era of global economic crisis and severe cuts to higher education, through the transformation of their knowledge and learning into quantities of information –- the so-called ‘deliverables’? Can we even position the ‘computational turn’ as an event created to justify such a move on the part of certain elements within the humanities (Frabetti, 2010)?


Where does all this leave us as far as this Living Book on open science is concerned? As the argument above hopefully demonstrates, it is clearly not enough just to attempt to reveal or recover the scientific truth about, say, the environment, to counter the disinformation of others involved in the likes of the Climategate controversy. Nor is it enough just to make the scientific research openly accessible to the public Equally, it is not entirely satisfactory just to make the information, data, and associated tools, techniques and resources freely available to those in the humanities, so they can collectively and collaboratively search, mine, map, graph, model, visualize, analyse and interpret it in new ways – including some that may make it less abstract and easier for the majority of those in society to understand and follow – and, in doing so, help bridge the gap between the ‘two cultures’. It is not so much that there is a lack of information, or access to the right kind of information, or information presented in the right kind of way to ensure that the message of the scientific research and data comes across effectively and efficiently. It is rather that there is too much information, too much white noise, as ‘Bifo’ calls it. As a 2010 Mintel report showed – to stay with the example of climate change – most people in the UK already know what is happening to the environment. They are just suffering from Green Fatigue, they are bored with thinking about it and thus enacting a backlash against what they perceive as ‘extreme’ pressure from environmentalist groups. This is perhaps one reason why ‘the number of cars on UK roads has risen from just over 26million in 2005 to more than 31 million in 2009’.


As Gilles Deleuze and Felix Guattari put it, then: ‘We do not lack communication. On the contrary, we have too much of it’. What we actually lack is creation. ‘We lack resistance to the present’. It is therefore not simply a case of supplying more scientific research and data; nor of making the research and data that has otherwise been closed, hidden, denied or suppressed openly available for free – by opening the already existing ‘memory and databanks’ to the people, for example (which is what Lyotard ended by suggesting we do). It is more a case of creating work around the scientific research and data that does not simply go along with shift in the status and nature of knowledge that is currently taking place. As we have seen, it is a shift toward STEM subjects and away from the humanities; toward a concern with optimizing the social system’s performance in instrumental, functional terms, and away from a concern with questions of what is just and right; and toward an emphasis on openness, freedom and transparency, and away from what is capable of disrupting and disturbing society, and what, in remaining resistant to a culture of measurement and calculation, maintains a much needed element of inaccessibility, inefficiency, delay, error, antagonism and dissensus within the system.


Can this Living Book on open science be considered one such a creation? And can this series of Living Books about Life be considered another? Are they instances of a resistance to the present? Or just more white noise?


References

Charles Arthur, ‘Analysing data is the future for journalists, says Tim Berners-Lee’, MediaGuardian, 22 November, 2010, p.7. Available at: http://www.guardian.co.uk/media/2010/nov/22/data-analysis-tim-berners-lee

Charles Arthur, ‘Nottingham University offers masterclasses in dealing with open data - for free of course’, guardian.co.uk Blogposts, 13 Oct 2010. Available at: http://m.guardian.co.uk/?id=102202&story=http://www.guardian.co.uk/technology/datablog/2010/oct/13/free-data-nottingham-classes


Franco ‘Bifo’ Berardi, Marco Jacquemet and Gianfranco Vitali, Ethereal Shadows: Communications and Power in Contemporary Italy (Autonomedia: Brooklyn, New York, 2009

Michael K. Bergman, ‘The Deep Web: Surfacing Hidden Value’, JEP: The Journal of Electronic Publishing, vol.7, no.1, August, 2001. Available at: http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;view=text;rgn=main;idno=3336451.0007.104.


Birchall, C (2012) ‘Transparency, Interrupted: Secrets of the Left’, Theory, Culture and Society […] Tony Blair, A Journey, Hutchinson, September 2010 Hilary Clinton, ‘Internet Freedom: The prepared text of U.S. of Secretary of State Hillary Rodham Clinton's speech, delivered at the Newseum in Washington, D.C., 21 January, 2010. Available at: http://www.foreignpolicy.com/articles/2010/01/21/internet_freedom?page=full

Cohen, D. (2010) ‘Searching for the Victorians’, Dan Cohen, October 4. http://www.dancohen.org/2010/10/04/searching-for-the-victorians/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+DanCohen+%28Dan+Cohen%29&utm_content=Google+Reader.

Cohen, P. (2010) ‘Digital Keys for Unlocking the Humanities’ Riches’, The New York Times, November 16. http://www.nytimes.com/2010/11/17/arts/17digital.html?_r=1&hp=&pagewanted=all.

Cramer, F. (2009) ‘Re: Digital Humanities Manifesto’, posting to the nettime email list. January 22. http://www.mail-archive.com/nettime-l@kein.org/msg01331.html.

Croxall, B. (2010) response to Tanner Higgen, ‘Cultural Politics, Critique, and the Digital Humanities’, Gaming the System. September 10. http://www.tannerhiggin.com/2010/05/cultural-politics-critique-and-the-digital-humanities/.

Deleuze G. and Guattari, F. (1988) A Thousand Plateaus: Capitalism and Schizophrenia. London: Athlone. Deleuze G. and Guattari, F., What is Philosophy? (New York: Columbia University Press, 1994)

Archon Fung, Mary Graham, David Weil, Full Disclosure: The Perils and Promise of Transparency (Cambridge: Cambridge University Press, 2007)

Frabetti, F. (2010) ‘Digital Again? The Humanities Between the Computational Turn and Originary Technicity’, talk given to the Open Media Group, Coventry School of Art and Design. November 9. http://coventryuniversity.podbean.com/2010/11/09/open-software-and-digital-humanities-federica-frabetti/. Mark Fisher, Capitalist Realism, (Winchester: Zero Books: 2009: 4-5

Presidential press secretary Robert Gibbs, cited in ‘White House condemns WikiLeaks' release’, MCNBC.com News, 28 November, 2010. Available at: http://www.msnbc.msn.com/id/40405589/ns/us_news-security.

Michael Gurstein, ‘Open Data: Empowering the Empowered or Effective Data Use for Everyone?’, Gurstein’s Community Infomatics, 2 September, 2010: Available at http://gurstein.wordpress.com/2010/09/02/open-data-empowering-the-empowered-or-effective-data-use-for-everyone/.

Michael Gurstein, ‘Open Data (2): Effective Data Use’, Gurstein’s Community Infomatics, 9 September, 2010: http://gurstein.wordpress.com/2010/09/09/open-data-2-effective-data-use/

Michael Gurstein, ‘Are the Open Data Warriors Fighting for Robin Hood or the Sheriff?: Some Reflections on OKCon 2011 and the Emerging Data Divide’, posting to the nettime mailing list, 5 July, 2011.

Hall, G. (2010) 'We Can Know It For You: The Secret Life of Metadata', How We Became Metadata. University of Westminster, London: Institute for Modern and Contemporary Culture.

Hall, G. (2011) ‘The Digital Humanities Beyond Computing: A Postscript’, Culture Machine 12. http://www.culturemachine.net/index.php/cm/article/view/441/459.

Hall, G. Birchall, C. and Woodbridge, P. (2010b) 'Introduction to Liquid Theory TV, Episode 2: Deleuze’s "Postscript on the Societies of Control"’, Culture Machine 11, http://www.culturemachine.net/index.php/cm/article/view/384/400.

Hall, S. (2007) ‘Epilogue: Through the Prism of an Intellectual Life’, in Meeks, B. Culture, Politics, Race, and Diaspora: The Thought of Stuart Hall. Miami: Ian Rundle Publishers.

Higgen, T. (2010) ‘Cultural Politics, Critique, and the Digital Humanities’, Gaming the System. May 25. http://www.tannerhiggin.com/2010/05/cultural-politics-critique-and-the-digital-humanities/.

Matthew Hindman The Myth of Digital Democracy (Princeton, NJ and Oxford: Princeton University Press, 2009) p.59, 79.

Houghton, J. (2009) ‘Open Access - What are the Economic Benefits?: A Comparison of the United Kingdom, Netherlands and Denmark’, Centre for Strategic Economic Studies, Victoria University, Melbourne. http://www.knowledge-exchange.info/Admin/Public/DWSDownload.aspx?File=%2fFiles%2fFiler%2fdownloads%2fOA_What_are_the_economic_benefits_-_a_comparison_of_UK-NL-DK__FINAL_logos.pdf.

Jeff Jarvis, ‘Time For Citizens of the Internet to Stand Up’, The Guardian: MediaGuardian, 29 March, 2010, p.4). For a discussion of the idea of a cyber bill of rights, go to http://bit.ly/cyberrights

JISC, ‘Press Release: Open Science - the future for research?, posting to the BOAI list, 16 November, 2009. The report is available at http://www.jisc.ac.uk/publications/documents/opensciencerpt.aspx).

Kagan, J. (2009) The Three Cultures: Natural Sciences, Social Sciences, and the Humanities in the 21st Century. Cambridge, Cambridge University Press.

Kempf, J. (2010) ‘Social Sciences and Humanities Publishing and the Digital “Revolution”’ (unpublished manuscript). http://perso.univ-lyon2.fr/~jkempf/Digital_SHS_Publishing.pdf. Accessed August 1, 2010.

Kerssens, N. and Dekker A. (2009) ‘Interview with Lev Manovich for Archive 2020’, Virtueel_ Platform. http://www.virtueelplatform.nl/#2595.

Kirschenbaum, M. (2010) ‘What Is Digital Humanities and What’s It Doing in English Departments?’, ADE Bulletin. No.150. http://mkirschenbaum.files.wordpress.com/2011/01/kirschenbaum_ade150.pdf.

Knouf, N. (2010) ‘The JJPS Extension: Presenting Academic Performance Information’, Journal of Journal Performance Studies, Vol 1, No 1 (2010). Available at http://journalofjournalperformancestudies.org/journal/index.php/jjps/article/view/6/6. Accessed 20 June, 2010.

Latour, B (2004) ‘Why Has Critique Run Out of Steam? From Matters of Fact to Matters of Concern”’, Critical Inquiry. Vol. 30, Number 2. http://criticalinquiry.uchicago.edu/issues/v30/30n2.Latour.html.

Liu, A. (2011) ‘Where is Cultural Criticism in the Digital Humanities’. Paper presented at the panel on ‘The History and Future of the Digital Humanities’” Modern Language Association convention, Los Angeles, January 7, 2011. http://liu.english.ucsb.edu/where-is-cultural-criticism-in-the-digital-humanities.

Lyotard, J-F. (1986) The Postmodern Condition: A Report on Knowledge. Manchester: Manchester University Press.

Manovich, L. (2002) ‘Metadata, Mon Amour’. http://www.manovich.net/TEXTS_07.HTM.

Manovich, L. (2005) ‘Remixing and Remixability’. http://www.manovich.net/DOCS/Remix_modular.doc.

Manovich, L. (2009a) ‘Cultural Analytics: Visualing Cultural Patterns in the Era of “More Media’. http://softwarestudies.com/cultural_analytics/Manovich_DOMUS.doc.

Manovich, L. (2009b) ‘How to Follow Global Digital Cultures, or Cultural Analytics for Beginners’, in Stalder, F. and Becker, K. (eds.) Deep Search: The Politics of Search Beyond Google. New Jersey: Transaction Publishers. http://softwarestudies.com/cultural_analytics/cultural_analytics_overview_final.doc http://lab.softwarestudies.com/2008/09/cultural-analytics.html.

Manovich, L. (2009d) ‘Cultural Analytics: Overview’, Software Studies Initiative. June 20. http://lab.softwarestudies.com/2008/09/cultural-analytics.html.

Manovich, L. (2009e) ‘Cultural Analytics: Why Cultural Analytics?’, Software Studies Initiative. June 20. http://lab.softwarestudies.com/2008/09/cultural-analytics.html.

Manovich, L. (2010a) ‘Cultural Analytics Lectures by Manovich in UK (London and Swansea), March 8-9, 2010’, Software Studies Initiative. March 8. http://lab.softwarestudies.com/2010/03/cultural-analytics-lecture-by-manovich.html.

Manovich, L. (2010b) ‘The Practice of Everyday (Media) Life’. Version: March 10. http://www.manovich.net/DOCS/manovich_social_media.doc

Manovich, L/ (2011) ‘Trending: The Promises and the Challenges of Big Social Data’, Lev Manovich, April 28, 2011: http://www.manovich.net/DOCS/Manovich_trending_paper.pdf. Ewan MacAskill, ‘Us Extols Internet Freedom – Within Limits’, The Guardian, 16 Feb, 2001, {check date, probably 2010]

Meeks, E. (2010) ‘The Digital Humanities as Imagined Community’, Digital Humanities Specialist. September 14. https://dhs.stanford.edu/the-digital-humanities-as/the-digital-humanities-as-imagined-community/.

Mintel report, ‘Energy Efficiency in the Home - UK - July 2010’.

Sally Murray, Stephen Choi, John Hoey, Claire Kendall, James Maskalyk, and Anita Palepu, ‘Open science, open access and open source software at Open Medicine’, Open Medicine, 2008; 2(1): e1–e3. Published online 2008 January 16: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091592/?tool=pmcentrez http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091592/pdf/OpenMed-02-e1.pdf??tool=pmcentrez

Orwant, Jon. “Our Commitment to the Digital Humanities.” The Official Google Blog, July 14, 2010. Web, http://googleblog.blogspot.com/2010/07/our-commitment-to-digital-humanities.html.

Patterson, M. (2009) ‘Article-Level Metrics at PloS – Addition of Usage Data’, PLoS: Public Library of Science. September 16, 2009. http://blogs.plos.org/plos/2009/09/article-level-metrics-at-plos-addition-of-usage-data/.

Richard Poynder, ‘Interview With Jean-Claude Bradley: The Impact of Open Notebook Science’, Information Today, September 2010. Available at: http://www.infotoday.com/IT/sep10/Poynder.shtml

Simon Rogers, ‘How Canada became an open data and data journalism powerhouse’, Guardian: Datablog, 9 November, 2010. Available at http://www.guardian.co.uk/news/datablog/2010/nov/09/canada-open-data

Scheinfeldt, T. (2010a) ‘Where’s the Beef?: Does Digital Humanities Have to Answer Questions?’, Found History. March 13. http://www.foundhistory.org/2010/05/12/wheres-the-beef-does-digital-humanities-have-to-answer-questions/.

Scheinfeldt, T. (2010b) response to Dan Cohen, ‘Searching for the Victorians’, Dan Cohen. October 5. http://www.dancohen.org/2010/10/04/searching-for-the-victorians/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+DanCohen+%28Dan+Cohen%29&utm_content=Google+Reader.

Rachel Shields, ‘Green fatigue hits campaign to reduce carbon footprint’, The Independent, 10 October, 2010, p.30. Available at: http://www.independent.co.uk/environment/climate-change/green-fatigue-hits-campaign-to-reduce-carbon-footprint-2102585.html.

Bruce Sterling, Shaping Things (2005: 47).

Stolberg, S. G. (2009) ‘On First Day, Obama Quickly Sets a New Tone’”, The New York Times. January 21. http://www.nytimes.com/2009/01/22/us/politics/22obama.html.

Swan, A. (2009) ‘Open Access and Open Data’, 2nd NERC Data Management Workshop, Oxford. February 17-18. http://eprints.ecs.soton.ac.uk/17424/.

Aaron Swartz, ‘When is transparency Useful?’ Aaron Swartz’s Raw Thought blog, 11 Feb., 2010. http://www.aaronsw.com/weblog/usefultransparency.

Mackenzie Wark, A Hacker Manifesto (Harvard University Press, 2004).

Westcott, S. (2009) ‘Global Warming: Brits Deny Humans are to Blame,’ The Express, December 7. http://www.express.co.uk/posts/view/144551/Global-warming-Brits-deny-humans-are-to-blame

The White House. (2009) ‘Memorandum for the heads of executive departments and agencies: Transparency and Open Government’. January 21. http://www.whitehouse.gov/the_press_office/TransparencyandOpenGovernment/