Electronic Futures in Scientific Communication and Outreach

5

Click here to load reader

Transcript of Electronic Futures in Scientific Communication and Outreach

Page 1: Electronic Futures in Scientific Communication and Outreach

Electronic Futures in Scientific Communicationand Outreach

Philip Campbell

Editor-in-Chief, Nature, UK

I remember complaining to a colleague, aboutfour years ago, about the acute limitations of theWorld Wide Web. Imagine, I said, that instead ofits tediously flat mimicking of paper documentsalong with the superficial enhancements of hyper-text links, you could put on your Virtual Realityoutfit and fly through a representational map ofarbitrary dimensionality. It might be a time-tunnelof the history of the Earth incorporating all knowndata—ranging from ocean temperatures to atmos-pheric chemical compositions to the appearancesof individual species in the fossil record. Or itmight be the three-dimensional structure of a com-plex of biological molecules. At any point youcould stop and steer yourself “sideways” into thehidden dimensions—for example, the referencesor databases on which the local representationwas based. Ultimately, I speculated (encouragedby the adventurous visions of writer WilliamGibson and by the relatively primitive realities ofneuroscience), VR goggles and gloves would bedispensed with and we’d swim through thesevirtual worlds via neural implants.

As it happens, others at that time were thinkingsimilar thoughts and, rather than simply complain-ing, were setting about developing the content ofsuch systems. Those visions, and other lessambitious but more immediately useful aspects ofan evolving electronic literature, are the mainfocus of this essay. The technologies of deliveryare a distinct aspect that I shall return to. Neuralimplants are for another day.

Much of what I’m going to speculate about or,more rashly, predict in this essay implicitly reflectsthe major impact of the World Wide Web onscientists. But, as I hope I shall make clear, theWeb is by no means yet a mature medium ofscientific communication—or any other communi-cation, for that matter. Thankfully, there is a creep-ing maturation taking place. My hope is that, bythe time the Web and its infrastructure are fullydeveloped, its multi-dimensional successors willalready be poised to take over.

This essay isn’t much about Nature, but perhapsI might be forgiven for using the journal as anexample to set the agenda. Nature’s mission state-

ment, crafted in 1869, reflected two tasks that itsfounders set themselves: to facilitate communi-cation between researchers (or “men of science” asthey were then called); and to place the results ofscience before the public. Today’s researchers, aswell as today’s Nature, find themselves having tothink about both of these agendas more than everbefore. I shall start with the first, looking at inevit-able changes in communication between scientists,stop off on the way for a look at the evolving Web,and finish with one particular enhancement I’mglad to be able to celebrate, in communicationswith the public and, especially, the developingworld.

Impact of changes in science

Readers of this essay will need no reminding ofthe growth of high-throughput biology. Thisrequires copious amounts of information to bestored and communicated. Furthermore, the infor-mation may be highly granular. A series of smallchanges in experimental conditions will lead to acorresponding library of outputs from, say, acDNA microarray.

One change on the way is to the effect thatalternative methods of storage of these outputsmay be equally useful and accessible. If the resultsare critical to the conclusions of a scientific paper,a journal may want to host them on its own web-site. Or they may be stored in a communitydatabase—a microarray equivalent (yet to be estab-lished) of the sequence database Genbank. Or theresults may be stored by the laboratory on its orits host institution’s web database.

Journals, accordingly, will need to make anenhanced commitment to methods and data. Theyeast geneticist Michael Eisen, for example, worksin the highly data-intensive area of expression pro-filing. He has talked about the yeast “paper” of thefuture, and vividly describes the enormity of thechallenge which may come to confront journals inthis respect. The library of microarray outputs thatI mentioned earlier requires a parallel library ofprecise statements of methods, in the form of atree of linked descriptions of the systematically

0022-2836/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved

doi:10.1016/S0022-2836(02)00347-9 available online at http://www.idealibrary.com onBw

J. Mol. Biol. (2002) 319, 963–967

Page 2: Electronic Futures in Scientific Communication and Outreach

varied successive stages of experiments. In prac-tice, such a multipath description is unprintable—only the Web will serve. The costs of storing thatinformation routinely could become significant.The additional burden on referees and editorsmay be considerable too. But so, hopefully, will bethe scientific value.

Journals have always accepted a duty to host theresults they publish. But there are reasons forinstitutions, too, to consider taking on new respon-sibilities to host archives of the data produced intheir laboratories, and the conditions under whichthey were produced. First, because some of theinformation may not need to be formally publishedto be valuable to specialists elsewhere. Second,because increasing attention to the prevention ofmisconduct requires institutions to keep betterrecords of their researchers’ practices at the labbench. And thirdly, some research institutions,including MIT, the University of California andthe Max Planck Society, are developing e-print ser-vers for their laboratories’ output—not just textbut data, video and images—to enhance their insti-tutional accessibility and impact.

New community databases

An alternative way forward is for a communityto co-ordinate its needs and develop a central com-munity database. The scientific stimulation herelies in the innovative intellectual value beingdeveloped within the newer databases. Genbankand the Protein Data Bank are powerful resources,but elementary in content compared to schemesmore recently hatched.

New developments include ontological data-bases, in which the terminology and the functionsof the elements referred to are rigorously definedin order to provide a standard by which disciplinesand databases can more readily interact with eachother. The Gene Ontology Consortium is perhapsthe best known provider of a nascent ontology,developing precise descriptors to be used in refer-ring to genes and conveying their roles (seehttp://www.geneontology.org).

The Alliance for Cellular Signaling reflects adifferent trend, towards the generation of data-bases that encapsulate as much as possible of avail-able knowledge in summary form. The AfCS seeksto capture data about signalling molecules, with aparticular emphasis on B lymphocytes and cardiacmyocites. This is a major collaboration of 50 scien-tists, funded to the tune of $10 million per yearover at least five years. The AfCS website willincorporate unpublished data from the laboratoriesas well as “molecule pages”—several thousandweb pages, each maintained by their respectiveresearcher, hosting a summary of all the essentialaspects of a signalling molecule. I should declarean interest, as Nature is collaborating with AfCS indeveloping joint peer-reviewed websites.

The other innovative database I want to mentionis the type that I mentioned at the outset: a multi-

dimensional and representational store of theinformation in the literature. One vivid examplecould be the RiboWeb project, at Stanford Univer-sity (see http://smi-web.stanford.edu/projects/helix/riboweb.html). This is intended to integratea range of information about the entire ribosomeof prokaryotes: ontological and functionaldescriptions, sequences, secondary and tertiarystructures, software to analyse and extend thepresentation and analysis of information, andintelligent search functions. One can in principleenter this informational resource by accessing andmoving through representations of the ribosome’sstructure. Another example of such an approach isa representation of metabolic, transport and gene-regulation networks of Escherichia coli (see http://ecocyc.org). How useful these are in practiceremains to be seen, but alternative ways into theknowledge of systems that are increasingly hardto describe with mere words seem desirable.

Note that these databases are not intended todisplace the role of peer-reviewed reports oforiginal research. Rather, they provide a stimulat-ing new way of filtering and representing theknowledge and hypotheses to be found there. Myonly concern is that these databases are in theearly stages of their development and full of idealswhose realisation could prove monumental inachievement—but also in complexity and expense.The electronic publishing world is only too familiarwith its own history of good ideas that collapsedfrom an underestimation of the costs or the fickle-ness of their originators, for all their initial goodintentions.

But I am optimistic about those databases wherethe originators appear realistic about the scale ofthe challenge, have the personalities and com-mitment to see it through, and have attracted sub-stantial funding from government, philanthropicorganisations or industry. These represent an excit-ing new electronic extension to the scientific litera-ture. Nevertheless, there needs also to beconfidence that a database will last; its mainten-ance or even survival cannot be taken for granted.

Take, for example, the “Los Alamos” preprintserver. Paul Ginsparg, its founder, was in my viewlucky recently to find an institution—CornellUniversity—willing to commit to hosting thatever-expanding database, when the Los Alamoslaboratory apparently lost its resolve to support it.Similarly lucky were the countless astronomersand high-energy physicists who now consider itas natural as the air they breathe to have theservice freely available on tap.

Likewise, biologists worldwide now consider thePubMed abstracts database, paid for by the USNational Library of Medicine, to be one of theirinalienable rights. But when the US Department ofEnergy sought to develop a similar service for thephysical sciences, Congress blocked it, because itwas seen as government competing with theprivate sector. How that contradiction will beresolved is still an open question. No publisher or

964 Electronic Futures in Scientific Communication

Page 3: Electronic Futures in Scientific Communication and Outreach

politician is likely to make themselves so unpopu-lar as to lobby for the privatisation of PubMed,but it is an interesting question whether anyoneproposing to launch it now would have been ableto persuade the current US administration andCongress to support it.

But such concerns need not distract us from theessential point. With links to and from new data-bases in many new locations, and new types ofexperiment to be reported, the scientific literatureof the future is likely to become a much richerentity than at present, with much of it quitebeyond the means of ink on paper to represent.

Post-publication publishing

Everyone will be familiar with the pressures ontraditional business models of scientific publishing.Unfortunately, electronic-only publishing wouldmake little difference to publishers’ costs becauseprint and paper are only a small component oftotal overheads, and electronic production and dis-tribution bring costs of their own.

One assertion sometimes made is that, in aworld based on the subscription model, revenuescoming in to publishers after publication aretrivial. Not from my viewpoint. Indeed, the worldof electronic publishing opens up new oppor-tunities for publishers to add value after the pointof initial publication.

Our estimates at Nature of the revenues receivedafter the point of publication, as opposed to thosefrom subscription and advertising, are commer-cially confidential but also woefully uncertain intheir detail. In particular, we have no readily avail-able timeline of past papers being paid for because,hitherto, there’s been no need. Crude analysis ofpatterns of inter-library loans served by the BritishLibrary, and analyses of accesses post-publicationon the web by us and other publishers, suggest asteep drop in requests or accesses within a fewweeks of publishing, followed by a long slowlydeclining tail extending over many years. Analysisof accesses by JSTOR, an electronic archive formany journals, suggests, unsurprisingly, thataccess to a digital archive significantly boostsincursions into the older literature compared toprint archives.

Current sources of revenues post-publicationinclude third party aggregators, interlibrary loans,licensing, reprint orders, individual document pur-chases and sponsorship of post-publication collec-tions distributed to new audiences. In assessingthe value of different kinds of revenue, one hasalso to consider the costs incurred in obtainingthem. These particular revenues have com-paratively low costs, so they contribute corre-spondingly more significantly to the overallfinancial health of the operation, whereas subscrip-tions are obtained with high costs (marketing,fulfilment, etc.). One cannot readily or sensiblytreat post-publication revenues as a separatebusiness.

Thus post-publication revenues—now andpotentially—are not trivial for our journals. Whatis also clear is that the electronic environmentgives opportunities to develop such revenues,which in turn could support additional publishingservices and content for subscribers. Thereforeeven editors, let alone publishers, are reluctant tostart giving away papers without very goodreason.

What the community would undoubtedly love isa seamless archive of the entire scientific literature,so that papers in obscure journals are as availableas any others. And given the pace of science, every-one would like the literature freely availableimmediately. How might this be achieved?

The costs of publishing have to be paid for some-where. So if one scraps payments by users—sub-scriptions and post-publication—then one has tocall on authors or third parties. Sponsorship andadvertising are not sustainable in this context.Government publishing is politically unacceptable,at least after an innovative initiative is fully estab-lished. Authors, therefore, seem to be the onlyalternative available.

Authors, however, are unlikely to want to payfor submission, given the risks of rejection. Amassive publication fee therefore would need tobecome the norm. That is not an impossiblescenario, but the history of journals who havetried to move in that direction suggests that it isnot one that can readily be developed. And, as aneditor, I am personally distrustful of a schemewhere readers don’t express their perceived valueof a publication’s content by paying for it. For me,that has always been one crucial feedback loopthat indicates to me and my colleagues thateditorial judgements are providing value to thecommunity in the way that we intend.

Evolving access

When contemplating future directions forNature, I have two groups of readers in my head.One consists of browsers. For now, they will con-tinue to want printed copies to read, whereverand whenever it suits them. (I cannot resist addingthat the number of such browsers of Nature con-tinues to grow year by year, total circulation nowstanding at about 67,000).

But with the availability of site licenses and Webaccess to our homepage directly or via Google andYahoo, and access to individual papers viaPubMed, electronic scavengers of Nature are nowour biggest group of users. As researchers tend tosay nowadays, “if it’s not easily accessible on mydesktop, I won’t even cite it”. Overlapping withthe browser group, scavengers are significantlygreater in number, but with a much shorter atten-tion span, busy as they are. (If today’s youngerpostdocs are anything like me at that stage in mycareer, they’ll be feeling guilty at even the thoughtof reading anything outside the narrow confinesof their research.)

Electronic Futures in Scientific Communication 965

Page 4: Electronic Futures in Scientific Communication and Outreach

High-priced business models and totally freeaccess are both contentious. But enhancing themeans of access? There, one would think, everyoneought to be pursuing the same goal. So let’s sup-pose that the entire published research literature,rather than papers’ abstracts, was accessible tosearch engines. This would be much more closelyaligned to the real needs of researchers, who areat least as likely to seek the content of a figure cap-tion, table or methods section as they are a title orabstract.

Everyone will be used to the hypertext mark-uplanguage (HTML) by which the structure of docu-ments—titles, authors, abstracts etc.—is encodedby tags that label each component accordingly.With a new standard of coding of text, known asExtensible Mark-Up Language (XML), the taggingcan be defined by the user according to the needsof software that can be applied to it. Thus itbecomes possible to design documents that aremuch more functional—tables that become inter-active, words that can be recognised as labels forspecified entities such as proteins or genes. A pub-lisher or database host could then routinely tag alldocuments and supply a search engine designedspecifically for the common use of a particulardiscipline.

More usefully, such tags could be applied acrossthe papers of many publishers. Here is one areawhere publishers are beginning to show a little bitof vision, but where there is a huge way to go. Forexample, Crossref is a collaboration of publisherswhich has already established agreements thatallow users of papers to jump from references tothe cited papers of another publisher. But that pro-vides only access to whatever the publisher hasdecided to make freely available—Nature providesabstracts. A new collaboration, CrossSearch, inwhich Nature is actively participating, is develop-ing comprehensive search access and functionality(though not free access) across full-text content ofall the collaborating publishers. It remains to beseen how many publishers will participate.

That, however, is only one way forward inachieving in-depth access. A central archive offull-text content of publishers, or of the codingbehind that content, can also, at the least, providesearch and added functionality. That is the goal ofthe E-BioSci database being developed under theleadership of EMBO, with EU funding. Nature’spublishers are playing a leading role in that collab-oration as another way of enhancing access to theircontent. They are much more cautious, but areengaging with, the NLM’s PubMed Central. Thatcentral literature database bundles the goal ofaccess with the more controversial goal of totallyfree access (after a delay). For better or worse, fewpublishers are prepared to go down that route,Nature’s included.

All of these are rather specialized developmentsfor scientists which ride on top of the infrastruc-ture of the Web. To my mind it is important thatpublishers appreciate the long-term self-interest in

collaborating to achieve access across the literature,rather than pursue impossible goals of becomingone-stop shops. But what of the future of that infra-structure?

Networked toasters and thepaperless revolution

The British habit of putting slabs of white breadinto electric grilling machines at the breakfasttable, combined with their national obsession withthe local condition of the atmosphere, has giventhem a peculiar advantage in the information tech-nology revolution. They can now link their toastersto the Web, download an image supplied by fore-casters depicting that day’s weather, and have thetoaster apply a mask that grills the image onto thetoast. (See Nature 410 859; 2001)

One can only speculate as to when, as will surelyhappen, more conventionally portable technologieswill be available that provide the quality and con-venience of print in portable electronic form.Broadband wireless, local connections and the dis-play technologies that might generate portabilityhave yet to materialize, though most in the indus-try will speak confidently of such technologies’arrival within a few years.

More immediately, there is a need for anothersource of inconvenience to be removed: that end-less series of usernames and passwords requiredto match the diverse formats insisted upon bydifferent organisations. One useful step forwardhere is represented by the Liberty Alliance (seehttp://www.projectliberty.org) and Microsoft’s“Net Passport” (see http://www.passport.com).All online businesses and suppliers thatparticipate will recognise the unique identifier thatsuch schemes give the user. It is, in other words,exactly like a passport, to allow you to beidentified automatically anywhere on the web thatis a participant. It can be linked to a credit card.Thus access and payments both become a farsimpler process in the future. A quick search willshow any reader that the Web is sprinkled withwarnings about the dangers in such systems, buttheir time will surely come.

What of the web’s visionaries? For many yearsthe Web’s inventor, physicist Tim Berners-Lee, hasbeen working on the “semantic web” (see http://www.w3.org/2001/sw). His vision is of a webthat allows machines as well as people to interactwith each other. With the help of XML and post-XML tagging of documents, according to newinternational standards, it will become possible toapply artificial intelligence across the web in away that will greatly enhance users’ access tocomplex arrays of information (Berners-Lee, T. &Hendler, J. (2001). Nature, 410, 1023). Berners-Leeenvisages a much readier sharing of experimentaldata with trusted colleagues. He also anticipates amuch readier communication between scientificdisciplines, involving translation technologies.Such products, says Berners-Lee, “will allow users

966 Electronic Futures in Scientific Communication

Page 5: Electronic Futures in Scientific Communication and Outreach

to create relationships that allow communicationwhen the commonality of concept has not yet letto a commonality of terms”. Such technologieswould lead to intelligent probing of the literatureand experimental databases (Rzepa, H. S. &Murray-Rush, P. (2001). Learned Publishing, 14,177–182).

The Web and public outreach

I hope I have made clear how much more theweb and its successors should offer scientists inthe future. It won’t be long, I suspect, beforeonline-only primary research papers become com-mon, with only their summaries, including someof the key data, available in print. But what of thebroader audiences of science?

In recent years, more and more learned societiesand government departments have taken up theopportunities of openness and consultation thatthe Web provides in communication with non-scientists—whether policy makers, stakeholders orinterested onlookers. The media have increasinglyprovided extensions of their traditional coveragewith background briefings on their websites.

This represents a particular opportunity forscientists working in controversial areas of science.Journalists will respond to assistance wherever itcomes from, as many lobby groups have beenpleased to discover. As Swiss scientists discoveredin 1998 when faced with a national referendumthreatening to ban the use of GM organisms of allkinds and in all contexts, they had only themselvesto call on in responding to the need to supplytimely opinions and facts in the face of theiropponents. In that spirit, it is entirely feasible for agroup of scientists to band together to make com-mon cause by means of a website and e-mails.

But there is also a need for a more detached butauthoritative presentation of information to stake-holders and the wider public. This is particularly

so for researchers and others in developingcountries who lack access to traditional sources ofinformation, in the context of debates over AIDS,or GM crops, or intellectual property rights.

In this context, I am particularly pleased to citeas an example a small organisation that hasrecently spun off from Nature, is independentlyfunded by national governments and internationalcharities, and is itself a registered charity. Its pur-pose is to publish a website, SciDev.Net, thatprovides original news and background briefings,and links to content that Nature and Science haveagreed to make free for this venture, plus copiesof policy and ethics reports, on these and, pro-gressively, many other hot topics for the develop-ing world. Launched in December 2001, it can befound at http://www.scidev.net

It is one thing to tell people what science is tell-ing us. It is another to consult them. Whetheropenness on the Web will in turn stimulate thedevelopment of more consultative public struc-tures within which science operates is unclear. Thehistory of web debates and consultations is not aninspiring one. But there is no limit to the serious-ness of engagement with science by some membersof the public when their close personal interests areat stake, when appropriate information is readilyaccessible, and when influential channels of com-munication are open.

Acknowledgments

I am pleased to acknowledge my colleague DeclanButler who, not least in his stewardship of Nature’s webforum on the electronic literature, has helped me andmany others think our way through these issues. Thepriorities and opinions presented here are my own, andare not necessarily those of the Nature PublishingGroup.

Philip Campbell has been the Editor-in-Chief of Nature since December 1995. Following a first degree in aeronauticalengineering, he took a Master’s degree in astrophysics and did doctoral and postdoctoral research in upper atmos-pheric physics. In 1979, he joined Nature, where he became Physical Sciences Editor. He left Nature in 1988 to be thefounding editor of Physics World, published by the UK’s Institute of Physics, until his return to Nature as its Editor-in-Chief. He is a director of the Nature Publishing Group, having overall responsibility for the editorial quality of allNature publications. He is a Fellow of the Royal Astronomical Society and a Fellow of the Institute of Physics, andwas awarded an honorary DSc by Leicester University in 1999. He was the first person to be given the EuropeanScience Writers Award by the Euroscience Foundation, a prize inaugurated in 2001. In 1999 he was an adviser to theUK government’s Office of Science and Technology on the public consultation on the regulation of biosciences and bio-technology. Until recently he was a member of the Wellcome Trust’s Medicine in Society panel, which advises onresearch grants in bioethics and in the public understanding of science. His main interest outside his work is music.

(Received and Accepted 11 April 2002)

Electronic Futures in Scientific Communication 967