Economic principles applied to publication systems for biomedical research reveal a publication bias, a winner’s curse. Elite high-impact scholarly journals continue to raise artificial publication barriers by underusing open access, neglecting negative data and publishing unrepresentative results of repeated samplings of real world. Access to our communal knowledge and memory through archives is essential to the democratic process.

read more Young, Ioannidis, Al-Ubaydli (2008), | digg story

Currently publicly-funded peer-reviewed academic research published in exclusive journals largely informs public policies on biomedicine, the economy, environment, education, justice, housing, etc. These journals now make articles available on-line at exorbitant prices. Contributors to these journals earn tremendous academic capital crucial to professional advancement. Password protection and high costs prevent the public from accessing the most recent relevant and accurate research. The number of publicly accessible sites are growing as search engines dig deeper in the Deep Web and the open access movement grows among some academics and scientists [2, 3].

In this concise, fact-filled, informative article published by the Association of Research Libraries (ARL)[1] (2003-05-04) the authors described how even five years ago librarians were concerned by the mergers in scholarly publishing which reduced the number of players and by rising journal subscription rates that severely eroded the purchasing power[6] of libraries, universities, and scholars requiring crucial publications for teaching, learning and research.

In February 2009 Jennifer McLennan, SPARC’s[5] Director of Communications encouraged all supporters of public access to taxpayer-funded research – researchers, libraries, campus administrators, patient advocates, publishers, and others to oppose H.R. 801: the “Fair Copyright in Research Works Act which was re-introduced in February 11, 2009 by Chairman of the House Judiciary Committee (Rep. John Conyers, D-MI). This bill would reverse the National Institutes of Health (NIH) Public Access Policy and make it impossible for other federal agencies to put similar policies into place.”The bill goes further than prohibiting open access requirements, however, as the bill also prohibits government agencies from obtaining a license to publicly distribute, perform, or display such work by, for example, placing it on the Internet, and would repeal the longstanding ‘federal purpose’ doctrine, under which all federal agencies that fund the creation of a copyrighted work reserve the ‘royalty-free, nonexclusive right to reproduce, publish, or otherwise use the work’ for any federal purpose. The National Institutes of Health require NIH-funded research to be published in open-access repositories (Doctorwo 2009).” HR801 would benefit for-profit science publishers and increase challenges for making the Deep Web more accessible. See also Doctorwo, Cory. 2009-02-16. “Scientific publishers get a law introduced to end free publication of govt-funded research.”

In 2000 The Open Archives Initiative (OAI) [4] focused on increased access to scientific research (Van de Sompel & Lagoze, 2000). Since then it has reached deeper into the Deep Web with is OAI-Protocol for Metadata Harvesting (OAI-PMH). See Cole et al (2002).

Notes

1. In early 2002, Association of Research Libraries (ARL) Office of Scholarly Communication task force recommended that the Association promote “open access to quality information in support of learning and scholarship.” Society benefits from the open exchange of ideas. Access to information is essential in a democratic society. Public health, the economy, public policy all depend on access to and use of information, including copyrighted works.

2. UC-Berkeley Biologist Michael Eisen, Nobel Laureate Harold Varmus and Stanford biochemist Patrick Brown helped start the Public Library of Science, PLoS in 2000, a “nonprofit organization of scientists and physicians committed to making the world’s scientific and medical literature a freely available public resource” by encouraging scientists to insist on open-access publishing models rather than being forced to sign over their (often publicly-funded research) to expensive scientific journals. Wright (2004) cited Eisen, Varmus and Brown as examples of scientists who are making making some areas of the Deep Web more accessible to the public.

3. Alex Steffen (2003 [2008-09-04]) open source (OS) movement

4. The Open Archives Initiative (OAI) “develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The OAI Metadata Harvesting Protocol allows third-party services to gather standardized metadata from distributed repositories and conduct searches against the assembled metadata to identify and ultimately retrieve documents. While many proponents of OAI advocate open access, i.e., the free availability of works on the Internet, the fundamental technological framework and standards of the OAI are independent of the both the type of content offered and the economic models surrounding that content (ARL).”

5. The Scholarly Publishing and Academic Resources Coalition, (SPARC) launched in June 1998, is an international alliance of academic and research libraries working to correct imbalances in the scholarly publishing system.

5. SciDev.Net (Science and Development Network) “is a not-for-profit organisation dedicated to providing reliable and authoritative information about science and technology for the developing world. Through our website www.scidev.net we give policymakers, researchers, the media and civil society information and a platform to explore how science and technology can reduce poverty, improve health and raise standards of living around the world. We also build developing countries’ capacity for communicating science and technology through our regional networks of committed individuals and organisations, practical guidance and specialist workshops.” SciDev.Net “originated from a project set up by news staff at the journal Nature (with financial assistance from the Wellcome Trust, United Kingdom) to report on the World Conference on Science, held in Budapest in 1999. This was warmly received, leading to discussions about creating a permanent website devoted to reporting on, and analysing the role of, science and technology in development. The initiative was endorsed at a meeting held at the Academy of Sciences for the Developing World (TWAS) in Trieste, Italy, in October 2000. Immediately following the Trieste meeting, the UK Department for International Development (DFID) agreed to finance a six-month planning stage, starting in November 2000. At the end of this planning stage, sufficient funding had been raised from international aid agencies and foundations for a full-time staff and an independent office in London. The SciDev.Net website was officially launched on 3 December 2001. The website has expanded continuously since its launch. We regularly add dossiers, spotlights, ‘quick guides’ and ‘news focuses’ on specific subjects, in addition to a growing amount of regular news coverage. An enhanced and redesigned version of the website was launched in January 2008. Regional networks were launched in Sub-Saharan Africa (2002), in Latin America (2003), in South Asia (2004) and in China (2005), each bringing together individuals and organisations that share our goals and objectives. There are plans for future networks in the Middle East and North Africa, West Africa and South-East Asia. SciDev.Net held its first workshop, in collaboration with the InterAcademy Panel, on science in the media in Tobago in February 2001. Since then we have collaborated with partners to deliver numerous specialist science communication workshops for journalists and other professional communicators across the world (SciDev.Net History).”

6. “Expenditures for serials by research libraries increased 210% between 1986-2001 while the CPI increased 62%. The typical library spent 3 times as much but purchased 5% fewer titles. Book purchases declined by 9% between 1986-2001 as libraries sought to sustain journals collections. Based on 1986 purchasing levels, the typical research library has foregone purchasing 90,000 monographs over the past 15 years. In the electronic environment, the model has changed from the purchase of physical copies to the licensing of access. In general, libraries do not own copies of electronic resources and must negotiate licenses (rather than depend on copyright law) to determine access and use. Large bundles of electronic journals offered by major commercial publishers will force smaller publishers out of business. Multiple-year licenses to large bundles of content that preclude cancellations will force libraries to cancel titles from smaller publishers to cover price increases of the bundles. This diminishes competition and increases the market control of the large publishers. Lack of corrective market forces has permitted large companies to reap high profits from publishing science journals. In 2001 Reed Elsevier’s STM division’s operating profit was 34% while its legal division’s operating profit was 20%, its business division’s 15%, and education 23%. Mergers and acquisitions increase prices and eliminate competition. Research has shown that mergers exacerbate the already significant price increases of journals owned by the merging companies. While there were 13 major STM (Science, Technology and Medicine) publishers in 1998, only seven remained by the end of 2002 (ARL 2003-05-04:2).”

Webliography and Bibliography

Cole, Timothy W.; Kaczmarek, Joanne; Marty, Paul F.; Prom, Christopher J.; Sandore, Beth; Shreeves, Sarah. 2002-04-18. “Now That We’ve Found the ‘Hidden Web,’  What Can We Do With It?” The Illinois Open Archives Initiative Metadata Harvesting Experience. Museums and the Web (MW) Conference. Archives and Museums Informatics. University of Illinois at Urbana-Champaign, USA. April 18-20.

Smith, Richard. 2008-10-07. “More evidence on why we need radical reform of science publishing.”

Steffen, Alex. 2008-09-04 [2003]. “The Open Source Movement.” WorldChanging Team.

Young, N.S,; Ioannidis, J.P.A; Al-Ubaydli, O. 2008. “Why Current Publication Practices May Distort Science.” PLoS Medicine. 5:10.

ARL. 2003-05-04. “Framing the Issue.” Association of Research Libraries (ARL).

There’s information out there that is actually not (yet) indexed in the big search engines such as Google. The non-indexable part of the Web is called the Dark, Deep, Hidden or Invisible Web. Fortunately, the Invisible Web is getting easier to search, with tools beyond the standard “big three” search engines. According to recently published PhD dissertation (Shestakov 2008:5), the query-based dynamic portion of the Web known as the deep Web remained poorly indexed by search engines even in 2008.

Shestakov refined distinction between Deep, Hidden or Invisible Web,

“There is a slight uncertainty in the terms defining the part of the Web that is accessible via web search interfaces to databases. In literature, one can observe the following three terms: invisible Web [97], hidden Web [46 hidden behind web search interfaces], and deep Web [25]. The first term, invisible Web, is a superior to latter two terms as it refers to all kind of web pages which are non-indexed or badly indexed by search engines (i.e., non-indexable Web). The terms hidden Web and deep Web are generally interchangeable, and it is only a matter of preference which to choose. In this thesis we use the term deep Web and define it as web pages generated as results of queries issued via search interfaces to databases available online. In this way, the deep Web is a large part but still part of the invisible Web (Shestakov 2008:5).”

read more | digg story

Garcia claimed Texas-based university professor Jill H. Ellsworth (d.2002), Internet consultant for Fortune 500 companies, coined the term “Invisible Web” in 1996 to refer to websites that are not registered with any search engine. ” “Ellsworth is co-author with her husband, Matthew V. Ellsworth, of The Internet Business Book (John Wiley & Sons, Inc., 1994), Marketing on the Internet: Multimedia Strategies for the World Wide Web (John Wiley & Sons, Inc.), and Using CompuServe. She has also explored education on the Internet, and contributed chapters on business and education to the massive tome, The Internet Unleashed.”

[S]igns of an unsuccessful or poor site are easily identified, says Jill Ellsworth. “Without picking on any particular sites, I’ll give you a couple of characteristics. It would be a site that’s possibly reasonably designed, but they didn’t bother to register it with any of the search engines. So, no one can find them! You’re hidden. I call that the invisible Web. Ellsworth also makes reference to the “dead Web,” which no one has visited for a long time, and which hasn’t been regularly updated (Garcia 1996).

I distinguish between the Invisible Web and the Deep Internet. Much of the research that is promoted by social media continues to focus primarily on business models of marketability not just findability.

The Deep Internet 2008 continues to be at cross purposes with the motivations of social minded authors. Too many foundational texts and articles that could be so useful to robust conversations in civil society are restricted to those with access codes to the deep internet, the dark place of open source and Web 2.0+. It would be hoped that writings and work written about key individuals concerned about ethics, economics, psychoanalysis, sociology, cultural studies . . . would be made available through the Creative Commons License 3.5, preferred by many engaged thinkers including many academics in 2008. Many of the services of the Deep Internet operate within the private sector model as user-pay. Others are restricted to those who are members of exclusive academic associations, the insular knowledge elite, who also operate with obligatory membership fees. JSTOR for example has its references behind a paywall. It provides summaries and a small section of text for free.

In a recent on-line search for biographical information on Zygmunt Bauman, for example a number of sites refer to Deep Internet sites: http://sociologyonline.net. One of the first sources available is http://www.megaessays.com.

“sociologizing makes sense only in as far as it helps humanity” and “sociology is first and foremost a moral enterprise,”

“To think sociologically can render us more sensitive and tolerant of diversity. Thus to think sociologically means to understand a little more fully the people around us in terms of their hopes and desires and their worries and concerns (Bauman & May, 2001).”

 

A pioneer in knowledge management, Professor Kim Veltman of SUMS, traced a history of major projects collections of recorded knowledge that changed the world sometimes taking centuries to construct. He argued that commercial offerings with short-term albeit, useful and profitable solutions lack the essential long-term vision. Digital media, full digital scanning and preservation, electronic networks could enable future generations in every corner of the world to access, study and appreciate all the significant literary, artistic, and scientific works of mankind. He is concerned that privatization of this communal memory is already underway and without intervention will only increase, effectively limiting access to those who have means. We have the means to shed light on the deep Internet. Is there the will?

 
“In a world where we make tens and even hundreds of millions of titles available online, readers need digital reference rooms. [T] he good news is that publishers have made many dictionaries, encyclopaedias and other standard reference works available in electronic form. Modern libraries now typically have an online section on Electronic Reference Sources.118 Special licences with publishers mean that some of these works are available free of charge at libraries and universities. Companies such as XReferplus now offer access to 100 or 150 standard reference works.119 The less good news is that the electronic versions of these reference works are frequently so expensive that they are beyond the reach of individual scholars. Meanwhile, there has been a trend for such reference works to be owned by a few key companies. In Germany, the pioneer in this field was K. G. Saur, which publishes “nearly 2000 print, microfilm, and electronic formats.” In 1987, Saur was acquired by Reed International. In 2000, it became part of the Gale Group owned by Thomson.120 In the United States, Dialog,121 which was founded in 1967, and “provides access to over 9 terabytes or more than 6 million pages of information“, was acquired by the same Thomson Company in 2000.122 Meanwhile, Bowker123 founded in 1872, which publishes Ulrich’s International Periodicals Directory (1932); and Books In Print 124 (1948-) was acquired by Xerox (1967) then Reed International (1981), then by Cambridge Information Group (2001), which has recently also acquired ProQuest Information and Learning (2006).125 Today, works such as Books in Print, are available only to institutions and are no longer available to individual subscribers. Fifty years ago, only the richest libraries could hope to achieve near comprehensive coverage of secondary literature. Today, practically no library can hope to be comprehensive and most collections are retreating. For instance, Göttingen, which had over 70,000 serials in the 1970s, now covers 30,000 serials. The California Digital Library has 21,000 electronic journals, which is impressive until we recall that Ulrich’s Periodicals Index lists 250,000 journals and serials. Meanwhile, at the University of California San Francisco, we find another modern catalogue that looks objective until we look closely and discover that of the 20 headings nine are traditional subjects and the remainder are branches of medicine (Appendix 3) … Ever since Gutenberg went bankrupt from the first printing, it has been obvious that publishers need to be attentive survival. For a very few companies this is not a problem. For instance, in 2004, Reed Elsevier126 listed an operating profit of £1126 million and profit attributable of £675 million.127 Somewhat disturbing is a trend whereby the world of longterm recorded knowledge is increasingly being framed in the terms of short-term business propositions, as if the whole of the public sphere was open to business exploitation..(Veltman 2007:12).” 

Webliography and Bibliography on the Deep Internet

Bergman, Michael K. 2001. “The Deep Web: Surfacing Hidden Value.” Taking License: Recognizing a Need to Change. Journal of Electronic Publishing. 7:1. Ann Arbor, Michigan: Scholarly Publishing Office, University of Michigan University Library. August.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1994. The Internet Business Book. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1997. The Internet Business Book. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1995. Marketing on the Internet: Multimedia Strategies for the World Wide Web. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1996. Marketing on the Internet: Multimedia Strategies for the World Wide Web. 2nd Edition. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. Using CompuServe. John Wiley & Sons, Inc.

Ellsworth, Jill H. Chapters? The Internet Unleashed.

Garcia, Frank. 1996. “Business and Marketing on the Internet.” Masthead. 9:1. January. Alternate url @ web.archive.org

Shestakov, Dennis. 2008-05. “Search Interfaces on the Web: Querying and Characterizing”. PhD. Dissertation. Turku Centre for Computer Science. Finland.

Veltman, Kim H. 2007. “Framework for Long-term Digital Preservation from Political and Scientific Viewpoints.” Digitale Langzeitarchivierung. Strategien und Praxis europäischer Kooperation, Deutschen Nationalbibliothek, anlässlich der EU-Ratspräsidentschaft Deutschlands, 20-21. April 2007. Frankfurt: National Bibliothek.

See also Timeline: Deep Web work in progress