Timeline of selected events related to the Deep Web (work in progress)

1980 Tim Berners-Lee “developed his first hypertext system, “Enquire” for his own use (although unaware of the existence of the term HyperText). With a background in text processing, real-time software and communications, Tim decided that high energy physics needed a networked hypertext system and CERN was an ideal site for the development of wide-area hypertext ideas (CERN).”

1989 Tim Berners-Lee started the WorldWideWeb project at CERN.

1992-09 Arthur Secret at the CERN created the first web gateway to a relational database system RDB (Shestakov 2008-05).

1994 Dr. Jill Ellsworth “first coined the phrase “invisible Web” to refer to information content that was “invisible” to conventional search engines (Bergman 2001 citing Garcia 1996).” See also

1996 Frank Garcia (1996) claimed Texas-based university professor Jill H. Ellsworth (d.2002), Internet consultant for Fortune 500 companies, coined the term “Invisible Web” in 1996 to refer to websites that are not registered with any search engine. ” “Ellsworth is co-author with her husband, Matthew V. Ellsworth, of The Internet Business Book (John Wiley & Sons, Inc., 1994), Marketing on the Internet: Multimedia Strategies for the World Wide Web (John Wiley & Sons, Inc.), and Using CompuServe. She has also explored education on the Internet, and contributed chapters on business and education to the massive tome, The Internet Unleashed.”

[S]igns of an unsuccessful or poor site are easily identified, says Jill Ellsworth. “Without picking on any particular sites, I’ll give you a couple of characteristics. It would be a site that’s possibly reasonably designed, but they didn’t bother to register it with any of the search engines. So, no one can find them! You’re hidden. I call that the invisible Web. Ellsworth also makes reference to the “dead Web,” which no one has visited for a long time, and which hasn’t been regularly updated (Garcia 1996).

1996-12-01 “The first commercial Deep Web tool (although they referred to it as the “Invisible Web”) was @1, announced December 12th, 1996 in partnership with large content providers. According to a December 12th, 1996 press release, @1 started with 5.7 terabytes of content which was estimated to be 30 times the size of the nascent World Wide Web. ( “America Online to Place AT1 from PLS in Internet Search Area: New AT1 Service Allows AOL Members to Search “The Invisible Web”).”See (Choi 2008-01-07).”

1996-12-12 “Personal Library Software, Inc. (PLS), the leading supplier of search and retrieval software to the online publishing industry, ushered in the next generation of Internet search engines with the introduction of a new Internet based service, AT1 which combines the best of PLS’s search, agent and database extraction technology to offer publishers and users something they have never had before: the ability to search for content residing in “hidden” databases — those large collections of documents managed by publishers not viewable by Web spiders. AT1 also allows users to create intelligent agents to search newsgroups and websites with E-Mail notification of results (Press release).”

1997 Michael Lesk wrote an unpublished paper entitled ”How much information is there in the world?”], in which he estimated that in 1997, the Library of Congress had between 20 terabytes and 3 petabytes.” See Choi (2008).

1999-02 Lawrence and Giles (1999) claimed that the publicly indexable World Wide Web (PIW) contained about 800 million pages; the search engine with the largest index, Northern Light, indexed roughly 16% of the publicly indexable World Wide Web; the combined index of 11 large search engines covered (very) roughly 42% of the publicly indexable World Wide Web.

2000-03 c. 43,000–96,000 Deep Web sites existed (Bergman 2001).

2000-07-26 BrightPlanet released a study documenting the Deep Web (a massive storehouse of databases and information that was invisible to search engines in 2000) claiming that the Deep Web was 500 times larger than the indexed Web accessible by most search engines. BrightPlanet researchers also released their direct-query search technology called LexiBot™ which automatically identifies, retrieves, qualifies, and classifies content from Deep Web sites. They listed c. 20,000 Deep Web searchable sites. Direct-query search technology that can access searchable databases unlike most search engines, implies that the Invisible Web is not really Invisible just harder to reach. BrightPlanet Unveils the ‘Deep’ Web: 500 Times Larger than the Existing Web.

2001BrightPlanet

“quantified the size and relevancy of the deep Web in a study based on data collected between March 13 and 30, 2000. Our key findings include: Public information on the deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web; The deep Web contains 7,500 terabytes of information compared to nineteen terabytes of information in the surface Web; The deep Web contains nearly 550 billion individual documents compared to the one billion of the surface Web; More than 200,000 deep Web sites presently exist; Sixty of the largest deep-Web sites collectively contain about 750 terabytes of information — sufficient by themselves to exceed the size of the surface Web forty times; On average, deep Web sites receive fifty per cent greater monthly traffic than surface sites and are more highly linked to than surface sites; however, the typical (median) deep Web site is not well known to the Internet-searching public; The deep Web is the largest growing category of new information on the Internet; Deep Web sites tend to be narrower, with deeper content, than conventional surface sites; Total quality content of the deep Web is 1,000 to 2,000 times greater than that of the surface Web; Deep Web content is highly relevant to every information need, market, and domain; More than half of the deep Web content resides in topic-specific databases; A full ninety-five per cent of the deep Web is publicly accessible information — not subject to fees or subscriptions (Bergman 2001).”

2001 AlltheWeb, public search engine was launched. (AlltheWeb is now owned by Yahoo.com). It was a redesign of Fast (1999-05 to 2001). Fast Search & Transfer is a Microsoft Subsidiary.

2000 Shestakov (2008) cites Bergman (2001) as the source for the claim that the term deep Web was coined in 2000. Bergman distinguished the Surface Web from the Deep Web using the metaphor of Surface and Deep water fishing or trawling. Deep Web is preferred over the term Invisible Web.

2000 UC-Berkeley Biologist Michael Eisen, Nobel Laureate Harold Varmus and Stanford biochemist Patrick Brown helped start the Public Library of Science, PLoS is a “nonprofit organization of scientists and physicians committed to making the world’s scientific and medical literature a freely available public resource” by encouraging scientists to insist on open-access publishing models rather than being forced to sign over their (often publicly-funded research) to expensive scientific journals. Wright (2004) cited Eisen, Varmus and Brown as examples of scientists who are making making some areas of the Deep Web more accessible to the public.

2001 Raghavan and Garcia-Molina (2001) “presented an architectural model for a hidden-Web crawler that used key terms provided by users or collected from the query interfaces to query a Web form and crawl the deep Web resources (Choi 2008-01-07).”

2002-02 StumbleUpon began to use human crawlers or human-based computation techniques to uncover data on the Deep Web. Human crawlers can find relevant links that algorithmic crawlers miss (Choi 2008-01-07).”

2002-12 There were c. 130,000 Deep Web sites (He, Patel, Mitesh, Zhang and Chang 2007, Shestakov 2008).

2003-06-01 Dorner and Curtis (2003-06-01) conducted a survey (data collected from 2002-12 through 2003-04) of librarians in New Zealand to compare their common user interface software products supplied by vendors: Endeavour, ExLibris, Follet, Fretwell-Downing, Innovative Interfaces, MuseGlobal, OCLC, SIRSI, WebFeat and VTLS. MuseSearch, ENCompass, MetaLib, Single Search and WebFeat received the highest scores in 2003 (Dorner and Curtis 2003-06-01:2). SingleSearch was noted as having the added cost advantage to librairies since it was open access, open source (Dorner and Curtis 2003-06-01:2). In 2002-2003 a successful common user interface technology software should support formats and protocols other than Z39.50 such as OpenURL, HTTP, SQL, XML, MARC, CrossRef, DOI, EAD, Dublin Core and Telnet (Dorner and Curtis 2003-06-01:8).

2004-04 There were c. 310,000 Deep Web sites (He, Patel, Mitesh, Zhang and Chang 2007, Shestakov 2008).

2004 Between 2000 and 2004 the Deep Web increased in size by 3-7 times (He, Patel, Mitesh, Zhang and Chang 2007, Shestakov 2008).

2004-03-02 Yahoo announced its Content Acquisition Program users paid for enhanced search coverage by “unlocking” the deep Web (Wright 2004).

2005 Yahoo released Yahoo! Subscriptions which searched a few of the Deep Web’s subscription-only web sites.

2005 Ntoulas et al. (2005) “created a hidden-Web crawler that automatically generated meaningful queries to issue against search forms. Their crawler generated promising results, but the problem is far from being solved. Since a large amount of useful data and information resides in the deep Web, search engines have begun exploring alternative methods to crawl the deep Web (Choi 2008-01-07).”

The search engine Pipl crawlers can identify, interact and retrieve some information from the deep Web.

Deep Web “search engines like CloserLookSearch and Northern Light Group|Northern Light create specialty engines by topic to search the deep Web. Because these engines are narrow in their data focus, they are built to access specified deep Web content by topic. These engines can search dynamic or password protected databases that are otherwise closed to search engines (Choi 2008-01-07).”

Google’s “Sitemap and mod oai are mechanisms that allow search engines and other interested parties to discover deep Web resources on particular Web servers. Both mechanisms allow Web servers to advertise the URLs that are accessible on them, thereby allowing automatic discovery of resources that are not directly linked to the surface Web (Choi 2008-01-07).”

2007-06 WorldWideScience was created to provide access to the Deep Web. When it began it linked to 12 databases from 10 countries. It is a “science portal developed and maintained by the Office of Scientific and Technical Information (OSTI), an element of the Office of Science within the U.S. Department of Energy. The WorldWideScience Alliance, a partnership consisting of participating member countries provides the governance structure for the WorldWideScience.org portal (RWW).”

2007-07-27 “Indiana University faculty member Javed Mustafa appeared on National Public Radio’s Science Friday, and drawing on information in a published study from University of California, Berkeley entitled ”How much information is there?”, estimated that the deep web consists of about 91,000 terabytes. By contrast, the surface web, which is easily reached by search engines, is only about 167 terabytes. The Library of Congress contains about 11 terabytes, for comparison. Mustafa noted that these numbers were a bit dated and were just rough estimates (Choi 2008-01-07).”

2008-05-14 ReadWriteWeb contributor Sarah Perez listed a number of “Digital Image Resources on the Deep Web.

2008-06 WorldWideScience portal to the Deep Web linked to 32 national, scientific databases and portals from 44 different countries. RWW.

2008 Several “Deep Web directories are under development such as OAIster by the University of Michigan, INFOMINE] at the University of California at Riverside and DirectSearch by Gary Price to name a few (Choi 2008-01-07).”

2008-09-22 Infovell launched its research engine for the Deep Web. “Available initially on a subscription basis, Infovell gives users access to hard to find, in-depth, expert information spanning Life Sciences, Medicines, Patents, and other reference categories with more to be added over time.” “Infovell’s research engine will be available beginning September 22 as a premium service for individual researchers and corporations who are seeking more affordable access to expert information. The Company is offering a risk-free trial through its website http://www.infovell.com. Later this year, Infovell will be beta-releasing a free version of its research engine on a limited basis for those individuals who want to search the Deep Web but don’t have the need for some of the advanced features available in the premium version.”

2009- United States “Congressional Representative John Conyers (D-MI) re-introduced a bill (HR801) that essentially would negate the  National Institutes of Health (NIH) policy concerning depositing research in Open Access (OA) repositories. The bill goes further than prohibiting open access requirements, however, as the bill also prohibits government agencies from obtaining a license to publicly distribute, perform, or display such work by, for example, placing it on the Internet, and would repeal the longstanding ‘federal purpose’ doctrine, under which all federal agencies that fund the creation of a copyrighted work reserve the ‘royalty-free, nonexclusive right to reproduce, publish, or otherwise use the work’ for any federal purpose. The National Institutes of Health require NIH-funded research to be published in open-access repositories (Doctorwo 2009).” HR801 would benefit for-profit science publishers and increase challenges for making the Deep Web more accessible. See Doctorwo, Cory. 2009-02-16. “Scientific publishers get a law introduced to end free publication of govt-funded research.” >> Boing Boing

Notes

Metasearch technology, also known as federated search or broadcast search, creates a portal that could allow the library to become the one-stop shop their users and potential users find so attractive (Luther 2003-10-01).”

Joo-Won Choi’s (2008-01) useful categories of Deep Web resources include:

Dynamic content: “Dynamic Web page and/or dynamic pages, which are returned in response to a submitted query or accessed only through a form (especially if open-domain input elements e.g. text fields are used; such fields are hard to navigate without domain knowledge). “

Unlinked content:
“pages which are not linked to by other pages, which may prevent Web crawling programs from accessing the content. This content is referred to as pages without backlinks or inlinks. “

Private Web
: “sites that require registration and login (password-protected resources).

Contextual Web: “pages with content varying for different access contexts (e.g. ranges of client IP addresses or previous navigation sequence).

Limited access content: “sites that limit access to their pages in a technical way (e.g., using the Robots Exclusion Standard, CAPTCHAs or HTTP headers, prohibiting search engines from browsing them and creating cached copies.”

Scripted content: “pages that are only accessible through links produced by JavaScript as well as content dynamically downloaded from Web servers via Macromedia Flash or AJAX solutions.”

Non-HTML/text content: “textual content encoded in multimedia (image or video) files or specific file formats not handled by search engines.” For more see Choi (2008-01).

Webliography and Bibliography

Bergman, Michael K. 2001-09-24. “The Deep Web: Surfacing Hidden Value.” White Paper.

Bergman, Michael. 2001. “The Deep Web: Surfacing Hidden Value.” Journal of Electronic Publishing. 7:1.

Choi, Joo-Won. 2008-01-07 “Deep Web.” KAIST

Dorner, Daniel G.; Curtis, Anne Marie. 2003-06-01. “A comparative review of common user interface software products for libraries.” National Library of New Zealand. 67 pp.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1994. The Internet Business Book. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1997. The Internet Business Book. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1995. Marketing on the Internet: Multimedia Strategies for the World Wide Web. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1996. Marketing on the Internet: Multimedia Strategies for the World Wide Web. 2nd Edition. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. Using CompuServe. John Wiley & Sons, Inc.

Ellsworth, Jill H. Chapters? The Internet Unleashed.

Garcia, Frank. 1996. “Business and Marketing on the Internet.” Masthead. 9:1. January. Alternate url @ web.archive.org

Guernsey, Lisa. 2001-01-25. “Mining the deep web with sharper shovels“. New York Times, No.25: pp.G1.

Lawrence, Steve: Giles, C. Lee. 1999-07-08.”Accessibility of Information on the Web.” Nature. 400:6740:107 – 109. See http://www.wwwmetrics.com.

Luther, Judy. 2003-10-01. “Trumping Google? Metasearching’s Promise.” Library Journal.

PLS. 1996-12-01. “America Online to Place AT1 from PLS in Internet Search Area: New AT1 Service Allows AOL Members to Search “The Invisible Web”.” Press Release.

Shestakov, Dennis. 2008-05. deep web

Smith, Richard. 2008-10-07. “More evidence on why we need radical reform of science publishing.” PLoS.

Wright, Alex. 2004-03-09. “In search of the deep Web: The next generation of Web search engines will do more than give you a longer list of search results. They will disrupt the information economy.” Salon.

He, Bin; Patel, Mitesh; Zhang, Zhen; Chang, Kevin Chen-Chuan. 2007. “Accessing the deep Web. Communications. ACM. 50:5:94–101.

See also http://papergirls.wordpress.com/the-ultimate-guide-to-the-invisible-web

Joo-Won Choi’s Bibliography:

Panagiotis Ipeirotis, Luis Gravano, and Mehran Sahami. 2001. “Probe, Count, and Classify: Categorizing Hidden-Web Databases.”Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data. pp. 67-78.

Gary Price & Chris Sherman. July 2001. ”The Invisible Web : Uncovering Information Sources Search Engines Can’t See.” CyberAge Books, ISBN 0-910965-51-X.

Michael K. Bergman. 2001-08. “The Deep Web: Surfacing Hidden Value.The Journal of Electronic Publishing. 7:1.

Sriram Raghavan and Hector Garcia-Molina. 2001. “Crawling the Hidden Web.” In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB). pp. 129-138

Nigel Hamilton (2003). ”The Mechanics of a Deep Net Metasearch Engine.” 12th World Wide Web Conference poster.

Bin He and Kevin Chen-Chuan Chang. 2003. “Statistical Schema Matching across Web Query Interfaces.” In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data

Joe Barker (Jan 2004). ”[ Invisible Web: What it is, Why it exists, How to find it, and Its inherent ambiguity. UC Berkeley - Teaching Library Internet Workshops.

Alex Wright (Mar 2004). ''In Search of the Deep Web'' Salon.com

Alexandros Ntoulas, Petros Zerfos, and Junghoo Cho. 2005. "Downloading Textual Hidden Web Content Through Keyword Queries." In Proceedings of the Joint Conference on Digital Libraries (JCDL). pp 100-109.

Extended version]

Frank McCown, Xiaoming Liu, Michael L. Nelson, and Mohammad Zubair. 2006.-03/4. “Search Engine Coverage of the OAI-PMH Corpus.” IEEE Internet Computing. pp. 66-73. 10:2.

Lesk’s bibliography:

[Bell 1994]. Alan Bell; IBM Academy Digital Library Workshop (Sept 12-13, 1994).

[Census 1995]. United States Census Bureau Statistical Abstract of the United States Government Printing Office (1995).

[Fargion 1996]. G. S. Fargion, R. Harberts, and J. G. Masek An Emerging Technology Becomes an Opportunity for EOS From the online file.

[Landauer 1986]. T. K. Landauer; “How much do people remember? Some estimates of the quantity of learned information in long-term memory,” Cognitive Science, 10 (4) pp. 477-493 (Oct-Dec 1986).

[Louis 1996 ]. Steve Louis Cooperative High-Performance Storage in the Accelerated Strategic Computing Initiative 5th NASA Goddard Conference on Mass Storage Systems and Technologies (Sept. 17-19, 1996 ). As reported by Ron Van Meter, .

[Markoff 1997]. John Markoff; “When Big Brother is a Librarian,” The New York Times pp. 3, sec. 4 (March 9, 1997).

[Mauldin 1995]. Matt Mauldin, “Measuring the Web with Lycos,” Third International World-Wide Web Conference, April 1995.

[Mills 1996]. Mike Mills; “Photo Opportunity,” Washington Post pp. H01 (January 28, 1996).

[Optitek]. The Need for Holographic Storage http://www.optitek.com/hdss_competition.htm.

[Radding 1990]. Alan Radding; “Putting data in its proper place,” Computerworld pp. 61 (August 13, 1990).

[Tenopir 1997]. Carol Tenopir, and Jeff Barry; “The Data Dealers,” Library Journal pp. 28-36 (May 15, 1997).

[UNESCO 1995]. UNESCO Statistical Yearbook Bernan Press (1995).

[Wells 1938]. H. G. Wells World Brain Methuen (1938).

There’s information out there that is actually not (yet) indexed in the big search engines such as Google. The non-indexable part of the Web is called the Dark, Deep, Hidden or Invisible Web. Fortunately, the Invisible Web is getting easier to search, with tools beyond the standard “big three” search engines. According to recently published PhD dissertation (Shestakov 2008:5), the query-based dynamic portion of the Web known as the deep Web remained poorly indexed by search engines even in 2008.

Shestakov refined distinction between Deep, Hidden or Invisible Web,

“There is a slight uncertainty in the terms defining the part of the Web that is accessible via web search interfaces to databases. In literature, one can observe the following three terms: invisible Web [97], hidden Web [46 hidden behind web search interfaces], and deep Web [25]. The first term, invisible Web, is a superior to latter two terms as it refers to all kind of web pages which are non-indexed or badly indexed by search engines (i.e., non-indexable Web). The terms hidden Web and deep Web are generally interchangeable, and it is only a matter of preference which to choose. In this thesis we use the term deep Web and define it as web pages generated as results of queries issued via search interfaces to databases available online. In this way, the deep Web is a large part but still part of the invisible Web (Shestakov 2008:5).”

read more | digg story

Garcia claimed Texas-based university professor Jill H. Ellsworth (d.2002), Internet consultant for Fortune 500 companies, coined the term “Invisible Web” in 1996 to refer to websites that are not registered with any search engine. ” “Ellsworth is co-author with her husband, Matthew V. Ellsworth, of The Internet Business Book (John Wiley & Sons, Inc., 1994), Marketing on the Internet: Multimedia Strategies for the World Wide Web (John Wiley & Sons, Inc.), and Using CompuServe. She has also explored education on the Internet, and contributed chapters on business and education to the massive tome, The Internet Unleashed.”

[S]igns of an unsuccessful or poor site are easily identified, says Jill Ellsworth. “Without picking on any particular sites, I’ll give you a couple of characteristics. It would be a site that’s possibly reasonably designed, but they didn’t bother to register it with any of the search engines. So, no one can find them! You’re hidden. I call that the invisible Web. Ellsworth also makes reference to the “dead Web,” which no one has visited for a long time, and which hasn’t been regularly updated (Garcia 1996).

I distinguish between the Invisible Web and the Deep Internet. Much of the research that is promoted by social media continues to focus primarily on business models of marketability not just findability.

The Deep Internet 2008 continues to be at cross purposes with the motivations of social minded authors. Too many foundational texts and articles that could be so useful to robust conversations in civil society are restricted to those with access codes to the deep internet, the dark place of open source and Web 2.0+. It would be hoped that writings and work written about key individuals concerned about ethics, economics, psychoanalysis, sociology, cultural studies . . . would be made available through the Creative Commons License 3.5, preferred by many engaged thinkers including many academics in 2008. Many of the services of the Deep Internet operate within the private sector model as user-pay. Others are restricted to those who are members of exclusive academic associations, the insular knowledge elite, who also operate with obligatory membership fees. JSTOR for example has its references behind a paywall. It provides summaries and a small section of text for free.

In a recent on-line search for biographical information on Zygmunt Bauman, for example a number of sites refer to Deep Internet sites: http://sociologyonline.net. One of the first sources available is http://www.megaessays.com.

“sociologizing makes sense only in as far as it helps humanity” and “sociology is first and foremost a moral enterprise,”

“To think sociologically can render us more sensitive and tolerant of diversity. Thus to think sociologically means to understand a little more fully the people around us in terms of their hopes and desires and their worries and concerns (Bauman & May, 2001).”

 

A pioneer in knowledge management, Professor Kim Veltman of SUMS, traced a history of major projects collections of recorded knowledge that changed the world sometimes taking centuries to construct. He argued that commercial offerings with short-term albeit, useful and profitable solutions lack the essential long-term vision. Digital media, full digital scanning and preservation, electronic networks could enable future generations in every corner of the world to access, study and appreciate all the significant literary, artistic, and scientific works of mankind. He is concerned that privatization of this communal memory is already underway and without intervention will only increase, effectively limiting access to those who have means. We have the means to shed light on the deep Internet. Is there the will?

 
“In a world where we make tens and even hundreds of millions of titles available online, readers need digital reference rooms. [T] he good news is that publishers have made many dictionaries, encyclopaedias and other standard reference works available in electronic form. Modern libraries now typically have an online section on Electronic Reference Sources.118 Special licences with publishers mean that some of these works are available free of charge at libraries and universities. Companies such as XReferplus now offer access to 100 or 150 standard reference works.119 The less good news is that the electronic versions of these reference works are frequently so expensive that they are beyond the reach of individual scholars. Meanwhile, there has been a trend for such reference works to be owned by a few key companies. In Germany, the pioneer in this field was K. G. Saur, which publishes “nearly 2000 print, microfilm, and electronic formats.” In 1987, Saur was acquired by Reed International. In 2000, it became part of the Gale Group owned by Thomson.120 In the United States, Dialog,121 which was founded in 1967, and “provides access to over 9 terabytes or more than 6 million pages of information“, was acquired by the same Thomson Company in 2000.122 Meanwhile, Bowker123 founded in 1872, which publishes Ulrich’s International Periodicals Directory (1932); and Books In Print 124 (1948-) was acquired by Xerox (1967) then Reed International (1981), then by Cambridge Information Group (2001), which has recently also acquired ProQuest Information and Learning (2006).125 Today, works such as Books in Print, are available only to institutions and are no longer available to individual subscribers. Fifty years ago, only the richest libraries could hope to achieve near comprehensive coverage of secondary literature. Today, practically no library can hope to be comprehensive and most collections are retreating. For instance, Göttingen, which had over 70,000 serials in the 1970s, now covers 30,000 serials. The California Digital Library has 21,000 electronic journals, which is impressive until we recall that Ulrich’s Periodicals Index lists 250,000 journals and serials. Meanwhile, at the University of California San Francisco, we find another modern catalogue that looks objective until we look closely and discover that of the 20 headings nine are traditional subjects and the remainder are branches of medicine (Appendix 3) … Ever since Gutenberg went bankrupt from the first printing, it has been obvious that publishers need to be attentive survival. For a very few companies this is not a problem. For instance, in 2004, Reed Elsevier126 listed an operating profit of £1126 million and profit attributable of £675 million.127 Somewhat disturbing is a trend whereby the world of longterm recorded knowledge is increasingly being framed in the terms of short-term business propositions, as if the whole of the public sphere was open to business exploitation..(Veltman 2007:12).” 

Webliography and Bibliography on the Deep Internet

Bergman, Michael K. 2001. “The Deep Web: Surfacing Hidden Value.” Taking License: Recognizing a Need to Change. Journal of Electronic Publishing. 7:1. Ann Arbor, Michigan: Scholarly Publishing Office, University of Michigan University Library. August.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1994. The Internet Business Book. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1997. The Internet Business Book. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1995. Marketing on the Internet: Multimedia Strategies for the World Wide Web. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. 1996. Marketing on the Internet: Multimedia Strategies for the World Wide Web. 2nd Edition. John Wiley & Sons, Inc.

Ellsworth, Jill H.; Ellsworth, Matthew V. Using CompuServe. John Wiley & Sons, Inc.

Ellsworth, Jill H. Chapters? The Internet Unleashed.

Garcia, Frank. 1996. “Business and Marketing on the Internet.” Masthead. 9:1. January. Alternate url @ web.archive.org

Shestakov, Dennis. 2008-05. “Search Interfaces on the Web: Querying and Characterizing”. PhD. Dissertation. Turku Centre for Computer Science. Finland.

Veltman, Kim H. 2007. “Framework for Long-term Digital Preservation from Political and Scientific Viewpoints.” Digitale Langzeitarchivierung. Strategien und Praxis europäischer Kooperation, Deutschen Nationalbibliothek, anlässlich der EU-Ratspräsidentschaft Deutschlands, 20-21. April 2007. Frankfurt: National Bibliothek.

See also Timeline: Deep Web work in progress

Follow

Get every new post delivered to your Inbox.