FACT: The Washington Post today has a story in the Business section (“Intelligence Agency Joins U-Md. Research Center“) about the relationship between IARPA and the University of Maryland, the location of the planned new IARPA headquarters.
ANALYSIS: UMd has a set of valuable relationships with the public- and private-sector national security community, and the IARPA startup is just the latest agency to benefit. Proximity is key, for research and bureaucracy. In Maryland’s case, IARPA Director Lisa Porter told an IEEE interviewer last month that “It’s nice not to be sitting right next to one particular agency. It’s also nice to be near a university because we’re sending a message that we want to bring in nontraditional partners: academia, industry. It sends a nice message that we’re embracing the broad community to help us solve these challenging problems.”
I lament sometimes that Charlottesville (home to my undergraduate alma mater) is a good two hours away from DC, as even that distance puts a frustrating limit on the amount of joint work that winds up being done with Virginia faculty and students.
If two hours driving time is a hurdle, imagine global barriers. Increasing amounts of research can be done virtually of course. But “research about research” isn’t always easy. For example, you can search Microsoft Research publications at this page, and Microsoft’s Live Search includes some 80 million journal articles indexed from scholarly or academic sources, mixed in with the other web content. But these are usually elements of the “deep web,” content within dedicated databases not normally indexed by a Google, say. R&D databases are notoriously well protected for obvious reasons.
One excellent resource on R&D, for a short period of time five years ago, was the RAND Corporation’s “Research and Development in the United States database,” RaDiUS, which was a short-lived but fairly comprehensive accounting of federal R&D activities and spending, in the early part of this decade. (See the fact sheet here.) RaDiUS was a valuable online asset, allowing the user to slice & dice views of the total R&D investment by all federal agencies, to see comparative investments in particular areas of science and technology across agencies, and to drill down within agencies. It enabled detailed analysis like this 2002 RAND review of R&D spending.
Bad News, Good News
Sadly, RaDiUS is no more. It was maintained by the Science & Technology Policy Institute, which was funded by the National Science Foundation (NSF) to provide analytical support to the White House Office of Science and Technology Policy (OSTP). The RaDiUS effort lost its funding in 2004 when the larger policy institute transfered from RAND to IDA, the Institute for Defense Analyses. Some (not all) of the functionality remains in the search capabilities on www.science.gov, maintained by the Department of Energy’s Office of Scientific and Technical Information (OSTI).
Now OSTI is taking a different approach. The office has launched a global science gateway called WorldWideScience, a federated-search portal allowing you to query more than 200 million “deep-web” documents not commonly indexed by most search engines, including U.S. government databases. The federated-search technology used is by DeepWeb Technologies and Verity co-founder Abe Lederman.
From its beta launch a year ago, WorldwideScience has grown from searching 12 databases in 10 countries, to 32 scientific databases and portals in 44 countries. (See the coverage map above.) Many of these sources are subscription databases which certainly aren’t accessible through traditional search engines. You can see the specific checklist of data sources here.
There are huge gaping holes in coverage; no databases from the scientific communities in Russia, China, much of Eastern and Central Europe – yet. But note the growth.
If you’re interested, ReadWriteWeb today had a good tutorial on how to do advanced deep-web searches in WorldWideScience, including sophisticated clustering techniques.
And I also like what one sly commenter had to say in response to the ReadWriteWeb blog post: “Ah, how refreshing it is to know that there’s still a research suite out there that returns results strictly on basis of relevancy.”
Filed under: Government, innovation, Intelligence, Microsoft, R&D, Technology Tagged: | Abe Lederman, academia, academic, blogs, Charlottesville, China, computer, DARPA, database, Deep Web, DeepWeb Technologies, Defense, DoD, Europe, federal, federal government, federated search, Google, Government, IARPA, IC, IDA, IEEE, Intelligence, Intelligence Community, internet, IT, Lisa Porter, Maryland, Microsoft, Microsoft Research, MSR, OSTI, policy, politics, public policy, R&D, radius, RAND, ReadWriteWeb, research, Russia, scholar, science, search, search engines, STPI, tech, Technology, UMD, University of Maryland, University of Virginia, UVA, Verity, Virginia, web, White House, WorldWideScience