Federated search, US government style March 19, 2012

Posted by Mia

I’ve discovered the OSTI blog technology feed.  In case you thought federated search was a thing of the past, here’s  OSTI’s current view of it:

Science.gov, WorldWideScience.org, and the E-print Network achieve search interoperability for the covered databases and websites. What this means is that each search query on these tools searches about a hundred repositories, whether the repositories use XML; PDF; LaTeX; PostScript; or HTML. If full text is searchable by the database, the federated search covers full text, too.

It continues (emphasis mine):

We now routinely integrate just about any kind of information resource. For example, we make e-print databases like arXiv, CERN Preprints, and many more databases like PubMed Central all full-text searchable and field searchable via a single federated search query; then we integrate about 35,000 other repositories hosted by universities, professional societies, and others using discovery service technology. We make all this as convenient to the user as searching a single integrated entity. We search by metadata field, like author or date. In addition, we had to re-invent and deploy relevance ranking so that it would work in the federated environment. We have expanded the number of databases that can be integrated by a single federation to the point that the number of databases is no longer a practical limitation.