Chapter 17LinkOut: Linking to External Resources from Entrez Databases

Kwan K.

Publication Details

Estimated reading time: 9 minutes

Summary

The power of linking is one of the most important developments that the World Wide Web offers to the scientific and research community. By providing a convenient and effective means for sharing ideas, linking helps scientists and scholars promote their research goals.

LinkOut is a powerful linking feature of the Entrez search and retrieval system (Chapter 15). It is designed to provide Entrez users with links from database records to a wide variety of relevant online resources, including full-text publications, biological databases, consumer health information, and research tools. (See Sample Links for examples of LinkOut resources.) The goal of LinkOut is to facilitate access to relevant online resources beyond the Entrez system to extend, clarify, or supplement information found in the Entrez databases. By branching out to relevant resources on the Web, LinkOut expands on the theme of Entrez as an information discovery system.

How Is LinkOut Represented in Entrez?

Any Entrez database record, e.g., a nucleotide sequence, a taxonomic record, a protein structure, or a PubMed abstract, can be linked to Web resources external to NCBI via LinkOut. The LinkOut homepage contains up-to-date documentation about LinkOut. The page that lists LinkOut resources associated with a given record can be accessed in a variety of ways (Figures 13). In the case of PubMed, the full-text article and other resources related to the abstract being viewed may be accessed directly by icon buttons above the abstract (Figure 2). The LinkOut display can be customized by using the LinkOut preferences in the Cubby.

Figure 1. The LinkOut display can be accessed by selecting LinkOut from a PubMed record (top panel), from other Entrez databases (middle panel), or from the Display list (lower panel).

Figure 1

The LinkOut display can be accessed by selecting LinkOut from a PubMed record (top panel), from other Entrez databases (middle panel), or from the Display list (lower panel).

Figure 3. Links to external resources are listed in the LinkOut Display of an Entrez record.

Figure 3

Links to external resources are listed in the LinkOut Display of an Entrez record.

Figure 2. From PubMed, the links to the full text of research articles are also managed by LinkOut and can be accessed through an icon from PubMed Abstracts, highlighted here in purple, as well as from the associated list of LinkOut resources in Figure 3.

Figure 2

From PubMed, the links to the full text of research articles are also managed by LinkOut and can be accessed through an icon from PubMed Abstracts, highlighted here in purple, as well as from the associated list of LinkOut resources in Figure 3.

How Does LinkOut Work?

Design Overview

The URLs to LinkOut resources are all provided by the person or organization that owns or created the resource. Links can be provided in any URL syntax, and providers of links may choose as much or as little access to their resource as they wish. Providers use one format to submit links to all Entrez databases.

LinkOut is in itself an Entrez database that holds all the linking information to external resources. The separation of the data records (e.g., PubMed abstracts) from the external linking information (e.g., URLs to journal articles on a publisher's Web site) enables both the external link providers and NCBI to manage linking in a flexible manner. This means that if links to external resources change, such as in the case of a Web site redesign, this will not affect the Entrez database records, and linking information can be updated as frequently as necessary.

The LinkOut database contains information on the relationship between a link and all of the applicable unique Entrez ID numbers (UIDs). By taking advantage of the interconnectivity among Entrez nodes, the linking information is presented seamlessly and efficiently.

LinkOut DTD and XML Files

LinkOut information is submitted in XML, defined by the LinkOut Document Type Definition (DTD).

Linking information is supplied in two elements: the Provider element, which specifies information about a link provider; and the LinkSet element, which describes information about the link. Each element should be submitted to NCBI in a separate file. Identity files contain the Provider element, and Resource files contain the LinkSet element.

The Identity file is always called providerinfo.xml. It describes the identity of a provider, including an ID (ProviderId) and an abbreviated name (NameAbbr) assigned by NCBI, the provider's name, and other general information about the provider. There should be only one providerinfo.xml file for each provider (see Box 1 for an example of an Identity file).

Box Icon

Box 1

Example of an identity file.

The Resource file, which contains the LinkSet information, specifies a set of Entrez records with a valid Entrez query, a specific rule to build the link to an external resource, and description of the resource using the SubjectType, Attribute, and UrlName fields. There is no standard for naming the LinkSet files, except that they must use the .xml extension. There may be any number of LinkSet files associated with a ProviderId. (See Box 2 for an example of a Resource file.)

Box Icon

Box 2

Example of a resource file.

Terms used in SubjectType and Attribute elements are controlled to describe LinkOut resources in a systematic manner. This is because resources are presented to users by SubjectType on the LinkOut display page (and within the Cubby system), making it easier to browse and access available resources. Attributes can be used to describe the nature of a LinkOut resource (i.e., whether the resource requires a subscription or registration to access the content). A short text string may be used in the UrlName element to describe a resource. UrlName is typically used when the allowed SubjectType and Attribute terms cannot describe the resource adequately or when multiple links are available from one provider for a single Entrez record.

XML File Processing and Indexing

All links from Entrez are generated on a daily basis so that new or modified Entrez records will have accurate LinkOut resources connected to them. Once a day, all LinkOut files are parsed according to the LinkOut DTD, and the LinkOut database is rebuilt, relating the Entrez UIDs with the link information specified in the LinkSet XML files.

A LinkOut record consists of a link and the associated information, including its URL and all descriptive terms (SubjectType, Attribute, and UrlName) pertaining to the link. The Entrez UIDs applicable to the link are indexed to associate this information to the corresponding Entrez databases. As explained in Chapter 15, LinkOut information is interconnected with all related Entrez records.

LinkOut Filters

To facilitate search and retrieval of LinkOut resources, there are a number of filters in the LinkOut-enabled Entrez databases. These filters, although not part of the LinkOut database, use the result generated in the LinkOut indexing process.

The filters are all prefixed with lo. Filters are available for all allowable SubjectType and Attribute terms and the NameAbbr of a provider. Some examples include:

  • loprov LinkOut Provider
  • loattr LinkOut Attribute
  • losubj LinkOut SubjectType
  • loall all LinkOut resources in an Entrez database

To use these filters to retrieve a set of Entrez records with LinkOut resources, the filter term can be entered as a search. For example, in PubMed, searching

“loattrfull text online”[Filter]
will retrieve all records with LinkOut resources that have an attribute “full-text online”. The Preview/Index section in PubMed can also be used to select LinkOut filters by first selecting Filter and then typing in “lo” and selecting Index to browse through all of the filters related to LinkOut.

Guides for LinkOut Providers

LinkOut resources should be directly relevant to specific subjects of the Entrez records to which they will be linked, thus providing further research resources for Entrez users. The information and its delivery system should be of high quality and must not, through typographic or factual errors, omissions, or other flaws or inconsistencies, mislead, hinder, or frustrate the research efforts of Entrez users. The resources should be easy to use and navigate. Resources from professional societies, government agencies, educational institutions, or individuals and organizations that have received grants from major funding organizations are preferred.

Participation in LinkOut is voluntary. Providers need to submit two types of files to describe the LinkOut resources, Identity files and Resource files (see Boxes 1 and 2). These files include the necessary information for the Entrez system to construct an appropriate URL to access specific resources.

A list of Frequently Asked Questions is available to address questions that potential LinkOut providers may have. Current lists of LinkOut providers can also be browsed.

Submission Procedures

Step 1. Initial Contact. A prospective provider can write to vog.hin.mln.ibcn@tuoknil, indicating interest in creating links from Entrez records to the providers' Web-accessible online resources. Please include the name, email address, and phone number of an individual who will act as a designated contact. The email should also include a LinkOut Identity file (providerinfo.xml) based on the specifications described above.

Step 2. File Evaluation. NCBI staff will evaluate the resources before a ProviderId and NameAbbr are assigned. NCBI will also provide assistance with setting up an appropriate Resource file to describe the LinkOut resources.

Step 3. File Submission. An FTP account will be assigned to a provider for submission. Files must have been validated by the LinkOut Validation utility before uploading. Providers may transfer new versions of current files or add new Resource files at their own discretion. Providers are responsible for keeping their files current and valid. Links in Entrez databases are regenerated each day based on the files in each provider's directory; therefore, providers must delete obsolete files from their holdings directory.

Step 4. Representation in Entrez. Once a provider's LinkOut files are processed, the resources described in the file will be available in the LinkOut display of a relevant Entrez record as described in the above section, How Is LinkOut Represented in Entrez?. In PubMed, publishers of the abstract can choose to display a “button” on the Abstract and Citation displays of the PubMed record by adding the parameter “holding=NameAbbr” to the basic PubMed URL. For example, to activate the icon of WebDatabase Co, the URL would be provided as:

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=WebDB
Furthermore, multiple NameAbbr parameters may be used in a URL to activate more than one icon. For example, to display icons for both WebDB and MyDB, the following URL should be provided:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=WebDB,MyDB
A provider's icon can be activated if the provider is selected from the LinkOut Preferences in Cubby.

All access restrictions will still apply. For example, if access to a database is limited by the user IP address, access will only be allowed via computers within an approved IP range; if access is password protected, the password must still be entered.

Detailed Guides

Interested parties can consult the following guides for more details:

Auxiliary Tools

A number of tools are available to facilitate participation in LinkOut:

  • Library LinkOut Files Submission Utility utility developed for Libraries to generate and manage their LinkOut files. Libraries simply check off their electronic journal collections from a list of journals that participate in LinkOut. With this utility, libraries can provide correct holdings information easily, and staff do not have to construct LinkOut files by hand.
  • LinkOut File Validation utility to be used by providers of links to parse their LinkOut files, ensuring the accuracy of the files before submission. Besides validating the file syntax against the LinkOut DTD, this tool will ensure that only allowable SubjectType and Attribute terms have been provided.

Additional tools are being developed to assist other groups of providers. Interested parties can subscribe to announcement lists described in Communicating with LinkOut Providers (next section) to be informed of new developments.

Communicating with LinkOut Providers

LinkOut resource providers can communicate with NCBI's LinkOut team in a number of ways. Users and providers can write to vog.hin.mln.ibcn@tuoknil to ask questions about LinkOut. There are also three announcement lists where development related to LinkOut will be communicated to link providers:

1.

Linkout-news is for general announcements on LinkOut.

2.

Library-linkout is for announcements on development related to library LinkOut participants.

3.

Tax-linkout is for announcements relevant to linking to taxonomic resources on the Web.