Skip Navigation Links

NIH Scientific Workforce Diversity Recruitment Search Protocol

This recruitment search protocol can be used as one tool to diversify faculty in biomedicine. At NIH, we have used it numerous times to help scientific leadership in the NIH intramural research program identify highly qualified scientists (both senior and early-career) from diverse backgrounds. (For additional tips on identifying early-career scientists from diverse backgrounds, click here). Below, find step-by-step directions to conduct a systematic, unbiased talent search tailored to a particular discipline. Note: any information that is retrieved online such as Last Name, First Name, Degree, Race/Ethnicity, Focus/Interests, Email, and Phone Number are key examples of personally identifiable information (PII). Be aware of the sensitive nature of PII when storing, sending, and uploading protocol-related information.

 

STEP 1: Generate dataset of top scientists in field of interest

STEP 2: Organize dataset by contact, professional, and demographic information

STEP 3: Vet candidates in the dataset using quantitative and qualitative measures

 

STEP 1: GENERATE DATASET OF TOP SCIENTISTS (PUBLICATION AUTHORS) IN FIELD OF INTEREST

Note: We have enlisted the help of an NIH Library informationist to accomplish STEP 1, and institutional library informationists at academic institutions have the ability to develop similar algorithms. If you are using an informationist to develop your initial list, proceed to STEP 2.

The first step of this search protocol requires identifying, in a systematic and unbiased manner, the top authors in a given field according to a range of bibliometric indicators. It is the most important step since it generates a list of approximately 100 names of scientists from diverse backgrounds that you will then narrow down to identify highly qualified candidates. Toward ensuring broad diversity of the final set of names you derive, and due to the current lack of diversity in leadership positions, you will likely need to oversample to ensure that the search yields sufficient representation of women and individuals from a range of racial/ethnic backgrounds. In our experience, we retrieve at least 100 names to begin.

While the general approach is the same, several online sources (all free but require registration and login) can be used to generate your initial dataset:

The list above is intended only to provide examples of available sources and is not intended as an endorsement or recommendation by NIH of a listed source.

For simplicity, we describe the search processes using two methods, WOS and iCite, that automatically calculate a range of bibliometric indicators for various publication sets or authors. If you have questions about using the other primary sources listed above, please proceed to DETAILED INSTRUCTIONS FOR NAVIGATING PRIMARY DATA SOURCES or contact SWDtoolkit@od.nih.gov. Note that iCite offers the opportunity to employ the field-normalized bibliometric tool Relative Citation Ratio or RCR.

Web of Science (WOS) Method

  1. Using Web of Science, start by entering your keywords into the search bar under Basic Search. For this example, our keywords are “computational genomics” and “cancer genomics,” and we will be searching for U.S.-funded scientists.
    Screenshot of Basic Search
  2. Use both keywords in combination to broaden the search (add another field).
    Screenshot of Basic Search with keywords
  3. Add "computational genomics" and select "Topic" in the top box. Change "AND" to "OR" and type in "cancer genomics."  Click on Search.
    Screenshot of Basic Search with options
  4. Next, click “Analyze Results.”
    Analyze Results
  5. Now rank the records (this search yielded 10,978 papers) by "Countries/Regions" (to narrow the search for U.S.-funded scientists) and select top 100 Results. Click "Analyze."
    Screenshot
  6. Use the checkboxes below and mark USA. Click "View Records." (this step facilitates the next filter by U.S. funding agency).
    Screenshot of View Records

    At this point, or after applying additional filters to narrow down the number of publications (and then authors), you may choose to analyze your author data using InCites (same sign-in credentials as for WOS). This enables an in-depth analysis using data visualation tools that compare authors with each other and provide many additional chacteristics about authors. If not, continue below.

  7. Once search results are posted, hover down the page and look for Funding Agencies on the left panel. Click on “more options/values…
    Screenshot of Funding Agencies
  8.  You will see a list of funding agencies listed in your search box.
    Screenshot of Funding Agencies list
  9. To narrow your search by funding agency, click anywhere into the page and press CTRL+F (CMD+F for Mac) to search the list. To search on the page within the list, type in “NIH” (or desired funding agency search terms). Click each checkbox for desired funding agency to be included. Click “Refine” when complete. (Note: you will need to customize this filter for your specific needs.)
  10.  Note that this has narrowed your search considerably (see top left for number of publication records). Now hover down to the Authors section again and click on “more options/ values.” This enables you to maximize the number of authors in your search. This step retrieves the top 100 authors in your publication set.
    screenshot of author list
    screenshot of author list 2
  11. Click "Analyze Results."
  12. Select Authors in the "Rank the record by this field" and select either 100 or 250 results. Click "Analyze." (Note: this step retrieves all publication authors on every paper.)
    screenshot of search
  13. The results will provide a list of the top 100 (or 250) authors, as below.
    authors
  14. View records and further characterize authors using your desired search engine(s). Check the box by a name to look further.
    screenshot
  15. Click View Records. This will generate a list of the author’s publications and analyze as desired, in addition to further characterizing the authors on your list with other search methods. Proceed to STEP 3.

For each of the top 100 (or 250) authors (expand this number if you do not yield sufficient diversity the first time), perform another search using an WOS "Author" search from the drop-down menu. Alternatively, you can create a citation report for each author and compare productivity.

There are numerous ways to analyze these data, so you will likely want to experiment with the parameters you choose based upon the characteristics of the field in which you are interested. Ultimately, when you are satisfied that you have a list of sufficient names, you will need to identify their probable gender and race/ethnicity by searching their online faculty profile, using any search engine and/or other online resources described in STEP 3.

 

INCITES BIBLIOMETRIC DATA VISUALIZATION TOOL (Optional)

Choose "Save to InCites" in middle dropdown menu (note: your WOS login and password is the same as for InCites, you may be asked to login to InCites if not done so already).

screenshot of InCites

Check your email inbox. This step generates an email (sent within a few minutes) to your WOS/InCites login email that contains detailed information and analytics about your dataset, including multiple data visualization tools showing many types of analyses. Tip: you may wish to load more names or delete multiples of the same author “exclude from results” to customize your dataset.

Click the link in your email to retrieve the dataset, then choose the dataset link within InCites.

screenshot of InCites
screenshot of InCites

At this point, you can parse the data by a wide range of characteristics including collaboration network, geography, research output/type, productivity over time, funding source and so on. The “Dashboard” function enables within-dataset comparisons. Once you’re satisfied with the initial dataset, proceed to STEP 2.

iCite Method

Go to iCite. On the first, “New Analysis,” page, enter your topic in the “Search PubMed” field and click “Process” at the bottom. This yields a detailed report of publication analytics that can be further mined for information about authors. Under “Customization,” click “RCR” on the top right to generate a descending list of the most-cited papers by this field-normalized bibliometric method. Tip: you may choose to emphasize original research papers reviews by clicking “exclude non-articles” (e.g. reviews).

screenshot of ICites

To save the data, you can export it to an Excel sheet and analyze various ways according to your designated criteria and search priorities. You may choose to select first and last authors, and re-analyze in iCite using the “author” field search. You may also reimport the Excel sheet into iCite for numerous additional analyses. Proceed to STEP 2.

Note: The automated nature of these approaches mean that the bibliometric indicators provided in this report should be regarded as estimates. Because of inconsistencies in the ways author names appear in WOS, and because the names of different authors can appear to be the same in WOS (e.g. Richard Rhodes and Robert Rhodes can both appear as “Rhodes, R” in WOS), errors in paper attribution are common. Common names may also sometimes generate papers in more than one field, requiring manual correction. Other limitations include hyphenated names. WOS attempt to correct for these problems algorithmically, but some degree of error remains in the dataset and affects the indicators calculated for these authors. In summary, although these analyses provide quantitative data that can be used to evaluate authors, the limitations of bibliometric analysis mean that they should not be used as the sole criteria for any evaluative purpose.

STEP 2: ORGANIZE CONTACT, PROFESSIONAL, AND DEMOGRAPHIC INFORMATION FOR THE DATASET

Once you have generated your dataset of top published authors, next collect and organize the data according to your desired parameters. Suggested indicators are listed below. Note that the all the primary data sources offer tools for generating metrics such as citations per publication or others (see DETAILED INSTRUCTIONS FOR NAVIGATING PRIMARY DATA SOURCES, below). As previously noted, be sure to appropriately secure all data that contains PII. *For race/ethnicity and gender, you will likely need to perform additional research – proceed to STEP 3.

  • Current Institution
  • Last Name
  • First Name
  • Contact Information (if publicly available)
  • Gender*
  • Degree
  • Race/Ethnicity*
  • Position
  • Institution (Current)
  • Focus/Interests
  • Phone Number
  • Faculty Page
  • Publications
  • Citations
  • Citing Articles
  • Citations per Publication

 

STEP 3: VET CANDIDATES IN THE DATASET USING QUANTITATIVE AND QUALITATIVE MEASURES

The next and final step to generating a list of diverse candidates is using additional search strategies to add details and fill in gaps, the information you gathered in Steps 1 and 2. You may need to consult additional search tools, such as LinkedIn, Faculty1000, and online faculty pages.

Such information includes:

  • Funding history (e.g., NIH RePORTER)
  • Leadership (faculty web pages or professional society web pages)
  • Other professional experience, connections, service, and non-scientific publications such as blogs and other professionally relevant social media posts (e.g., LinkedIn)
  • Google Scholar
  • PubMed

 

DETAILED INSTRUCTIONS FOR NAVIGATING PRIMARY DATA SOURCES

Publicly available tools used include search engines such as Web of Science, Google, Google Scholar, PubMed, Federal RePORTER, NIH RePORTER, and LinkedIn.

Web of Science (see screenshots below for more detail):

  1. Go to www.webofscience.com.  
  2. Begin the search by selecting the “Author” option on the drop-down menu.
  3. In the search box, type in author’s name and click search.
    • To identify every available publication in each track of an author's career progression, omit middle initials (i.e., Gonzalez, Peter) in the query.
  4. Refine search parameters on the left side of the page.
    For Document Types, select “Article” only
    For Authors, select all variations of the name in “more options/values...”
    • For example, select the following options if listed: Gonzalez, P; Gonzalez, Peter; Gonzalez PM; Gonzalez, Peter M; Gonzalez, Peter Mark, etc.
    • Although “Research Areas” may be relevant; it is better to review each article individually to gain insight and understand specific research led by the author.

    Figure 1. Web of Science Marked List

    screenshot of WebScience
    screenshot of WebScience

  5. Once you have refined your search results, begin reviewing each article individually to assess quality and impact (See Figure 1).
    1. Review each article abstract for relevance
      1. Is the article relevant to the field that you are searching?
    2. Verify each article Author’s institution and/or email address (if applicable) to identify the exact Author identification.
    3. Verify the Document Type under document information as “Article”
  6. Once you have reviewed each article for relevance and exact Author identification, click Add to Marked List.
  7. Click the above Citation Network.
  8. Repeat steps 1-7 until search results are complete.
  9. Once the review is complete, click Marked List.
  10. In Marked List, scroll to the bottom of the page and click on create citation report. (See Figure 2)
  11. Once you have your citation report, collect the following bibliometric values and enter them into a spreadsheet/collection tool (see Figure 3):
    1. Results found (# of publications)
    2. Sum of times cited (Total Citations)
    3. Citing articles
    4. Average citations per publication (perform this calculation manually)
      1. Average Citations per pub = (Total Citations)/Publications

Figure 2. Web of Science Citation Report

screenshot of Web Science report

Google or other search engine:

  • For CVs - "Anna Sullivan + CV"
  • For resumes - "Anna Sullivan + resume"
  • To narrow search results for a specific scientist, use the following queries:
    • "Anna Sullivan + PhD"
    • For current/former education - "Anna Sullivan + My University"
    • For current/former institution and workplace - “Anna Sullivan + Career Institute”
    • For LinkedIn - "Anna Sullivan +LinkedIn"
    • "Anna Sullivan + biologist"

Google Scholar:

  • Although Google Scholar has a profile search feature, not all scientists are included. Thus, Google Scholar may provide good reference material when Google search does not help narrow results based on "author."
    • Use this link as a starting point as there is no direct method of arrival: https://scholar.google.com/citations?view_op=search_authors&hl=en&mauthors=
    • The "cited by" metrics can provide insight on how influential a scientist is within his or her field.
      • Label feature – very sensitive in searches
        • In the Google Scholar search box, type in label: topic (scientific discipline of interest)
          Label: autism

Figure 4. Google Scholar Label-Based Search

screenshot with label chosen

PubMed:

  • Use the advanced builder tool to retrieve accurate, refined results on a specific author.
    • Using the advanced builder tool, perform searches using the “titles” and “text word” filter
      • For example, if you are searching for a mental health scientist that focuses on autism, try many different types of queries such as:
        • autism spectrum, autism, autism spectrum disorder, etc.

Figure 5. PubMed Advanced Builder Screenshot

screenshot of PubMEd

Grant History

  • Click on NIH RePORTER (or Federal RePORTER)
  • Input Last Name and First Name of person of interest
  • Submit query
  • You will receive a notice for “Searching all fiscal years (FY)”
  • Filter by FY and identify grants by each year

Leadership and other Qualitative Information

  • Refer to protocol above for Google or other search engine
  • Type in scientist’s name in the search box
    • “Peter Gonzalez + PhD”
    • For current/former education - “Peter Gonzalez + My University”
    • For current/former institution and workplace - “Peter Gonzalez + Career Institute”
      • Often, this type of information can be found on Faculty Page, LinkedIn, or websites that refer to biographical information.

TIPS for PACKAGING AND PRESENTING YOUR DATA TO A SEARCH COMMITTEE

Once we have compiled all the information for a given candidate, we create a solicitation package that includes the following materials:

  • Overview of candidates
  • Solicitation list – a list that includes contact information provided to hiring chair* and human resources contact
  • Candidate summary highlighting bibliometrics, grant history, and candidate’s value proposition that includes qualitative information
  • Candidate biosketch that limits potential for bias (does not include photos and unnecessary personal information) and highlights key accomplishments and research interests
  • Candidate CVs and resumes

We then deliver electronically the solicitation package to the Hiring Chair for review. After the solicitation package is submitted, we monitor the progress of each candidate during the search/hiring process cycle and record this information on a secured spreadsheet.

*or person responsible for leading hiring decisions

Questions, comments?

For more information, please contact us at SWDToolkit@od.nih.gov

 

Back to Top