De-duplicating EndNote results against a previous search

Well hello my friends! It’s been a long time since I’ve posted. Forgive me, as much has happened in the last year, including: moving across an ocean, subsequently moving across a city, starting a busy freelance information specialist business, and many mundane crises, trips, and side-projects in between.

I was tasked recently with updating a systematic review search with a new and improved search strategy completely unlike the previous one. There were new search terms added to this search, but also several irrelevant search terms that had been deleted. I’d done systematic review updates before, but I’d always simply used date filters in the databases to capture results from the date of previous search.

But this time, the researchers wanted any articles that would be captured by the new search, in any date period, and wanted to ensure they weren’t screening any citations that had already been previously screened with the old search.

I’d heard other information specialists talk about using EndNote to de-duplicate against a previous search, but never tried it myself. It just seemed unduly complicated. Date filters seemed to be working perfectly fine.

It was around this same time that I found out that date filters were not perfectly fine.

One day, I went to date limit an Embase search using the advice from a LibGuide at a high-ranking university…. and was horrified to find out that a not insignificant number of citations that ought to have been picked up had not.

Cue a minor panic as I tried to figure out whether I had royally screwed up any of my previous projects.

Friends, I have learned the error of my ways, and will henceforth de-duplicate my systematic review update searches in EndNote when possible. Cross my heart, etc, etc.

As usual when picking up a new skill, I went to Twitter to see what all the experts were doing.

Here follows the method that I ended up using. I’ve documented it for my own purposes and hope that it can come in handy for others as well.

1. De-duplicate your total search results in EndNote, as normal.

You can use whatever process works best for you. I tend to use a modified version of Bramer et al, 2016 in which I progressively choose different field combinations in EndNote to test for duplicates, and manually go through the results. The field combinations suggested in the article include (in this order):

  • Author | Year | Title | Secondary Title (Journal)
  • Author | Year | Title | Pages
  • Title | Volume | Pages
  • Author | Volume | Pages
  • Year | Volume | Issue | Pages
  • Title
  • Author | Year

But if you want to get fancy about it, the article supplies a more complicated process than this.

2. Label your citations by search date.

At this point, you’ll want to load up your citations from the previously conducted search into a separate EndNote Library. Then, use one of the custom fields to mark these citations.

  1. Select one of your citations from the “All Refs” group, then click cmd + A or ctrl + A to select all in the entire library.
  2. Then, go to tools, then Change/move/copy fields
  3. Select “Custom 1” (or another field of your choosing), then “replace whole field with”
  4. Choose text that is meaningful for you for remembering which citations these are. Something as simple as “OLD” may suffice (this is what I did, based on a Twitter tip).
  5. Next, do the same for your “new” search results.

Note that this is an important step if your search strategy has changed such that some results that were previously returned will not be returned in the new search. Otherwise, you will end up re-screening those articles!

3. Combine your “old” and “new” EndNote Libraries together

To combine your two libraries together, navigate to your “old” library and select all the citations by clicking ctrl + A or cmd + A.

Then click “references”, then “copy references to”, and choose your “new” library. Easy peasy!

4. Remove duplicates

Use your EndNote Library which contains both your “old” and “new” records (both of which have previously had duplicates removed!), and remove your duplicates as you normally do, or following the process in Step 1.

But this time, there’s one big difference – every time you find a duplicate, instead of removing the duplicate record, you’ll remove BOTH records, since they represent a previously screened record that you won’t need to screen again.

5. Remove any remaining previously screened citations

This step won’t be necessary if no search terms have been removed since the original search was conducted.

In my case, the search had changed drastically since its creation by someone else, and I needed to remove any citations that were picked up by the original search, but not by the new one. Removing these records is easy if you have followed Step 2, above!

Simply create a smart group by right-hand clicking over “my groups” in EndNote. Then, set the parameters to find your old citations (e.g. “custom 1”, “is”, “OLD”). Then, navigate to your smart group and delete all the citations in this group. These have already been previously screened and weren’t retrieved by the new search.

And that’s basically it! I was able to tackle this new skill that originally seemed kind of hard, and you can too!

For more information, the following papers may also be useful:

Bramer WM, Bain P. Updating search strategies for systematic reviews using EndNote. Journal of the Medical Library Association: JMLA. 2017 Jul;105(3):285.

Bramer WM, Giustini D, de Jonge GB, Holland L, Bekhuis T. De-duplication of database search results for systematic reviews in EndNote. Journal of the Medical Library Association: JMLA. 2016 Jul;104(3):240.

Advertisements

Finding a random set of citations in EndNote

Have you ever been asked to find a random set of citation from EndNote? This happens most often to me when researchers are testing out screening procedures, and want to ensure they are all interpreting the screening guidelines the same way. The researchers will all screen the same random set of 10-20 articles and compare results before screening the entire set.

So: what’s the best way to go about this? Sorting from a-z on any given field and selecting the top 10-20 articles isn’t likely to be truly random. For example, sorting by date will retrieve only very new or old articles. Sorting by record number is one possible way to do it, but also isn’t truly random as it will retrieve articles added to the database most or least recently.

Here’s how I take a truly random sample of citations from EndNote.

First, create an output filter in EndNote

The output filter will include only the citation record numbers. Don’t worry, you only have to do this once, and in the future it will all be set up for you!

  1. In EndNote, go to Edit –> Output Styles –> New Style
  2. In the resulting screen, click “templates” under the heading “bibliography”
  3. Then, put your curser in the box below “generic”. Then, click “insert field” –> “Record Number” –> then press enter so that you curser goes to the next line in the text box.
  4. Go to “file” –> “save as” and save it to something descriptive like “record-number-only”.

Next, export your record numbers.

  1. Back in the main EndNote screen, click the dropdown box at the top of the screen, then “select another style”, and search for your previously created Output Style.
  2. Then click “choose”. Ensure that your output style name is displaying in the dropdown box!
  3. Select “all references” to make sure all your references (that you want to create a subset from) are displayed. Then click one of the references and press ctrl + a (or cmd + a on a mac) to select all references.
  4. Right-hand click and select “copy formatted”.

Create your random subset!

  1. Open excel, and press ctrl + v (or cmd + v on a mac) to paste all your record numbers.
  2. in the cell to the right of your first record number, insert the formula =rand(). This will create a random number from 0 to 100.
  3. Hover the cursor over the bottom-right corner of the cell until it makes a cross. Then click and drag all the way down to the last row that contains a record number
    Insert a row at the top and click “sort & filter” –> “filter” on the menu bar.
  4. Then, sort the second row (with the random numbers) from smallest to largest (or largest to smallest).
  5. You now have a randomly sorted list! Select and copy the top x number of cells in the first column (however large you want your sample to be).

Format your record numbers to put back into EndNote.

  1. Paste your subset of record numbers into word (paste as text, not a table!)
  2. Click “replace” on the main toolbar to bring up the find and replace box.
  3. Beside the box “find what”, write ^p (the up-carrot symbol followed by “p”).
    Beside the box “replace with”, insert a semi-colon followed by one space.
  4. Then click “replace all”.
  5. You should have a string of record numbers separated by semi-colons.

Put them back into EndNote!

  1. Go back to your EndNote Library.
  2. Right-hand click in the sidebar and select “create smart group”
  3. Give it a nice title, like “random set” 😃
  4. In the first dropdown box, select “record number”, then “word begins with”, then paste in your formatted record numbers separated by semi-colons.
  5. Click “create”.
  6. All done!

I hope you found this useful. It might sound complicated, but this process really only takes a few seconds once you have gone through it a few times.

Do you have a more efficient or a different way of doing it? What kinds of formatting and database problems do you come across in your position? Feel free to send me a message or tweet at me.

Til next time,
Amanda

— POSTSCRIPT —

I was asked recently on Twitter how to isolate the remainder of citations for screening after using this method. It’s very easy!

creating a combination group in EndNote
creating a combination group in EndNote

To isolate the rest of your citations, simply make a combination group by doing the following:

  1. First, create a new group in EndNote calls “All Refs”, and drag ALL your citations from the library into it by going to “All References”, selecting all by clicking ctrl/cmd + A, and dragging them into your new group.
  2. Right-hand click on “My Groups” in EndNote, then click “Create From Groups”. Name this group “remainder to screen”, or something else that makes sense to you.
  3. In the first drop-down menu, select your “All Refs” group.
  4. Then select “NOT” from the boolean operators dropdown menu.
  5. Then select the group that holds your random subset in the second dropdown menu.

Using a “gold standard” set to test your search

When developing a systematic search, it’s important to use an iterative approach, constantly tweaking and reevaluating your strategy to ensure relevant articles are captured (and hopefully, non-relevant articles are minimised).

Today, I’d like to share a trick that I frequently use when building my searches. First, develop a set of articles which are relevant to your topic. These are articles which should definitely be picked up by your search. The articles might come from researchers or your patrons, other team members on the systematic review, background scoping searches, google scholar, or any other number of places. The more variety in the set of articles, the better. These articles will comprise your “gold standard set” by which you will test your search strategy.

PART 1: Formatting your PMIDs

First, put each of these articles into your citation management system (ideally EndNote). Next, ensure that each article contains a PMID (PubMed ID) in the accession number field (or whichever one you choose). In EndNote, this can often be easily done by clicking “references”, then “find reference updates”. However, do check through all the citations for any that are missed; it may be necessary to manually find the PMID in PubMed.

After you have your gold set all tidied up in EndNote, export the set of references using a custom output filter containing only the accession number field. To set this up in EndNote v7 (only required the first time you do this!):

  1. go to Edit -> Output Styles -> New Style.
  2. in the sidebar, find “Bibliography” heading and click the “Templates” subheading.
  3. in the box that says “Generic”, click “Insert Field”, then “Accession Number”. Save and close your output filter with a descriptive name such as “PMID”.

To export the references using your new filter, first make sure that your newly created output filter is selected (the name should appear in the dropdown box on the top header; if not select the dropdown box, then “select another style”). Next, press ctrl + A to select all references, then right-hand click and select “copy formatted”.

Open a word document and press ctrl + v to paste your formatted references. Your document ought to contain a list of PMIDs – one per line. From here, I use the find and replace tool to automatically format the list of PMIDs for Ovid Medline:

  1. Click “find and replace”.
  2. In the “find what” box, enter ^p (this stands for the paragraph character)
  3. In the “replace with box”, enter “_OR_” (the underscores represent spaces)
  4. Press “Replace all”.

findandreplace

Okay! Still with me? Your word document should be formatted most of the way. Now, I finish by adding an open parenthesis at the beginning of the document and replacing the final ” OR sequence with ).ui. The .ui at the end refers to the Ovid Medline field code for accession number (where the PMID is stored). The text of your document should now look something like this:

(“19901971” OR “22214755” OR “22214756” OR “24169943” OR “24311990” OR “18794216” OR “25491195” OR “16931779” OR “9727760” OR “22529271” OR “18757621” OR “25536072” OR “24838102” OR “25025477” OR “23460252” OR “26888209” OR “24381228” OR “25154608” OR “21889426” OR “24165853” OR “25315132” OR “26819213” OR “26936902” OR “27492817” OR “27531721” OR “27522246” OR “27067893”).ui

This process might take a little while to set up the first time, but once everything is automated through your custom output file, it will only take a few seconds in the future. I’m a big fan of front-loading my work to make things easier down the line.

PART 2: Testing your gold standard set

Now, navigate to your draft search strategy in Ovid Medline and paste the full query from part 1 into a new line below the search.

Take the line of your final search results and the line containing your gold standard set and OR them together. If the last two lines in Ovid contain the same number, you’re in luck! All the citations in your gold standard set will be picked up in your draft search. If not, NOT out your original search results to see which ones have been missed; by looking at these citations, you can strategise ways to pick up articles with similar wording or indexing.

capture
OR together your “gold standard” set with your final search results. If the number stays the same, all your gold standard articles are contained in the search strategy.

I sometimes find that researchers are concerned about whether the relevant articles they have found will be captured by my search strategies, so I sometimes include this “gold standard search” in draft strategies that I send. I also annotate my process to make it more clear.

The beauty of this method is that as new relevant papers are discovered from additional sources, you can add them to the gold standard set, and continually check your strategy throughout the drafting process.

How to make screening less painful

You know that feeling when you are running searches for a patron, and want to pick out some of the most relevant papers for them, but it’s a Friday afternoon and your eyes are tired and zomg screening is the worst?

I’ve got a tip to make this process marginally less awful. This life-saving tip comes from a wonderful colleague at the College of Physicians and Surgeons of British Columbia.

First, run your search(es) and download your citations into EndNote (or another citation management program)* for screening. Generally, I only use this tip for general scoping (not for systematic review screening), so I usually end up downloading less than 200 citations for this process, and sometimes as few as 30.

Next, export your citations into a .rtf format with an export filter that includes both the citation information and the abstract. To do this in EndNote (v7), go to the dropdown menu at the top of the page and choose “select another style…”, then search for “annotated”. Click the one with the category “generic”, then click “choose”. You will notice that the preview pane for each citation now contains the citation’s information and its’ abstract.

capture
EndNote dropdown menu and preview pane

Next, export your references by clicking the blue arrow on the top bar. First, press ctrl + a to select all the references. Then save the filetype as .rtf and select “annotated” as your output style. Save the file wherever, then navigate to that folder and open it. It should automatically open in Microsoft Word (or the word processing program of your choice). The file should contain all of your references, with abstracts.

capture1
exporting your references from EndNote

Now comes the fun bit! Press ctrl + H (or click “replace” in the main top bar). Under “find what”, type one of the main terms for your first concept. Then click anywhere in the “replace with” box, but instead of typing anything, click “More >>” to expand the options, then click the “format” dropdown box, then “highlight”. The word “highlight” ought to appear below the “replace with box”.

capture3
Find and Replace in Microsoft Word

Still with me? Okay. Click “replace all”. Repeat this step with other terms that might be found in the titles and abstracts of the citations (but only for your first concept!). Once you have reached relative saturation, click the highlighter icon in the main top bar, and select a different highlighter colour. Next, repeat the same process as above with your second main concept, until you have reached relative saturation.

Ta da! At this point, you ought to have a pretty colour-coded document which helps you easily see the main concepts from your search. Screening this word document will be much less straining on the eyes and take less time because the main concepts have already been identified for you.

capture4
Word document, ready for screening

This trick works better for some topics than others. My example above which uses the concepts of caring and attachment works pretty well. However, complex interventions or other areas with ever-changing terminology might not work as well.

Pro tip: in some cases, it is useful to send this colour coded document to your patron, and let them make decisions about what citations are relevant.

Another pro tip: instead of formatting with a highlighter, which only comes in garish colours (why, microsoft? why??), you can also format the text in any way you want. For example, you can put the relevant terms in bold or italics, or make the text itself different colours.

That’s it for today. Have you ever done this, or something similar? Do you have any protips for screening more quickly and efficiently? Send them to me on Twitter or through the Contact Me form!

* But seriously, if you’re not using EndNote, get on that.