Phrase searching in PubMed is weirdly complicated

In this weeks’ episode of expert searching: hubris edition, I found that I don’t understand PubMed nearly as well as I thought. I’ll confess to primarily being an Ovid user myself. I’ve never found the PubMed interface intuitive. I dislike not seeing my search history on the same page as my results, reading the search history from bottom to top of the page, and not being able to use proximity operators.

But, as an information specialist, one sometimes has to use platforms that one does not like.

In the course of using PubMed last week, I found that my search queries were not being interpreted the way I had intended. My query in Ovid retrieved over 9000 results, but when translated (as accurately as one can translate between the two…), my search retrieved less than half the results!

The solution: When search terms are in quotation marks, PubMed ignores the truncation symbol. My strategy had relied heavily on truncated phrases, all of which were in quotation marks (to avoid PubMed’s automatic term mapping), and all of which were being interpreted as singular rather than plural terms (e.g. “patient outcome*” would search only for “patient outcome” and not “patient outcomes”).

To demonstrate:

Search Query Items found
#3 Search patient reported outcome*[Title/Abstract] 7799
#2 Search patient reported outcome[Title/Abstract] 3319
#1 Search “patient reported outcome*”[Title/Abstract] 3319

Oh bother! Why can’t databases just read my mind already?!

This bug(?) in PubMed’s system of interpreting logic brings up a few important issues for systematic searching. I spent some time this week figuring out how the system works.

capture
PubMed’s interpretation of the unqualified search: patient reported outcomes. Ick! That’s not what I wanted at all!

PubMed’s automatic term mapping kicks in when no truncation, quotation marks, or field tags (eg [tiab]) are used (an unqualified search). In the case of an unqualified search, PubMed searches for terms within MeSH, authors, journals, and the phrase index. If none are found, PubMed starts searching for the individual words within a phrase and adding them to your search. To see how PubMed interprets your search query, see the “search details” box in the right-hand sidebar on the search results page. This will show if any automatic term mapping was used, and if so, how.

The main take-away from this experience for me is:

To conduct a replicable and transparent search in PubMed, always in ensure that your search terms and phrases are either: 1) in quotation marks, 2) use truncation, or 3) in the phrase index. Also, never use both quotation marks and truncation at the same time. Otherwise, you run the risk of having your beautifully constructed search destroyed by silly computer logic.

Advertisements

The secret to bibliometric analysis: generating a list of PMIDs

By now, it’s probably no secret that I love crunching bibliometric data. I find that analysing my results — both during search strategy formation and after downloading final results — gives me a broader perspective and see trends that I might otherwise miss.

However, analysing data can sometimes be time consuming and clunky. Data never seems to be in the format that you want it when you need it; the precise tool that you need at that moment hasn’t been invented yet or is otherwise proprietary; the right software for the job requires a programming language you haven’t yet learned, and so forth. Sometimes you want a quick and dirty answer to help develop a strategy and it doesn’t have to be tidy or perfect, but you need it now!

Here’s my quick and dirty trick for analysing your bibliometric [medline] data:

  1. Generate a list of PMIDs from your results (whether your strategy is finalised or not!)
  2. Pop into the data analysis program of your choosing…

The beauty of this trick is that you can copy-paste whatever you are working on at this very moment (provided you’re working with medline data, of course…) and get real-time feedback. No need to mess with clunky software interfaces or retype your strategy.

Generate a list of PMIDs

PubMed

If you’re using PubMed, this part is easy. Click the “Format: Summary” drop down menu just below the search bar, then select “PMID”. Et voila! The resulting page is a plain text list of PMIDs, taken from the results on the previous page.

screenshot.PNG

Note that the resulting PMID list will show only the citations from the previous page, so you may want to scroll to the bottom of the screen to show the max number of citations per page (200 at the time of this writing).

Ovid MEDLINE

To extract PMIDs from Ovid:

  • select all citations (or a range if there’s a lot!)
  • click “export”
  • select “excel” under the drop-down menu “Export To:”
  • select “custom fields”
  • under “select fields” (beside the “custom fields” radio button), unselect everything except “unique identifier” (this is the field that contains the PMID in Ovid)
  • Then select “export citations”

An excel file should download with a column of PMIDs, which can then be copied/pasted.

(Thanks to Michelle Fiander for the excel tip!)

 

Analyse your data

Once you have your list of PMIDs, you can pop them into a variety of different tools to crunch the data in different ways. For example, try pasting your list into:

  • PubReminer  – for a word count analysis of authors, journals, MeSH, title/abstract words…
  • Medline Trends – for an analysis of citations over time
  • GoPubMed – for a variety of filters (maps! bar graphs! frequency charts!)
  • Yale MeSH Analyser – for a side-by-side comparison of MeSH usage

And more! Someday I intend to write up a full list of medline data analysis tools freely available online, but that day is not today…

Capture.PNG
It’s not necessary to input a full search strategy into most bibliometric analysis programmes… simply paste in your PMIDs!

Why would a person bother to do this?

Building a search strategy is an iterative process and it requires using a lot of different tools. For example, you can use your own common sense and intuition, but other tried-and-true strategies include: backwards/forwards citation chaining, talking to experts in the field, or looking at highly cited papers/journals in the field.

Using quick data analysis strategies throughout the process of building a search strategy will help ensure that important concepts aren’t missed. They provide a more objective picture of what’s happening, what’s missing, and how you can better refine your strategy.

That’s it for this week!

PS This is my first proper blog and I must say… keeping a blog up to date is not as easy as I thought. Please do let me know if you find this content useful and I will try my utmost to keep ’em coming! You can use the site contact form or find me on twitter at @v_woolf.

 

ONE WEIRD TRICK THAT OVID WON’T TELL YOU

761bf8f77c17cc26a07f837501f75850913c192227b19aabaec2a3910e5c6f99No, it’s not a food that will give you a slimmer stomach or boost your manly prowess.
I’m talking, of course, about the ability to find the total number of citations in the Medline database. Why on earth would someone want to know how many citations are in a database, you ask?
  • To compare and contrast the size with other databases
  • For FUN, because you’re a nerd like me
  • Um… because?
It’s relatively straightforward to find the total number of citations in PubMed. Their documentation helpfully tells us: “To search for the total number of PubMed citations, enter all [sb] in the search box.”
However, a few days ago I was struck with an awkward problem. I needed to find the total number of citations in Ovid Medline. Why? I had conducted a straightforward scoping search for a researcher and created a basic frequency analysis of the number of citations retrieved in the search per year to show the publishing trends in the topic over time.
blog1
frequency analysis, non-normalised (raw count of citations)
The researcher asked me to normalise the data…. say what?? Do I look like a statistician?
I knew I couldn’t use the numbers from PubMed, because the two have slight differences in content. And I couldn’t translate my strategy into PubMed because it relied heavily on the adjacent operator (which is absent in PubMed).
After some frantic searching, I found out that this was not such a difficult task: all I needed to do was take the number of citations retrieved from the search in a given year, and divide this by the number of total citations published in the database that year. This would even out any potential errors in the chart from anomalies in the database as a whole.
The problem: I could not find an equivalent operation in Ovid Medline to PubMed’s all[sb] command. After combing through Ovid’s documentation, I finally broke down and tweeted them… and received a response within a few hours.
I know everyone’s been waiting with bated breath to find the answer: it’s docz.dz.
What does the .dz field code stand for? No idea. But anyway, it seems to get the trick done, and now I have my nicely normalised graph. In the second image, below, you can see that the downtick in citations for the year 2016 has vanished, because the number of citations retrieved from the search is proportionate to the total citations published this year.

 

blog2
frequency analysis, normalised (results as a percentage of total citations in database)
Happy story! The end.
PS Cheers to Ovid’s social media team! They are totally on the ball.

search twitter using boolean logic

Today’s tip is one of those ideas that seems obvious when you think about it, but many seem to overlook. While information professionals know to use boolean logic and nested parenthesis in formal databases, many have not thought to apply the same logic to social media sites or specialised search engines.

Case in point: Twitter!

Sure, you can search for a specific hashtag or user, but you can also combine these things together in complex ways. Let’s look at a few examples…

Example 1: Job searching

I’ve found complex twitter searching to be particularly useful when looking for vacant job postings (for myself and for others). Let’s say you’re looking for a position in the sustainability or environmental sector.

(sustainability OR environment OR environmental OR renewable OR clean OR energy) AND (#job OR #jobs OR #UKjobs OR recruit OR recruiting OR join OR vacant OR vacancy OR apply OR join)

See results of the search above here. 

From here, you can further narrow down your search to local jobs by clicking “near me” from the dropdown menu or include keywords for the locations you are interested in as a separate concept.

Example 2: I saw that thing on that feed but now I can’t find it!

Have you ever tried to find something on Twitter, and just scrolled continuously through a user’s tweets hoping that it will miraculously surface? Yeah, me neither…..

 

One way to find this elusive information is to use keywords in the search box along with a username. For example, maybe I remember some cool story about archival research in newspapers at Library of Congress.

@librarycongress newspapers

This search will find instances where Library of Congress has tweeted or have been mentioned in a tweet using the term newspapers:

screen-shot-2016-10-16-at-7-44-05-pm
Twitter search for “@librarycongress newspapers”

 

The above can also be nested within boolean logic and parenthesis.

Twitter, of course, wasn’t build for expert searching, so it’s far from a perfect interface. Some of the downsides include:

  • No truncation options
  • It’s difficult – if not impossible – to search systematically. Since Twitter is a proprietary platform and not necessarily transparent about the way its search interface works, it’s difficult to know exactly how it interprets your logic.
  • There’s no native ability to download results (although it can be accomplished through 3rd party programs).

Have you used Twitter for expert searching? Share your tips in the comments below, or contact me.

Yes, Virginia, it is possible to annotate your searches!

My inaugural tip for the Expert Searching blog comes, fittingly, through a chain of colleagues passed down mentor to mentee. I believe this tip originates from the irreplaceable Dean Guistini of HLWiki.
Ovid Medline recently added a feature to add search strategy annotations, but it’s clunky and annoying. To add annotations, you have to click several times, and to top it off, they aren’t even visible while constructing and executing the search. How useless is that?
annotation-1
Built-in annotations in Ovid Medline
However, there’s a secret nobody has told you: it’s always been possible to add annotations to your searches! Simply add square brackets to the end of any line. Any text inside the square brackets is meant to be read by people only; the computer disregards this content. These in-text annotations are a useful way to document the search process and to see what sets of concepts you are combining.
screenshot2.PNG
In-text annotations in Ovid Medline
Another way to use the square brackets are to add them to a line all by themselves. This helps separate parts of the search very clearly. If you’re testing out lots of different terms and combining concepts all over the place, it’s a good way to look back on your work and see what’s going on.
screenshot3.PNG
Line annotations in Ovid Medline
Why annotate your work?
  • Others will be able to understand your search strategy
  • You will be able to understand your search strategy!
  • It shows your thought process and rationale for making different decisions
  • It makes everyone happy because it doesn’t look like gibberish
That’s it for today; see you all next week!
Amanda

Why I started this blog

Hello, world!

After many failed attempts to start blogs in the past, I am once again crawling out of the woodwork to share my knowledge (such as it is…) with the world.

I’m an information specialist (or medical librarian, depending on your geography and political persuasion). Basically, I spend the majority of my time using medical databases, search engines, and hard to use websites. I find the unfindable! I search for the unsearchable! Sometimes searching for stuff is pretty straightforward. But a lot of the time – especially in medicine – the amount of information and the way it’s organised can be pretty overwhelming.

Over the years, I’ve found some useful ways of efficiently navigating databases, search engines, and hard to search websites. My colleagues have shared some pretty cool tricks, too. I’ve created this website as a way for expert searchers to share expertise, pick up new tips, and generally share the love.

Want to contribute? Contact me!