Biochemicals data growth sheds light on reliability

Over the last year CiteAb product manager Rebecca Sadler has been working hard to grow our popular biochemicals dataset.

At a recent internal meeting we discussed how we’ve added a huge amount of new citations to this dataset, reinforcing the overall trends we have seen emerging over the last few years. This demonstrates that the data being collected using our unbiased methods is extremely high quality and gives really good insight to global markets even before huge numbers of citations have been analysed.

Rebecca adds: “I shared this insight with colleagues and we felt that it would be a good discussion to have here on our blog too. It’s really important to the companies that we work with that they’re getting the very best quality datasets available, so hopefully this is really reassuring to them. Today I am going to outline some of the detail of this new data compared to our biochemicals data of a year ago.”

In the last year, the technology Rebecca oversees has added nearly 100k citations and 200k products to our biochemicals dataset. On average that’s an addition of around 5000 citations a month – six or seven an hour. In real terms, this is a total of 120,000 citations, for over 300,000 products from 53 companies, compared to 21,000 citations for 100,000 products from 41 companies a year ago – so we’re keeping her pretty busy!

Rebecca says: “The fantastic thing about our collection method is that its unbiased nature means our insights are accurate even at lower numbers of citations. Our top ten suppliers in the biochemical market remains unchanged year on year despite a huge additional number of data points.”

The top three suppliers for our data both in August 2017 and now in May 2019 remain the same; Sigma-Aldrich, followed by Millipore and Tocris Bioscience. The overall market share has changed slightly but not sufficiently to impact marketplace.

Below the top three we still see the same companies fill out the remaining seven places in the top ten, but with some very small changes to order. For example, previously we had Selleck Chemicals in fifth place and Cayman Chemical in fourth – these two have now swapped positions.
Dr Andrew Chalmers, founder of CiteAb, said: “The initial biochemical data that was collected was a very good overview of the market, despite containing only a small proportion of the data that we have now – and the trends we’re seeing can only continue to get more accurate as we add more data.”

Rebecca adds that trends for individual suppliers have also remained the same – for example Selleck Chemicals had an increasing share, and this continues in our latest data. This also applies to top cited products – 64 products in the top 100 in the first data analysed in August 2017 are still in the top 100.


A change we are seeing in our expanded number of biochemicals is the appearance of solvents, such as DMSO and dyes like MTT, appearing in our top cited products. Rebecca explains: “As well as collecting citations for various chemical probes, which include inhibitors and activators, we are now collecting carbohydrates, amino acids, dyes, and solvents which will give us a broader view of the biochemicals market as a whole.”

It is this increase in the breadth of the data which gives Rebecca and the team the ability to run significantly more powerful and specific analyses, looking more closely at specific segments of the market. She says: “For example, we can look at citation trends for different suppliers of the same product. Through doing this for MG-132 we can see that Millipore continues to lose share while Selleck is gaining shares for this product.”

So what is next for this dataset? Rebecca tells us that her next focus is adding structural classification information, function, target, and disease areas. She’s also keen to hear suggestions from suppliers in this market about any other information she could add to help them segment the data.

In a few weeks we’ll blog on this dataset again and we’ll give away some free data – so if this is an area you’re interested in please do check back soon.

– Rhys and the CiteAb team

About the author

Join thousands of people who already enjoy the CiteAb newsletter

To keep up to date with the latest developments to our search engine, news from our life science market data analysis and improvements to our citation provision.