8 Biology Databases to Accelerate your Research

5

Min Read

In this blog:

  • We share a list of free to use databases to help in your life science research
  • How can you use this biology data?
  • What are the database sources?

How are you using biology databases in your research? 

You almost certainly are familiar with Pubmed, NCBI, Web of Science and other popular resources for research. Here, we share some further open-access data sources specifically focusing on those that help in reagent selection, literature review and exploratory research.

For these 8 biology databases, we cover details of what they include, how you could use them, and how they were curated. 

Bear in mind this is by no means an exhaustive list – please get in touch if you use a resource that you want to share with the CiteAb community!


8 Life Science Biology Databases you should know about

8 Biology databases you may not know about and a short summary of what they include.

1. YCharOS – Antibody Characterisation Database 

If you work with research antibodies, YCharOS is a fantastic resource to use. This open science organisation characterise commercially available antibodies in well-used applications against important targets. They publish this characterisation data to F1000, and share reports to Zenodo.

How can I use this data?

  • When selecting antibodies for your experiment, assessing YCharOS characterisation data can give you confidence your antibody is specific and selective in a certain application, and therefore more likely to give you reproducible results.

How is the data generated? 

  • YCharOS produce this data using a methodology formalized at the Montreal Neurological Institute. They use knockout validation, and test immunoblots, immunoprecipitation and immunofluorescence as applications.

We also link to their data from the CiteAb reagent search engine, to make it easy to see if further characterisation data is available for your potential product.


2. Human Protein Atlas – Protein Expression Database

The Human Protein Atlas provides a useful open-access map of protein expression across human cells, tissues and organs, with over 300k monthly users! Data is split across 8 resources, including the ‘Tissue’ resource, ‘Brain resource, ‘Cancer’ resource and more. 

How can I use this data?

  • When mapping out your research, you could use this resource to determine localisation, expression, disease association and more for your targets. On top of this, it could be used to identify cell type specific genes, or explore expression patterns in particular tissues of interest.

How is this data generated? 

  • The Human Protein Atlas employ a mix of techniques to generate data, including  antibody-based imaging, mass spectrometry-based proteomics, transcriptomics and systems biology. Also, they provide details on the antibodies used in the generation of the data – with antibody validation being a core step in their processes.  

3. Cellosauras – Cell Line Database

Cellosaurus can be considered a ‘knowledge resource on cell lines’.  Immortalized cell lines, plant cell lines, stem cell lines and more are included, with details on recommended name, synonym, species, and accession number. 

How can I use this data?

  • You could use this data to cite cell lines, identify appropriate ones for your research, or get more data on the cell lines you are intending to use in your experiment. Furthermore, it can help flag potential cross-contaminations.

How is this data generated? 

  • Part of the Swiss Institute of Bioinformatics Geneva, Cellosaurus gets its data from publications, researchers, product pages and more.

4. CiteAb – Research reagent database

CiteAb is database of over 14m commercially available RUO antibodies, proteins, biochemicals, cell lines and nucleotides. The database is searchable through a free to use reagent search engine, which ranks results by citations. Reagents are linked to published literature, with experimental information extracted including reactivity, application, dilution, and published images.

The CiteAb database can also be in-licensed for to feed into internal tools, data projects and workflows, and dramatically accelerate reagent selection through the CiteAb Unlimited service. 

How can I use this data?

  • You can use this database to accelerate reagent selection. It enables the quick evaluation of available products, the ability to assess relevant literature and purchase the chosen product, with links to the supplier sites. This removes the time-sink of manually finding this data across many different sources.

How is this data generated? 

  • The CiteAb reagent database is curated using proprietary AI-driven text-mining technology augmented with human review by teams of scientists. Open-access publications, vendor datasheets, and a number of closed-access publications are fed into the database. 

5. Lipid maps – Lipidomics resource

Lipid Maps shares lipid nomenclature, tools, protocols, standards, tutorials, meetings, publications, and other resources for lipids. They host a number of databases including a structure database, gene and protein database. 

How can I use this data?

  • If studying lipids, this is a great place to access various resources and databases. As an example, their gene/proteome database could be used to identify lipid-related genes, and the Lipid analytics standards database used in experimental planning. 

How is this data generated? 

  • This resource is funded under a MRC partnership award by Cardiff University, University of California San Diego, Babraham Institute Cambridge, Swansea University and University of Edinburgh. Data in each of the resources is curated from several sources, such as laboratories, public sources, journals and computational work. 

6. HMDB – Human metabolome database

The HMBD provides a resource of small molecule metabolites in the body. It links chemical data, clinical data and molecular biology/biochemistry data, and currently has over 220k entries.

How can I use this data?

  • You can use the ‘MetaboCard’ entries to explore metabolites of interest. The many other databases linked (such as KEGG and PubMed) enable easy further exploration and analysis. This can be helpful in fields such as metabolomics, clinical chemistry biomarker discovery and more.

How is this data generated?

  • The data in this resource is compiled from the literature, linked open access databases such as pubchem, pubmed and KEGG, as well as experimental data.

7. mAb3D Atlas – 3D brain reference atlas 

With the rise of spatial biology, this is an interesting new resource in brain proteomics that is more spatial focused. This data pertains to protein expression, providing a database of validated antibodies for the adult mouse brain.

How can I use this data?

  • If studying protein expression in the brain, browsing the antibody database could be a useful resource to check out. Seeing expected results for IHC acts as helpful reference material, as well as shared protocols.

How is this data generated? 

  • The team have screened over 300 mAbs for IHC using a set protocol they detail on their site: https://mab3d-atlas.com/wp-content/uploads/2021/02/mAb3D_DataProduction_v1.0_20201109.pdf

8. ChEMBL – bioactive molecules with drug like properties

This database contains chemical, bioactivity and genomic data, with over 2.5 million compounds listed that are ‘bioactive drug-like small molecules’.

How can I use this data?

  • This data could be particularly useful in early stages of drug discovery for identifying relevant molecules or similar molecules. In addition, it contains 2-D structures, calculated properties and abstracted bioactivities for assessment and analysis.

How is this data generated?

  • Data in ChEMBL is manually extracted from the literature and updated several times a year. A number of journals are used – with 7 core examples listed on their site.

Wrap-up

These open-science resources provide useful starting points for research, and we hope you enjoy checking them out and using them in your work!

If you’d like to share any more biology databases with the CiteAb community, get in touch with us here.

You can also sign up for a free CiteAb account here, to explore our research reagent database.

  • Skye and the CiteAb team
About the author

Join thousands of people who already enjoy the CiteAb newsletter

To keep up to date with the latest developments to our search engine, news from our life science market data analysis and improvements to our citation provision.