BEST: a web application for comprehensive biomarker exploration on large-scale data in solid tumors

Liu, Zaoqu; Liu, Long; Weng, Siyuan; Xu, Hui; Xing, Zhe; Ren, Yuqing; Ge, Xiaoyong; Wang, Libo; Guo, Chunguang; Li, Lifeng; Cheng, Quan; Luo, Peng; Zhang, Jian; Han, Xinwei

doi:10.1186/s40537-023-00844-y

Research
Open access
Published: 01 November 2023

BEST: a web application for comprehensive biomarker exploration on large-scale data in solid tumors

Zaoqu Liu^1,10,11^na1,
Long Liu³^na1,
Siyuan Weng¹,
Hui Xu¹,
Zhe Xing⁴,
Yuqing Ren⁵,
Xiaoyong Ge¹,
Libo Wang³,
Chunguang Guo⁶,
Lifeng Li⁷,
Quan Cheng⁸,
Peng Luo⁹,
Jian Zhang⁹ &
…
Xinwei Han^1,2

Journal of Big Data volume 10, Article number: 165 (2023) Cite this article

3320 Accesses
21 Citations
Metrics details

Abstract

Data mining from RNA-seq or microarray data has become an essential part of cancer biomarker exploration. Certain existing web servers are valuable and broadly utilized, but the meta-analysis of multiple datasets is absent. Most web servers only contain tumor samples from the TCGA database with only one cohort for each cancer type, which also means that the analysis results mainly derived from a single cohort are thin and unstable. Indeed, consistent performance across multiple independent cohorts is the foundation for an excellent biomarker. Moreover, the deeper exploration of specific biomarkers on underlying mechanisms, tumor microenvironment, and drug indications are missing in existing tools. Thus, we introduce BEST (Biomarker Exploration for Solid Tumors), a web application for comprehensive biomarker exploration on large-scale data in solid tumors. To ensure the comparability of genes between different sequencing technologies and the legibility of clinical traits, we re-annotated transcriptome data and unified the nomenclature of clinical traits. BEST delivers fast and customizable functions, including clinical association, survival analysis, enrichment analysis, cell infiltration, immunomodulator, immunotherapy, candidate agents, and genomic alteration. Together, our web server provides multiple cleaned-up independent datasets and diverse analysis functionalities, helping unleash the value of current data resources. It is freely available at https://rookieutopia.com/.

Introduction

Biomarker identification is an important goal of cancer research for clinicians and biologists. How to explore specific biomarkers that can distinguish tumoral from normal tissues, identify treatment-resistant patients, predict patient prognosis and recurrence, etc., are routine research tasks. Recently, immunotherapies represented by immune checkpoint inhibitors have opened a new era in cancer treatment, significantly improving the clinical outcomes of cancer patients [1]. However, only a small fraction of patients can generate considerable benefits from immunotherapies [2]. Exploring specific biomarkers that can effectively predict immunotherapeutic efficacy is crucial for preventing under- or over-treatment.

With the advancement of bioinformatics techniques, researchers are inclined to explore cancer biomarkers using RNA-seq or microarray data [3, 4], and data mining has become an essential part of cancer research. However, these works may be difficult and inconvenient for clinicians and biologists without computational programming skills. Currently, several open-access web servers that allow users to analyze and visualize gene expression online directly are emerging, such as GEPIA [5], Xena [6], ExpressionAtlas [7], and HPA [8]. Although these web applications are valuable and broadly utilized, obtaining high confidence results in a specific tumor is difficult because their data sources are mainly derived from the TCGA database. Consistent performance across multiple independent datasets is the foundation for an excellent biomarker. In addition, the deeper exploration of specific biomarkers on underlying mechanisms, tumor microenvironment, and drug indications are missing in these tools.

To address these unmet needs, we have developed Biomarker Exploration for Solid Tumors (BEST), a web-based application for comprehensive biomarker exploration on large-scale data in solid tumors and delivering fast and customizable functionalities to complement existing tools.

Methods

Data collection

BEST is committed to identifying robust tumor biomarkers through large-scale data. Hence, we retrieved cancer datasets with both expression data and important clinical information (e.g., survival, therapy, etc.) as much as possible. Eligible datasets were mainly enrolled from five databases, including The Cancer Genome Atlas Program (TCGA, https://portal.gdc.cancer.gov), Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/), International Cancer Genome Consortium (ICGC, https://dcc.icgc.org), Chinese Glioma Genome Atlas (CGGA, http://www.cgga.org.cn/), and ArrayExpress (https://www.ebi.ac.uk/arrayexpress/). In total, we included more than 50,000 samples from 64 datasets for 27 cancer types.

Data re-annotation and pre-processing

Raw expression data were extracted for subsequent processing (Fig. 1). Data were re-annotated if the original probe sequences were available based on the GRCh38 patch 13 sequences reference from GENCODE (https://www.gencodegenes.org/). For RNA-seq data, raw count read was converted to transcripts per kilobase million (TPM) and further log-2 transformed. The raw microarray data from Affymetrix®, Illumina®, and Agilent® were processed using the affy [9], lumi [10], and limma [11] packages, respectively. The normalized matrix files were directly downloaded for microarray data from other platforms. Gene expression was further transformed into z-score across patients in each dataset. To make it easier for users to interpret and present analysis results, we cleaned and unified the clinical traits. Take KRAS mutation as an example, GSE39084 [12] named it ‘kras.gene.mutation.status’, ‘mutation’ was labeled ‘M’ and ‘wild type’ was labeled ‘WT’; whereas GSE143985 [13] named it ‘kras_mutation’, ‘mutation’ was labeled ‘Y’ and ‘wild type’ was labeled ‘N’. We uniformly termed it ‘KRAS’, and ‘mutation’ was labeled ‘Mut’ and ‘wild type’ was labeled ‘WT’.

Data calculation and storage

A tremendous amount of calculations are involved in BEST analysis, we thus have completed the time-consuming calculations in advance and used R.data for storage. Users can directly call these data, significantly reducing the user’s waiting time and background computing pressure. Take colorectal cancer (CRC) as an example, we collected a total of 47 datasets. Drug assessment is an analysis module of BEST, which requires fitting ridge regression models for individual drugs based on drug responses and expression data of cancer cell lines from the Genomics of Drug Sensitivity in Cancer_v1 (GDSC_v1), Genomics of Drug Sensitivity in Cancer_v2 (GDSC_v2), The Cancer Therapeutics Response Portal (CTRP), and Profiling Relative Inhibition Simultaneously in Mixtures (PRISM) databases, and then predicting the sensitivity of each drug for CRC samples from all collected datasets. Apparently, if these results are not calculated in advance, users may have to wait more than 3 days. The pre-calculated content is displayed in Fig. 1.

Implementations

BEST is entirely free for users, built by the Shiny app and the HTML5, CSS, and JavaScript libraries for the client-side user interface. The Shiny app (version: 1.7.2) mainly executes data processing and analysis. The function of BEST is divided into eight tabs (Fig. 1): Clinical association, Survival analysis, Enrichment analysis, Cell infiltration, Immunomodulator, Immunotherapy, Candidate agents, and Genomic alteration. Analysis results include images and tables, images can be downloaded in portable document format (PDF) and portable network graphics (PNG) format, and tables can be obtained in comma-separated value (CSV) format.

Results

Quick start

BEST offers a simple interactive interface. Users first select one cancer type and then determine the input category—single gene or gene list (Fig. 1). For the single-gene module, users can enter a gene symbol or an Ensembl ID in the ‘Enter gene name’ field to explore a gene of interest. The gene list module needs users to input a list of genes and pick a method to calculate the gene set score for each sample. The embedded methods include gene set variation analysis (GSVA) [14], single sample gene set enrichment analysis (ssGSEA) [15], z-score [16], pathway-level analysis of gene expression (PLAGE) [17], and the mean value. Users can customize the name of the gene set score.

Clinical association

In this module, users can explore the associations between the expression or score of the input variable and general characteristics (e.g., age, gender, alcohol, smoke, etc.), histological characteristics (e.g., tissue type, tumor site, stage, etc.), molecular characteristics (e.g., TP53 mutation, microsatellite instability, etc.) and treatment responses (e.g., chemotherapy and bevacizumab responses, etc.) (Fig. 2A). Whether to use parametric or nonparametric statistical tests for group comparisons based on the distribution of input variable [18]. For example, users can easily explore the differential expression of the input variable between tumor and normal tissues or find the associations between the input variable with smoke and alcohol. Our datasets also include abundant treatment responses, which might contribute to developing promising biomarkers in clinical settings. Importantly, analysis results tend to be displayed in multiple independent cohorts, which provides a reference for the stability power of a variable of interest. For instance, Fig. 2B illustrates that CRC tumors process a significantly higher expression of COL1A2 than normal tissues in most CRC datasets with tissue type information.

Survival analysis

BEST performs survival analysis based on gene expression or gene set score. This module allows users to explore the prognostic significance for overall survival (OS), disease-free survival (DFS), relapse-free survival (RFS), progression-free survival (PFS), and disease-specific survival (DSS) (Fig. 2C). BEST generates Kaplan–Meier curves with log-rank test and forest plot with cox proportional hazard ratio and the 95% confidence interval information for various survival outcomes in multiple independent datasets (Fig. 2C, D). Kaplan–Meier analysis requires categorical variables, we thus provide five cutoff options for users to choose from, including ‘median’, ‘mean’, ‘quantile’, ‘optimal’, and ‘custom’. For example, when investigating gene COL1A2 in survival analysis of CRC, users can obtain Kaplan–Meier curves with a specific cutoff approach and a Cox forest plot for five survival outcomes across all CRC datasets with survival information.

Enrichment analysis

BEST provides two enrichment frameworks: over-representation analysis (ORA) [19] and gene set enrichment analysis (GSEA) [20]. Users can select the top gene (self-defined number) most associated with the input variable to perform ORA and apply a ranked gene list based on the correlation between all genes and the input variable to carry out GSEA (Fig. 3A). Of note, the final correlation coefficient between the input variable and each gene is the average correlation of all datasets in specific cancer. The Pearson correlation was calculated between all genes and the input variable. If users input a gene list, which will be firstly calculated by one of the four provided algorithms, including gene set variation analysis (GSVA), single sample gene set enrichment analysis (ssGSEA), z-score, pathway-level analysis of gene expression (PLAGE), and the mean value. The output forms of ORA are GO and KEGG bar charts (Fig. 3B, C). The ‘Detected Genes’ are all top gene most related to the input variable, which are also existed in GO or KEGG gene sets. The ‘Enriched Genes’ are the top gene within the specific biological pathway. Also, GSEA results are exhibited using GSEA-Plot (Fig. 3D) and Ridge-Plot (Fig. 3E) images. The GO, KEGG, and Hallmark gene sets for GSEA are obtained from Molecular Signatures Database (MSigDB). Similarly, users could select single gene or gene list as input variable. The specific biological term of GO, KEGG, and Hallmark gene set could be shown as GSEA-Plot, or a series of biological terms could be displayed as Ridge-Plot.

Cell infiltration and immunomodulator

BEST offers eight prevalent algorithms to estimate immune cell infiltration in the tumor microenvironment (TME) (Fig. 4A), including CIBERSORT [21], CIBERSORT ABS [21], EPIC [22], ESTIMATE [23], MCP-counter [24], Quantiseq [25], TIMER [26], and xCell [27]. To avoid time-consuming calculations for users and save computing resources, these eight algorithms have been executed in advance across all datasets, and the resulting data have been stored in the website background. Additionally, BEST provides five immunomodulator categories: antigen presentation, immunoinhibitors, immunostimulators, chemokines, and receptors (Fig. 4A). Users can generate heatmap and correlation scatter plots from these two analysis modules. The heatmaps illustrate the correlations of the input variable with each immune cell/immunomodulator across all cohorts (Fig. 4B, C), and the correlation scatters plots indicate the correlation of the input variable and an immune cell/immunomodulator in a specific dataset (Fig. 4D, E).

Immunotherapy

To further investigate the clinical significance of the input variable in immunotherapies, we retrieved 19 immunotherapeutic cohorts with expression data and immunotherapy information (e.g., CAR-T, anti-PD-1, anti-CTLA4, etc.) (Fig. 5A). Based on gene expression or gene set score in these datasets, users can conduct differential expression analysis (DEA) between response and non-response groups (Fig. 5B), receiver operating characteristic (ROC) curve to evaluate the performance of the input variable in predicting the immunotherapeutic efficacy (Fig. 5C), and survival analysis to assess the impact of the input variable on survival (OS and PFS) in immunotherapeutic cohorts that have undergone immunotherapies (Fig. 5D).

Candidate agents

In this analysis tab, BEST performs drug assessment in bulk samples based on drug responses and expression data of cancer cell lines from the GDSC_v1, GDSC_v2, CTRP, and PRISM databases (Fig. 6A). Given the inherent differences between bulk samples and cell line cultures, we introduced a correlation of correlations framework [28] to retain genes presenting analogical co-expression patterns in bulk samples and cell lines. As previously reported [29], the model used for predicting drug response was the ridge regression algorithm implemented in the oncoPredict package [30]. This predictive model was trained on transcriptional expression profiles and drug response data of cancer cell lines with a satisfied predictive accuracy were evaluated by default 10-fold cross-validation, thus allowing the estimation of clinical drug response using only the expression data of bulk samples (Fig. 6A). Modeling and prediction works have been completed, and drug assessments of all tumor samples based on four databases have been stored in the website background. BEST will calculate the correlations between all drugs and the input variable in all cohorts. According to the correlation rank of each drug across all datasets, we applied the robust rank aggregation (RRA) [31] to determine drug importance related to the input variable (Fig. 6B). Users can select the top drugs (self-defined number) to display the heatmap that illustrates the correlations of the input variable with each drug across all cohorts. Higher-ranked drugs indicate that high levels of the input variable predict drug resistance and vice versa. For example, high expression of COL1A2 might suggest Afatinib resistance and Dasatinib sensitivity based on the GDSC_v2 database (Fig. 6C). Also, users can select a drug database, a tumor dataset, and a specific drug to generate a correlation scatter plot (Fig. 6D).

Genomic alteration

In this module, BEST has pre-processed mutation and copy number variation data from the TCGA database using maftools [32] and GISTIC2.0 [33], respectively. Users can obtain a heatmap indicating genomic alterations as the input variable increase. The right panel of heatmap also displays the proportion of genomic alteration and statistical differences between the high and low groups. For example, with the rise in COL1A2 expression, the genomics landscape of the TCGA-CRC dataset is illustrated in Fig. 7. We found that the loss of chromosome segment 1p13.2 was more frequent in the high expression group.

Discussion

As an interactive web tool, BEST aims to explore the clinical significance and biological functions of cancer biomarkers through large-scale data. Therefore, data richness is the foundation of BEST. From data collection, re-annotation, pre-processing, and pre-calculation to storage, we provide a tidy and uniform pan-cancer database, allowing users to call and interpret data quickly. BEST offers prevalent analysis modules to enable researchers without computational programming skills to conduct various bioinformatics analyses. Compared with other available tools [5,6,7,8, 34,35,36], BEST has more datasets and more diverse analysis options, which complements well with them (Table 1).

Table 1 Comparison of BEST with other tools

Full size table

In BEST web application, users can identify cancer biomarkers associated with critical clinical traits (e.g., stage and grade), prognosis, and immunotherapy. Moreover, the underlying mechanisms of these biomarkers could be further explored using the enrichment, cell infiltration, and immunomodulator analysis modules. Users can also apply the candidate agent analysis tab to investigate high levels of cancer biomarkers that might indicate which drugs are resistant and which are sensitive to specific cancer.

Taken together, BEST provides a curated database and innovative analytical pipelines to explore cancer biomarkers at high resolution. It is an easy-to-use and time-saving web tool that allows users, especially clinicians and biologists without background knowledge of bioinformatics data mining, to comprehensively and systematically explore the clinical significance and biological function of cancer biomarkers. With constant user feedback and further improvement, BEST is promising to serve as an integral part of routine data analyses for researchers.

Data availability

BEST is available at https://rookieutopia.com/.

Abbreviations

BEST:: Biomarker Exploration for Solid Tumors
CRC:: Colorectal cancer
PDF:: Portable document format
PNG:: Portable network graphics
CSV:: Comma-separated value
TME:: Tumor microenvironment
OS:: Overall survival
DFS:: Disease-free survival
RFS:: Relapse-free survival
PFS:: Progression-free survival
DSS:: Disease-specific survival
GDSC:: Genomics of Drug Sensitivity in Cancer
CTRP:: The Cancer Therapeutics Response Portal
PRISM:: Profiling Relative Inhibition Simultaneously in Mixtures

References

Hamilton PT, Anholt BR, Nelson BH. Tumour immunotherapy: lessons from predator-prey theory. Nat Rev Immunol. 2022. https://doi.org/10.1038/s41577-022-00719-y.
Article Google Scholar
Vesely MD, Zhang T, Chen L. Resistance mechanisms to anti-PD cancer immunotherapy. Annu Rev Immunol. 2022;40:45–74.
Article Google Scholar
Liu Z, Liu L, Weng S, Guo C, Dang Q, Xu H, et al. Machine learning-based integration develops an immune-derived lncRNA signature for improving outcomes in colorectal cancer. Nat Commun. 2022;13(1):816.
Article Google Scholar
Liu Z, Guo C, Dang Q, Wang L, Liu L, Weng S, et al. Integrative analysis from multi-center studies identities a consensus machine learning-derived lncRNA signature for stage II/III colorectal cancer. EBioMedicine. 2022;75:103750.
Article Google Scholar
Tang Z, Li C, Kang B, Gao G, Li C, Zhang Z. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res. 2017;45(W1):W98–102.
Article Google Scholar
Goldman MJ, Craft B, Hastie M, Repecka K, McDade F, Kamath A, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol. 2020;38(6):675–8.
Article Google Scholar
Petryszak R, Keays M, Tang YA, Fonseca NA, Barrera E, Burdett T, et al. Expression Atlas update–an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Res. 2016;44(D1):D746-752.
Article Google Scholar
Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347(6220): 1260419.
Article Google Scholar
Gautier L, Cope L, Bolstad BM, Irizarry RA. Affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20(3):307–15.
Article Google Scholar
Du P, Kibbe WA, Lin SM. Lumi: a pipeline for processing Illumina microarray. Bioinformatics. 2008;24(13):1547–8.
Article Google Scholar
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.
Article Google Scholar
Kirzin S, Marisa L, Guimbaud R, De Reynies A, Legrain M, Laurent-Puig P, et al. Sporadic early-onset colorectal cancer is a specific sub-type of cancer: a morphological, molecular and genetics study. PLoS ONE. 2014;9(8): e103159.
Article Google Scholar
Shinto E, Yoshida Y, Kajiwara Y, Okamoto K, Mochizuki S, Yamadera M, et al. Clinical significance of a gene signature generated from tumor budding grade in colon cancer. Ann Surg Oncol. 2020;27(10):4044–54.
Article Google Scholar
Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013;14: 7.
Article Google Scholar
Barbie DA, Tamayo P, Boehm JS, Kim SY, Moody SE, Dunn IF, et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature. 2009;462(7269):108–12.
Article Google Scholar
Lee E, Chuang HY, Kim JW, Ideker T, Lee D. Inferring pathway activity toward precise disease classification. PLoS Comput Biol. 2008;4(11): e1000217.
Article Google Scholar
Tomfohr J, Lu J, Kepler TB. Pathway level analysis of gene expression using singular value decomposition. BMC Bioinform. 2005;6:225.
Article Google Scholar
le Cessie S, Goeman JJ, Dekkers OM. Who is afraid of non-normal data? Choosing between parametric and non-parametric tests. Eur J Endocrinol. 2020;182(2):E1–E3.
Article Google Scholar
Tokar T, Pastrello C, Jurisica I. GSOAP: a tool for visualization of gene set over-representation analysis. Bioinformatics. 2020;36(9):2923–5.
Article Google Scholar
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102(43):15545–50.
Article Google Scholar
Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12(5):453–7.
Article Google Scholar
Racle J, de Jonge K, Baumgaertner P, Speiser DE, Gfeller D. Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data. Elife. 2017;6: 6.
Article Google Scholar
Yoshihara K, Shahmoradgoli M, Martinez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. 2013;4:2612.
Article Google Scholar
Becht E, Giraldo NA, Lacroix L, Buttard B, Elarouci N, Petitprez F, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 2016;17(1):218.
Article Google Scholar
Finotello F, Mayer C, Plattner C, Laschober G, Rieder D, Hackl H, et al. Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data. Genome Med. 2019;11(1):34.
Article Google Scholar
Li B, Severson E, Pignon JC, Zhao H, Li T, Novak J, et al. Comprehensive analyses of tumor immunity: implications for cancer immunotherapy. Genome Biol. 2016;17(1):174.
Article Google Scholar
Aran D, Hu Z, Butte AJ. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017;18(1):220.
Article Google Scholar
Guinney J, Dienstmann R, Wang X, de Reynies A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nat Med. 2015;21(11):1350–6.
Article Google Scholar
Yang C, Chen J, Li Y, Huang X, Liu Z, Wang J, et al. Exploring subclass-specific therapeutic agents for hepatocellular carcinoma by informatics-guided drug screen. Brief Bioinform. 2021. https://doi.org/10.1093/bib/bbaa295.
Article Google Scholar
Maeser D, Gruener RF, Huang RS. oncoPredict: an R package for predicting in vivo or cancer patient drug response and biomarkers from cell line screening data. Brief Bioinform. 2021. https://doi.org/10.1093/bib/bbab260.
Article Google Scholar
Kolde R, Laur S, Adler P, Vilo J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics. 2012;28(4):573–80.
Article Google Scholar
Mayakonda A, Lin DC, Assenov Y, Plass C, Koeffler HP. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018;28(11):1747–56.
Article Google Scholar
Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 2011;12(4): R41.
Article Google Scholar
Chandrashekar DS, Bashel B, Balasubramanya SAH, Creighton CJ, Ponce-Rodriguez I, Chakravarthi B, et al. UALCAN: a portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia. 2017;19(8):649–58.
Article Google Scholar
Mizuno H, Kitada K, Nakai K, Sarai A. PrognoScan: a new database for meta-analysis of the prognostic value of genes. BMC Med Genom. 2009;2:18.
Article Google Scholar
Goswami CP, Nakshatri H. PROGgeneV2: enhancements on the existing database. BMC Cancer. 2014;14: 970.
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This study was supported by The Collaborative Innovation Major Project of Zhengzhou (Grant No. 20XTZX08017), The National Natural Science Foundation of China (Grant No. 82002433), and Science and Technology Project of Henan Provincial Department of Education (Grant No. 21A320036).

Author information

Zaoqu Liu and Long Liu contributed equally to this work.

Authors and Affiliations

Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
Zaoqu Liu, Siyuan Weng, Hui Xu, Xiaoyong Ge & Xinwei Han
Interventional Treatment and Clinical Research Center of Henan Province, Zhengzhou, Henan, China
Xinwei Han
Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Xi‘an Jiaotong University, Xi‘an, Shanxi, China
Long Liu & Libo Wang
Department of Neurosurgery, The Fifth Affiliated Hospital of Zhengzhou University, Henan, China
Zhe Xing
Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
Yuqing Ren
Department of Endovascular Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
Chunguang Guo
Cancer center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450052, Henan, China
Lifeng Li
Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
Quan Cheng
Department of Oncology, Zhujiang Hospital, Southern Medical University, Guangzhou, China
Peng Luo & Jian Zhang
State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
Zaoqu Liu
State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Department of Pathophysiology, Peking Union Medical College, Beijing, China
Zaoqu Liu

Authors

Zaoqu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Long Liu
View author publications
You can also search for this author in PubMed Google Scholar
Siyuan Weng
View author publications
You can also search for this author in PubMed Google Scholar
Hui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Xing
View author publications
You can also search for this author in PubMed Google Scholar
Yuqing Ren
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyong Ge
View author publications
You can also search for this author in PubMed Google Scholar
Libo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chunguang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Lifeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Quan Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Peng Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xinwei Han
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ZQL contributed study design, data analysis, and paper writing. XWH contributed project oversight and paper revisiting. LL, SYW, HX, ZX, XYG, LBW, and CGG collected samples and generated data. LBW and L.L performed and interpreted trail assays. YQR, LFL, QC, PL, and JZ contributed paper revisiting.

Corresponding authors

Correspondence to Zaoqu Liu or Xinwei Han.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

We have obtained consents to publish this paper from all the participants of this study.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, Z., Liu, L., Weng, S. et al. BEST: a web application for comprehensive biomarker exploration on large-scale data in solid tumors. J Big Data 10, 165 (2023). https://doi.org/10.1186/s40537-023-00844-y

Download citation

Received: 07 July 2022
Accepted: 16 October 2023
Published: 01 November 2023
DOI: https://doi.org/10.1186/s40537-023-00844-y

BEST: a web application for comprehensive biomarker exploration on large-scale data in solid tumors

Abstract

Introduction

Methods

Data collection

Data re-annotation and pre-processing

Data calculation and storage

Implementations

Results

Quick start

Clinical association

Survival analysis

Enrichment analysis

Cell infiltration and immunomodulator

Immunotherapy

Candidate agents

Genomic alteration

Discussion

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords