0% found this document useful (0 votes)
26 views

Xu DKK (2017)

Science journal

Uploaded by

Yoga Dinatha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Xu DKK (2017)

Science journal

Uploaded by

Yoga Dinatha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ARTICLES

PUBLISHED ONLINE: 9 OCTOBER 2017 | DOI: 10.1038/NMAT4997

Circulating tumour DNA methylation markers for


diagnosis and prognosis of hepatocellular carcinoma
Rui-hua Xu1*†, Wei Wei1,2†, Michal Krawczyk2†, Wenqiu Wang2†, Huiyan Luo1,2†, Ken Flagg2,
Shaohua Yi2, William Shi2, Qingli Quan3, Kang Li3, Lianghong Zheng4, Heng Zhang5,
Bennett A. Caughey2, Qi Zhao1, Jiayi Hou2, Runze Zhang2, Yanxin Xu3, Huimin Cai3,4, Gen Li3,4,
Rui Hou4, Zheng Zhong2, Danni Lin2, Xin Fu2, Jie Zhu2, Yaou Duan2, Meixing Yu3, Binwu Ying6,
Wengeng Zhang3, Juan Wang7, Edward Zhang2, Charlotte Zhang2, Oulan Li2, Rongping Guo1,
Hannah Carter2, Jian-kang Zhu5, Xiaoke Hao7 and Kang Zhang2,3,8*

An effective blood-based method for the diagnosis and prognosis of hepatocellular carcinoma (HCC) has not yet been
developed. Circulating tumour DNA (ctDNA) carrying cancer-specific genetic and epigenetic aberrations may enable a
noninvasive ‘liquid biopsy’ for diagnosis and monitoring of cancer. Here, we identified an HCC-specific methylation marker
panel by comparing HCC tissue and normal blood leukocytes and showed that methylation profiles of HCC tumour DNA and
matched plasma ctDNA are highly correlated. Using cfDNA samples from a large cohort of 1,098 HCC patients and 835 normal
controls, we constructed a diagnostic prediction model that showed high diagnostic specificity and sensitivity (P < 0.001) and
was highly correlated with tumour burden, treatment response, and stage. Additionally, we constructed a prognostic prediction
model that effectively predicted prognosis and survival (P < 0.001). Together, these findings demonstrate in a large clinical
cohort the utility of ctDNA methylation markers in the diagnosis, surveillance, and prognosis of HCC.

H
epatocellular carcinoma (HCC) is a leading cause of cancer monitoring of cfDNA may detect tumour that is not apparent or
deaths worldwide1 . As with many cancers, HCC found at an is indeterminate on imaging (for example, residual tumour post-
early stage carries much-improved prognosis compared to resection). Finally, ctDNA may represent the entire molecular pic-
advanced stage disease2 , in part due to the relative efficacy of local ture of a patient’s malignancy, while a tumour biopsy may be affected
treatments compared with systemic therapy. Thus, early detection by intra-tumour heterogeneity.
has significant potential for reducing the mortality of HCC. Unfor- DNA methylation is an epigenetic regulator of gene expression
tunately, there has been little success in developing effective blood- that usually results in gene silencing7 . Increased methylation of
based methods to screen for HCC. Alpha fetal protein (AFP) is the tumour suppressor genes is an early event in many tumours, sug-
only currently available blood test for detection and surveillance of gesting that altered DNA methylation patterns could be one of
HCC; however, its clinical utility is limited by low sensitivity3 . the first detectable neoplastic changes associated with tumorigen-
Circulating tumour DNA (ctDNA) consists of extracellular esis8–10 . ctDNA-bearing cancer-specific methylation patterns have
nucleic acid fragments shed into plasma via tumour cell necrosis, been investigated as feasible biomarkers in cancers11 ; however, cur-
apoptosis, and active release of DNA4 . Recent research demon- rently there are few validated methylation markers available, such
strates that ctDNA has the potential to revolutionize screening, as SEPT9 in colorectal cancer12 . DNA methylation profiling offers
diagnosis, and treatment of cancer by enabling a noninvasive ‘liquid several advantages over somatic mutation analysis for cancer detec-
biopsy’—that is, a blood test that enables molecular testing of solid tion, including higher clinical sensitivity and dynamic range, many
malignancies5,6 . Compared to tissue biopsy, cell-free DNA (cfDNA) methylation target regions in diseases, and multiple altered CpG
sequencing has some obvious advantages. First, the collection of sites within each targeted genomic region. Further, each methylation
peripheral blood to obtain cfDNA is minimally invasive compared marker is present in both cancer tissue and cfDNA, whereas only
with tumour biopsy, regardless of site. Second, blood can be taken a fraction of mutations present in cancer tissue may be detected
at any time during therapy, allowing for real-time and dynamic in cfDNA13 .
monitoring of molecular changes in tumours rather than depending Obtaining reliable and quantitative measurements of methyla-
on the challenges of invasive biopsy or even imaging. Furthermore, tion values in a minimum amount of cfDNA remain challenging;

1
State Key Laboratory of Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Sun Yat-sen University Cancer Center,
Guangzhou 510060, China. 2 Moores Cancer Center and Institute for Genomic Medicine, University of California, San Diego, La Jolla, California 92093,
USA. 3 Molecular Medicine Research Center, West China Hospital, Sichuan University, Chengdu 610041, China. 4 Guangzhou Youze Biological
Pharmaceutical Technology Company Ltd., Guangzhou 510005, China. 5 Shanghai Center for Plant Stress Biology, Shanghai Institute for Biological Sciences,
Chinese Academy of Sciences, Shanghai 210602, China. 6 Department of Clinical Laboratory Medicine, West China Hospital, Sichuan University,
Chengdu 610041, China. 7 Department of Clinical Laboratory Medicine, Xijing Hospital, the Fourth Military Medical University, Xi’an, Shanxi 710032, China.
8
Veterans Administration Healthcare System, San Diego, California 92093, USA. † These authors contributed equally to this work.
*e-mail: [email protected]; [email protected]

NATURE MATERIALS | VOL 16 | NOVEMBER 2017 | www.nature.com/naturematerials 1155


© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
ARTICLES NATURE MATERIALS DOI: 10.1038/NMAT4997

more sensitive assays need to be developed. It is hypothesized that (mBlock). We applied a Pearson correlation method to quantify
adjacent CpG sites in the same DNA strand may be modified co-methylation or mBlock20 . We compiled all common mBlocks of a
by a methyltransferase or demethylase together14 . These adjacent region by calculating different mBlock fractions (see Methods). We
stretches of CpG methylation, which we refer to as a methyla- then partitioned the genome into blocks of tightly co-methylated
tion correlated blocks (MCBs), are similar in concept to haplotype CpG sites we termed methylation correlated blocks (MCBs), using
blocks of adjacent single nucleotide polymorphisms (SNPs) in DNA an r 2 cutoff of 0.5. We then surveyed MCBs in cfDNA of 500
sequence variations and have the potential to enhance the accuracy normal samples and found that MCBs are highly consistent. We
of methylation allele calling. next determined methylation levels within an MCB in the cfDNA
In this study, to evaluate the potential of ctDNA methylation from 500 HCC samples. We found a highly consistent methylation
markers in diagnosis and prognosis of HCC, we compared differen- pattern in MCBs when comparing normal versus HCC cfDNA
tial methylation profiles of HCC tissues and blood leukocytes in nor- samples, which significantly enhanced allele-calling accuracy (Sup-
mal individuals by analysing 485,000 CpG markers, and identified a plementary Fig. 3). This technique was employed in all subsequent
methylation marker panel enriched in HCC. After validation of this sequencing analysis.
panel in matched HCC tumour DNA and plasma cfDNA within the
same patients, we employed multiple statistical methods to develop cfDNA diagnostic prediction for HCC
diagnostic and prognostic prediction models with selected methy- The methylation values of the 401 selected markers that showed
lation markers. We further compared the efficacy of methylation good methylation ranges in cfDNA samples were analysed by Ran-
marker-based models and current available approaches, such as AFP dom Forest and Least Absolute Shrinkage and Selection Operator
and TNM staging classification, in the diagnosis and prognosis of (LASSO) methods to further reduce the number of markers by
HCC in 1098 HCC and 835 normal samples. These results show modelling them in 715 HCC ctDNA and 560 normal cfDNA samples
that ctDNA methylation analysis may be reliable biomarkers in the (Fig. 1, see Methods). We obtained 24 markers using the Random-
diagnosis, surveillance, and prognosis of HCC. Forest analysis. We also obtained 30 markers using a LASSO analysis
in which we required selected markers to appear over 450 times out
Patient and sample characteristics of a total of 500 repetitions. There were ten overlapping markers
Clinical characteristics and molecular profiling including methy- between these two methods (Table 1). Using a logistic regression
lation data for comparison between HCC and blood lymphocytes method, we constructed a diagnostic prediction model with these
were assembled from sources including 377 HCC tumour samples ten markers. Applying the model yielded a sensitivity of 85.7% and
from The Cancer Genome Atlas (TCGA) and 754 blood leuko- specificity of 94.3% for HCC in the training data set of 715 HCC
cyte samples of healthy control individuals from a data set used and 560 normal samples (Fig. 2a) and a sensitivity of 83.3% and
in our previous methylation study on ageing (GSE40279)15 . To specificity of 90.5% in the validation data set of 383 HCC and
study ctDNA in HCC, plasma samples were obtained from Chinese 275 normal samples (Fig. 2b). We also demonstrated this model
patients with HCC and randomly selected healthy controls undergo- could differentiate HCC from normal controls both in the training
ing routine health care maintenance, resulting in a training cohort of data set (AUC = 0.966) and the validation data set (AUC = 0.944)
715 HCC patients and 560 normal healthy controls and a validation (Fig. 2c,d). Unsupervised hierarchical clustering of these ten mark-
cohort of 383 HCC patients and 275 healthy controls. All partici- ers was able to distinguish HCC from normal controls with high
pants provided written informed consent. Clinical characteristics of specificity and sensitivity (Fig. 2e,f and Supplementary Fig. 4).
all patients and controls are listed in Supplementary Table 1. We next assessed a combined diagnostic score (cd-score) of
the model for differentiating between liver diseases (hepatitis B
Methylation markers for differentiating HCC and blood virus/hepatitis C virus (HBV/HCV) infection, and fatty liver) and
We hypothesized that CpG markers with a maximal difference in HCC, since these liver diseases are known major risk factors for
methylation between HCC and blood leukocytes in normal indi- HCC. We found that the cd-score could differentiate HCC patients
viduals would be most likely to demonstrate detectable methylation from those with liver diseases or healthy controls (Fig. 3a). These
differences in the cfDNA of HCC patients when compared to that of results were consistent and comparable with those predicted by AFP
normal controls. We used the ‘moderated t-statistics’ method with levels (Supplementary Fig. 5a).
Empirical Bayes for shrinking the variance16 , and the Benjamini–
Hochberg procedure17 to control the false discovery rate (FDR) at a Methylation markers predicted clinical outcomes
significance level of 0.05 to identify the top 1,000 markers with the We next studied the utility of the cd-score in assessing treatment
most significantly different rates of methylation (that is, those with response, the presence of residual tumour following treatment, and
the lowest p values) between HCC and normal blood. Unsupervised staging of HCC. Clinical and demographic characteristics, such
hierarchical clustering of these top 1,000 markers was able to dis- as age, gender, race, and American Joint Committee on Cancer
tinguish between HCC and blood leukocytes in normal individuals (AJCC) stage were included in the analysis. The cd-scores of patients
(Supplementary Fig. 1). We designed molecular-inversion (padlock) with detectable residual tumour following treatment (n = 828) were
probes corresponding to these 1,000 markers and tested them in 28 significantly higher than those with no detectable tumour (n = 270),
pairs of HCC tissue DNA and matched plasma ctDNA from the and both were significantly greater than normal controls (n = 835)
same patient. The methylation profiles in HCC tumour DNA and (p < 0.0001, Fig. 3b). Similarly, cd-scores were significantly higher
matched plasma ctDNA were consistent (Supplementary Fig. 2a,b). in patients before treatment (n = 109) or with progression (n = 381)
401 markers with a good experimental amplification profile and compared to those with treatment response (n = 248) (p < 0.0001,
dynamic methylation range were selected for further analysis. Fig. 3c). In addition, cd-scores were significantly lower in patients
with complete tumour resection after surgery (n = 170) compared
Methylation block structure for allele-calling accuracy with those before surgery (n = 109), yet were higher in patients with
We employed the well-established concept of genetic linkage recurrence (n = 155) (p < 0.0001, Fig. 3d). Furthermore, there is
disequilibrium (LD block) to study the degree of co-methylation good correlation between the cd-scores and tumour stage. Patients
among different DNA strands18,19 , with the underlying assumption with early stage disease (I, II) had substantially lower cd-scores
that DNA sites in close proximity are more likely to be compared to those with advanced stage disease (III, IV) (p < 0.05,
co-methylated than distant sites. We used paired-end Illumina Fig. 3e). Collectively, these results suggest that the cd-score (that is,
sequencing reads to identify each individual methylation block the amount of ctDNA in plasma) correlates well with tumour burden

1156 NATURE MATERIALS | VOL 16 | NOVEMBER 2017 | www.nature.com/naturematerials

© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
NATURE MATERIALS DOI: 10.1038/NMAT4997 ARTICLES
HCC (TCGA)/normal blood (GSE)
methylation data

Moderate t-statistics to
select top 1,000 markers

Padlock probe design

Targeted bisulfite sequencing


in 28 paired HCC tissue DNA and
plasma cfDNA samples

401 usable markers

Targeted bisulfite sequencing in


HCC patient /normal blood samples
n = 1,933 (1,098 HCC/835 normal)

Diagnosis analysis Prognosis analysis

Training dataset Training dataset


n = 1,275 (715 HCC/560 normal) n = 680 HCC

Marker selection Marker selection

Univariate Cox

LASSO Random-forest
(30 markers) (24 markers)
LASSO-Cox

Diagnosis predictive model by Prognosis predictive model


overlapping 10 markers by 8 markers

Validation Validation

Validation dataset Validation dataset


n = 658 (383 HCC/275 normal) n = 369 HCC

Figure 1 | Workflow chart of data generation and analysis. Whole genome methylation data on HCC and normal lymphocytes were used to identify
401 candidate markers. Left panel: diagnostic marker selection: LASSO and random-forest analyses were applied to a training cohort of 715 HCC and
560 normal patients to identify a final selection of ten markers. These ten markers were applied to a validation cohort of 383 HCC and 275 normal patients.
Right panel: prognostic marker selection: Univariant-Cox and LASSO-Cox were applied to a training cohort of 680 HCC patients with survival data to
identify a final selection of eight markers. These eight markers were applied to a validation cohort of 369 HCC with survival data.

and may have utility in predicting tumour response and surveillance (AUC 0.969 versus 0.816, Fig. 3f). In patients with treatment
for recurrence. response, tumour recurrence, or progression, cd-score showed more
significant changes compared to testing at initial diagnosis than AFP
Utility of ctDNA diagnostic prediction and AFP (Supplementary Fig. 5b,c). In patients with serial samples, those
Currently, the only blood biomarker for risk assessment and surveil- with a positive treatment response had a concomitant significant
lance of HCC is serum AFP levels. However, its low sensitiv- decrease in cd-score compared to that prior to treatment, and
ity makes it inadequate to detect all patients that will develop there was an even further decrease in patients after surgery. By
HCC and severely limits its clinical utility. In fact, many cirrhotic contrast, our patients with progressive or recurrent disease all had
patients develop HCC without any increase in AFP levels. Strik- an increase in cd-score (Supplementary Fig. 6). By comparison,
ingly, 40% patients of our HCC study cohort have a normal serum AFP was less sensitive for assessing treatment efficacy in individual
AFP (<25 ng ml−1 ). patients (Supplementary Fig. 7). In addition, while cd-score corre-
In biopsy-proven HCC patients, the cd-score demonstrated lated well with tumour stage (Supplementary Fig. 5d), particularly
superior sensitivity and specificity than AFP for HCC diagnosis among patients with stage I, II and III, there was no significant

NATURE MATERIALS | VOL 16 | NOVEMBER 2017 | www.nature.com/naturematerials 1157


© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
ARTICLES NATURE MATERIALS DOI: 10.1038/NMAT4997

Table 1 | Characteristics of ten methylation markers and their coefficients in HCC diagnosis.

Markers Ref Gene Coefficients SE z value p value


15.595 2.395 6.513 <0.001
cg10428836 BMPR1A 11.543 0.885 −13.040 <0.001
cg26668608 PSD 4.557 0.889 5.129 <0.001
cg25754195 ARHGAP25 2.519 0.722 3.487 <0.001
cg05205842 KLF3 −3.612 0.954 −3.785 <0.001
cg11606215 PLAC8 6.865 1.095 6.271 <0.001
cg24067911 ATXN1 −5.439 0.868 −6.265 <0.001
cg18196829 Chr 6:170 −9.078 1.355 −6.698 <0.001
cg23211949 Chr 6:3 −5.209 1.081 −4.819 <0.001
cg17213048 ATAD2 6.660 1.422 4.683 <0.001
cg25459300 Chr 8:20 1.994 1.029 1.938 0.053
SE: standard errors of coefficients; z value: Wald z-statistic value.

a b c d
1.00 1.00
Training Real Real Validation Real Real
dataset HCC normal dataset HCC normal

True positive rate

True positive rate


Predict HCC 613 32 Predict HCC 319 26
Predict normal 102 528 Totals Predict normal 64 249 0.50 0.50
Totals 715 560 1,275 Totals 383 275 658 AUC = 0.966 AUC = 0.944
95%CI 0.958−0.975 95%CI 0.928−0.961
Correct 613 528 1,162 Correct 319 249 568 Data point Data point
715T/560N 383T/275N
Sensitivity (%) 85.7 Sensitivity (%) 83.3
0.00 0.00
Specificity (%) 94.3 Specificity (%) 90.5
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
False positive rate False positive rate

e Real status Predict status f Real status Predict status


Training dataset Cancer Pre_cancer Cancer Pre_cancer
0.0 0.2 0.4 0.6 0.8 1.0 Validation dataset 0.0 0.2 0.4 0.6 0.8 1.0
Normal Pre_normal Normal Pre_normal

Predict status Predict status


Real status Real status
cg26668608 cg26668608
cg11606215 cg11606215
cg24067911 cg24067911
cg05205842 cg05205842
cg18196829 cg18196829
cg23211949 cg23211949
cg17213048 cg17213048
cg25754195 cg25754195
cg25459300 cg25459300
cg10428836 cg10428836

Figure 2 | cfDNA methylation analysis of HCC diagnosis. a,b, Confusion tables of binary results of the diagnostic prediction model in the training (a) and
validation data sets (b). c,d, ROC of the diagnostic prediction model with methylation markers in the training (c) and validation data sets (d).
e,f, Unsupervised hierarchical clustering of ten methylation markers selected for use in the diagnostic prediction model in the training (e) and validation
data sets (f).

difference in AFP values in patients with different stages, except to predict prognosis with an 8-marker panel (Table 2). We gen-
between patients with stage III and IV (Supplementary Fig. 5e), indi- erated Kaplan–Meier curves in training and validation data sets
cating an advantage of cd-score over AFP in differentiation of early using a combined prognosis score (cp-score) with these markers.
stage HCC. The high-risk group (cp-score > −0.24) had 341 observations with
53 events in the training data set and 197 observations with 26
ctDNA prognostic prediction for HCC events in the validation data set; and the low-risk group (cp-score
We then investigated the potential of using methylation markers in ≤ −0.24) has 339 observations with 7 events in the training data
ctDNA for prediction of prognosis in HCC in combination with set and 172 observations with 9 events in the validation data set.
clinical and demographic characteristics including age, gender, race, Median survival was significantly different in both the training
and AJCC stage. We randomly split the 1049 HCC patients with set (p < 0.0001) and the validation set (p = 0.0014) by log-rank
complete survival information into training and validation data sets test (Fig. 4a,b).
with an allocation of 2:1. We implemented UniCox and LASSO-Cox Multivariate variable analysis showed that the cp-score was
methods to reduce the dimensionality and constructed a Cox-model significantly correlated with risk of death both in the training

1158 NATURE MATERIALS | VOL 16 | NOVEMBER 2017 | www.nature.com/naturematerials

© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
NATURE MATERIALS DOI: 10.1038/NMAT4997 ARTICLES
a b c
20 20 p = 3.9 × 10 −101
p = 3.0 × 10 −18
20 p = 3.0 × 10−14 p = 17.8 × 10
−32

p = 1.3 × 10−54
p = 0.85 p = 1.1 × 10−133
cd-score

cd-score

cd-score
10 10 10

0 0 0

−10 −10 −10


Healthy Liver HCC Normal HCC no HCC with Normal HCC before HCC with HCC with
controls disease controls tumour load tumour load controls treatment response progression

d e f 1.00
0.90
0.85
20 20 0.80

True positive rate


p = 1.3 × 10−54 p = 7.4 × 10−12 p = 1.0 × 10−4 p = 9.2 × 10−26 p = 0.01 p = 0.02 p = 0.02
cd-score

10 cd-score 10 0.50
AUC of cd-score: 0.969 (0.926−0.977)
0 AUC of AFP: 0.816 (0.792−0.839)
0

−10 −10 0.00


Normal HCC before HCC after HCC with Normal HCC of HCC of HCC of HCC of 0.0 0.2 0.4 0.6 0.8 1.0
controls surgery surgery recurrence controls stage I stage II stage III stage IV False positive rate

Figure 3 | cfDNA methylation analysis and tumour burden, treatment response, and staging. a, The combined diagnosis score (cd-score) in healthy
controls, individuals with liver diseases (HBV/HCV infection, and fatty liver) and HCC patients. b, cd-score in normal controls and HCC patients with and
without detectable tumour burden. c, cd-score in normal controls, HCC patients before treatment, with treatment response, and with progression.
d, cd-score in normal controls and HCC patients before surgery, after surgery, and with recurrence. e, cd-score in normal controls and HCC patients from
stage I–IV. f, The ROC of cd-score and AFP for HCC diagnosis in whole HCC cohort.

Table 2 | Characteristics of eight methylation markers and their coefficients in HCC prognosis prediction.

Markers Ref Gene Coefficients HR CI SE z value p value


cg23461741 SH3PXD2A −1.264 0.282 0.024–3.340 1.2604 −1.003 0.316
cg06482904 C11orf9 −0.247 0.781 0.067–9.100 1.2530 −0.197 0.844
cg25574765 PPFIA1 1.026 2.790 0.488–15.900 0.8894 1.153 0.249
cg07459019 Chr 17:78 −8.156 0.000 0.000–0.012 1.9112 −4.267 <0.001
cg20490031 SERPINB5 6.082 438.000 13.200–14,600.000 1.7885 3.400 0.001
cg01643250 NOTCH3 −5.368 0.005 0.000–0.140 1.7357 −3.093 0.002
cg11397370 GRHL2 1.497 4.470 1.030–19.400 0.7506 1.994 0.046
cg11825899 TMEM8B 2.094 8.120 0.957–68.900 1.0909 1.920 0.055
HR: Hazard Ratio; CI: 95.0% confidence interval; SE: standard errors of coefficients; z value: Wald z-statistic value.

and validation data set and that the cp-score was an inde- has opened an exciting new avenue in cancer diagnosis and prog-
pendent risk factor of survival (hazard ratio [HR]: 2.405; 95% nosis21,22 . Despite substantial variability in the somatic mutations
confidence interval [CI]: 1.904–3.038; p < 0.001 in the training of individual tumours (with some notable exceptions), methylation
set; HR: 1.548, CI: 1.246–1.924; p < 0.001 in the validation patterns turn out to be remarkably consistent. Methylation patterns
set, Supplementary Table 2). Interestingly, AFP was no longer detected in cfDNA therefore have the potential to be more reliable
significant as a risk factor when cp-score and other clinical char- discriminatory tools for the detection and diagnosis of malignancy.
acteristics were taken into account (Supplementary Table 2). In this study, we first determined differentially methylated CpG
As expected, TNM stage predicted the prognosis of patients in sites between HCC tumour samples and blood leukocytes in nor-
our training and validation data set (Supplementary Fig. 8a,b). How- mal individuals for an HCC-specific panel. We then constructed a
ever, the combination of cp-score and TNM staging significantly diagnostic prediction model using a 10-methylation marker panel
improved our ability to predict prognosis in both the training (AUC (cd-score) for use in cfDNA; the cd-score effectively discriminated
0.7935, Fig. 4c) and validation data sets (AUC 0.7588, Fig. 4d). patients with HCC from individuals with HBV/HCV infection, and
Kaplan–Meier curves also showed that patients separated by fatty liver as well as healthy controls. Given that patients with these
both cp-score and staging have significantly different prognosis liver diseases are the target screening population under current
(p < 0.0001, Fig. 4e). These results demonstrate that ctDNA methy- guidelines, it is essential that a serum test reliably distinguish these
lation analysis may contribute to risk stratification and prediction disease states from HCC. In our study, the sensitivity of the cd-score
of prognosis in patients with HCC. However, this application merits for HCC is comparable to liver ultrasound23 , the current standard
further investigation in an HCC population with longer clinical for HCC screening, markedly superior to AFP, and may represent a
follow-up than we had access to for our study. more cost-effective and less resource-intensive approach. Prospec-
The finding that tumours shed nucleic acids (DNA and RNA) tive clinical evaluation is warranted to compare or potentially com-
into the blood and can be used as a surrogate source of tumour DNA bine ultrasound screening with cd-score. Furthermore, the cd-score

NATURE MATERIALS | VOL 16 | NOVEMBER 2017 | www.nature.com/naturematerials 1159


© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
ARTICLES NATURE MATERIALS DOI: 10.1038/NMAT4997

a b 100 Low risk


100 Low risk
High risk High risk

Survival probability (%) 80

Survival probability (%)


80

60 60

40 40

Log-rank test p < 0.0001 Log-rank test p = 0.0014


Hazard ratio = 0.15(0.09−0.25) Hazard ratio = 0.32(0.16−0.61)
20 20
0 100 200 300 400 500 600 0 100 200 300 400 500 600
Overall survival (days) Overall survival (days)

c 1.0 d 1.0

0.8 0.8
True negative rate

True negative rate


0.6 0.6

0.4 0.4

0.2 AUC of cp-score + stage: 0.7935 0.2


AUC of cp-score + stage: 0.7588
AUC of cp-score: 0.7533
AUC of cp-score: 0.675
AUC of stage: 0.651 AUC of stage: 0.693

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
False negative rate False negative rate

e Low risk + stage I/II


100
Low risk + stage III/IV
High risk + stage I/II
High risk + stage III/IV
Survival probability (%)

80

60

40 Log-rank test p < 0.0001


Hazard ratio = 0.16(0.09−0.29)
Hazard ratio = 1.06(0.58−1.96)
Hazard ratio = 0.22(0.12−0.40)
20
0 100 200 300 400 500 600
Overall survival (days)

Figure 4 | cfDNA methylation analysis for prognostic prediction HCC survival. a,b, Overall survival curves of HCC patients with low or high risk, according
to the combined prognosis score (cp-score) in the training (a) and validation data sets (b). c,d, The ROC for the cp-score, stage, and cp-score combined
with stage in the training (c) and validation data sets (d). e, Survival curves of HCC patients with combinations of cp-score risk and stage in the whole
HCC cohort.

of our model showed high correlation with HCC tumour burden, Further study is warranted with longer clinical surveillance, in par-
treatment response, and stage, and is superior to the performance ticular to fully assess whether this score can meaningfully contribute
of AFP in our cohort. The cd-score may therefore be particu- to clinical decision making for patients.
larly useful for assessment of treatment response and surveillance By sequencing of bisulfite converted cfDNA, we identified many
for recurrence. previously unknown CpG markers differentially methylated in can-
Additionally, we constructed a prognostic prediction model with cer versus normal plasma. Specifically, we employed a direct se-
an independent 8-marker panel and generated a combined prog- quencing approach that captured the methylation status of adjacent
nosis score system (cp-score). The cp-score, which effectively dis- CpG markers and found that the methylation of many adjacent
tinguished HCC patients with significantly different prognosis, was markers is highly correlated with the initially targeted CpG, forming
validated as an independent prognostic risk factor in a multivariable an MCB. A similar concept has been proposed before in which mul-
analysis in our cohort and was again superior to AFP. This type of tiple adjacent CpG sites share a similar methylation pattern14,24–27 .
analysis may assist in the identification of patients for whom more This information allowed us to identify additional markers and
or less aggressive treatment and surveillance is warranted. However, improve the accuracy of sequencing for determining significant
our study was limited by a relatively short clinical follow-up period. methylation differences.

1160 NATURE MATERIALS | VOL 16 | NOVEMBER 2017 | www.nature.com/naturematerials

© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
NATURE MATERIALS DOI: 10.1038/NMAT4997 ARTICLES
Oncologists currently evaluate treatment response of HCC by 14. Lehmann-Werman, R. et al. Identification of tissue-specific cell death using
imaging and AFP. Even with the modified Response Evaluation methylation patterns of circulating DNA. Proc. Natl Acad. Sci. USA 113,
Criteria in Solid Tumours (mRECIST)28 , there are often difficult 201519286 (2016).
15. Hannum, G. et al. Genome-wide methylation profiles reveal quantitative views
cases in which data is inconsistent and determining response and of human aging rates. Mol. Cell 49, 359–367 (2013).
prognosis of patients is challenging. AFP is a useful serum marker 16. Smyth, G. K. Bioinformatics and Computational Biology Solutions Using R and
in many patients, but is limited by its poor sensitivity and has Bioconductor 397–420 (Springer, 2005).
proven to be a less than ideal surrogate for monitoring treatment 17. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a
response of HCC29 , as demonstrated by others and consistent with practical and powerful approach to multiple testing. J. R. Statist. Soc. B 57,
our study. In contrast, our results showed that methylation markers 289–300 (1995).
18. Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411,
of ctDNA have high sensitivity and specificity that correlate with 199–204 (2001).
tumour burden, stage, treatment response, and prognosis of HCC 19. Ardlie, K. G., Kruglyak, L. & Seielstad, M. Patterns of linkage disequilibrium in
patients. Furthermore, it is possible for relatively rapid adjustment the human genome. Nat. Rev. Genet. 3, 299–309 (2002).
of the treatment plan based on cfDNA due to its relatively short 20. Hao, X. et al. DNA methylation markers for diagnosis and prognosis of
half-life (about 2 h)30 . common cancers. Proc. Natl Acad. Sci. USA 114, 7414–7419 (2017).
Some recent studies have reported that monitoring the somatic 21. Aravanis, A. M., Lee, M. & Klausner, R. D. Next-generation sequencing of
circulating tumor DNA for early cancer detection. Cell 168, 571–574 (2017).
alterations in ctDNA can provide the earliest measure of treatment 22. Snyder, M. W., Kircher, M., Hill, A. J., Daza, R. M. & Shendure, J. Cell-free
response in some solid cancers, including lung, colorectal and breast DNA comprises an in vivo nucleosome footprint that informs its
cancer31–34 . Unlike these studies, an advantage of our methylation tissues-of-origin. Cell 164, 57–68 (2016).
markers is that we do not first need identification of somatic mu- 23. Singal, A. et al. Meta-analysis: surveillance with ultrasound for early-stage
tations in an individual patient. Furthermore, based on targeted hepatocellular carcinoma in patients with cirrhosis. Aliment Pharmacol. Ther.
sequencing of specific markers, our method can avoid the high 30, 37–47 (2009).
24. Butcher, L. M. et al. Non-CG DNA methylation is a biomarker for assessing
cost of deep sequencing, which may make for its more routine and endodermal differentiation capacity in pluripotent stem cells. Nat. Commun.
cost-effective application. Alternatively, it is intriguing to imagine 7, 10458 (2016).
the identification of a broad ‘pan-cancer’ methylation panel for 25. Libertini, E. et al. Information recovery from low coverage whole-genome
use in cfDNA, possibly in synergy with somatic mutation analysis, bisulfite sequencing. Nat. Commun. 7, 11306 (2016).
that would allow pan screening for malignancy. Collectively, our 26. Burger, L., Gaidatzis, D., Schubeler, D. & Stadler, M. B. Identification of active
study demonstrates the utility of cfDNA methylation analysis in regulatory regions from DNA methylation data. Nucleic Acids Res. 41,
e155 (2013).
the diagnosis, treatment evaluation, and prognosis of HCC, and 27. Guo, S. et al. Identification of methylation haplotype blocks aids in
represents a proof of concept for its use in solid malignancies broadly deconvolution of heterogeneous tissue samples and tumor tissue-of-origin
beyond HCC. mapping from plasma DNA. Nat. Genet. 49, 635–642 (2017).
28. Lencioni, R. & Llovet, J. M. Modified RECIST (mRECIST) assessment for
Methods hepatocellular carcinoma. Semin. Liver Dis. 30, 52–60 (2010).
Methods, including statements of data availability and any 29. Raoul, J. L. et al. Using modified RECIST and alpha-fetoprotein levels to
assess treatment benefit in hepatocellular carcinoma. Liver Cancer 3,
associated accession codes and references, are available in the 439–450 (2014).
online version of this paper. 30. Diehl, F. et al. Circulating mutant DNA to assess tumor dynamics. Nat. Med.
14, 985–990 (2008).
Received 26 July 2016; accepted 30 August 2017; 31. Mok, T. et al. Detection and dynamic changes of EGFR mutations from
circulating tumor DNA as a predictor of survival outcomes in NSCLC patients
published online 9 October 2017
treated with first-line intercalated erlotinib and chemotherapy. Clin. Cancer
Res. 21, 3196–3203 (2015).
References 32. Diaz, L. A. Jr. et al. The molecular evolution of acquired resistance to targeted
1. Torre, L. A. et al. Global cancer statistics, 2012. CA Cancer J. Clin. 65, EGFR blockade in colorectal cancers. Nature 486, 537–540 (2012).
87–108 (2015). 33. Misale, S. et al. Emergence of KRAS mutations and acquired resistance to
2. Bruix, J. & Sherman, M. AASLD Practice Guideline: Management of anti-EGFR therapy in colorectal cancer. Nature 486, 532–536 (2012).
hepatocellular carcinoma. Hepatology 42, 1208–1236 (2005). 34. Dawson, S. J. et al. Analysis of circulating tumor DNA to monitor metastatic
3. Johnson, P. Role of alpha - fetoprotein in the diagnosis and management of breast cancer. N. Engl. J. Med. 368, 1199–1209 (2013).
hepatocellular carcinoma. J. Gastroenterol. Hepatol. 14, S32–S36 (1999).
4. Stroun, M. et al. The origin and mechanism of circulating DNA. Ann. NY
Acad. Sci. 906, 161–168 (2000). Acknowledgements
5. Bettegowda, C. et al. Detection of circulating tumor DNA in early- and The results published here are in part based upon data generated by the TCGA Research
Network: http://cancergenome.nih.gov. We thank staff at Kang Zhang and Ruihua Xu
late-stage human malignancies. Sci. Trans. Med. 6, 224ra224 (2014).
laboratories for technical assistance. This study was funded by Richard Annesser Fund,
6. Newman, A. M. et al. An ultrasensitive method for quantitating Michael Martin Fund, Dick and Carol Hertzberg Fund, SYSUCC, Xijing Hospital, and
circulating tumor DNA with broad patient coverage. Nat. Med. 20, West China Hospital.
548–554 (2014).
7. Esteller, M. Epigenetics in cancer. N. Engl. J. Med. 358, 1148–1159 (2008).
8. Baylin, S. B. & Jones, P. A. Epigenetic determinants of cancer. Cold Spring Harb.
Author contributions
W.Wei, M.K., W.Wang, H.L., K.F., W.S., S.Y., L.Z., H.Z., R.Z., Y.X., K.L., H.Cai, G.L., L.Z.,
Perspect. Biol. 8, a019505 (2016). R.-h.X., Z.Z., D.L., E.Z. and C.Z. performed the experiments; M.K. W.Wang, H.L., K.F.,
9. Irizarry, R. A. et al. The human colon cancer methylome shows similar hypo- B.A.C., Q.Q., Q.Z., L.Z., R.-h.X., J.Z., X.F., J.-k.Z., Y.D., H.Carter, M.Y., W.Z., R.G. and
and hypermethylation at conserved tissue-specific CpG island shores. X.H. collected and analysed the data. K.Z. and R.-h.X. conceived the project, designed
Nat. Genet. 41, 178–186 (2009). the experiments, and wrote the manuscript; All authors discussed the results and
10. Baylin, S. B. & Jones, P. A. A decade of exploring the cancer reviewed the manuscript.
epigenome—biological and translational implications. Nat. Rev. Cancer 11,
726–734 (2011). Additional information
11. Board, R. E. et al. DNA methylation in circulating tumour DNA as a biomarker Supplementary information is available in the online version of the paper. Reprints and
for cancer. Biomarker Insights 2, 307–319 (2008). permissions information is available online at www.nature.com/reprints. Publisher’s note:
12. Warren, J. D. et al. Septin 9 methylated DNA is a sensitive and specific blood Springer Nature remains neutral with regard to jurisdictional claims in published maps
test for colorectal cancer. BMC Med. 9, 133 (2011). and institutional affiliations. Correspondence and requests for materials should be
13. Pishvaian, M. J. et al. A pilot study evaluating concordance between addressed to R.-h.X. or K.Z.
blood-based and patient-matched tumor molecular testing within pancreatic
cancer patients participating in the Know Your Tumor (KYT) initiative. Competing financial interests
Oncotarget 7, 13225 (2016). The authors declare no competing financial interests.

NATURE MATERIALS | VOL 16 | NOVEMBER 2017 | www.nature.com/naturematerials 1161


© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
ARTICLES NATURE MATERIALS DOI: 10.1038/NMAT4997

Methods Probe design and synthesis. Padlock probes were designed using the ppDesigner
Patient data. Tissue DNA methylation data was obtained from The Cancer software37 . The average length of the captured region was 100 bp, with the CpG
Genome Atlas (TCGA). Complete clinical, molecular, and histopathological data marker located in the central portion of the captured region. Linker sequence
sets are available at the TCGA website: https://tcga-data.nci.nih.gov/docs/ between arms contained binding sequences for amplification primers separated by
publications/tcga. Individual institutions that contributed samples coordinated the a variable stretch of Cs to produced probes of equal length. We incorporated a 6-bp
consent process and obtained informed written consent from each patient in UMI sequence in probe design to allow for the identification of unique individual
accordance to their respective institutional review boards. molecular capture events and accurate scoring of DNA methylation levels. Padlock
A second independent Chinese cohort consisted of HCC patients at the Sun probe sequence information on the final ten diagnostic markers and eight
Yat-sen University Cancer Center in Guangzhou, Xijing Hospital in Xi’an and the prognostic markers are listed in Supplementary Table 4.
West China Hospital in Chengdu, China. Those who presented with HCC from Probes were synthesized as separate oligonucleotides using standard
stage I–IV were selected and enrolled in this study. Patient characteristics and commercial synthesis methods (ITD). For capture experiments, probes were
tumour features are summarized in Supplementary Table 1. The TNM staging mixed, in vitro phosphorylated with T4 PNK (NEB) according to manufacturer’s
classification for HCC is according to the 7th edition of the AJCC cancer staging recommendations, and purified using P-30 Micro Bio-Spin columns (Bio-Rad).
manual35 . The TNM Staging System is one of the most commonly used tumour
staging systems. This system was developed and is maintained by the American Sequencing data analysis. Mapping of sequencing reads was done using the
Joint Committee on Cancer (AJCC) and adopted by the Union for International software tool bisReadMapper with some modifications37 . First, UMIs were
Cancer Control (UICC). The TNM classification system was developed as a tool for extracted from each sequencing read and appended to read headers within FASTQ
oncologists to stage different types of cancer based on certain standard criteria. The files using a custom script. Reads were on-the-fly converted as if all C were
TNM Staging System is based on the extent of the tumour (T), the extent of spread non-methylated and mapped to in-silico converted DNA strands of the human
to the lymph nodes (N), and the presence of metastasis (M). This project was genome, also as if all C were non-methylated, using Bowtie2 (ref. 39). Original
approved by the Institutional Review Boards (IRBs) of Sun Yat-sen University reads were merged and filtered for a single UMI—that is, reads carrying the same
Cancer Center, Xijing Hospital, and West China Hospital. Informed consent was UMI were discarded, leaving a single, unique read. Methylation frequencies were
obtained from all patients. Tumour and normal tissues were obtained as clinically calculated for all CpG dinucleotides contained within the regions captured by
indicated for patient care and were retained for this study. Human blood samples padlock probes by dividing the numbers of unique reads carrying a C at
were collected by venipuncture and plasma samples were obtained by taking the interrogated position by the total number of reads covering the
supernatant after centrifugation and stored at −80 ◦ C before cfDNA extraction. interrogated position.

Cell-free DNA extraction from plasma samples. We used minimal 1.5 ml plasma Identification of methylation correlated blocks (MCBs). Pearson correlation
samples throughout our study by investigating the minimal volume of plasma that coefficients between methylation frequencies of each pair of CpG markers
will give a consistent cfDNA recovery and reliable sequencing coverage defined as separated by no more than 200 bp were calculated separately across 50 cfDNA
more than 20 reads for a target cg marker. EliteHealth cfDNA extraction Kit samples from each of the two diagnostic categories—that is, normal health blood
(EliteHealth, Guangzhou Youze, China) was used for cell-free DNA extraction. and HCC. A value of Pearson’s r < 0.5 was used to identify transition spots
More detailed information is described in the Supplementary Information. (boundaries) between any two adjacent markers indicating uncorrelated
methylation. Markers not separated by a boundary were combined into MCBs.
Bisulfite conversion of genomic DNA. 10–15 ng of cf DNA was converted to This procedure identified a total of ∼1,550 MCBs in each diagnostic category
bis-DNA using EZ DNA Methylation-Lightning Kit (Zymo Research) according to within our padlock data, combining between 2 and 22 CpG positions in each block.
the manufacturer’s protocol. The efficiency of bisulfite conversion was >99.8%, as Methylation frequencies for entire MCBs were calculated by summing up the
verified by deep sequencing of bis-DNA and analysing the ratio of C to T numbers of Cs at all interrogated CpG positions within an MCB and dividing by
conversion of CH (non-CG target-captured) dinucleotides. the total number of C+Ts at those positions.

Determination of DNA methylation levels by deep sequencing of bis-DNA Data availability. Raw beta value data for ten diagnostic markers are listed in
target-captured with molecular-inversion (padlock) probes. CpG markers whose Supplementary Table 5 (Pages 15–81); raw beta value data for eight prognostic
methylation levels significantly differed in any of the comparisons between any markers are listed in Supplementary Table 6 (Pages 82–118). Key raw data were also
cancer tissue and any normal tissue in TGCA data set were used to design padlock verified and uploaded onto the Research Data Deposit public platform
probes for capture and sequencing of cfDNA. Padlock capture of bis-DNA was (www.researchdata.org.cn) with an approval number RDDB2017000132.
based on the technique on published methods with modifications36–38 . We used a
two-step approach wherein the first step is to identify optimal cg markers with the References
largest methylation beta value difference between HCC tissue and normal blood 35. Edge, S. B. & Compton, C. C. The American Joint Committee on Cancer:
leukocytes; the second step to validate these top cg markers using cfDNA from the 7th edition of the AJCC cancer staging manual and the future of TNM.
plasma sample of HCC and normal patients. Because of a relatively modest total Ann. Surg. Oncol. 17, 1471–1474 (2010).
size of captured regions/cg markers, this approach offers much lower cost of 36. Porreca, G. J. et al. Multiplex amplification of large sets of human exons.
sequencing than any current methods, including whole methylome-wide Nat. Methods 4, 931–936 (2007).
sequencing, therefore enabling us to evaluate a large number of samples. 37. Diep, D. et al. Library-free methylation sequencing with bisulfite padlock
Furthermore, our direct targeted sequencing approach offers digital readout, and probes. Nat. Methods 9, 270–272 (2012).
requires much less starting cfDNA material (10–15 ng) than more traditional 38. Deng, J. et al. Targeted bisulfite sequencing reveals changes in DNA
recent methods based on hybridization on a chip (for example, Infinium, Illumina) methylation associated with nuclear reprogramming. Nat. Biotechnol. 27,
or target-enrichment by hybridization (for example, SureSelect, Agilent). This 353–360 (2009).
approach is also less sensitive to unequal amplification as it utilizes unique 39. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2.
molecular identifiers (UMIs). Nat. Methods 9, 357–359 (2012).

NATURE MATERIALS | www.nature.com/naturematerials

© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.

You might also like