Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune...

9
Systematic autoantigen analysis identifies a distinct subtype of scleroderma with coincident cancer George J. Xu a,b,c,d,e,1 , Ami A. Shah f,1 , Mamie Z. Li c,d,e , Qikai Xu c,d,e , Antony Rosen f , Livia Casciola-Rosen f,2,3 , and Stephen J. Elledge c,d,e,2,3 a Program in Biophysics, Harvard University, Cambridge, MA 02115; b Division of Health Sciences and Technology, HarvardMassachusetts Institute of Technology, Cambridge, MA 02139; c Division of Genetics, Department of Medicine, Brigham and Womens Hospital, Boston, MA 02115; d Howard Hughes Medical Institute, Brigham and Womens Hospital, Boston, MA 02115; e Department of Genetics, Program in Virology, Harvard University Medical School, Boston, MA 02115; and f Division of Rheumatology, Johns Hopkins University School of Medicine, Baltimore, MD 21224 Contributed by Stephen J. Elledge, October 4, 2016 (sent for review August 10, 2016; reviewed by Joshua LaBaer and Hidde L. Ploegh) Scleroderma is a chronic autoimmune rheumatic disease associ- ated with widespread tissue fibrosis and vasculopathy. Approxi- mately two-thirds of all patients with scleroderma present with three dominant autoantibody subsets. Here, we used a pair of complementary high-throughput methods for antibody epitope discovery to examine patients with scleroderma with or without known autoantibody specificities. We identified a specificity for the minor spliceosome complex containing RNA Binding Region (RNP1, RNA recognition motif) Containing 3 (RNPC3) that is found in patients with scleroderma without known specificities and is ab- sent in unrelated autoimmune diseases. We found strong evidence for both intra- and intermolecular epitope spreading in patients with RNA polymerase III (POLR3) and the minor spliceosome specificities. Our results demonstrate the utility of these technologies in rapidly identifying antibodies that can serve as biomarkers of disease sub- sets in the evolving precision medicine era. autoimmunity | PLATO | PhIP-Seq | systemic sclerosis A utoimmune diseases occur when the bodys immune system attacks the bodys own tissues. Although many autoimmune disorders may operate through similar general mechanisms, they differ in the initiating event and the tissue target of the immune response responsible for the patientssymptoms. For example, rheumatoid arthritis attacks joints, inflammatory bowel disease attacks the lining of the intestines, and multiple sclerosis attacks nerve cells. In addition, many autoimmune diseases may appear similar but have distinct underlying etiologies. Understanding and diagnosing autoimmune diseases benefit from precise knowledge of the targets of the immune system, which can serve both di- agnostic and, potentially, therapeutic purposes. For these reasons, we and others have developed methods to identify targets of au- toimmune disorders (14). Here, we present an approach to identifying epitopes targeted by the humoral autoimmune response in scleroderma using a pair of complementary high-throughput antigen discovery tech- nologies, phage-immunoprecipitation sequencing (PhIP-Seq) and parallel analysis of translated ORFs (PLATO). Both PhIP- Seq and PLATO techniques identify the antigen targets of serum antibodies by immunoprecipitation in the presence of a library of potential protein antigens, each of which is associated with the nucleic acid encoding it. High-throughput sequencing of the mixture of DNA molecules before and after immunoprecipita- tion reveals which protein antigens were enriched due to binding by antibodies in the sample. PhIP-Seq and PLATO differ in the method of protein display and the type of DNA encoding the displayed protein. In PhIP- Seq, the DNA oligonucleotides encode 90-aa protein fragments tiling through the entire human proteome with a 45-aa overlap; these oligonucleotides are displayed on the surface of the bacte- riophage as fusions to its coat protein. Although this approach offers a complete, uniform representation of the human proteome, a major limitation of this approach is that antibodies recognizing discontinuous epitopes may not be captured. This limitation is overcome in PLATO, where the DNA is a collection of full- length ORFs expressed using ribosome display. However, the PLATO library does not contain all human proteins. A final limitation of both approaches is the lack of native cotranslational/ posttranslational modification of proteins. In this work, we also describe technical improvements to both PhIP-Seq and PLATO (Materials and Methods). For PLATO, we added DNA barcoding to the library of ORFs to simplify the workflow greatly and increase the accuracy of detection. DNA barcoding enables simple sequencing library preparation through RT-PCR with one primer pair instead of the DNA shearing and polyA priming used in the original PLATO assay. In addition, the simple PCR step is more efficient and less prone to bias. We applied this approach to study the humoral autoimmune response in scleroderma, a chronic autoimmune rheumatic disease associated with widespread tissue fibrosis and vasculopathy. Auto- antibodies are found in >95% of patients who have scleroderma, with the three most common specificities being anticentromere, antitopoisomerase 1, and anti-RNA polymerase III (anti-POLR3). It has long been recognized that autoantibodies have diagnostic and prognostic utility across the spectrum of autoimmune rheumatic diseases, and may be associated with distinct clinical phenotypes. For example, it is recommended that patients with scleroderma with anticentromere antibodies be monitored for evidence of pulmonary arterial hypertension, whereas patients with sclero- derma with antitopoisomerase 1 antibodies should be monitored for pulmonary fibrosis (5). Stratifying patients by antibody status Significance In this study, we created a barcoded whole-genome ORF mRNA display library and combined it with phage-immunoprecipitation sequencing to look for autoantibodies in sera from patients with scleroderma who also had coincident cancer without a known autoantibody biomarker. Using these two technologies, we found that 25% of these patients had autoantibodies to RNA Binding Region Containing 3 (RNPC3) and multiple other components of the minor spliceosome. There was evidence of intra- and inter- molecular epitope spreading within RNPC3 and the complex. These combined technologies are highly effective for rapidly discovering autoantibodies in patient subgroups, which will be useful tools for patient stratification and discovery of pathogenic pathways. Author contributions: G.J.X., A.A.S., Q.X., A.R., L.C.-R., and S.J.E. designed research; G.J.X., A.A.S., and M.Z.L. performed research; G.J.X., A.A.S., Q.X., A.R., and L.C.-R. analyzed data; and G.J.X., A.A.S., A.R., L.C.-R., and S.J.E. wrote the paper. Reviewers: J.L., Arizona State University; and H.L.P., Whitehead Institute for Biomedical Research. The authors declare no conflict of interest. 1 G.J.X. and A.A.S. contributed equally to this work. 2 L.C.-R. and S.J.E. contributed equally to this work. 3 To whom correspondence may be addressed. Email: [email protected] or [email protected]. E7526E7534 | PNAS | Published online November 7, 2016 www.pnas.org/cgi/doi/10.1073/pnas.1615990113 Downloaded by guest on October 1, 2020

Transcript of Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune...

Page 1: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

Systematic autoantigen analysis identifies a distinctsubtype of scleroderma with coincident cancerGeorge J. Xua,b,c,d,e,1, Ami A. Shahf,1, Mamie Z. Lic,d,e, Qikai Xuc,d,e, Antony Rosenf, Livia Casciola-Rosenf,2,3,and Stephen J. Elledgec,d,e,2,3

aProgram in Biophysics, Harvard University, Cambridge, MA 02115; bDivision of Health Sciences and Technology, Harvard–Massachusetts Institute ofTechnology, Cambridge, MA 02139; cDivision of Genetics, Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115; dHoward HughesMedical Institute, Brigham and Women’s Hospital, Boston, MA 02115; eDepartment of Genetics, Program in Virology, Harvard University Medical School,Boston, MA 02115; and fDivision of Rheumatology, Johns Hopkins University School of Medicine, Baltimore, MD 21224

Contributed by Stephen J. Elledge, October 4, 2016 (sent for review August 10, 2016; reviewed by Joshua LaBaer and Hidde L. Ploegh)

Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately two-thirds of all patients with scleroderma present withthree dominant autoantibody subsets. Here, we used a pair ofcomplementary high-throughput methods for antibody epitopediscovery to examine patients with scleroderma with or withoutknown autoantibody specificities. We identified a specificity forthe minor spliceosome complex containing RNA Binding Region(RNP1, RNA recognition motif) Containing 3 (RNPC3) that is foundin patients with scleroderma without known specificities and is ab-sent in unrelated autoimmune diseases. We found strong evidencefor both intra- and intermolecular epitope spreading in patients withRNA polymerase III (POLR3) and the minor spliceosome specificities.Our results demonstrate the utility of these technologies in rapidlyidentifying antibodies that can serve as biomarkers of disease sub-sets in the evolving precision medicine era.

autoimmunity | PLATO | PhIP-Seq | systemic sclerosis

Autoimmune diseases occur when the body’s immune systemattacks the body’s own tissues. Although many autoimmune

disorders may operate through similar general mechanisms, theydiffer in the initiating event and the tissue target of the immuneresponse responsible for the patients’ symptoms. For example,rheumatoid arthritis attacks joints, inflammatory bowel diseaseattacks the lining of the intestines, and multiple sclerosis attacksnerve cells. In addition, many autoimmune diseases may appearsimilar but have distinct underlying etiologies. Understanding anddiagnosing autoimmune diseases benefit from precise knowledgeof the targets of the immune system, which can serve both di-agnostic and, potentially, therapeutic purposes. For these reasons,we and others have developed methods to identify targets of au-toimmune disorders (1–4).Here, we present an approach to identifying epitopes targeted

by the humoral autoimmune response in scleroderma using apair of complementary high-throughput antigen discovery tech-nologies, phage-immunoprecipitation sequencing (PhIP-Seq)and parallel analysis of translated ORFs (PLATO). Both PhIP-Seq and PLATO techniques identify the antigen targets of serumantibodies by immunoprecipitation in the presence of a library ofpotential protein antigens, each of which is associated with thenucleic acid encoding it. High-throughput sequencing of themixture of DNA molecules before and after immunoprecipita-tion reveals which protein antigens were enriched due to bindingby antibodies in the sample.PhIP-Seq and PLATO differ in the method of protein display

and the type of DNA encoding the displayed protein. In PhIP-Seq, the DNA oligonucleotides encode 90-aa protein fragmentstiling through the entire human proteome with a 45-aa overlap;these oligonucleotides are displayed on the surface of the bacte-riophage as fusions to its coat protein. Although this approachoffers a complete, uniform representation of the human proteome,a major limitation of this approach is that antibodies recognizing

discontinuous epitopes may not be captured. This limitation isovercome in PLATO, where the DNA is a collection of full-length ORFs expressed using ribosome display. However, thePLATO library does not contain all human proteins. A finallimitation of both approaches is the lack of native cotranslational/posttranslational modification of proteins.In this work, we also describe technical improvements to both

PhIP-Seq and PLATO (Materials and Methods). For PLATO, weadded DNA barcoding to the library of ORFs to simplify theworkflow greatly and increase the accuracy of detection. DNAbarcoding enables simple sequencing library preparation throughRT-PCR with one primer pair instead of the DNA shearing andpolyA priming used in the original PLATO assay. In addition,the simple PCR step is more efficient and less prone to bias.We applied this approach to study the humoral autoimmune

response in scleroderma, a chronic autoimmune rheumatic diseaseassociated with widespread tissue fibrosis and vasculopathy. Auto-antibodies are found in >95% of patients who have scleroderma,with the three most common specificities being anticentromere,antitopoisomerase 1, and anti-RNA polymerase III (anti-POLR3).It has long been recognized that autoantibodies have diagnostic andprognostic utility across the spectrum of autoimmune rheumaticdiseases, and may be associated with distinct clinical phenotypes.For example, it is recommended that patients with sclerodermawith anticentromere antibodies be monitored for evidence ofpulmonary arterial hypertension, whereas patients with sclero-derma with antitopoisomerase 1 antibodies should be monitoredfor pulmonary fibrosis (5). Stratifying patients by antibody status

Significance

In this study, we created a barcoded whole-genome ORF mRNAdisplay library and combined it with phage-immunoprecipitationsequencing to look for autoantibodies in sera from patients withscleroderma who also had coincident cancer without a knownautoantibody biomarker. Using these two technologies, we foundthat 25% of these patients had autoantibodies to RNA BindingRegion Containing 3 (RNPC3) and multiple other components ofthe minor spliceosome. There was evidence of intra- and inter-molecular epitope spreadingwithin RNPC3 and the complex. Thesecombined technologies are highly effective for rapidly discoveringautoantibodies in patient subgroups, which will be useful toolsfor patient stratification and discovery of pathogenic pathways.

Author contributions: G.J.X., A.A.S., Q.X., A.R., L.C.-R., and S.J.E. designed research; G.J.X.,A.A.S., and M.Z.L. performed research; G.J.X., A.A.S., Q.X., A.R., and L.C.-R. analyzed data;and G.J.X., A.A.S., A.R., L.C.-R., and S.J.E. wrote the paper.

Reviewers: J.L., Arizona State University; and H.L.P., Whitehead Institute for BiomedicalResearch.

The authors declare no conflict of interest.1G.J.X. and A.A.S. contributed equally to this work.2L.C.-R. and S.J.E. contributed equally to this work.3To whom correspondence may be addressed. Email: [email protected] [email protected].

E7526–E7534 | PNAS | Published online November 7, 2016 www.pnas.org/cgi/doi/10.1073/pnas.1615990113

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 2: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

highlights the need for discovery of additional specificities andnovel strategies to accomplish this task.The recent observation that scleroderma and cancer can pre-

sent synchronously, particularly in patients with autoantibodiesto POLR3 (6, 7), prompted studies to define whether cancer andscleroderma might be mechanistically related in this subgroup ofpatients. Somatic mutation and loss of heterozygosity (LOH)within the gene (POLR3A) encoding the 155-kDa large subunitof RNA polymerase III occur frequently in cancers from patientswith scleroderma with anti-POLR3 autoantibodies (8). Fur-thermore, these patients had CD4 T cells that preferentiallyrecognize the mutated epitopes in the POLR3A subunit, sug-gesting that their immune response may initially be directedagainst the mutation-associated neoepitope, thereby enabling aB-cell response to the wild-type antigen. These data stronglysuggest a model of cancer-induced autoimmunity in which mu-tated autoantigens serve as immunogens that may initiate diseasesimilar to what has been suggested with other paraneoplasticneurological disorders in which tumors are thought to drive au-toimmunity (9, 10). In those studies, two additional observationswere made. First, ∼85% of patients with scleroderma and anti-POLR3 autoantibodies do not have a known cancer diagnosis.This observation raised the question of whether the immune re-sponse recognizes different components of the POLR3 complex inanti-POLR3 antibody–positive patients with cancer comparedwith those patients without cancer. Second, patients with sclero-derma and cancer who did not have antibodies against any of thethree major antigen specificities in scleroderma defined to date(centromere, topoisomerase-1, and POLR3; subsequently referredto as “anti-CTP antibody–negative”) frequently have a shortcancer-scleroderma interval. Improved understanding of themechanisms at the cancer–autoimmunity interface will comefrom defining whether additional targets of autoantibodies existin anti-POLR3–positive patients without cancer, and whether CTPantibody-negative patients with a short cancer-scleroderma intervalhave as yet unknown immune responses.In this study, we demonstrate the particular utility of PhIP-Seq

for detecting autoantigens that are targeted during intramolecularepitope spreading; that is, we show that the assay is able to detecta pattern of autoantibody specificities that is consistent with epitopespreading over the course of disease. Furthermore, using thesecomplementary methods, we report the identification of antibodiesdirected against four different components of the minor spliceo-some complex [RNA Binding Region (RNP1, RNA recognitionmotif) Containing 3 (RNPC3), small nuclear ribonucleoproteinU11/U12 subunit 25 (SNRNP25), small nuclear ribonucleoproteinU11/U12 subunit 35 (SNRNP35), and Programmed Cell Death7 (PDCD7)] in sera from patients with CTP-negative cancer-associated scleroderma. In addition, we report on extensive, previ-ously undetected epitope spreading among additional componentsof the POLR3 protein complex. These studies reveal a subset ofpatients with scleroderma with a reproducible, robust autoimmunesignature that indicates a distinct subtype of the disease with anunderlying cause that may differ from other forms of the disease.Furthermore, we suspect that this extensive epitope spreading mayplay a role in immune control of cancers in general (9).

ResultsPhIP-Seq Detects Autoantibodies Directed Against Different Subunitsof the POLR3 Complex. We used PhIP-Seq to assay serum samplesfrom 32 patients with scleroderma with POLR3 antibodies [com-prising 18 POLR3-positive patients with coincident cancer (mediancancer-scleroderma interval of −1 mo) and 14 POLR3-positivepatients without cancer] and 16 patients with scleroderma withoutCTP autoantibodies, all with a short cancer-scleroderma interval(median interval of 1.95 y). The presence of POLR3 antibodies wasdefined using a commercially available ELISA (Inova Diagnostics).

Using PhIP-Seq, we detected autoantibodies against at leastone subunit of POLR3 in 24 of 32 (75%) samples that had knownanti-POLR3 autoantibodies and in none of 16 samples that werepreviously determined to be negative for POLR3A antibodies(i.e., in the CTP-negative group; Fig. 1A). Interestingly, althoughall 32 serum samples in the anti-POLR3–positive group had anti-POLR3A antibodies as assayed by (i) ELISA and (ii) immuno-precipitation of POLR3A generated by in vitro transcription andtranslation (IVTT) from cDNA encoding full-length POLR3A,PhIP-Seq identified autoantibodies against this subunit in onlythree of 32 (9.4%) samples (Fig. 1A), suggesting that the patientresponse to POLR3A primarily consists of conformational, asopposed to linear, epitopes.PhIP-Seq also identified autoantibodies against other compo-

nents of the POLR3 complex, specifically POLR3F in 16 of 32(50%) serum samples and POLR3H in seven of 32 (22%) serumsamples, that had not been previously detected in these sclerodermasamples. Subsequent immunoprecipitations performed using [35S]methionine-labeled proteins generated by IVTT showed anti-POLR3F in 18 of 32 (56%) and anti-POLR3H in 29 of 32 (91%)serum samples, confirming the presence of these autoantibodies inmost anti-POLR3A–positive patients (Fig. 1). These results dem-onstrate likely epitope spreading of the autoantibody responsewithin the POLR3 complex (11). Antibodies against POLR3F wereidentified in 15 patients by both PhIP-Seq and immunoprecipita-tion. In addition, one patient was positive by PhIP-Seq only, andthree patients were positive by immunoprecipitation only (87.5%agreement; kappa = 0.75, P < 0.0001). In contrast, although sevenpatients were positive for anti-POLR3H by both PhIP-Seq andimmunoprecipitation, an additional 22 anti-POLR3H– positivepatients were identified by immunoprecipitation only (29.0%agreement; kappa = 0.04, P = 0.215).In this initial group, we examined whether anti-POLR3 autoan-

tibody specificities differed by cancer status. For these analyses, weconsidered patients as anti-POLR3F–positive and anti-POLR3H–

positive if they were positive by PhIP-Seq, by immunoprecipitation,or by either assay method. The prevalence of autoantibodies to thesetwo subunits did not differ by cancer status. Similarly, there was nodifference in the frequency of patients who were double-positive,double-negative, or positive for only one subunit alone by cancerstatus. There were also no differences in age at scleroderma onset,gender, race, cutaneous subtype, first or maximummodified Rodnanskin score, renal crisis, myopathy, or restrictive lung disease by epi-tope status. These data suggest that the breadth of the POLR3immune response is not the major determinant for the emergenceof cancer, indicating that other mechanisms (immune or cancer-related) should be considered.

PhIP-Seq Identifies Autoantibody Specificities. We next sought toidentify novel autoantibody specificities in the set of samples fromanti-CTP antibody–negative patients. To accomplish this goal, weidentified autoantibody specificities that were observed morefrequently in the 16 serum samples without known autoantibodyspecificities compared with the 32 samples with autoantibodiesagainst POLR3. Of the 20 putative autoantigens detected in thePhIP-Seq screen with the lowest q values (q = 0.054 and lower;Table 1), we focused on two for the following reasons. First, be-cause PhIP-Seq is performed using protein fragments rather thanfull-length proteins, we are able to discern the presence of mul-tiple antibodies that recognize distinct epitopes on the sameprotein. Using this feature as the selection criterion, we prioritizedRNPC3 for validation and follow-up analysis. Second, knowingthat PhIP-Seq can identify antibodies against different compo-nents of a complex (as discussed in POLR3 findings above), wesought such components in the list of candidates (Table 1); thisapproach highlighted PDCD7 and RNPC3, which are both com-ponents of the minor spliceosome complex.

Xu et al. PNAS | Published online November 7, 2016 | E7527

IMMUNOLO

GYAND

INFLAMMATION

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 3: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

Using PhIP-Seq, we detected autoantibodies against RNPC3 infour of 16 (25%) of the anti-CTP antibody–negative sera and innone of 32 samples with POLR3 antibodies (Fig. 2A and Table 1).This specificity was not detected in any of our previous PhIP-Seqscreens performed on 123 serum samples, consisting of 44 fromhealthy donors, 35 from patients with IgG4-related autoimmunedisease, and 44 from patients with dermatomyositis. Thus, theseautoantibodies appear to be a specific marker of scleroderma. Inaddition, we detected autoantibodies against the same four con-secutive protein fragments in all four samples, providing strongevidence of epitope spreading within the protein (Fig. 2A). Ofnote, this region is not repetitive, and the four fragments do noteshare any stretch of four or more amino acids. Thus, the observed

pattern of autoantibodies is indicative of intramolecular epitopespreading, suggesting there is a true autoimmune response.RNPC3 is a member of the minor spliceosome complex that

participates in removal of U12-type introns from pre-mRNA (12,13). To confirm that these autoantibodies could be detected bystandard methods, we performed immunoprecipitations on RNPC3expressed by IVTT on the entire group of 48 samples. The resultswere fully consistent with the PhIP-Seq data: There were four anti-RNPC3–positive serum samples, and these samples were the sameas those anti-RNPC3–positive serum samples detected using PhIP-Seq (Fig. 2A).Because PDCD7 was identified by PhIP-Seq, and is a component

of the minor spliceosome complex, we proceeded to validate it usinga similar IVTT immunoprecipitation approach. Of note, PhIP-Seq

A

B

C

Fig. 1. Detection of autoantibodies to POLR3 subunits in sera from patients with scleroderma. (A) Chart representing data obtained by PhIP-Seq. Eachcolumn represents a patient serum sample screened using the PhIP-Seq assay. Horizontal bars above the chart indicate whether the patients have autoan-tibodies to POLR3 or are anti-CTP antibody–negative (defined as negative for anticentromere, antitopoisomerase I, and anti-POLR3 antibodies). Each row is agene that encodes a subunit of the POLR3 complex. Filled and unfilled cells denote the presence or absence, respectively, of autoantibodies to specific proteinfragments for each serum sample tested by PhIP-Seq. (B) Representative immunoprecipitation data obtained using eight of the scleroderma serum samples inA and sera from healthy donors. The [35S]methionine-labeled POLR3A, POLR3F, and POLR3H proteins, generated by IVTT, were immunoprecipitated with theindicated patient and control sera, and the immunoprecipitates were visualized by autoradiography. (C) Chart comparing the POLR3 autoantibody status ofeach of the serum samples tested in B, as assessed by PhIP-Seq or IVTT immunoprecipitation.

E7528 | www.pnas.org/cgi/doi/10.1073/pnas.1615990113 Xu et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 4: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

identified PDCD7 antibodies in three of the four serum sampleswith autoantibodies against RNPC3 (FW-1089, FW-0446, and FW-1864; Fig. 2 A and B); the antibodies recognized the samePDCD7 peptide in all three, as well as a consecutive peptide in twoof the three. By immunoprecipitation using [35S]methionine-labeledPDCD7 as a target, we confirmed that these antibodies were foundin four of 16 (25%) of the anti-CTP antibody–negative serumsamples and in none of 32 samples with POLR3 antibodies.Thus, the PhIP-Seq and IVTT immunoprecipitation data forPDCD7 antibodies were consistent, with the exception of serumFW-1782, in which PhIP-Seq did not detect these antibodies. Aswas the case with POLR3 antibodies, our data with RNPC3 andPDCD7 antibodies demonstrates that in addition to intra-molecular epitope spreading within the RNPC3 protein, in-termolecular epitope spreading within the minor spliceosomecomplex commonly occurs.To validate our RNPC3 results, we performed IVTT immu-

noprecipitation on an independent set of 123 anti-CTP–negativesamples and identified 14 additional anti-RNCP3–positive sam-ples. We performed another PhIP-Seq screen on these samplesand the four original anti-RNPC3–positive samples (18 totalsamples) to determine if any additional autoantibody specificitiesare associated with this subclass. To identify unique autoanti-bodies, we compared the results with 123 control samples, in-cluding samples from healthy donors and patients with otherautoimmune diseases, such as IgG4-related disease and derma-tomyositis (Table 2). In addition to RNPC3, we frequently detected

autoantibodies against the protein product of SNRNP48, which,like RNPC3 and PDCD7, is a component of the minor spliceosomecomplex. Finally, to determine whether anti-RNPC3 autoantibodiesare specific to scleroderma, we also performed IVTT immuno-precipitation and did not detect anti-RNPC3 autoantibodies in anyof 25 healthy controls or 102 patients with dermatomyositis.

PLATO-BC Identifies Autoantibodies That Confirm and Extend thePhIP-Seq Findings. It is clear from the immunoprecipitation analysisof IVTT POL3 subunits that IVTT immunoprecipitations are moresensitive than PhIP-Seq at detecting antibody targets (Fig. 1C).Thus, for the second screen, we examined the four samples we hadoriginally identified with autoantibodies against RNPC3 usingPLATO, a genome-scale IVTT ribosome-display method that candetect antibodies that recognize conformational epitopes. For thispurpose, we developed a version of PLATO we call PLATO-BC(Fig. 3). In the original method, ribosome-displayed RNA purifiedwith antibodies had to be fragmented and an RNA-Seq libraryprepared. This step was laborious and expensive, and it preventedhigh-throughput analysis. To overcome this limitation, we intro-duced a bar code onto each ORF. Because the ORFs lack a STOPcodon that results in a trapped RNA–ribosome complex aftertranslation, we generated a collection of random bar codes thatlacked a STOP codon and created a library in which these barcodesare linked to ORFs in-frame (more details are provided inMaterialsand Methods). PLATO-BC greatly simplifies the analysis.Using this PLATO-BC library (Fig. 3), we discovered that in

three of the four samples, RNPC3 was the most or second moststrongly enriched ORF, confirming the PhIP-Seq results. In addi-tion to RNPC3, we found that SNRNP25 and SNRNP35, which arealso components of the minor spliceosome complex, were signifi-cantly enriched in the same three samples (Table 3). As before, weconfirmed that SNRNP25 and SNRNP35 are recognized by auto-antibodies by performing immunoprecipitations with proteins gen-erated by IVTT (Fig. 2C). IVTT immunoprecipitations showed thatsera either recognized all four specificities (RNPC3, PDCD7,SNRNP25, and SNRNP35) or were negative for all. Interestingly,SNRNP25 and SNRNP35 were not detected in the PhIP-Seqscreens, perhaps because PLATO is more sensitive for detectingantibodies against discontinuous epitopes and these two proteinsare shorter, with fewer opportunities for linear epitopes (Table 4).

DiscussionScleroderma provides a specific example of the general chal-lenges facing autoantigen identification and the utility of novelmethods. Multiple gaps exist in the understanding of the immuneresponse in many autoimmune diseases in general and in cancer-associated scleroderma in particular. Understanding whetherthere are novel immune responses associated with this CTP-neg-ative group is important, because it may identify additional auto-antibody signatures predictive of a higher risk of cancer at theclinical onset of scleroderma. However, the paucity of appropriatemethods has limited studies to date. Although several of the lesscommon scleroderma autoantibody specificities can be assayedusing commercially available tools, there are still antibodies thatare not routinely available. Also, it is likely that additional speci-ficities beyond known or suspected antigens remain to be dis-covered in scleroderma (particularly components of multiproteincomplexes). The availability of new sophisticated, high-throughputantigen discovery methods creates opportunities to define andunderstand the complexity of the autoimmune response in sclero-derma and other autoimmune disorders.To address these issues, we have developed and used two high-

throughput epitope identification methods to identify immuno-logical characteristics of different subclasses of scleroderma(POLR3-positive scleroderma with cancer, POLR3-positive sclero-derma without cancer, and patients with cancer-associated sclero-derma who are anti-CTP–negative). Our methods included a

Table 1. Ranked list of the top candidate autoantigens enrichedin patients with scleroderma and no known autoantibodyspecificities

Anti-CTP Ab− Pol auto-Ab+

GeneNo.

positiveNo.

negativeNo.

positiveNo.

negative q

RNPC3* 4 12 0 32 0.007DAZ3 3 13 0 32 0.020LOC100286987 3 13 0 32 0.022DMD 3 13 0 32 0.023PCNXL2 3 13 0 32 0.024ENAH 3 13 0 32 0.025LOC100134365 3 13 0 32 0.025PDCD7* 3 13 0 32 0.027ZNF512B 3 13 0 32 0.028KANK2 3 13 0 32 0.028LOC100132427 3 13 0 32 0.028AGRN 5 11 2 30 0.029LOC100292810 5 11 2 30 0.031TRIM21 3 13 0 32 0.031UTRN 3 13 0 32 0.031TNKS2 7 9 5 27 0.034TCF20 3 13 0 32 0.040SFXN3 9 7 8 24 0.043LOC100290519 5 11 3 29 0.048USP11 12 4 15 17 0.054

Each row is a gene. The columns “No. positive” and “No. negative” under“Anti-CTP Ab−

” list the number of samples that are anti-CTP antibody (Ab)–negative that have or do not have autoantibodies against that row’s geneproduct, respectively. The same columns under “Pol auto-Ab+

” list the num-ber of samples with POLR3 autoantibodies that have or do not have auto-antibodies against that row’s gene product, respectively. Finally, the column“q” in the table lists the false discovery rate based on a permutation analysisof Fisher’s exact P values for enrichment of each gene in the set of samplesfrom patients with scleroderma and no known autoantibody specificities.*Genes whose protein products are components of the minor spliceosomecomplex.

Xu et al. PNAS | Published online November 7, 2016 | E7529

IMMUNOLO

GYAND

INFLAMMATION

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 5: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

A

B

C

Fig. 2. Components of the minor spliceosome are targeted by scleroderma autoantibodies. (A and B) Evidence for intramolecular epitope spreading inRNPC3 and PDCD7. Each chart column represents a patient with scleroderma who was screened using the PhIP-Seq assay. Labeled bars above the chartsindicate whether the patient has detectable autoantibodies to RNPC3 or is anti-CTP antibody–negative (defined as negative for anticentromere, anti-topoisomerase I, and anti-POLR3 antibodies). Each chart row represents one of the 90-aa protein fragments tiled through the entire gene (1 is the N-terminalfragment) for RNPC3 (A) or PDCD7 (B). The color of each cell represents the −log10 (P value) for enrichment of that row’s protein fragment in that column’ssample. Greater confidence in the detection of an autoantibody is indicated by a darker color [larger −log10 (P value)], as labeled on the scale in the color barto the right of the chart. (C) Immunoprecipitations performed with the 16 anti-CTP antibody–negative scleroderma serum samples (defined as negative foranticentromere, antitopoisomerase I, and anti-POLR3 antibodies). In vitro transcription/translated (IVTT) [35S]methionine-labeled RNPC3, PDCD7, SNRNP25,and SNRNP35 were immunoprecipitated with the indicated patient sera, and the immunoprecipitates were visualized by autoradiography.

E7530 | www.pnas.org/cgi/doi/10.1073/pnas.1615990113 Xu et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 6: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

significantly improved version of PLATO, PLATO-BC, whichgreatly simplifies the original assay, which was not compatible withhigh-throughput applications. Using this combined approach, wemade three major observations. First, PhIP-Seq demonstrateditself to be an attractive method with which to identify the POLR3complex as a target in cancer-associated scleroderma. Althoughthe method only identified antibodies to POLR3A in a minority ofELISA-positive patients, it was able to identify antibodies againstPOLR3F in 50% of POLR3-positive patients and POLR3H in22% of POLR3-positive patients. When tested by immunopre-cipitation against the individual POLR3 components, antibodiesto POLR3F and POLR3Hwere found in 56% and 91% of POLR3AELISA-positive patients, respectively. PhIP-Seq is therefore anexcellent method to screen for targets of the immune responsesto multimolecular complexes. Because antibodies to POLR3Fand POLR3A co-occur in most patients with POLR3 antibodies,PhIP-Seq also identifies POLR3H as a well-performing surro-gate antigen to detect POLR3-positive scleroderma. Interestingly,

we did not detect a difference in the POLR3 antibody profilesbetween patients with and without cancer, suggesting that theyshare a similar immune response independent of the presence ofcancer. It is important to note that although cases of sclerodermaexist that have POL3 antibodies and do not develop cancer, thisfinding does not mean that a cancer did not initiate the auto-immune response in these circumstances and then regress, pos-sibly because of the immune response itself.Second, PhIP-Seq and the subsequent validation studies

highlighted striking intermolecular spreading among subunitsof the POLR3 complex. The presence or extent of intermolecularspreading did not correlate with the presence or absence of cancer.Intermolecular spreading among components of multimolecularcomplexes is a frequent feature of autoimmunity, and may resultwhen immunity is induced to any one of the components of thecomplex. Because the studies in scleroderma cancer to date inPOLR3-positive patients have demonstrated mutations or LOH atthe POLR3A locus in only six of eight cancers studied, these studieshighlight the importance of considering other components of thePOLR3 complex as the sites of possible mutations that could induceautoimmunity. The search for mutations in additional componentsof POLR3 in relevant cancers is an ongoing effort.The third observation demonstrated that a subset of the pa-

tients with scleroderma and cancer (short interval) without CTPantibodies (25%) had autoantibodies against the minor spliceo-some complex. In the case of RNPC3, our results indicate strikingintramolecular epitope spreading within its encoded protein,RNA-binding protein 40, with a minimum of two and maximum offour adjacent epitopes recognized, given that adjacent 90-aa pep-tides overlap by 45 aa and could share an epitope. It is possiblethat this region of RNPC3 exhibits particularly antigenic proper-ties. Although not repetitive, the region contains many chargedamino acids that are known to be enriched in B-cell epitopes(12.5% P, 12% E, 8.5% K, and 7.1% D). It is possible that theantigenic features of this region of RNPC3 elicit an immunodo-minant antibody response across individuals to generate what isreferred to as a public epitope.Additional analysis with both PhiP-Seq and PLATO demon-

strated extensive intermolecular epitope spreading within theminor spliceosome complex. These findings were subsequentlyverified using IVTT immunoprecipitation assays. The fact thatpatients targeting the minor spliceosome all recognized each ofthe components highlights the striking convergence of the im-mune response onto this multimolecular splicing machine. Un-derstanding whether there is a specific phenotype associated withthis immune response against the minor spliceosome is a highpriority. The mechanistic role of anti-RNPC3 autoantibodies, ifany, in the pathogenesis of scleroderma is unclear, as is true formost antinuclear autoantibodies in rheumatic diseases. However,further investigation of a connection with coincident cancer maybe warranted, given the evidence that anti-POLR3A autoanti-bodies in patients with scleroderma may arise from POLR3Amutant tumors that elicit a specific CD4+ T-cell response thatultimately triggers cross-reactive autoantibodies. As noted pre-viously, it will be important to sequence the genes encoding allcomponents of the minor spliceosome targeted by the immuneresponse in cancers from patients with scleroderma, because theresults of this sequencing may provide additional mechanistic ev-idence of tumor neoantigen-driven autoimmunity in this disease.Our results demonstrate the benefit of combining the two

complementary approaches, PhIP-Seq and PLATO-BC, for iden-tifying autoantibody specificities in large, heterogeneous pop-ulations. These methods have complementary strengths andweaknesses. Although PhIP-Seq has the advantages of being high-throughput, identifying individual epitopes, detecting intraproteinepitope spreading, and possessing a complete representation ofthe human proteome, it does not identify all patients with aparticular specificity (e.g., POLR3A), possibly due to its reduced

Table 2. Ranked list of the top candidate autoantigens enrichedin 18 patients with scleroderma and autoantibodies againstRNPC3

RNPC3 auto-Ab+ Controls

GeneNo.

positiveNo.

negativeNo.

positiveNo.

negative q

RNPC3* 15 3 1 138 0.000SNRNP48* 4 14 1 138 0.000DDX19B 3 15 0 139 0.000OR52N1 8 10 19 120 0.002LOC100287934 3 15 1 138 0.006RNF145 12 6 49 90 0.008TCF3 2 16 0 139 0.010ELK4 2 16 0 139 0.011FLII 3 15 2 137 0.013ACTR5 2 16 0 139 0.014LOC727900 2 16 0 139 0.015KATNB1 7 11 19 120 0.019REPS2 10 8 40 99 0.021TLE2 4 14 7 132 0.026DDX19A 2 16 1 138 0.026C7 3 15 3 136 0.026LOC730268 4 14 7 132 0.027TIAL1 2 16 1 138 0.031C1ORF95 2 16 1 138 0.033GPR50 2 16 1 138 0.037TRRAP 2 16 1 138 0.037OR10A5 2 16 1 138 0.039ZC3HAV1 2 16 1 138 0.040GPRC5D 3 15 4 135 0.040ADORA3 2 16 1 138 0.040ANKAR 3 15 5 134 0.046SRCIN1 2 16 2 137 0.049

Each row is a gene. The columns “No. positive” and “No. negative” under“RNPC3 auto-Ab+

” list the number of samples with RNPC3 autoantibodies(as detected by IVTT immunoprecipitation) that have or do not have auto-antibodies against that row’s gene product as determined by PhIP-Seq anal-ysis, respectively. The same columns under “Controls” list the number ofnonscleroderma control samples, including samples from healthy donorsand patients with other autoimmune diseases, that have or do not haveautoantibodies against that row’s gene product, respectively. Finally, thecolumn “q” in the table lists the false discovery rate based on a permutationanalysis of Fisher’s exact P values for enrichment of each gene in the setof samples from patients with scleroderma and no known autoantibodyspecificities.*Genes whose protein products are components of the minor spliceosomecomplex.

Xu et al. PNAS | Published online November 7, 2016 | E7531

IMMUNOLO

GYAND

INFLAMMATION

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 7: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

ability to detect a portion of antibodies that recognize discontinuousepitopes. Although this technique is therefore not comprehensivefor defining all autoantibody targets in a particular individual, it isespecially useful for identifying the autoantibody response inpopulations of individuals with similar phenotypes. PhIP-Seqwas highly effective at demonstrating that the POLR3 complex(POLR3A, POLR3F, and POLR3H) is a major target of auto-antibodies, and rapidly allowed specific assays to be established.PhIP-Seq was highly effective at identifying RNPC3 as an auto-antigen in cancer-associated scleroderma, identifying all patientssubsequently shown to be positive by other assays. PLATO wasneeded to identify two other minor spliceosome components,SNRNP35 and SNRNP25, but did not identify antibodies toPOLR3A because POLR3A was missing from the ORF library,a current weakness of PLATO that can be remedied. Although,

in principle, PLATO may be more sensitive than PhIP-Seq dueto its ability to detect antibodies to conformational epitopes, itfailed to detect one minor spliceosome component detected byPhIP-Seq, SNRNP48 (Table 4). Why SNRNP48 was not detectedby PLATO is not yet clear.Using multiple orthogonal approaches to antigen discovery is

likely to fuel important advances in defining new antigenic tar-gets, the first step to elucidate the mechanisms underlying theirselection in cancer and autoimmunity. The central finding of thisstudy, that epitope spreading is a common feature of autoim-mune responses, suggests that these genome-wide antigen dis-covery methods can be significantly improved by merging themcomputationally with proteomic approaches aimed at discover-ing protein complexes (14, 15). Further studies using large, well-defined patient cohorts are warranted to understand the full

ORF2ORF1

ORF3BC2BC1

BC3

ORF1 BC1BC1BC1BC1

A B C

F E D

+ AbivTT

IP

elute+ RT

PCR

Fig. 3. Schematic of the barcoded PLATO-BC assay. (A) ORF library is cloned into a ribosome display vector that contains 30 bp of random nucleotides thathave been previously selected to lack a STOP codon that serves as a molecular barcode for each ORF. (B) DNA is in vitro-transcribed and translated (ivTT) tocreate the mRNA–ribosome–polypeptide ternary complex for ribosome display panning. (C ) Sample containing patient antibodies is added to the so-lution. (D) Immunoprecipitation (IP) with magnetic beads coated with Protein A and G separates antibody-bound ribosome complexes. (E ) Specificallybound ribosome complexes are disrupted, eluting the mRNA, which is then reverse-transcribed to create a template for PCR. (F ) Finally, PCR with primersflanking the barcode region generates DNA that is ready to be prepared for high-throughput sequencing. Further details are provided in Materialsand Methods.

Table 3. Top candidate autoantigens detected by PLATO in the four patients with scleroderma who have autoantibodies againstRNPC3

FW-1089 FW-0446 FW-1782 FW-1864

Gene Fold change Rank Fold change Rank Fold change Rank Fold-change Rank

RNPC3* 22.6 (1) 10.5 (1) 8.0 (2) 1.5 (>1,000)SNRNP25* 9.2 (2) 3.5 (8) 9.6 (1) 0.9 (>1,000)ARHGAP27 6.4 (3) 1.3 (>1,000) 1.1 (>1,000) 1.1 (>1,000)SNRNP35* 5.4 (4) 3.7 (5) 4.5 (4) 1.4 (>1,000)SEPT9 4.5 (5) 0.9 (>1000) 1.1 (>1,000) 0.8 (>1000)TMEM175 4.3 (6) 1.6 (656) 2.8 (14) 1.6 (>1,000)SH3D19 4.3 (7) 0.9 (>1,000) 1.1 (>1,000) 1.4 (>1,000)SPATA24 4.3 (8) 0.7 (>1,000) 0.8 (>1,000) 0.7 (>1,000)SCG3 4.1 (9) 0.9 (>1,000) 1.5 (945) 0.3 (>1,000)ANXA7 4.0 (10) 1.1 (>1,000) 0.8 (>1,000) 2.1 (>1,000)

Each row shows data corresponding to the candidate autoantigen gene listed in the first column. The columns labeled “Fold change” show the fold changein relative abundance for each gene after immunoprecipitation with patient serum compared with beads alone. The columns labeled “Rank” show where thegene is ranked when sorted by descending fold change for each sample. Each pair of Fold change and Rank columns was obtained from the sample identifiedin the column label. The data are sorted in descending order based on fold change in sample “FW-1089.”*Genes whose protein products are components of the minor spliceosome complex.

E7532 | www.pnas.org/cgi/doi/10.1073/pnas.1615990113 Xu et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 8: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

clinical utility of these new antibodies. Such strategies will beimportant in the evolving precision medicine era, where specifictargets of the immune response in patients with autoimmunity orcancer may predict outcome or direct intervention.

Materials and MethodsPhiP-Seq Assay and Analysis. PhIP-Seq assays were performed as previouslydescribed (2). Briefly, 96-well plates were blocked overnight at 4 °C with 3%(wt/vol) BSA/phosphate-buffered saline with Tween 20 (PBST). The 3% (wt/vol)BSA was used to ensure minimal loss of material from nonspecific adhesion.Sera (∼2 μg of IgG) were subsequently added to each well, together with 1 mLof the bacteriophage library diluted to 2 × 105-fold representation (2 × 1010

plaque-forming units for a library of 105 clones) in phage extraction buffer[20 mM Tris·HCl (pH 8.0), 100 mM NaCl, 6 mM MgSO4] and incubatedovernight at 4 °C. Two technical replicates were performed for each sample.Protein A and G Dynabeads (Invitrogen) were then added to each well (4 h,4 °C). The magnetic beads (containing immunoprecipitates) were washedthree times with PhIP-Seq wash buffer [50 mM Tris·HCl (pH 7.5), 150 mMNaCl, 0.1% Nonidet P-40] and resuspended in 40 μL of water, and the phagewas lysed by heating at 95 °C for 10 min. Samples were also made usingphage from the library before immunoprecipitation (“input”) and afterimmunoprecipitation with beads alone.

DNA for multiplexed Illumina sequencing was prepared using a slightlymodified version of a previously published protocol (16). Two rounds of PCRamplification were performed on the lysed phage material using hot startQ5 polymerase (New England Biomedical). The first round of PCR usedthe primers IS7_HsORF5_2 (ACACTCTTTCCCTACACGACTCCAGTCAGGTGT-GATGCTC) and IS8_HsORF3_2 (GTGACTGGAGTTCAGACGTGTGCTCTTCCGA-TCCGAGCTTATCGTCGTCATCC). The second round of PCR used 1 μL of thefirst-round product and the primer IS4_HsORF5_2 (AATGATACGGCGACCA-CCGAGATCTACACTCTTTCCCTACACGACTCCAGT) and a different uniqueindexing primer for each sample to be multiplexed for sequencing (CAAG-CAGAAGACGGCATACGAGATxxxxxxxGTGACTGGAGTTCAGACGTGT, where“xxxxxxx” denotes a unique 7-nt indexing sequence). After determining theDNA concentration of each sample by quantitative PCR (qPCR), equimolaramounts of all samples were pooled for gel extraction, followed by se-quencing by the Harvard Medical School Biopolymers Facility using a 50-bpread cycle on an Illumina HiSeq 2000 or 2500. Up to 96 samples were pooledfor sequencing on each lane and gave ∼100–200 million reads per lane (1–2million reads per sample).

The initial informatics and statistical analyses were performed using aslightly modified version of the previously published technique (2, 4). Se-quencing reads were mapped to the original library sequences using Bowtie,and the frequency of each clone in the input and each sample “output” wascounted (17). Because the majority of clones are not enriched, the observed

distribution of output counts as a null distribution was used. A zero-inflatedgeneralized Poisson distribution fitted our output counts well. This nulldistribution was used to calculate a P value for the likelihood of enrichmentfor each clone. The probability mass function for the zero-inflated gener-alized Poisson distribution is

PðY = yÞ=� π + ð1− πÞ

hθðθ+ λÞx−1e−θ−xλ

i,     y =0

ð1− πÞhθðθ+ λÞx−1e−θ−xλ

i,     y > 0

.

Maximum likelihood estimation was used to regress the parameters π, θ, and λto fit the distribution of counts after immunoprecipitation for all clones pre-sent at a particular frequency count in the input. This procedure was repeatedfor all of the observed input counts; θ and λ are fit well by linear regressionand π by an exponential regression as a function of input count. Finally, foreach clone, input count and the regression results were used to determine thenull distribution based on the zero-inflated generalized Poisson model, whichwe used to calculate the −log10 (P value) of obtaining the observed count.

To call hits, the threshold for reproducibility between technical replicateswas determined, based on a previously publishedmethod (16). Briefly, scatterplots were made of the log10 of the −log10 (P values), and a sliding windowwas used of width 0.005 from 0 to 2 across the axis of one replicate. For all ofthe clones that fell within each window, the median and median absolutedeviation of the log10 of the −log10 (P values) in the other replicate werecalculated and plotted against the window location. The first window inwhich the median was greater than the median absolute deviation wascalled the threshold for reproducibility. The distribution of the threshold−log10 (P value) was centered around a mean of ∼2.3, so peptides weredesignated hits if the −log10 (P value) was at least 2.3 in both replicates.

PLATO-BC: Library Construction, Assay, and Analysis. The PLATO-BC library is abarcoded version of one previously described (3). The vector pRDDEST describedin that paper was modified by cloning a stretch of 30 random nucleotides (N’s)as barcodes following the attB2 site and in-frame with the downstream TolAsequence. These barcodes were previously selected to contain only sequenceswithout a stop codon by cloning them into a chloramphenicol acetyl transferasederivative of pUC18 as 5′ fusions into a modified β-lactamase gene that con-tained restriction sites after the ATG to allow insertion of the random oligo-nucleotides. These plasmids were transformed into sensitive bacteria, andselected for ampicillin resistance to select for ORFs (18). This procedure wasperformed twice to ensure all clones were in-frame before transferring theminto the pRDDEST vector. The human ORFeome v5.1 collection was cloned intothis barcoded vector using Gateway Cloning (18, 19). The resulting DNA waselectroporated into DH10B cells (Invitrogen), and plasmid DNA was maxi-pre-pared and stored at −20 °C until use.

The PLATO assays were performed as previously described (3). Briefly, theplasmid DNA was PCR-amplified using the T7B (5′-ATACGAAATTAATACGA-CTCACTATAGGGA GACCACAACGG-3′) and TolAK (5′-CCGCACACCAGTAA-GGTGTGCGGTTTCAGTTGC CGCTTTCTTTCT-3′) primers, and then in vitro-transcribed using the RiboMAX Large-Scale RNA Production system-T7 kit(Promega). The RNA was purified using MegaClear (Ambion), and 15 μg wasused for a 100-μL in vitro translation reaction using the RTS 100 E. coli HY Kitaccording to the manufacturer’s protocols (5 Prime). A total of 12.5 μL of thein vitro translation reaction is diluted in 85.5 μL of selection buffer [2.5 mg/mLheparin, 1% (wt/vol) BSA, and 83.3 μg/mL yeast tRNA in 50 mM Tris acetateand 150 mM NaCl (pH 7.5), diethylpyrocarbonate (DEPC)-treated] and addedto 40 μL of a 1:1 mix of Protein A and Protein G Dynabeads (Life Technolo-gies) that has bound 2 μg of IgG from serum overnight at 4 °C in PBST con-taining 1% (wt/vol) BSA, and is subsequently blocked in selection buffer for1 h at room temperature. After 4 h of incubation with the in vitro translationproduct at 4 °C, the beads are washed six times with 500 μL of RD wash buffer[50 mM Tris acetate and 150 mM NaCl (pH 7.5), DEPC treated] and theremaining bound RNA is eluted with 50 μL of EB20 [50 mM Tris acetate,150 mMNaCl, and 20 mM EDTA (pH 7.5)] at 37 °C for 10 min. The elutedRNA is purified using the MEGAclear kit according to the manufacturer’sprotocols (Ambion) and reverse-transcribed with the TolART primer (5′-CGCTGCTTCTTCCGCAGCTTTAGC-3′) using the SuperScript III kit according tothe manufacturer’s protocols (Life Technologies). The barcode region of thecDNA is then PCR-amplified using the primers Adap-BCfor (5′-GTGACTGG-AGTTCAGACGTGTGCTCTTCCGATCTACAAGTCACGTCCACAGTCGT-3′) and P5-BCrev (5′-AATGATACGGCGACCACCGAACTACGGTGCGGCGAATATAC-3′). Thesecond round of PCR used 1 μL of the first-round product, the primer P5-BCrev,and a different unique indexing primer for each sample to be multiplexed forsequencing (CAAGCAGAAGACGGCATACGAGATxxxxxxxGTGACTGGAGTTCAGAC-GTGT, where “xxxxxxx” denotes a unique 7-nt indexing sequence). After the

Table 4. Summary of the discovery of autoantibodies againstmultiple components of the minor spliceosome complex

Protein (gene)

Autoantibodiesdetected byPhIP-Seq

Autoantibodiesdetected by

PLATO

Sm proteins*SF3b complex†

20K (ZMAT5)25K (SNRNP25) ✓

31K (ZCRB1)35K (SNRNP35) ✓

48K (SNRNP48) ✓

59K (PDCD7) ✓ N/A65K (RNPC3) ✓ ✓

Urp (ZRSR2)hPrp43 (DHX15)Y Box-1 (YBX1) N/A

Each row represents one of the protein components of the minor spliceo-some complex as indicated by the label in the first column (gene names inparentheses). Check marks in next two columns indicate whether autoanti-bodies to that protein were identified by PhIP-Seq, PLATO, or both. Genesabsent in the PLATO library are marked as not available (N/A).*Sm proteins (named for a patient with SLE) B/B′, D1, D2, D3, E, F, and G.†Multisubunit complex.

Xu et al. PNAS | Published online November 7, 2016 | E7533

IMMUNOLO

GYAND

INFLAMMATION

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020

Page 9: Systematic autoantigen analysis identifies a distinct ... · Scleroderma is a chronic autoimmune rheumatic disease associ-ated with widespread tissue fibrosis and vasculopathy. Approxi-mately

second round of PCR, we determined the DNA concentration of each sampleby qPCR and pooled equimolar amounts of all samples for gel extraction.Following gel extraction, the pooled DNA was sequenced by the HarvardMedical School Biopolymers Facility using a 50-bp read cycle on an IlluminaHiSeq 2000 or 2500.

We first mapped the sequencing reads to the original library sequencesusing Bowtie and counted the frequency of each clone in the input and eachsample output. We then used these sequencing read counts to calculate“fractional abundance” estimates for each clone, using the number of readsfor that peptide divided by the total number of reads for that sample. Theratio of the fractional abundance in the output over the input is used as anestimate for fold-change enrichment.

Study Population. Patients with scleroderma and an available serum samplewere identified through the Institutional Review Board-approved JohnsHopkins Scleroderma Center database. All participants in the database whodonated serum samples for the study provided written informed consent. Allpatients had scleroderma defined by 2013 American College of Rheuma-tology classification criteria or by having at least three of five calcinosis,Raynaud’s, esophageal dysmotility, sclerodactyly, and telangiectasia syn-drome features (20). Demographic data, symptom onset dates, cutaneoussubtype, organ-specific severity scores, and cancer diagnoses (dates, site,histology, and therapy) are captured in all patients at the first visit andlongitudinally at 6-mo intervals for relevant parameters. The date ofscleroderma onset was defined by the date of the first scleroderma symp-tom, and identified as either a Raynaud’s or non-Raynaud’s sclerodermasymptom. The date of cancer diagnosis was obtained from pathology re-ports or medical record review when available, and was otherwise definedby patient report. The cancer-scleroderma interval was calculated as thedifference between these two dates. Three well-defined groups of patientswith scleroderma were selected for study: (i) patients with cancer, a shortcancer-scleroderma interval (defined as ≤5 y), and no known sclerodermaautoantibody [anti-CTP antibody–negative group (i.e., negative for centro-mere, topoisomerase 1, and POLR3 autoantibodies), n = 16]; (ii) patients withPOLR3 autoantibodies, cancer, and a short cancer-scleroderma interval (n =18); and (iii) patients with POLR3 autoantibodies who did not have cancerafter at least 5 y of follow-up matched on age at scleroderma onset andgender to the POLR3-positive cancer group (n = 14).

In addition, secondary use of blood samples from patients with derma-tomyositis (n = 44), patients with IgG4-related disease (n = 35), and healthydonors (n = 44) for the purposes of this work were exempted by the BrighamandWomen’s Hospital Institutional Review Board (Protocol 2013P001337). Inaddition, the following sera from the Johns Hopkins University site were alsoincluded: 102 dermatomyositis patients (seen at the Johns Hopkins UniversityMyositis Center) and 25 healthy controls. All serum samples were obtainedafter informed consent under IRB-approved protocols.

Autoantibody Assays. The status of autoantibodies against topoisomerase 1,POLR3, and centromere was obtained for each patient by clinical chart reviewand/or by ELISAsusing commercially available kits (InovaDiagnostics). Antibodiesagainst RNPC3, SNRNP25, SNRNP35, SNRNP48, PDCD7, POLR3A, POLR3F, andPOLR3H were assayed by immunoprecipitation. Briefly, cDNAs encoding therelevant human proteins were purchased (Origene) and used to generate [35S]methionine-labeled proteins by IVTT using a commercially available kit (Prom-ega). The radiolabeled products were then used to test for antibodies in patientsera by immunoprecipitation as described (21). Immunoprecipitates were elec-trophoresed on SDS/polyacrylamide gels and visualized by fluorography.

Statistical Analyses. Agreement in autoantibody results obtained by PhIP-Seqand immunoprecipitation was assessed by the kappa statistic. The frequencyof anti-POLR3 specificities was compared by cancer status using the Fisher’sexact test. Differences in demographic and clinical characteristics werecompared by autoantibody status using the Student’s t test, ANOVA, Kruskal–Wallis test, and Fisher’s exact test where appropriate. All analyses wereperformed using Stata, version 13 (StataCorp). Two sided P values <0.05were considered statistically significant.

ACKNOWLEDGMENTS. We thank members of the S.J.E. and L.C.-R. labora-tories for helpful comments on the manuscript and Tomasz Kula for adviceon the PhIP-Seq experiments. We also thank B. Vogelstein, K. Kinzler, andN. Papadopoulos (Johns Hopkins University) for numerous helpful discus-sions. This work was supported by NIH Grants K23 AR061439 (to A.A.S.)and DE-12354-15A1 (to A.R. and L.C.-R.), the Ira T. Fine Discovery Fund, theDonald B. and Dorothy L. Stabler Foundation, and the Scleroderma ResearchFoundation. The Johns Hopkins Rheumatic Disease Research Core Center issupported by NIH P30 Grant AR-053503. S.J.E. is an Investigator with theHoward Hughes Medical Institute.

1. Rosenberg JM, Utz PJ (2015) Protein microarrays: A new tool for the study of auto-antibodies in immunodeficiency. Front Immunol 6:138.

2. Larman HB, et al. (2011) Autoantigen discovery with a synthetic human peptidome.Nat Biotechnol 29(6):535–541.

3. Zhu J, et al. (2013) Protein interaction discovery using parallel analysis of translatedORFs (PLATO). Nat Biotechnol 31(4):331–334.

4. Xu GJ, et al. (2015) Viral immunology. Comprehensive serological profiling of humanpopulations using a synthetic human virome. Science 348(6239):aaa0698.

5. Koenig M, Dieudé M, Senécal JL (2008) Predictive value of antinuclear autoantibodies:The lessons of the systemic sclerosis autoantibodies. Autoimmun Rev 7(8):588–593.

6. Shah AA, Rosen A, Hummers L, Wigley F, Casciola-Rosen L (2010) Close temporal re-lationship between onset of cancer and scleroderma in patients with RNA polymeraseI/III antibodies. Arthritis Rheum 62(9):2787–2795.

7. Shah AA, et al. (2015) Examination of autoantibody status and clinical features as-sociated with cancer risk and cancer-associated scleroderma. Arthritis Rheumatol67(4):1053–1061.

8. Joseph CG, et al. (2014) Association of the autoimmune disease scleroderma with animmunologic response to cancer. Science 343(6167):152–157.

9. Shah AA, Casciola-Rosen L, Rosen A (2015) Review: Cancer-induced autoimmunity inthe rheumatic diseases. Arthritis Rheumatol 67(2):317–326.

10. Schubert RD, Wilson MR (2015) A tale of two approaches: How metagenomics andproteomics are shaping the future of encephalitis diagnostics. Curr Opin Neurol 28(3):283–287.

11. Mamula MJ (1998) Epitope spreading: The role of self peptides and autoantigenprocessing by B lymphocytes. Immunol Rev 164(1):231–239.

12. Benecke H, Lührmann R, Will CL (2005) The U11/U12 snRNP 65K protein acts as a

molecular bridge, binding the U12 snRNA and U11-59K protein. EMBO J 24(17):

3057–3069.13. Netter C, Weber G, Benecke H, Wahl MC (2009) Functional stabilization of an RNA

recognition motif by a noncanonical N-terminal expansion. RNA 15(7):1305–1313.14. Huttlin EL, et al. (2015) The BioPlex network: A systematic exploration of the human

interactome. Cell 162(2):425–440.15. Havugimana PC, et al. (2012) A census of human soluble protein complexes. Cell

150(5):1068–1081.16. Larman HB, et al. (2013) PhIP-Seq characterization of autoantibodies from patients

with multiple sclerosis, type 1 diabetes and rheumatoid arthritis. J Autoimmun 43:

1–9.17. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient

alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25.18. D’Angelo S, et al. (2011) Filtering “genic” open reading frames from genomic DNA

samples for advanced annotation. BMC Genomics 12(1, Suppl 1):S5.19. Larman HB, Liang AC, Elledge SJ, Zhu J (2014) Discovery of protein interactions using

parallel analysis of translated ORFs (PLATO). Nat Protoc 9(1):90–103.20. van den Hoogen F, et al. (2013) 2013 classification criteria for systemic sclerosis: an

American College of Rheumatology/European League against Rheumatism collabo-rative initiative. Arthritis Rheum 65(11):2737–2747.

21. Fiorentino D, Chung L, Zwerner J, Rosen A, Casciola-Rosen L (2011) The mucocuta-

neous and systemic phenotype of dermatomyositis patients with antibodies to MDA5(CADM-140): A retrospective study. J Am Acad Dermatol 65(1):25–34.

E7534 | www.pnas.org/cgi/doi/10.1073/pnas.1615990113 Xu et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

1, 2

020