Genetic characterization of Stargardt clinical phenotype in South Indian patients using sanger and targeted sequencing

Background Stargardt disease 1 (STGD1; MIM 248200) is a monogenic form of autosomal recessive genetic disease caused by mutation in ABCA4. This gene has a major role in hydrolyzing N-retinylidene-phosphatidylethanolamine to all-trans-retinal and phosphatidylethanolamine. The purpose of this study is to identify the frequency of putative disease-causing mutations associated with Stargardt disease in a South Indian population. Methods A total of 28 clinically diagnosed Stargardt-like phenotype patients were recruited from south India. Ophthalmic examination of all patients was carefully carried out by a retina specialist based on the stages of fundus imaging and ERG grouping. Genetic analysis of ABCA4 was performed for all patients using Sanger sequencing and clinical exome sequencing. Results This study identified disease-causing mutations in ABCA4 in 75% (21/28) of patients, 7% (2/28) exhibited benign variants and 18% (5/28) were negative for the disease-causing mutation. Conclusion This is the first study describing the genetic association of ABCA4 disease-causing mutation in South Indian Stargardt 1 patients (STGD1). Our findings highlighted the presence of two novel missense mutations and an (in/del, single base pair deletion & splice variant) in ABCA4. However, genetic heterogeneity in ABCA4 mutants requires a larger sample size to establish a true correlation with clinical phenotype.


Background
Stargardt disease (STGD) is a monogenic form of juvenile macular degeneration, which was first described by Karl Stargardt in 1909 [1,2]. The globally estimated prevalence rate is 1 in 8000-10,000. It is characterized by early central vision loss, progressive degeneration of the macula that is associated with loss of photoreceptors leading to irreversible vision loss [3,4]. Yet, another important unique characteristic clinical feature is the presence of distinct yellow-white flecks around the macula and mid-periphery of the retina [5]. The disease symptoms typically develop as early as in the first or second decade of life. Genes associated with degenerative macular dystrophies are highly expressed in photoreceptor cells playing a crucial role in phototransduction, visual cycle, photoreceptor structure and small molecule transport [6]. STGD1 is one of the most common autosomal recessive inherited retinal disorders caused by a mutation in the ATP Binding Cassette Subfamily A Member 4 (ABCA4) gene, whereas, mutations in elongation of very-long-chain fatty acids 4 (ELOVL4), prominin 1 (PROM1) genes are responsible for the STGD3 and STGD4 phenotype, respectively [7,8].
The ABCA4 gene located in chromosome 1p22.1 contains 50 exons that codes for a membrane bound glycoprotein that is ubiquitous and localized to the rim of the rod and cone outer discs membrane [9]. In addition, it is actively involved in the transport of retinoid substrate from photoreceptor to RPE [10]. Indeed, mutation in ABCA4 affects the retinoid transport activity, which subsequently affects the clearance of all-trans-N-ret-PE in the rod disc membrane. Consequently, the waste product, alltrans-N-ret-PE, reacts with all-trans-retinal forming dihydropyridinium compounds, which undergo auto-oxidation and thereby generate phosphatidyl-pyridinium bisretinoid A2PE in photoreceptors. So far, more than 1000 mutations have been reported in ABCA4 across different cohorts leading to STGD1 and other retinal disorders like autosomal recessive cone-rod dystrophies, age macular degeneration and retinitis pigmentosa [11]. To our knowledge, only one study reported the clinical and genetic correlation of STGD1 disease in five families belonging to of Indian origin [12].
The current study utilized a combinatorial approach including conventional Sanger sequencing and Targeted exome sequencing (TES) to determine the frequency of putative disease-causing variants associated with Stargardt disease in a South Indian population.

Study samples and clinical assessment
We recruited 28 clinically diagnosed Stargardt disease-like phenotype patients from two territories of Aravind Eye hospital-Madurai & Pondicherry, India, between 1998 and 2007 and 2018-2019. All the study participants are of South Indian origin (Tamil Nadu, Pondicherry, Kerala, Andhra Pradesh and Karnataka). The ophthalmic features were carefully examined in both eyes by a retina specialist. The examination included patient's age, disease onset, best corrected visual acuity (BCVA-Snellen acuity chart), slit lamp biomicroscopy, color fundus photography (TRC-50IA Retinal Fundus Camera) (Topcon, Inc., Tokyo, Japan), Spectral-domain optical coherence tomography (SD-OCT), Autofluorescence (AF) images using Spectralis with viewing module version 5.1.2.0, Clinical full-field electroretinography (ERG) through Diagnosys Color Dome (Diagnosys LLC, Lowell, MA) based on the standards of the International Society for Clinical Electrophysiology of Vision.
The written informed consent form was received from all probands or parents/legal guardians in cases of minor subjects after explaining the genetic study. A complete clinical and familial pedigree was collected from each proband. This study was approved by the Institutional Ethics Review Board, Aravind Eye Hospital, Madurai, Tamil Nadu, India. The study adhered to the tenets of the declaration of Helsinki.

Mutation screening
Two methods were adopted to identify the frequency of ABCA4 mutations in STGD1 patients. Sanger sequencing was performed for 24 samples and the remaining 4 cases were analyzed by a clinical exome sequencing method.

Polymerase chain reaction (PCR) for ABCA4
Five milliliters of peripheral blood were collected from all study subjects using an EDTA-vacutainer. Genomic DNA was extracted using modified salting out precipitation method [13]. Primers were designed for all fifty exons of ABCA4 (NG_009073.1) with the respective exon -intron boundaries using Primer3 and Primer BLAST software. Fifty nanograms of genomic DNA template was used for all ABCA4 specific exon amplification with 1 unit of Taq DNA polymerase (Sigma), 50 μM dNTPs (Sigma), 5 pm/μl of forward/reverse primers and standard 1X PCR buffer (Sigma). Gradient PCR was established to optimize the annealing temperature (54°C -66°C) of primers for all 50 exons of ABCA4. The PCR amplicon was purified using Exonuclease I-Shrimp alkaline phosphatase reagent (Exo-SAP; Affymetrix, Santa Clara, CA, USA). Further, the samples were sequenced using Big Dye Terminator ready reaction mix using the ABI-3500 genetic analyzer (Applied Biosystems, Foster city, CA).

Sanger sequencing
Direct sequencing was performed through di-deoxy nucleotide chain termination method to detect the potential variants associated with disease. Sequencing results were viewed in Finch TV and compared with the cDNA sequence of ABCA4 (NM_0 00350.3). The zygosity status of the variants across the exons (homozygous, heterozygous and compound heterozygous) was also identified through chromatogram.

Targeted exome sequencing (TES)
Targeted exome sequencing was performed for 4 study participants. Cev3 clinical-exome panel was used for library preparation and probe capture. Using Illumina HiSeq X ten platform, 6800 clinically relevant genes were captured with the preconstructed library to generate 150 bp paired-end reads at 100X sequencing depth. Post-sequencing data processing and variants filtration was performed using inhouse UNIX scripts [19]. The quality of the raw data in FASTQ file was checked and the bad reads were removed using Fast QC (version 0.11.5). The read alignment was done using BWA-MEM aligner (version 0.7.12-r1039) (23). The PCR-duplicate reads from the aligned reads were removed using Picard mark duplicate (version 2.18.24). The aligned reads were compared with hg19 reference version from UCSC genome browser. Further, single nucleotide polymorphisms (SNPs), point mutations and short indels were prioritized using Haplotype Caller module in GATK (version 4.0). These variants were finally annotated using ANNOVAR [20] to predict whether the mutation was silent, mis-sense or nonsense.

Variants prioritization
The variants obtained from ANNOVAR file were prioritized by applying a stringent filter with minor allele frequency (MAF) less than or equal to 0.1% in 1000genome, ESP, ExAC and gnomAD. Only the non-synonymous coding or splice-site variants with the conservation score > 2.5 (GERP score) and CADD score greater than 10 were selected. To predict the deleteriousness, the variants were further analyzed using in silico tools like Polyphen2, SIFT, Mutation Taster, FATHMM and LRT. Finally, the filtered variants were ranked out by their association with Stargardt disease using VarElect software [21].

Conservation analysis and effect of missense mutations in protein stability
Multiple sequence alignment was carried out using the Clustal Omega online tool. The structure of the ABCA4 domain was predicted through I-TASSER online software (http://zhanglab.ccmb.med.umich.edu/I-TASSER/). The predicted structure was evaluated by mutation cutoff scanning matrix (mCSM), site-directed mutator SDM and DUET server, which calculated the stability difference score between the wild and mutant type protein [22].

Disease-causing mutations identified by sanger sequencing and TES
In the present study, 28 patients with clinically Stargardt disease-like phenotype were recruited. All the affected probands presented with complaints of defective vision or central vision loss in both eyes, of which the ophthalmic evaluation was carefully carried out only in 11 patients who were taken forward for further phenotype classification (Table 1) and segregation analysis (Additional file 1: Table S2). The disease progression of STGD1 based on fundus imaging (Fishman's classification) [23] and ERG grouping [24] (Fig. 1) was keenly categorized by our clinicians. Of the total 11 probands, 27% were diagnosed with stage-1 disease, 36% were categorized as According to full field ERG, 27% of probands belonged to group-1 as well as group-2 and 45% were categorized as group-3. SD-OCT findings indicated the following phenotypes such as RPE thinning, IS-OS loss/disruption, outer retinal thinning and macular atrophy. These phenotypes were commonly observed in all probands. Case ID 22 showed a Bulls eye maculopathy-like fundus, but OCT was similar to other phenotypes. The study adopted two methods. Primarily, 24 samples were screened through Sanger sequencing (Fig. 2a-b) and to further elucidate the disease-associated variants in other STGD-related genes such as ELOVL4, CNGB3 and PROM1, targeted exome sequencing was carried out. TES disclosed the presence of disease-causing mutation only in ABCA4 (Fig. 2c-d) whereas non-pathogenic variants were observed in clinically relevant STGD genes such as ELOVL4, CNGB3 and PROM1 (Additional file 1: Table S1). These results narrowed down our search exclusively to ABCA4 of the affected STGD patients.
Modeling of ABCA4-ECD1 domain and predication of protein stability for novel missense variant Multiple sequence alignment was performed for the two novel missense variants with six different species. The sequence was observed to be 100% similar for both residues (p.C519F; p.I73F) (Fig. 3a). Further, the structure of ABCA4 exo-cytoplasmic domain (ECD-1; position 43-646) was predicted using I-TASSER tool. The modeling templates were retrieved from LOMETS (LOcal MEta-Threading-Server), a protein data bank (PDB) model 5XJY chosen as a template for predicting protein stability. Protein stability was identified based on the change in amino acid in the conserved region of the ECD-1 domain. Server (mCSM, SDM and DUET) results demonstrated that the missense mutations were destabilizing the ECD-1 region which was further emphasized by a minus value in Gibbs free energy [22] (Table 3). Wild and mutant residues were viewed using PyMol version 2.3 (Fig. 3b).

Discussion
The present study identified ABCA4 mutations in a South Indian population with a clinical phenotype of STGD1 disease using a combination of Sanger sequencing and clinical exome sequencing. The rate of homozygous variants identified in the population using the abovementioned methods was 75% (21/28). Due to the small sample size and allelic heterogeneity of ABCA4 mutants, it was not possible to establish a correlation between genetic data and the clinical phenotypic features of STGD1-affected patients. Foremost, the sequence analysis revealed missense, nonsense and compound heterozygous mutations involved in the disease pathogenesis of STGD1. This study further contributes to understanding the spectrum of ABCA4 mutations in South Indian patients with STGD1 disease.
Sanger sequencing, a cost-effective approach, was adopted for precise molecular diagnosis. However, despite its accuracy, seven inconclusive cases were observed. Two out of seven patients showed benign variants rs3112831 [35] (Case ID: 1), rs142673376 (Case ID: 16) and the remaining five patients (Case IDs: 3, 7, 12, 15, 23) were found negative for the disease-causing MT = mutation taster mutation in ABCA4. The unsolved cases and cases harboring benign variants may be related to the following factors: (i) the clinical overlap might lead to distinct genetics. Therefore, other STGD candidate genes (e.g., ELOVL4, PROM1, CNGB3) may play a role in disease progression, (ii) Mutations in deep intronic region of ABCA4 could be a cause for the typical STGD phenotype.
Previous studies reported a common hypomorphic allele of the ABCA4 gene explaining the missing heritability in autosomal recessive disorders [36,37]. In our cases, a hypomorphic allele rs1801581 (c.G2828A, p.R943Q) was identified in 25% (7/28) of STGD1 subjects that is reported to have a global minor allele frequency (GMAF -0.01538) in healthy population. In vitro assay demonstrated the pathogenicity of the variant (p.R943Q) that had a minimal effect on nucleotidase activity and on nucleotide binding affinity [38]. This variant could be pathogenic only in trans allele condition to moderate the disease severity in STGD1 cases (IDs: 5 & 14), who possessed a disease-causing   [39] was associated with heterozygous mutation in case ID: 6; which might be responsible for the late onset of phenotype expression in STGD1. Two missense mutations (p.C519F; p.I73F) in case ID: 10 and case ID: 25 were observed which was not previously reported in the population database. Multiple sequence alignment of human (Homo sapiens) ABCA4 protein and other species' ABCA4 protein region revealed that cysteine and isoleucine are highly conserved in the mutated region across the genus, suggesting that the mutated region may play role in the structural stability of the ABCA4 protein. The ABCA4 protein consists of two transmembrane domains (TMD) and two nucleotide binding domains (NBD) arranged in non-identical tandem halves (TMD1-NBD1-TMD2-NBD2) which is separated by exo-cytoplasmic domains (ECDs) [10]. Both novel mutations occurred at one of the large exocytoplasmic domains-1 (ECD-1), which is involved in the substrate translocation process with their highly mobile hinge domains [40].
Several reports showed that the common disease causing variant (c.5882G > A; p.G1961E) frequency was high in different ethnic cohorts like Somalia [41], those of Italian ancestry [42] and the Indian population [12,34]. Patients exhibiting this variant (homozygous and compound heterozygous) were clinically classified as moderate severity or late-onset disease phenotype [33]. However, in vitro studies revealed a severe dysfunction due to this missense variant [11]. In the current study, fundus imaging of the variant-associated patients (Case IDs: 19, 25) who were in the early onset of disease progression revealed a severity of stages III and IV disease category. Further, ERG indicated cone-rod dysfunction. Similarly, case ID: 13 harbored the p.G1961E homozygous variant, who had vision problems (BCVA-20/200 in BE) from 26 years of age (clinical images not available).
This study described two missense mutations p.G396C and p.A967V for the first time in association with STGD1 in a South Indian population. In addition, two more disease-causing variants (p.Y665Ter, p.T1277 M) were observed that was consistent with the previous reports in an Indian population [31,33].

Conclusions
In conclusion, the clinical and genetic perspective of 28 unrelated STGD-like phenotype patients of South Indian origin indicated the diverse variants in ABCA4. However, the identified allelic heterogeneity was inconsistent with an earlier report [12]. In addition, it creates a setback in correlating the phenotypic-genotypic relation. Sanger sequencing is considered as a gold standard method to identify monogenic Mendelian disorders. Hence, this method was used to determine the disease causative variants in the candidate gene ABCA4 that is associated with STGD1. In order to widen our knowledge, high throughput sequencing approach such as targeted exome sequencing was adopted to understand the genetic heterogeneity in our STGD1 phenotype. Due to a small number of samples and lack of clinical data, we were not able to explore the distinct genetics of STGD phenotype.
The prevalence rate of STGD remains to be investigated in the Indian population. In addition, the frequency of ABCA4 is poorly understood in our cohort. Therefore, this preliminary study contributes to the allelic diversity and mutation rate of ABCA4 in a South Indian population.
Additional file 1: Table S1. List of non-pathogenic variants identified in STGD patients (ID: 25,26,27,28) by Targeted exome sequencing. Table S2. Segregation analysis of 11 unrelated probands. Segregation analysis was performed for parents of 11 unrelated probands; ß Consanguinity in parents; * Consanguinity in previous generation; # Non consanguinity in parents; † Genetic analysis was performed for affected sibling.