Skip to main content

Application of generative adversarial networks (GAN) for ophthalmology image domains: a survey

Abstract

Background

Recent advances in deep learning techniques have led to improved diagnostic abilities in ophthalmology. A generative adversarial network (GAN), which consists of two competing types of deep neural networks, including a generator and a discriminator, has demonstrated remarkable performance in image synthesis and image-to-image translation. The adoption of GAN for medical imaging is increasing for image generation and translation, but it is not familiar to researchers in the field of ophthalmology. In this work, we present a literature review on the application of GAN in ophthalmology image domains to discuss important contributions and to identify potential future research directions.

Methods

We performed a survey on studies using GAN published before June 2021 only, and we introduced various applications of GAN in ophthalmology image domains. The search identified 48 peer-reviewed papers in the final review. The type of GAN used in the analysis, task, imaging domain, and the outcome were collected to verify the usefulness of the GAN.

Results

In ophthalmology image domains, GAN can perform segmentation, data augmentation, denoising, domain transfer, super-resolution, post-intervention prediction, and feature extraction. GAN techniques have established an extension of datasets and modalities in ophthalmology. GAN has several limitations, such as mode collapse, spatial deformities, unintended changes, and the generation of high-frequency noises and artifacts of checkerboard patterns.

Conclusions

The use of GAN has benefited the various tasks in ophthalmology image domains. Based on our observations, the adoption of GAN in ophthalmology is still in a very early stage of clinical validation compared with deep learning classification techniques because several problems need to be overcome for practical use. However, the proper selection of the GAN technique and statistical modeling of ocular imaging will greatly improve the performance of each image analysis. Finally, this survey would enable researchers to access the appropriate GAN technique to maximize the potential of ophthalmology datasets for deep learning research.

Background

Over the past decades, ophthalmologic images have gained a huge interest because of their importance in healthcare to prevent blindness due to ocular diseases and to decrease their socioeconomic burden worldwide [1]. In particular, the widespread availability of fundus photography and optical coherence tomography (OCT) provides an opportunity for early detection of diabetic retinopathy, age-related macular degeneration, and glaucoma [2]. However, there are still many constraints on the use of data-driven artificial intelligence (AI) models in ophthalmology. Ocular imaging data have been expanding globally [3], but there is a shortage of high-quality images and pathological data from patients to train AI models [4]. In addition, there are many diagnostic ocular imaging modalities such as color fundus photography, retinal OCT, ultra-widefield fundus photography, retinal angiography, ultrasonography, corneal tomography, and anterior segment OCT. Because imaging techniques differ from each other in terms of structural complexity and image dimensionality, the need for cross-modality image processing methods has increased to improve the disease prediction models.

In recent years, generative adversarial network (GAN) has become the technique of choice for image generation and translation in the field of medical imaging [5]. GAN, which is a new type of deep learning developed by Ian Goodfellow [6], can automatically synthesize medical images by learning the mapping function from an arbitrary distribution to the observed data distribution, which is the process of extracting mathematical relationships from data distributions for matching input to output data. As deep learning requires more data to build better accurate models, the medical research community requires more databases from various imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI) [7]. Accordingly, many experiments have been performed to demonstrate the benefit of using GAN, which can generate realistic synthetic images. Previous publications in radiology have shown that the application of GAN includes data augmentation, denoising, super-resolution, domain transfer between modalities, and segmentation [8]. They demonstrated that the potential gains from GAN can improve deep learning models for pathological conditions and can support clinicians for diagnosis using complex medical images. While medical image processing using GAN has been actively studied in radiology [9], there have been relatively few studies using GAN in the field of ophthalmology image domains.

A previous literature review showed that the adoption of GAN for medical imaging is increasing rapidly [10], but it is not familiar to researchers in the field of ophthalmology. Since ophthalmology imaging techniques are becoming more important for diagnosing ocular diseases, the use of GAN will gradually increase to achieve a more accurate diagnosis. In this work, we present a literature review on the application of GAN in ophthalmology image domains to discuss important contributions and identify potential future research directions. This paper attempts to guide future research on image processing in the ophthalmic domain through the proper application of GAN techniques.

Review

Overview

We detail the studies using GAN for ophthalmology image domains from available literature. We summarized how GAN was utilized in the field of ophthalmology imaging. The type of GAN used in the analysis, task, imaging domain, and outcome of the application of GAN were collected to verify its usefulness. We searched for potentially relevant literature in PubMed, Embase, and Google Scholar databases using the following search strategy: ((generative adversarial network) OR (GAN) OR (deep generative model)) AND ((ophthalmology) OR (diabetic retinopathy) OR (age-related macular degeneration) OR (fundus photography) OR (optical coherence tomography)). The initial selection of studies was performed based on texts with titles and abstracts. We included only peer-reviewed articles published before June 2021. Articles that did not contain original research using GAN were excluded. Only articles written in English were included in the study, and studies without a specific description of the GAN model were excluded. In the case of multiple publications of the same research, we regarded them as one study. The initial search yielded 855 articles. Of these, 806 were removed because their studies or manuscripts were not related to either GAN or ophthalmology images. A further three were excluded because they were duplicates of other studies in the list or were not research articles. Finally, 48 articles were included in the final literature review. To limit bias in the search, additional searches were performed using other common ocular diseases (such as dry eye, conjunctivitis, cataract, glaucoma, retinal vein occlusion, central serous chorioretinopathy, and strabismus) and other imaging modalities found in the previous search. However, no additional research articles were obtained.

Since a comprehensive list of ophthalmology image datasets has been provided in previous studies [3, 11], we found no reason to discuss ocular image datasets in this work. Traditional data augmentation refers to an increase in the number of training examples through the rotation, flipping, cropping, translation, and scaling of existing images to improve the performance of deep learning models, and can also be used to train GAN models. Generally, most GAN techniques rely on large amounts of annotated data, although several studies have targeted data augmentation in a small amount of pathological data.

GAN techniques

This section provides the general concepts of GAN for analyzing ophthalmology images, especially the architectures most frequently encountered in the literature reviewed. The basic structure of a GAN is called a vanilla GAN [6]. The architecture of the vanilla GAN consists of two separate deep learning models, including the generator, which synthesizes candidate samples based on the data distribution of the original dataset, and a discriminator, which tries to distinguish the synthesized candidate samples from the real samples from the original dataset. These two modules are trained simultaneously because the gradient information is back-propagated to the generator to increase realistic image synthesis capabilities and to the discriminator to increase real/fake discriminating capabilities. After vanilla GAN was introduced, GAN was highlighted because of its ability to generate realistic synthetic images based on the original dataset. Figure 1 illustrates the structure of the vanilla GAN and retinal images generated. In the example of retinal image synthesis, vectors of late space are randomly selected initially; however, after training, they obtain an appropriate functional relationship with the generated image. Randomly generated images and original real retinal images are classified by the discriminator and this result is back-propagated and reflected in the training of both the generator and the discriminator. Finally, the desired outcome after training the GAN is that the pixel distributions from the generated retinal images should approximate the distribution of real original retinal images. According to the original paper describing GAN, the generator is like a team that produces counterfeit money, which wants to use counterfeit money without it being detected, and the discriminator is like the police detecting counterfeit money. The competition between the two teams results in better counterfeiting [6]. After image generation using GAN was popularized, novel GAN techniques were constantly developed in the machine learning community. In this review, we found that most studies on ophthalmology images have used progressively growing GAN (PGGAN), conditional GAN, Pix2pix, and cycle-consistent GAN (CycleGAN).

Fig. 1
figure 1

An illustration of a basic architecture of GAN (vanilla GAN) for retinal image synthesis. The generator transforms a noise vector \(z\) from the distribution \(p(z)\) into a synthesized retinal image \({x}_{g}\). The discriminator distinguishes the synthetic and real retinal images based on the distributions of \({x}_{g}\) and \({x}_{r}\), respectively. The generated image samples form a distribution \({p}_{g}(x)\), which is desired to be an approximation of \({p}_{r}(x)\) from real image sample, after successful training

Currently, vanilla GAN is not widely used because of its low-quality outputs and instability during training. However, it has been the basis of recent GAN variant techniques (Table 1). A deep convolutional GAN (DCGAN) is based on the vanilla GAN by replacing the building block with fully convolutional layers [10]. Wasserstein GAN is an improved version of the vanilla GAN that uses a metric of the distance between two probability distributions (Wasserstein distance) as a loss function [12]. PGGAN is an extension of the vanilla GAN with a progressively growing generator and discriminator to generate realistic high-resolution images. The main concept of PGGAN is to build generators and discriminators, starting from a low-resolution to a high-resolution network. The newly added layers model fine-grained details as the training progresses. As the images are generated from a random noise vector, the original PGGAN cannot generate new instances with objects in the desired condition. Additionally, StyleGAN is a variant of PGGAN that adds the style transfer function in a conditional setting to the architecture of PGGAN [13]. Style and feature changes in synthetic images can be performed using an additional mapping network for the latent space in the generator of StyleGAN.

Table 1 The characteristics of typical GAN variant techniques and examples of general tasks in general medicine and ophthalmology fields

In many cases, synthetic images should be generated with the desired properties to adopt GAN for medical purposes. A conditional GAN is an extended architecture of vanilla GAN, where both the generator and discriminator are trained using not only the original dataset but also additional conditioning variables [14]. To achieve good image generation performance in multiple domains, researchers have modified the generators of conditional GAN in various deep learning architectures. Currently, conditional GAN includes many types of GAN models because the condition variable can be any variable including a single status variable [15], images of the same or different domains [16], masked images [17], and guided heatmap images [18]. If the conditional variable is set as an image, the training dataset should commonly contain aligned image pairs. The most widely used form of conditional GAN is Pix2pix, which contains an image-to-image translation framework [19]. Instead of using a conventional encoder-decoder as a generator, Pix2pix adopts a U-Net-like architecture with skip connections to generate synthetic images from the input images. In Pix2pix, the discriminator is used at the local image patch level to improve the performance. The GAN architecture for a super-resolution task (SRGAN) was developed by adopting a conditional GAN and perceptual loss [20]. However, conditional GAN models, including Pix2pix, have a critical disadvantage in that the shortage of paired datasets restricts their application to real problems.

Recently, CycleGAN was developed to generate images without matching paired images [21]. CycleGAN involves the simultaneous training of two generators and two discriminators. The CycleGAN adopts a cycle consistency, which is based on the idea that the output of the first generator can be used as input to the second generator, and the output of the second generator should be like the original image. This cycle consistency allows CycleGAN to learn the characteristics of the two image domains to transfer the domains without any paired dataset. The weights for the training parameters of CycleGAN modules can be tuned depending on the image domain or task. CycleGAN can perform denoising by mapping clean and noisy domains from unpaired training data [22]. Currently, variants of CycleGAN, such as StarGAN [23] and its variants [24], have been introduced to achieve high performance in a multiple domain transfer problem. The characteristics of typical GAN techniques and examples of general tasks in general medicine (especially radiology) and ophthalmology fields are summarized in Table 1. It should be noted that there are several cases where it is not classified as a specific type of GAN because the custom architectures of GAN were commonly designed for each imaging domain.

Applications in ophthalmology

Here, we survey the literature on GAN for ophthalmology image domains. Applications are introduced according to the type of tasks of GAN models, including segmentation (15 studies), data augmentation (11 studies), denoising (8 studies), domain transfer (8 studies), super-resolution (4 studies, two studies overlap with denoising), post-intervention prediction (3 studies), and feature extraction (2 studies). Figure 2 shows examples of the applications of GAN. Figure 3 shows the number of studies based on the tasks of the GAN and image domains. Some studies that handled the two image domains were double-counted. Most studies have focused on the generative aspect of GAN, and only two studies, f-AnoGAN [25] and AMD-GAN [26] adopted the discriminative aspect with feature extraction. The survey showed that segmentation was the most studied task in GAN in ophthalmology. Among ophthalmology imaging domains, fundus photography (24 studies) has been most frequently analyzed using GAN in the literature. GAN has also been used in various imaging domains, including retinal OCT (15 studies), retinal angiography (7 studies), ultra-widefield fundus photography (scanning laser ophthalmoscopy, 3 studies), anterior segment OCT (two studies), periorbital facial image for orbital diseases (1 study), ocular surface image (1 study), corneal topography (1 study), meibography infrared imaging (1 study), and in vivo corneal confocal microscopy (1 study). If one study deals with two modalities, it was reviewed and double-counted if necessary. Although conditional GAN is most frequently mentioned in the survey, it is difficult to conclude that it has been most widely used because the conditional GAN models refer to a wide variety of deep learning structures for each study. A table detailing the literature is provided for each section of the application.

Fig. 2
figure 2

Examples of applications of GAN in ophthalmology image domains. a Post-intervention prediction for decompression surgery for thyroid ophthalmopathy [15] and anti-vascular endothelial growth factor (VEGF) therapy for neovascular age-related macular degeneration [66]. b Denoising in fundus photography [53] and peripapillary optical coherence tomography (OCT) [16]. c Super-resolution for optic nerve head photography [56]. d Domain transfer for fundus photography to angiography [62] and ultra-widefield to classic fundus photography (re-analysis in this work) [63]. e Data augmentation for ocular surface images [46] and anterior segment OCT [82]. f Segmentation for corneal sub basal nerves in in vivo confocal microscopy images [37]. Most images were generated according to publicly available datasets and the methods of each study (some cases are based on our own dataset)

Fig. 3
figure 3

Number of studies that were reviewed in this work grouped according to tasks and image domains. a Study objectives in the application of GAN. b Ophthalmology image domains for the use of GAN. If one study deals with two issues, it was reviewed and double-counted appropriately

Segmentation

Image segmentation is a task where pixels or areas in an image are assigned a category label. Segmentation is the most frequently studied (14 studies) focusing on the identification of structures such as retinal vessels, retinal layers, and optic nerve. Identifying pathological areas on the ocular images can help clinicians to diagnose more accurately, and thus segmentation is an important task for developing AI models for medicine. Table 2 shows a summary of the literature review for the segmentation task using GAN.

Table 2 Summary of literature review for image segmentation task using GAN in ophthalmology imaging domains

GAN techniques are typically used to segment retinal vessels from fundus photographs. For decades, retinal vessel segmentation has been a challenging problem in the computer science community because vessels have various widths, colors, tortuosity, and branching. After conditional GAN was applied to this problem [27], many variants of the conditional GAN were proposed by modifying the architectures. In particular, Son et al. [28] improved the conditional GAN using a generator based on U-Net, similar to the Pix2pix architecture. To improve segmentation performance, some studies employed patch-based GAN [29], multi-kernel pooling layers [30], topological structure-constrained models [31], large receptive fields [32], and symmetric equilibrium generators with attention mechanisms [17]. Since most of these studies have been conducted using limited annotated vessel datasets without collaboration with ophthalmologists, validation with real patient data was not performed.

In addition, accurate cup-to-disc ratio calculation based on optic disc segmentation is an important problem for evaluating optic nerve damage and glaucoma. Damage to the optic nerve increases the cup-to-disc ratio, which is difficult to compare with the naked eye if the damage is small. In recent studies, GAN was useful for segmenting the optic disc and cup in fundus photographs using patch-based conditional GAN [33] and Wasserstein GAN [34]. The cup-to-disc ratio, a clinically used measurement to assess glaucoma progression, can be directly calculated from the optic cup and disc segmentation using conditional GAN [35]. This study showed comparable performance in assessing the cup-to-disc ratio for glaucoma screening.

Another application found in the literature is the segmentation of retinal layers in OCT images. Retinal layers consist of the vasculature, neurons, glia, and their connections, and each layer changes differently under pathological conditions. Pix2pix was successfully applied to segment the retinal nerve fiber layer, Bruch’s membrane, and choroid-sclera boundary in peripapillary retinal OCT images [36]. GAN was also applied to evaluate corneal pathological conditions using in vivo confocal microscopy images [37]. In this study, the segmentation of corneal sub basal nerves was achieved using a conditional GAN to detect corneal diseases. The meibomian gland can be evaluated by the GAN to segment the area of the meibomian glands in meibography infrared images [38]. In this study, the conditional GAN outperformed U-Net and masked regions with convolutional neural networks (mask R-CNN).

Data augmentation

The development of a machine-learning model requires enough data. Imbalanced data is a barrier to the training model, and the lack of data often presents in many medical problems because of barriers to access and usability [3]. Traditional data augmentation is commonly unable to extrapolate the generated data, which leads to data bias and suboptimal performance of trained models. Many researchers have shown that data augmentation using GAN techniques can provide additional benefit over traditional methods [8]. Recently, GAN techniques have been widely used to synthesize realistic medical images for data augmentation. Here, 11 studies investigated data augmentation for ophthalmology imaging domains (see Table 3). Several studies using GAN have focused on fundus photography and retinal OCT image generation to augment training datasets for machine learning.

Table 3 Summary of literature review for data augmentation task using GAN in ophthalmology imaging domains

In recent years, generating realistic fundus photographs has become a challenging issue. DCGAN was initially used to generate synthetic peripapillary fundus photographs [39]. The machine learning model based on DCGAN showed better diagnostic performance for glaucoma detection than conventional deep learning models. Burlina et al. evaluated the performance of a PGGAN to generate realistic retinal images [40]. In their study, two retinal specialists could not distinguish real images from synthetic images. Zhou et al. used a conditional GAN to generate high-resolution fundus photographs based on structural and lesion mask images [17]. The multi-channel generator technique, which trains multiple GAN models for each feature such as exudates, microaneurysms, and bleeding, was used to augment fundus photographs to overcome the imbalance of data to build a diabetic retinopathy detection model [41].

The generation of synthetic retinal OCT is another important task for data augmentation in the development of machine-learning models for automated OCT diagnosis. By integrating both normal and pathological data, GAN can generate synthetic OCT images with various pathological grades for data augmentation [42]. Zheng et al. showed that realistic retinal OCT images could be generated using PGGAN, which could improve the classification performance of deep learning models [43]. Data augmentation based on conditional GAN also improved the segmentation performance of retinal OCT images [44]. CycleGAN was applied to OCT data augmentation for rare retinal diseases in a few-shot learning system design [45]. GAN has been used for data augmentation of anterior OCT images for angle-closure glaucoma, ocular surface images for conjunctival disease [46] and corneal topography images for keratoconus detection [47].

Denoising & super-resolution

Image enhancement tasks such as denoising and super-resolution are important because ophthalmology images generally suffer from limitations of the device, the skill of the examiner, variations of ocular anatomy, and transparency of the visual axis. Image quality may affect the diagnostic performance using ocular images although the device and software offer suppression of noise and artifacts. As each imaging domain has characteristic noise and artifacts, several research groups have tried to develop data-driven GAN models tailored to each domain. There are 10 studies investigating denoising or super-resolution for ophthalmology imaging domains (detailed in Table 4).

Table 4 Summary of literature review for image enhancement (denoising and super-resolution) tasks using GAN in ophthalmology imaging domains

To remove speckle noise in retinal OCT images, a conditional GAN model with Wasserstein distance and perceptual loss was proposed [48]. Huang et al. showed that both super-resolution and noise reduction can be performed simultaneously using a conditional GAN [49]. Cheong et al. built DeShadowGAN using manually masked artifact images and conditional GAN with perceptual loss and demonstrated the effectiveness of the model in removing shadow artifacts [16]. Similarly, conditional GAN has also been applied to remove speckle noise in peripapillary retinal OCT [50] and anterior segment OCT [51]. However, image denoising methods using conditional GAN can match low- and high-quality image pairs; however, these data are typically unavailable in the medical field. Therefore, Das et al. used CycleGAN for super-resolution and noise reduction in retinal OCT images to facilitate unpaired image datasets [52].

Fundus photography has several artifacts and noise, including overall haze, edge haze, arcs, and lashes. In a super-resolution problem, artificial manipulation to reduce image resolution is possible. Since it is difficult to collect paired clean and artifact fundus photographs, CycleGAN has been used to improve the image quality of fundus photography [53, 54]. These studies showed that CycleGAN can effectively reduce artifacts to provide clearer retinal images to clinicians. PGGAN was also shown with a conditional design that was applied to super-resolution fundus photography [55]. Ha et al. adopted that to generate high-resolution synthetic optic disc images with a 4-times up-scaling using SRGAN [56].

Domain transfer

Most machine learning works have performed the development and validation of data from the same domain. To build a more generalized machine-learning model, data from different domains might be fused through domain transfer, which is the transfer between different imaging modalities. The domain transfer task of GAN is the cross-modality image synthesis process by which images are generated for one modality based on another. Cross-domain modality using the domain transfer technique has shown the possibility of obtaining additional clinical information without additional examinations [57]. Eight studies using GAN mainly focused on domain transfer for ophthalmology imaging domains (shown in Table 5). Notably, several studies that used image transfer generators were categorized as data augmentation tasks because they focused on image synthesis tasks. The concept of conditional GAN has been used in most studies because the generator must have an image input channel as a conditional variable for domain transfer. Generally, GAN models without conditional inputs can generate new images infinitely by adjusting the latent vector, whereas one input corresponds to one output image in typical conditional GAN. GAN techniques allow for more high-dimensional image transformation and realistic outputs than simple pixel-level transformation.

Table 5 Summary of literature review for domain transfer task using GAN in ophthalmology imaging domains

Initially, Costa et al. demonstrated that a conditional GAN can be used to generate realistic fundus photographs guided by masked vessel network images [58]; an autoencoder was used to synthesize new retinal vessel images apart from training the GAN. Zhao et al. also built a conditional GAN model that emphasizes the ability to learn with a small dataset [59]. Extending the conditional GAN, modified Pix2pix synthesized realistic color fundus photographs to enlarge the image dataset based on multiple inputs of the vessel and optic disc masked images [60]. Wu et al. showed that retinal autofluorescence images could be synthesized based on retinal OCT data using a conditional GAN framework [61]. In that study, en-face OCT images from volumetric OCT data were successfully transformed into synthetic autofluorescence images to detect geographic atrophy regions in the retina. Tavakkoli et al. demonstrated that realistic retinal angiography images with diabetic retinopathy were generated via conditional GAN using fundus photographs [62]. Although the model was trained using a limited angiography dataset without detailed phase information, the study showed the potential of image domain transfer for the diagnosis of diabetic retinopathy. Based on CycleGAN, ultra-widefield fundus photography can be transformed into classic color fundus photography to integrate retinal imaging domains [63]. In contrast, a study converting classic fundus photography to ultra-widefield fundus photography via CycleGAN was also reported [64]. Larzaridis et al. demonstrated that time-domain OCT could be converted to spectral-domain OCT using conditional GAN with Wasserstein distance and perceptual loss, showing that the integrated dataset fused by the GAN improved the statistical power of the OCT measurements.

Post-intervention prediction

The aim of post-intervention prediction is to generate an image that explains how the anatomical appearance changes after treatment. Three studies investigated post-intervention prediction tasks for ophthalmology imaging domains (detailed in Table 6). As post-intervention results are represented as images in several medical fields, this task is useful to clinicians and patients to understand how the intervention will affect the prognosis of diseases. However, the included studies have several limitations in terms of short-term follow-up periods for prediction and unstandardized interventions [65]. In addition, attention should be paid to interpreting the results because anatomical prediction after treatment is not necessarily related to functional outcomes such as visual acuity.

Table 6 Summary of literature review for post-intervention prediction task using GAN in ophthalmology imaging domains

Yoo et al. proposed a postoperative appearance prediction model for orbital decompression surgery for thyroid ophthalmopathy using a conditional GAN [15]. Although the experiment was performed at a relatively low resolution, the results show the potential of GAN as a decision support tool for oculoplastic and cosmetic surgeries related to the orbit. Two studies demonstrated that conditional GAN models could predict OCT images after anti-vascular endothelial growth factor (anti-VEGF) injection based on pre-injection OCT images with exudative age-related macular degeneration. Liu et al. showed that the Pix2pix model could generate synthetic post-injection OCT using pre-injection images to estimate the short-term response [66]. Lee et al. designed a conditional GAN with a multi-channel input for anti-VEGF injection [67]. The model was trained using both pre- and post-injection OCT images as well as fluorescein angiography and indocyanine green angiography to predict post-injection OCT.

Feature extraction

Another task that did not belong to these categories was feature extraction, including out-of-distribution detection. This task focuses on the discriminative aspect of GAN because the studies have directly used GAN architectures to detect pathologies. Two studies have used the concept of GAN in ophthalmology image domains (Table 7). Schlegl et al. proposed an anomaly detection method using a GAN in the retinal OCT domain [25]. This GAN model estimated the latent space via inverse mapping learning from the input images and calculated anomaly scores from the feature space of normal samples. This anomaly detection architecture has been successfully extended to other areas, such as industrial anomaly detection or chest lesion detection in X-ray images [10]. Xie et al. built a modified conditional GAN model for ultra-widefield fundus photography to improve the detection of retinal diseases [26]. They used an attention encoder for feature mining in the generator and designed a multi-branch structure in the discriminator to extract image features.

Table 7 Summary of literature review for feature extraction task using GAN in ophthalmology imaging domains

Other applications

Here, we address several studies that did not fit our search criteria, but the applications are noteworthy. A localization task refers to the identification of a region of interest rather than a specific pixel segmentation. Zhang et al. performed retinal pathology localization in fundus photography using CycleGAN [68]. In this localization task, the pathological area was detected by subtracting the synthesized normal image from the pathological image.

Image registration is another task finding the geometric transformation to structurally align images. It is also important in automated analysis of multimodal image analysis such as domain transfer. For example, a dataset for conditional GAN requires additional image registration of aligned image pairs for successful training. Mahapatra et al. showed that an autoencoder based on GAN architecture provided better registration performance for fundus photography and retinal angiography images [69].

Current limitations of GAN techniques

We found that GAN has several limitations that researchers should take note of (Fig. 4). First, mode collapse, which is a phenomenon that continues to output the same results, is a well-known problem of GAN [70]. To avoid this failure caused by a model stuck in a local minimum, more variant training data or additional data augmentation techniques are needed. Many GAN models were trained with no guarantee of convergence. Second, spatial deformities frequently occur when there are small training images without spatial alignment. In particular, in domain transfer using conditional GAN, paired images with structural and spatial alignment are critically challenging and require additional image registration in a preprocessing to obtain high-quality medical images [57, 63]. Third, unintended changes could occur in image-to-image translation because of the different data distributions of the structural features between the two image domains. For example, if only one domain contains many images with glaucoma in a domain transfer task, CycleGAN can produce glaucomatous changes during image synthesis. Noise can be represented as unintended outliers/confounders and unintended fake features can be generated from noise for a generator in several GAN models [26]. Fourth, high-frequency noises and artifacts of checkerboard patterns are often detected in images synthesized by a generator with deconvolution [53]. Novel techniques have been developed to reduce noise and artifacts [71].

Fig. 4
figure 4

Examples of problems encountered using GAN techniques. a Mode collapse where the generator produces limited varieties of samples. b Spatial deformity due to small training images without spatial alignment. c Unintended changes due to the difference of data distribution between two domains. d Checker-board artifacts in synthetic images. All of the images were generated according to publicly available datasets and the standard GAN methods

GAN and its variants generally consist of two or more deep learning modules, for example, two generators and two discriminators in CycleGAN, and thus training GAN tends to be unstable compared to a single deep learning module [15]. A problem of vanishing gradients may also occur if the discriminator performs well, and the generator learns too slowly. Therefore, tuning the hyperparameters is sometimes important, and training can be stopped early to obtain better synthetic images. However, the occurrence of these problems depends on the amount of data and the distribution of embedded pixels and is unpredictable.

In our experience, most of the problems from training GAN models are solved to some extent by increasing the amount of clinical data extracted from a variety of patients if the technique is appropriately selected. Although GAN is widely used for data augmentation, several previous studies using GAN for image-to-image translation also suffered from small amounts of data, similar to other deep learning algorithms [45]. As novel algorithms are emerging to solve these problems [72], GAN will eventually be easy to use for image processing in ophthalmology.

Additionally, there is no standard metric for evaluating the performance of GAN for realistic image synthesis [73]. It now relies on subjective judgments from researchers and clinicians relevant to ophthalmology imaging [40]. Previous studies adopted classic image similarity indices, such as mean squared error, mean absolute error, and structural similarity index [15], but they are unable to evaluate the realism of synthetic images. Several studies have shown an improvement in the diagnostic performance of machine learning after GAN-based data augmentation [43, 46], but it does not guarantee that GAN produces realistic images. This problem arises from the application of GAN not only in ophthalmology but also in all medical areas [10] and will continue to be a drawback for using GAN.

Discussion

We surveyed the literature relevant to GAN in ophthalmology image domains to guide future studies on image processing in ocular images. To our knowledge, this work is the first comprehensive literature review on the use of GAN techniques in ophthalmology image domains. Recently, a review of GAN in ophthalmology was reported, but the scope was limited to image synthesis in fundus photography and OCT [73]. GAN research has thrived in the medical field because machine learning is data-hungry to achieve a more accurate diagnosis. In this review, we highlighted the various uses of GAN in that it can perform segmentation, data augmentation, denoising, domain transfer, super-resolution, post-intervention prediction, and feature extraction. The number of publications relevant to this field has also grown consistently as GAN techniques have become popular among researchers. We found that GAN can be applied to most ophthalmology image domains in the literature. As imaging plays a crucial role in ophthalmology, GAN-based image synthesis and image-to-image translation will be highly valuable in improving the quantitative and personalized evaluation of ocular disorders. Despite the increasing use of GAN techniques, we also found that it faces challenges for adaptation to clinical settings.

Recently, a previous paper suggested that the utility of GAN in image synthesis is unclear for ophthalmology imaging [73]. However, GAN techniques have shown better performance in the fields of radiology and pathology than other generative deep learning models, such as autoencoders, fully convolutional networks (FCNs), and U-nets [74, 75]. In the anomaly detection task for retinal OCT, GAN models including AnoGAN, Pix2pix, and CycleGAN outperformed a traditional autoencoder model, which simply learns latent coding of unlabeled image data [76]. FCN and U-Net are well-established deep generative models for detection and segmentation tasks for biomedical imaging domains [77]. As these do not consider the detailed features of the output images, the GAN framework can improve the image synthesis performance of the FCN and U-Net models [78]. A previous study on retinal vessel segmentation showed that conditional GAN outperformed U-Net and other generative techniques [28]. Given this trend, GAN is expected to improve image analysis technologies in various tasks. A more accurate comparison and benchmarking of GAN techniques will be enabled by future studies and more clinical data.

The proper selection of the GAN technique and statistical modeling of ocular imaging will improve the performance of each image analysis. In this review, we found that a broad range of custom architectures from GAN variants was used for different tasks. There is no evidence of a particularly superior GAN technique. Researchers can analyze ocular images by newly defining the custom objective functions of the GAN to fit the specific task and domain. For example, Cheong et al. modified the conditional GAN model to effectively denoise OCT images using a custom loss function including content, style, total variation, and shadow losses [16]. Moreover, researchers can incorporate prior information about each imaging domain to develop GAN models for specific tasks. For example, researchers have suggested several statistical distributions of retinal structures and noise modeling in OCT images [79]. Statistical modeling using prior information improved the performance of segmentation of retinal layers and detection of diabetic retinopathy in OCT [80]. Since GANs are also mathematical models for learning the statistical relationship of distributions of training and target data [6], statistical modeling using a prior domain knowledge is expected to improve GAN performance. To adopt this concept in GAN for medical imaging, various mathematical attempts and validations are needed in future studies.

Several studies have shown that GAN can be a good choice in overcoming data shortages and lack of large annotated datasets in ophthalmology [81]. Burlina et al. showed that a deep learning model trained with only synthetic retinal images generated by PGGAN performed worse than those trained with real retinal images (0.9706 vs. 0.9235 considering the area under the receiver operating characteristic curve) [40]. However, several studies have shown that machine learning models trained with data integrating both real and GAN-based synthetic images can outperform those trained with real images in retinal OCT [43], anterior segment OCT [82], ocular surface image [46], and corneal topography [47]. GAN was also used for data augmentation of OCT images with rare retinal diseases in a semi-supervised learning manner [45]. Studies have shown that GAN-based data augmentation can provide a tool to solve an over-fitting problem in imbalanced datasets owing to the lack of available pathological samples. The image synthesis ability of GAN also provides patient privacy because synthetic images preserve characteristics as they become unidentifiable. The synthetic data preserve the manifold in the feature space of the original dataset [83]. It might be possible that machine learning researchers release the synthetic dataset generated by GAN instead of a real dataset to demonstrate their model if there is a problem with patient privacy. Additionally, the annotation of the dataset can be incomplete and inaccurate because it is time-consuming and laborious. According to a previous report regarding cell segmentation in microscopy images, GAN can be a solution for this weak annotation problem [84].

Studies using image-to-image translation frameworks of GAN have focused on segmentation, domain transfer, denoising, super-resolution, and post-intervention prediction. The denoising function of the GAN may be effective in decreasing the effect of adversarial attacks in ophthalmology image domains [85]. Recently developed GAN architectures, such as Pix2pix and CycleGAN, have been widely applied in medical image domains [27]. However, these techniques require spatial alignment between the two image domains to obtain high-quality results. Therefore, additional image registration is required before the GAN performs a domain transformation [57]. If the structures in the images are not aligned, the GAN may perform an image-to-image translation with deformed results in synthetic images. In this review, we found that only one study performed image registration to align the retinal structures between classic fundus photography and ultra-widefield fundus photography to improve the performance of CycleGAN [63]. We anticipate that this image alignment issue for training GAN models will be highlighted when data from various centers measured from multiple devices are collected. More research is needed, but recent studies have shown that GAN can also provide solutions to this image alignment issue [69, 86].

As the deep learning techniques associated with GAN have been developed, the scope of medical image processing is rapidly expanding. For example, when GAN was first introduced, simple vessel segmentation was the most frequent application of GAN. The recent work of Tavakkoli et al. achieved a significant advancement in the retinal vessel segmentation problem because their conditional GAN model provides realistic retinal fluorescein angiography, which can be used in a clinical setting [62]. Novel image processing techniques using unpaired image datasets, such as CycleGAN, have extended the range of training image domains, so ocular images from more diverse modalities are expected to be used for developing AI systems [38, 63]. Multi-domain GAN models that handle images from various domains at once have also been developed to fuse data for more accurate diagnosis. For example, Lee et al. reported a conditional GAN using multi-domain inputs analyzing OCT and fluorescein angiography to predict more accurate post-treatment retinal state prediction [67]. In the future, new GAN techniques such as StyleGAN [13], which is excellent at extracting and transforming features, and StarGAN [23], which performs multi-domain transformation, are also expected to be used in the ophthalmology imaging domains to solve clinical problems. To adopt this rapid technical development, future studies require multidisciplinary (clinician–engineer) collaboration and collection of more multi-domain ocular images. Clinicians need to feedback to engineers to improve the technical completeness of GAN.

Most studies included in this review used training and validation datasets extracted from the same study group. We found that no clinical trials have been conducted that explored the use of GAN. Machine learning techniques, including GAN, do not guarantee performance in external datasets independent of training sets. It has not been confirmed whether data augmentation through GAN can increase the diagnostic accuracy of AI systems for ocular diseases in real clinics. A GAN can be used to bridge the domain gap between training and external data from different sources [64]. If difficult access to reliable annotated data from multiple data sources remains problematic, domain adaptation can be considered to address the generalization issue [87]. Domain adaptation via the domain transfer function of a GAN may provide a chance to use a machine learning system in different settings. For example, retinal images taken with ultra-widefield fundus photography can be analyzed by an AI system developed with FP via domain transfer using GAN [63, 64]. Although GAN has several shortcomings, its ability to adapt to domains and expand data by generating realistic images can increase generalizability and may help to increase the use of machine learning algorithms in ophthalmology image domains. However, further studies are required to determine whether the application of GAN techniques will improve the diagnostic performance of machine learning models in real world clinical situations.

Conclusion

The findings of this work suggest that the direction of deep learning research in ophthalmology has benefited from GAN. GAN techniques have established an extension of datasets and modalities in ophthalmology. The adoption of GAN in ophthalmology is still in its early stages of clinical validation compared with deep learning classification techniques because several problems need to be overcome for practical use. However, the proper selection of a GAN technique and statistical modeling of ocular imaging will improve the performance of each image analysis. We hope that this review will fuel more studies using GAN in ophthalmology image domains. More accurate algorithms for the detection of pathological ophthalmic conditions would be enabled by selection of proper GAN techniques by maximizing the potential of ophthalmology datasets.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Wang W, Yan W, Müller A, Keel S, He M. Association of socioeconomics with prevalence of visual impairment and blindness. JAMA Ophthalmol. 2017;135(12):1295–302.

    PubMed  PubMed Central  Google Scholar 

  2. Liu H, Li L, Wormstone IM, Qiao C, Zhang C, Liu P, et al. Development and validation of a deep learning system to detect glaucomatous optic neuropathy using fundus photographs. JAMA Ophthalmol. 2019;137(12):1353–60.

    PubMed  PubMed Central  Google Scholar 

  3. Khan SM, Liu X, Nath S, Korot E, Faes L, Wagner SK, et al. A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability. Lancet Digit Health. 2021;3(1):e51-66.

    PubMed  Google Scholar 

  4. De Fauw J, Ledsam JR, Romera-Paredes B, Nikolov S, Tomasev N, Blackwell S, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24(9):1342–50.

    PubMed  Google Scholar 

  5. Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA. Generative adversarial networks: an overview. IEEE Signal Process Mag. 2018;35:53–65.

    Google Scholar 

  6. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada; p. 2672–2680.

  7. Barbedo JGA. Impact of dataset size and variety on the effectiveness of deep learning and transfer learning for plant disease classification. Comput Electron Agric. 2018;153:46–53.

    Google Scholar 

  8. Sorin V, Barash Y, Konen E, Klang E. Creating artificial images for radiology applications using generative adversarial networks (GANs)-a systematic review. Acad Radiol. 2020;27(8):1175–85.

    PubMed  Google Scholar 

  9. Wolterink JM, Mukhopadhyay A, Leiner T, Vogl TJ, Bucher AM, Išgum I. Generative adversarial networks: a primer for radiologists. Radiographics. 2021;41(3):840–57.

    PubMed  Google Scholar 

  10. Yi X, Walia E, Babyn P. Generative adversarial network in medical imaging: a review. Med Image Anal. 2019;58:101552.

    PubMed  Google Scholar 

  11. Tsiknakis N, Theodoropoulos D, Manikis G, Ktistakis E, Boutsora O, Berto A, et al. Deep learning for diabetic retinopathy detection and classification based on fundus images: a review. Comput Biol Med. 2021;135:104599.

    PubMed  Google Scholar 

  12. Abdelhalim ISA, Mohamed MF, Mahdy YB. Data augmentation for skin lesion using self-attention based progressive generative adversarial network. Expert Syst Appl. 2021;165:113922.

    Google Scholar 

  13. Fetty L, Bylund M, Kuess P, Heilemann G, Nyholm T, Georg D, et al. Latent space manipulation for high-resolution medical image synthesis via the StyleGAN. Z Med Phys. 2020;30(4):305–14.

    PubMed  Google Scholar 

  14. Mirza M, Osindero S. Conditional generative adversarial nets. arXiv:1411.1784. 2014.

  15. Yoo TK, Choi JY, Kim HK. A generative adversarial network approach to predicting postoperative appearance after orbital decompression surgery for thyroid eye disease. Comput Biol Med. 2020;118:103628.

    PubMed  Google Scholar 

  16. Cheong H, Devalla SK, Pham TH, Zhang L, Tun TA, Wang X, et al. DeshadowGAN: a deep learning approach to remove shadows from optical coherence tomography images. Transl Vis Sci Technol. 2020;9(2):23.

    PubMed  PubMed Central  Google Scholar 

  17. Zhou Y, Wang B, He X, Cui S, Shao L. DR-GAN: conditional generative adversarial network for fine-grained lesion synthesis on diabetic retinopathy images. IEEE J Biomed Health Inform. 2020. https://doi.org/10.1109/JBHI.2020.3045475.

    Article  PubMed  Google Scholar 

  18. Wang W, Li X, Xu Z, Yu W, Zhao J, Ding D, et al. Learning two-stream CNN for multi-modal age-related macular degeneration categorization. arXiv:2012.01879.

  19. Isola P, Zhu JY, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 1125–34.

  20. Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, et al. ESRGAN: enhanced super-resolution generative adversarial networks. arXiv:180900219.

  21. Zhu JY, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision. 2017. p. 2223–32.

  22. Manakov I, Rohm M, Kern C, Schworm B, Kortuem K, Tresp V, et al. Noise as domain shift: denoising medical images by unpaired image translation. In: Wang Q, Milletari F, Nguyen HV, Albarqouni S, Cardoso MJ, Rieke N, et al., editors. Domain adaptation and representation transfer and medical image learning with less labels and imperfect data. Cham: Springer International Publishing; 2019. p. 3–10.

    Google Scholar 

  23. Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J. StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018. p. 8789–97.

  24. Lee D, Moon WJ, Ye JC. Assessing the importance of magnetic resonance contrasts using collaborative generative adversarial networks. Nat Mach Intell. 2020;2:34–42.

    Google Scholar 

  25. Schlegl T, Seeböck P, Waldstein SM, Langs G, Schmidt-Erfurth U. f-AnoGAN: fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal. 2019;54:30–44.

    PubMed  Google Scholar 

  26. Xie H, Lei H, Zeng X, He Y, Chen G, Elazab A, et al. AMD-GAN: attention encoder and multi-branch structure based generative adversarial networks for fundus disease detection from scanning laser ophthalmoscopy images. Neural Netw. 2020;132:477–90.

    PubMed  Google Scholar 

  27. Iqbal T, Ali H. Generative adversarial network for medical images (MI-GAN). J Med Syst. 2018;42:231.

    PubMed  Google Scholar 

  28. Son J, Park SJ, Jung KH. Towards accurate segmentation of retinal vessels and the optic disc in fundoscopic images with generative adversarial networks. J Digit Imaging. 2019;32(3):499–512.

    PubMed  Google Scholar 

  29. Rammy SA, Abbas W, Hassan NU, Raza A, Zhang W. CPGAN: Conditional patch-based generative adversarial network for retinal vessel segmentation. IET Image Process. 2019;14(6):1081–90.

    Google Scholar 

  30. Park KB, Choi SH, Lee JY. M-GAN: retinal blood vessel segmentation by balancing losses through stacked deep fully convolutional networks. IEEE Access. 2020;8:146308–22.

    Google Scholar 

  31. Yang J, Dong X, Hu Y, Peng Q, Tao G, Ou Y, et al. Fully automatic arteriovenous segmentation in retinal images via topology-aware generative adversarial networks. Interdiscip Sci. 2020;12(3):323–34.

    PubMed  Google Scholar 

  32. Zhao H, Qiu X, Lu W, Huang H, Jin X. High-quality retinal vessel segmentation using generative adversarial network with a large receptive field. Int J Imaging Sys Technol. 2020;30:828–42.

    Google Scholar 

  33. Wang S, Yu L, Yang X, Fu CW, Heng PA. Patch-based output space adversarial learning for joint optic disc and cup segmentation. IEEE Trans Med Imaging. 2019;38(11):2485–95.

    PubMed  Google Scholar 

  34. Kadambi S, Wang Z, Xing E. WGAN domain adaptation for the joint optic disc-and-cup segmentation in fundus images. Int J Comput Assist Radiol Surg. 2020;15(7):1205–13.

    PubMed  Google Scholar 

  35. Bian X, Luo X, Wang C, Liu W, Lin X. Optic disc and optic cup segmentation based on anatomy guided cascade network. Comput Methods Programs Biomed. 2020;197:105717.

    PubMed  Google Scholar 

  36. Heisler M, Bhalla M, Lo J, Mammo Z, Lee S, Ju MJ, et al. Semi-supervised deep learning based 3D analysis of the peripapillary region. Biomed Opt Express. 2020;11(7):3843–56.

    PubMed  PubMed Central  Google Scholar 

  37. Yildiz E, Arslan AT, Yildiz Tas A, Acer AF, Demir S, Sahin A, et al. Generative adversarial network based automatic segmentation of corneal sub basal nerves on in vivo confocal microscopy images. Transl Vis Sci Technol. 2021;10(6):33.

    PubMed  PubMed Central  Google Scholar 

  38. Khan ZK, Umar AI, Shirazi SH, Rasheed A, Qadir A, Gul S. Image based analysis of meibomian gland dysfunction using conditional generative adversarial neural network. BMJ Open Ophthalmol. 2021;6(1):e000436.

    PubMed  PubMed Central  Google Scholar 

  39. Diaz-Pinto A, Colomer A, Naranjo V, Morales S, Xu Y, Frangi AF. Retinal image synthesis and semi-supervised learning for glaucoma assessment. IEEE Trans Med Imaging. 2019;38(9):2211–8.

    PubMed  Google Scholar 

  40. Burlina PM, Joshi N, Pacheco KD, Liu TYA, Bressler NM. Assessment of deep generative models for high-resolution synthetic retinal image generation of age-related macular degeneration. JAMA Ophthalmol. 2019;137(3):258–64.

    PubMed  PubMed Central  Google Scholar 

  41. Wang S, Wang X, Hu Y, Shen Y, Yang Z, Gan M, et al. Diabetic retinopathy diagnosis using multichannel generative adversarial network with semisupervision. IEEE Trans Autom Sci Eng. 2021;18:574–85.

    Google Scholar 

  42. He X, Fang L, Rabbani H, Chen X, Liu Z. Retinal optical coherence tomography image classification with label smoothing generative adversarial network. Neurocomputing. 2020;405:37–47.

    Google Scholar 

  43. Zheng C, Xie X, Zhou K, Chen B, Chen J, Ye H, et al. Assessment of generative adversarial networks model for synthetic optical coherence tomography images of retinal disorders. Transl Vis Sci Technol. 2020;9(2):29.

    PubMed  PubMed Central  Google Scholar 

  44. Kugelman J, Alonso-Caneiro D, Read SA, Vincent SJ, Chen FK, Collins MJ. Data augmentation for patch-based OCT chorio-retinal segmentation using generative adversarial networks. Neural Comput & Applic. 2021;33:7393–408.

    Google Scholar 

  45. Yoo TK, Choi JY, Kim HK. Feasibility study to improve deep learning in OCT diagnosis of rare retinal diseases with few-shot classification. Med Biol Eng Comput. 2021;59(2):401–15.

    PubMed  Google Scholar 

  46. Yoo TK, Choi JY, Kim HK, Ryu IH, Kim JK. Adopting low-shot deep learning for the detection of conjunctival melanoma using ocular surface images. Comput Methods Programs Biomed. 2021;205:106086.

    PubMed  Google Scholar 

  47. Abdelmotaal H, Abdou AA, Omar AF, El-Sebaity DM, Abdelazeem K. Pix2pix conditional generative adversarial networks for scheimpflug camera color-coded corneal tomography image generation. Trans Vis Sci Tech. 2021;10(7):21–21.

    Google Scholar 

  48. Halupka KJ, Antony BJ, Lee MH, Lucy KA, Rai RS, Ishikawa H, et al. Retinal optical coherence tomography image enhancement via deep learning. Biomed Opt Express. 2018;9(12):6205–21.

    PubMed  PubMed Central  Google Scholar 

  49. Huang Y, Lu Z, Shao Z, Ran M, Zhou J, Fang L, et al. Simultaneous denoising and super-resolution of optical coherence tomography images based on generative adversarial network. Opt Express. 2019;27(9):12289–307.

    PubMed  Google Scholar 

  50. Chen Z, Zeng Z, Shen H, Zheng X, Dai P, Ouyang P. DN-GAN: Denoising generative adversarial networks for speckle noise reduction in optical coherence tomography images. Biomed Signal Process Control. 2020;55:101632.

    Google Scholar 

  51. Ouyang J, Mathai TS, Lathrop K, Galeotti J. Accurate tissue interface segmentation via adversarial pre-segmentation of anterior segment OCT images. Biomed Opt Express. 2019;10(10):5291–324.

    PubMed  PubMed Central  Google Scholar 

  52. Das V, Dandapat S, Bora PK. Unsupervised super-resolution of OCT images using generative adversarial network for improved age-related macular degeneration diagnosis. IEEE Sens J. 2020;20:8746–56.

    Google Scholar 

  53. Yoo TK, Choi JY, Kim HK. CycleGAN-based deep learning technique for artifact reduction in fundus photography. Graefes Arch Clin Exp Ophthalmol. 2020;258(8):1631–7.

    PubMed  Google Scholar 

  54. Luo Y, Chen K, Liu L, Liu J, Mao J, Ke G, et al. Dehaze of cataractous retinal images using an unpaired generative adversarial network. IEEE J Biomed Health Inform. 2020;24(12):3374–83.

    PubMed  Google Scholar 

  55. Mahapatra D, Bozorgtabar B, Garnavi R. Image super-resolution using progressive generative adversarial networks for medical image analysis. Comput Med Imaging Graph. 2019;71:30–9.

    PubMed  Google Scholar 

  56. Ha A, Sun S, Kim YK, Lee J, Jeoung JW, Kim HC, et al. Deep-learning-based enhanced optic-disc photography. PLoS One. 2020;15(10):e0239913.

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Shin Y, Yang J, Lee YH. Deep generative adversarial networks: applications in musculoskeletal imaging. Radiol Artif Intell. 2021;3(3):e200157.

    PubMed  PubMed Central  Google Scholar 

  58. Costa P, Galdran A, Meyer MI, Niemeijer M, Abramoff M, Mendonca AM, et al. End-to-end adversarial retinal image synthesis. IEEE Trans Med Imaging. 2018;37(3):781–91.

    PubMed  Google Scholar 

  59. Zhao H, Li H, Maurer-Stroh S, Cheng L. Synthesizing retinal and neuronal images with generative adversarial nets. Med Image Anal. 2018;49:14–26.

    PubMed  Google Scholar 

  60. Yu Z, Xiang Q, Meng J, Kou C, Ren Q, Lu Y. Retinal image synthesis from multiple-landmarks input with generative adversarial networks. Biomed Eng Online. 2019;18(1):62.

    PubMed  PubMed Central  Google Scholar 

  61. Wu M, Cai X, Chen Q, Ji Z, Niu S, Leng T, et al. Geographic atrophy segmentation in SD-OCT images using synthesized fundus autofluorescence imaging. Comput Methods Programs Biomed. 2019;182:105101.

    PubMed  Google Scholar 

  62. Tavakkoli A, Kamran SA, Hossain KF, Zuckerbrod SL. A novel deep learning conditional generative adversarial network for producing angiography images from retinal fundus photographs. Sci Rep. 2020;10(1):21580.

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Yoo TK, Ryu IH, Kim JK, Lee IS, Kim JS, Kim HK, et al. Deep learning can generate traditional retinal fundus photographs using ultra-widefield images via generative adversarial networks. Comput Methods Programs Biomed. 2020;197:105761.

    PubMed  Google Scholar 

  64. Ju L, Wang X, Zhao X, Bonnington P, Drummond T, Ge Z. Leveraging regular fundus images for training UWF fundus diagnosis models via adversarial learning and pseudo-labeling. IEEE Trans Med Imaging. 2021;40(10):2911–25.

    PubMed  Google Scholar 

  65. Liu TYA, Farsiu S, Ting DS. Generative adversarial networks to predict treatment response for neovascular age-related macular degeneration: interesting, but is it useful? Br J Ophthalmol. 2020;104(12):1629–30.

    PubMed  Google Scholar 

  66. Liu Y, Yang J, Zhou Y, Wang W, Zhao J, Yu W, et al. Prediction of OCT images of short-term response to anti-VEGF treatment for neovascular age-related macular degeneration using generative adversarial network. Br J Ophthalmol. 2020;104(12):1735–40.

    PubMed  Google Scholar 

  67. Lee H, Kim S, Kim MA, Chung H, Kim HC. Post-treatment prediction of optical coherence tomography using a conditional generative adversarial network in age-related macular degeneration. Retina. 2021;41(3):572–80.

    CAS  PubMed  Google Scholar 

  68. Zhang Z, Ji Z, Chen Q, Fan W, Yuan S. Joint optimization of CycleGAN and CNN classifier for detection and localization of retinal pathologies on color fundus photographs. IEEE J Biomed Health Inform. 2021. https://doi.org/10.1109/JBHI.2021.3092339.

    Article  PubMed  Google Scholar 

  69. Mahapatra D, Ge Z. Training data independent image registration using generative adversarial networks and domain adaptation. Pattern Recogn. 2020;100:107109.

    Google Scholar 

  70. Srivastava A, Valkov L, Russell C, Gutmann MU, Sutton C. Veegan: Reducing mode collapse in gans using implicit variational learning. In: Proceedings of the 31st International Conference on Neural Information Processing System. 2017. p. 3310–20. https://doi.org/10.5555/3294996.3295090.

  71. Lee OY, Shin YH, Kim JO. Multi-perspective discriminators-based generative adversarial network for image super resolution. IEEE Access. 2019;7:136496–510.

    Google Scholar 

  72. Liu MY, Huang X, Mallya A, Karras T, Aila T, Lehtinen J, et al. Few-shot unsupervised image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision. 2019. p. 10551–60.

  73. Wang Z, Lim G, Ng WY, Keane PA, Campbell JP, Tan GSW, et al. Generative adversarial networks in ophthalmology: what are these and how can they be used? Curr Opin Ophthalmol. 2021;32(5):459–67.

    PubMed  Google Scholar 

  74. Tschuchnig ME, Oostingh GJ, Gadermayr M. Generative adversarial networks in digital pathology: a survey on trends and future potential. Patterns (N Y). 2020;1(6):100089.

    Google Scholar 

  75. Kearney V, Ziemer BP, Perry A, Wang T, Chan JW, Ma L, et al. Attention-aware discrimination for MR-to-CT image translation using cycle-consistent generative adversarial networks. Radiol Artif Intell. 2020;2(2):e190027.

    PubMed  PubMed Central  Google Scholar 

  76. Zhou K, Xiao Y, Yang J, Cheng J, Liu W, Luo W, et al. Encoding structure-texture relation with P-Net for anomaly detection in retinal images. In: Vedaldi A, Bischof H, Brox T, Frahm J-M, editors. Computer vision—ECCV 2020. Cham: Springer International Publishing; 2020. p. 360–77.

    Google Scholar 

  77. Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF, editors. Medical image computing and computer-assisted intervention—MICCAI 2015. Cham: Springer International Publishing; 2015. p. 234–41.

    Google Scholar 

  78. Lei B, Xia Z, Jiang F, Jiang X, Ge Z, Xu Y, et al. Skin lesion segmentation via generative adversarial networks with dual discriminators. Med Image Anal. 2020;64:101716.

    PubMed  Google Scholar 

  79. Jorjandi S, Amini Z, Plonka G, Rabbani H. Statistical modeling of retinal optical coherence tomography using the Weibull mixture model. Biomed Opt Express. 2021;12(9):5470–88.

    PubMed  PubMed Central  Google Scholar 

  80. Grzywacz NM, de Juan J, Ferrone C, Giannini D, Huang D, Koch G, et al. Statistics of optical coherence tomography data from human retina. IEEE Trans Med Imaging. 2010;29(6):1224–37.

    PubMed  PubMed Central  Google Scholar 

  81. Bellemo V, Burlina P, Yong L, Wong TY, Ting DSW. Generative adversarial networks (GANs) for retinal fundus image synthesis. In: Carneiro G, You S, editors. Computer vision—ACCV 2018 Workshops. Cham: Springer International Publishing; 2019. p. 289–302.

    Google Scholar 

  82. Zheng C, Bian F, Li L, Xie X, Liu H, Liang J, et al. Assessment of generative adversarial networks for synthetic anterior segment optical coherence tomography images in closed-angle detection. Transl Vis Sci Technol. 2021;10(4):34.

    PubMed  PubMed Central  Google Scholar 

  83. Yoon J, Drumright LN, van der Schaar M. Anonymization through data synthesis using generative adversarial networks (ADS-GAN). IEEE J Biomed Health Inform. 2020;24(8):2378–88.

    PubMed  Google Scholar 

  84. He J, Wang C, Jiang D, Li Z, Liu Y, Zhang T. CycleGAN with an improved loss function for cell detection using partly labeled images. IEEE J Biomed Health Inform. 2020;24(9):2473–80.

    PubMed  Google Scholar 

  85. Yoo TK, Choi JY. Outcomes of adversarial attacks on deep learning models for ophthalmology imaging domains. JAMA Ophthalmol. 2020;138(11):1213–5.

    PubMed  PubMed Central  Google Scholar 

  86. Mahapatra D, Antony B, Sedai S, Garnavi R. Deformable medical image registration using generative adversarial networks. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). 2018. p. 1449–53.

  87. Çallı E, Sogancioglu E, van Ginneken B, van Leeuwen KG, Murphy K. Deep learning for chest X-ray analysis: a survey. Med Image Anal. 2021;72:102125.

    PubMed  Google Scholar 

  88. Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing. 2018;321:321–31.

    Google Scholar 

  89. Onishi Y, Teramoto A, Tsujimoto M, Tsukamoto T, Saito K, Toyama H, et al. Automated pulmonary nodule classification in computed tomography images using a deep convolutional neural network trained by generative adversarial networks. Biomed Res Int. 2019;2019:e6051939.

    Google Scholar 

  90. Hu Z, Jiang C, Sun F, Zhang Q, Ge Y, Yang Y, et al. Artifact correction in low-dose dental CT imaging using Wasserstein generative adversarial networks. Med Phys. 2019;46(4):1686–96.

    PubMed  Google Scholar 

  91. Lazaridis G, Lorenzi M, Ourselin S, Garway-Heath D. Improving statistical power of glaucoma clinical trials using an ensemble of cyclical generative adversarial networks. Med Image Anal. 2021;68:101906.

    PubMed  Google Scholar 

  92. Kim M, Kim S, Kim M, Bae HJ, Park JW, Kim N. Realistic high-resolution lateral cephalometric radiography generated by progressive growing generative adversarial network and quality evaluations. Sci Rep. 2021;11(1):12563.

    CAS  PubMed  PubMed Central  Google Scholar 

  93. Teramoto A, Tsukamoto T, Yamada A, Kiriyama Y, Imaizumi K, Saito K, et al. Deep learning approach to classification of lung cytological images: Two-step training using actual and synthesized images by progressive growing of generative adversarial networks. PLoS One. 2020;15(3):e0229951.

    CAS  PubMed  PubMed Central  Google Scholar 

  94. Park JE, Eun D, Kim HS, Lee DH, Jang RW, Kim N. Generative adversarial network for glioblastoma ensures morphologic variations and improves diagnostic model for isocitrate dehydrogenase mutant type. Sci Rep. 2021;11(1):9912.

    CAS  PubMed  PubMed Central  Google Scholar 

  95. Zhao C, Shuai R, Ma L, Liu W, Hu D, Wu M. Dermoscopy image classification based on StyleGAN and DenseNet201. IEEE Access. 2021;9:8659–79.

    Google Scholar 

  96. Zhang X, Song H, Zhang K, Qiao J, Liu Q. Single image super-resolution with enhanced Laplacian pyramid network via conditional generative adversarial learning. Neurocomputing. 2020;398:531–8.

    Google Scholar 

  97. Wang H, Rivenson Y, Jin Y, Wei Z, Gao R, Günaydın H, et al. Deep learning enables cross-modality super-resolution in fluorescence microscopy. Nat Methods. 2019;16(1):103–10.

    CAS  PubMed  Google Scholar 

  98. Armanious K, Jiang C, Fischer M, Küstner T, Hepp T, Nikolaou K, et al. MedGAN: Medical image translation using GANs. Comput Med Imaging Graph. 2020;79:101684.

    PubMed  Google Scholar 

  99. Munawar F, Azmat S, Iqbal T, Grönlund C, Ali H. Segmentation of lungs in chest X-Ray image using generative adversarial networks. IEEE Access. 2020;8:153535–45.

    Google Scholar 

  100. Maspero M, Savenije MHF, Dinkla AM, Seevinck PR, Intven MPW, Jurgenliemk-Schulz IM, et al. Dose evaluation of fast synthetic-CT generation using a generative adversarial network for general pelvis MR-only radiotherapy. Phys Med Biol. 2018;63(18):185001.

    PubMed  Google Scholar 

  101. Moran MBH, Faria MDB, Giraldi GA, Bastos LF, Conci A. Using super-resolution generative adversarial network models and transfer learning to obtain high resolution digital periapical radiographs. Comput Biol Med. 2021;129:104139.

    PubMed  Google Scholar 

  102. Becker AS, Jendele L, Skopek O, Berger N, Ghafoor S, Marcon M, et al. Injecting and removing suspicious features in breast imaging with CycleGAN: a pilot study of automated adversarial attacks using neural networks on small images. Eur J Radiol. 2019;120:108649.

    PubMed  Google Scholar 

  103. Sandfort V, Yan K, Pickhardt PJ, Summers RM. Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks. Sci Rep. 2019;9(1):16884.

    PubMed  PubMed Central  Google Scholar 

  104. Yoo TK, Choi JY, Jang Y, Oh E, Ryu IH. Toward automated severe pharyngitis detection with smartphone camera using deep learning networks. Comput Biol Med. 2020;125:103980.

    CAS  PubMed  PubMed Central  Google Scholar 

  105. Jafari MH, Girgis H, Van Woudenberg N, Moulson N, Luong C, Fung A, et al. Cardiac point-of-care to cart-based ultrasound translation using constrained CycleGAN. Int J Comput Assist Radiol Surg. 2020;15(5):877–86.

    PubMed  Google Scholar 

  106. Yang T, Wu T, Li L, Zhu C. SUD-GAN: deep convolution generative adversarial network combined with short connection and dense block for retinal vessel segmentation. J Digit Imaging. 2020;33(4):946–57.

    PubMed  PubMed Central  Google Scholar 

  107. Zhou Y, Chen Z, Shen H, Zheng X, Zhao R, Duan X. A refined equilibrium generative adversarial network for retinal vessel segmentation. Neurocomputing. 2021;437:118–30.

    Google Scholar 

  108. Lazaridis G, Lorenzi M, Mohamed-Noriega J, Aguilar-Munoa S, Suzuki K, Nomoto H, et al. OCT signal enhancement with deep learning. Ophthalmol Glaucoma. 2021;4(3):295–304.

    PubMed  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

AY conceived the idea, interpreted the literature, and drafted the manuscript. JJK reviewed the manuscript critically. IHR interpreted the several studies and revised the manuscript. TKY designed the study, interpreted the literature, and reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Tae Keun Yoo.

Ethics declarations

Ethics approval and consent to participate

Not applicable. Ethical approval was not sought, as the study was based entirely on previously published data. All procedures were performed in accordance with the ethical standards of the 1964 Helsinki Declaration and its amendments.

Consent for publication

Not applicable.

Competing interests

Jin Kuk Kim and Ik Hee Ryu are executives of VISUWORKS, Inc., which is a Korean Artificial Intelligence company providing medical machine learning solutions. Jin Kuk Kim is also an executive of the Korea Intelligent Medical Industry Association. They received salaries or stocks as part of the standard compensation package. The remaining authors declare no conflict of interest.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

You, A., Kim, J.K., Ryu, I.H. et al. Application of generative adversarial networks (GAN) for ophthalmology image domains: a survey. Eye and Vis 9, 6 (2022). https://doi.org/10.1186/s40662-022-00277-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40662-022-00277-3

Keywords