Divergent CNN Architectures for Novel Image Generation: A Dual-Pipeline Approach

Al-Obaidi, Haitham Sabah Husin; Kurnaz, Sefer

doi:10.1007/s11277-023-10758-w

Divergent CNN Architectures for Novel Image Generation: A Dual-Pipeline Approach

Published: 30 November 2023

(2023)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Haitham Sabah Husin Al-Obaidi¹ &
Sefer Kurnaz¹

126 Accesses
Explore all metrics

Abstract

In this paper, we present a novel approach to image generation using two convolutional neural network (CNN) algorithms, operating in a complementary manner. One CNN is designed for feature extraction from a dataset of images, while the other is responsible for synthesizing a random image based on the extracted features. The resulting image, which is distinct from any image in the dataset, retains the overall characteristics of the original set. To demonstrate the effectiveness of our approach, we employ two distinct datasets: one containing human faces and the other featuring various animal species. Our method outperforms state-of-the-art techniques in terms of accuracy, F1 score, recall, peak signal-to-noise ratio (PSNR), and mean squared error (MSE). Additionally, we provide a comprehensive review of related works in the field of image generation and CNNs, and we thoroughly analyze the advantages and disadvantages of the proposed dual-pipeline CNN method. The results indicate that our approach is a promising alternative for generating high-quality, unique images, with potential applications in various domains, including computer vision, graphics, and data augmentation. Future work will focus on enhancing the efficiency of the method, extending its applicability to other domains, and exploring additional evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Generation: A Review

Article 11 March 2022

Improved InfoGAN: Generating High Quality Images with Learning Disentangled Representation

Recognition of Natural and Computer-Generated Images Using Convolutional Neural Network

Availability of Data and Material

Not applicable.

References

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–444.
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial networks. Advances in Neural Information Processing Systems. https://doi.org/10.1145/3422622
Article Google Scholar
Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. In Proceedings of international conference on learning representations.
Dong, X., et al. (2021). Peco: Perceptual codebook for bert pretraining of vision transformers. https://arxiv.org/abs/2111.12710
Szegedy, C., et al. (2015). Going deeper with convolutions. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1–9).
Abdulrahman, S. A., & Alhayani, B. (2023). A comprehensive survey on the biometric systems based on physiological and behavioural characteristics. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2021.07.005
Article Google Scholar
Zhang, Y., Han, S., Zhang, Z., Wang, J., & Bi, H. (2022). CF-GAN: Cross-domain feature fusion generative adversarial network for text-to-image synthesis. The Visual Computer. https://doi.org/10.1016/j.matpr.2021.07.005
Article Google Scholar
Sabri, T., & Alhayani, B. (2022). Network page building methodical reviews using involuntary manuscript classification procedures founded on deep learning. In 2022 International conference on electrical, computer, communications and mechatronics engineering (ICECCME), Maldives (pp. 1–8). https://doi.org/10.1109/ICECCME55909.2022.9988457.
AlKawak, O. A., Ozturk, B. A., Jabbar, Z. S., & Mohammed, H. J. (2023). Quantum optics in visual sensors and adaptive optics by quantum vacillations of laser beams wave propagation apply in data mining. Optik, 273, 170396.
Article Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings IEEE conference on computer vision and pattern recognition (pp. 770–778).
Bassel, A., Abdulkareem, A. B., Alyasseri, Z. A. A., Sani, N. S., & Mohammed, H. J. (2022). Automatic malignant and benign skin cancer classification using a hybrid deep learning approach. Diagnostics. https://doi.org/10.3390/diagnostics12102472
Article Google Scholar
Hashemi, S., Mohammed, H. J., Kiumarsi, S., Kee, D. M. H., & Anarestani, B. B. (2021). Destinations food image and food neophobia on behavioral intentions: culinary tourist behavior in Malaysia. Journal of International Food and Agribusiness Marketing. https://doi.org/10.1080/08974438.2021.1943101
Article Google Scholar
Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In Proceedings of international conference on learning representations.
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European conference on computer vision (pp. 694–711).
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. (2016). InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets.
Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of international conference on learning representations.
Brock, A., Donahue, J., & Simonyan, K. (2019). Large scale GAN training for high fidelity natural image synthesis. In Proceedings of international conference on learning representations.
Chen, L., Srivastava, S., Duan, Z., & Xu, C. (2017). Deep cross-modal audio-visual generation. In Proceedings of the ACM multimedia thematic workshops (pp. 349–357).
Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). Texture synthesis using convolutional neural networks. Advances in Neural Information Processing Systems, vol. 28.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 4700–4708).
Ramesh, A., Pavlov, M., Goh, G., & Gray, S. (2021). DALL·E: Creating images from text. OpenAI, tech.rep.
Liu, Z., Luo, P., Wang, X., Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (pp. 3730–3738).
Yu, Y., et al. (2021). Unbalanced feature transport for exemplar-based image translation. In Proceedings of IEEE conference on computer vision and pattern recognition.
Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. https://arxiv.org/abs/1411.1784
Isola, P., Zhu, J.-Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1125–1134).
Krizhevsky, A., Sutskever, I. &Hinton, G. H. (2012). ImageNet classification with deep convolutional neural networks. In Proceedings of advances in neural information processing systems (pp. 1097–1105).
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016). Improved techniques for training GANs.
Ronneberger, O., Fischer, & P., Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of international conference on medical image computing and computer-assisted intervention (pp. 234–241).
Dumoulin, V., et al. (2017). Adversarially learned inference. In Proceedings of International Conference on Learning Representations.
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 3431–3440). https://doi.org/10.1109/CVPR.2015.7298965.
Miyato, T., Kataoka, T., Koyama, M. & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. In Proceedings of international conference on learning representations.
Howard, G., et al. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. https://arxiv.org/abs/1704.04861
Zhang, H., Goodfellow, I., Metaxas, D., & Odena, A. (2019). Self-attention generative adversarial networks. In Proceedings of international conference on machine learning (pp. 7354–7363).
Chen, T., Lucic, M., Houlsby, N., & Gelly, S. (2019). On self-modulation for generative adversarial networks. In Proceedings of international conference on learning representations.
Thies, J., Zollhöfer, M., & Nießner, M. (2019). Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics, 38(4), 1–12.
Article Google Scholar
Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J. (2019). Stacked convolutional auto-encoders for hierarchical feature extraction. In Proceedings of international conference on artificial neural networks (pp. 52–59).
Gatys, L., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 2414–2423).
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 4401–4410).
Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 586–595).
Lotter, W., Kreiman, G., & Cox, D. (2017). Deep predictive coding networks for video prediction and unsupervised learning. In Proceedings of international conference on learning representations.
Theis, L., van den Oord, A., & Bethge, M. (2016). A note on the evaluation of generative models. In Proceedings of international conference on learning representations.
Y. Taigman, A. Polyak, and L. Wolf, "Unsupervised cross-domain image generation," in Proc. International Conference on Learning Representations, 2017.
Huang, X., & Belongie, S. J. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of IEEE international conference on computer vision (pp. 1501–1510).
Chen, Y., Wang, Y., Kao, M., & Chuang, Y. (2018). Deep photo enhancer: Unpaired learning for image enhancement from photographs with GANs. In Proceedings IEEE conference on computer vision and pattern recognition workshops (pp. 630–638).
Zhao, H., Gallo, O., Frosio, I., & Kautz, J. (2017). Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging, 3(1), 47–57.
Article Google Scholar
Dosovitskiy, A., & Brox, T. (2016). Generating images with perceptual similarity metrics based on deep networks. In Proceedings of advances in neural information processing systems (pp. 658–666).
Reed, S., et al. (2016). Generative adversarial text to image synthesis. In Proceedings of international conference on machine learning (pp. 1060–1069).
Bojanowski, P., Joulin, A., Lopez-Paz, D., & Szlam A. (2018). Optimizing the latent space of generative networks. In Proceedings of international conference on machine learning (pp. 619–628).

Download references

Acknowledgements

Not applicant.

Funding

This work received no specific funding.

Author information

Authors and Affiliations

Department of Electrical Computer Engineering, Altinbas University, Istanbul, Turkey
Haitham Sabah Husin Al-Obaidi & Sefer Kurnaz

Authors

Haitham Sabah Husin Al-Obaidi
View author publications
You can also search for this author in PubMed Google Scholar
Sefer Kurnaz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors contributed significantly to the research and this paper, and the first author is the main contributor.

Corresponding author

Correspondence to Haitham Sabah Husin Al-Obaidi.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Informed Consent

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Al-Obaidi, H.S.H., Kurnaz, S. Divergent CNN Architectures for Novel Image Generation: A Dual-Pipeline Approach. Wireless Pers Commun (2023). https://doi.org/10.1007/s11277-023-10758-w

Download citation

Accepted: 31 October 2023
Published: 30 November 2023
DOI: https://doi.org/10.1007/s11277-023-10758-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Divergent CNN Architectures for Novel Image Generation: A Dual-Pipeline Approach

Abstract

Access this article

Similar content being viewed by others

Image Generation: A Review

Improved InfoGAN: Generating High Quality Images with Learning Disentangled Representation

Recognition of Natural and Computer-Generated Images Using Convolutional Neural Network

Availability of Data and Material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Informed Consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Divergent CNN Architectures for Novel Image Generation: A Dual-Pipeline Approach

Abstract

Access this article

Similar content being viewed by others

Image Generation: A Review

Improved InfoGAN: Generating High Quality Images with Learning Disentangled Representation

Recognition of Natural and Computer-Generated Images Using Convolutional Neural Network

Availability of Data and Material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Informed Consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation