Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Divergent CNN Architectures for Novel Image Generation: A Dual-Pipeline Approach

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

In this paper, we present a novel approach to image generation using two convolutional neural network (CNN) algorithms, operating in a complementary manner. One CNN is designed for feature extraction from a dataset of images, while the other is responsible for synthesizing a random image based on the extracted features. The resulting image, which is distinct from any image in the dataset, retains the overall characteristics of the original set. To demonstrate the effectiveness of our approach, we employ two distinct datasets: one containing human faces and the other featuring various animal species. Our method outperforms state-of-the-art techniques in terms of accuracy, F1 score, recall, peak signal-to-noise ratio (PSNR), and mean squared error (MSE). Additionally, we provide a comprehensive review of related works in the field of image generation and CNNs, and we thoroughly analyze the advantages and disadvantages of the proposed dual-pipeline CNN method. The results indicate that our approach is a promising alternative for generating high-quality, unique images, with potential applications in various domains, including computer vision, graphics, and data augmentation. Future work will focus on enhancing the efficiency of the method, extending its applicability to other domains, and exploring additional evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of Data and Material

Not applicable.

References

  1. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–444.

    Article  Google Scholar 

  2. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial networks. Advances in Neural Information Processing Systems. https://doi.org/10.1145/3422622

    Article  Google Scholar 

  3. Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. In Proceedings of international conference on learning representations.

  4. Dong, X., et al. (2021). Peco: Perceptual codebook for bert pretraining of vision transformers. https://arxiv.org/abs/2111.12710

  5. Szegedy, C., et al. (2015). Going deeper with convolutions. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1–9).

  6. Abdulrahman, S. A., & Alhayani, B. (2023). A comprehensive survey on the biometric systems based on physiological and behavioural characteristics. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2021.07.005

    Article  Google Scholar 

  7. Zhang, Y., Han, S., Zhang, Z., Wang, J., & Bi, H. (2022). CF-GAN: Cross-domain feature fusion generative adversarial network for text-to-image synthesis. The Visual Computer. https://doi.org/10.1016/j.matpr.2021.07.005

    Article  Google Scholar 

  8. Sabri, T., & Alhayani, B. (2022). Network page building methodical reviews using involuntary manuscript classification procedures founded on deep learning. In 2022 International conference on electrical, computer, communications and mechatronics engineering (ICECCME), Maldives (pp. 1–8). https://doi.org/10.1109/ICECCME55909.2022.9988457.

  9. AlKawak, O. A., Ozturk, B. A., Jabbar, Z. S., & Mohammed, H. J. (2023). Quantum optics in visual sensors and adaptive optics by quantum vacillations of laser beams wave propagation apply in data mining. Optik, 273, 170396.

    Article  Google Scholar 

  10. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings IEEE conference on computer vision and pattern recognition (pp. 770–778).

  11. Bassel, A., Abdulkareem, A. B., Alyasseri, Z. A. A., Sani, N. S., & Mohammed, H. J. (2022). Automatic malignant and benign skin cancer classification using a hybrid deep learning approach. Diagnostics. https://doi.org/10.3390/diagnostics12102472

    Article  Google Scholar 

  12. Hashemi, S., Mohammed, H. J., Kiumarsi, S., Kee, D. M. H., & Anarestani, B. B. (2021). Destinations food image and food neophobia on behavioral intentions: culinary tourist behavior in Malaysia. Journal of International Food and Agribusiness Marketing. https://doi.org/10.1080/08974438.2021.1943101

    Article  Google Scholar 

  13. Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In Proceedings of international conference on learning representations.

  14. Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European conference on computer vision (pp. 694–711).

  15. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. (2016). InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets.

  16. Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of international conference on learning representations.

  17. Brock, A., Donahue, J., & Simonyan, K. (2019). Large scale GAN training for high fidelity natural image synthesis. In Proceedings of international conference on learning representations.

  18. Chen, L., Srivastava, S., Duan, Z., & Xu, C. (2017). Deep cross-modal audio-visual generation. In Proceedings of the ACM multimedia thematic workshops (pp. 349–357).

  19. Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). Texture synthesis using convolutional neural networks. Advances in Neural Information Processing Systems, vol. 28.

  20. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 4700–4708).

  21. Ramesh, A., Pavlov, M., Goh, G., & Gray, S. (2021). DALL·E: Creating images from text. OpenAI, tech.rep.

  22. Liu, Z., Luo, P., Wang, X., Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (pp. 3730–3738).

  23. Yu, Y., et al. (2021). Unbalanced feature transport for exemplar-based image translation. In Proceedings of IEEE conference on computer vision and pattern recognition.

  24. Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. https://arxiv.org/abs/1411.1784

  25. Isola, P., Zhu, J.-Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1125–1134).

  26. Krizhevsky, A., Sutskever, I. &Hinton, G. H. (2012). ImageNet classification with deep convolutional neural networks. In Proceedings of advances in neural information processing systems (pp. 1097–1105).

  27. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016). Improved techniques for training GANs.

  28. Ronneberger, O., Fischer, & P., Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of international conference on medical image computing and computer-assisted intervention (pp. 234–241).

  29. Dumoulin, V., et al. (2017). Adversarially learned inference. In Proceedings of International Conference on Learning Representations.

  30. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 3431–3440). https://doi.org/10.1109/CVPR.2015.7298965.

  31. Miyato, T., Kataoka, T., Koyama, M. & Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. In Proceedings of international conference on learning representations.

  32. Howard, G., et al. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. https://arxiv.org/abs/1704.04861

  33. Zhang, H., Goodfellow, I., Metaxas, D., & Odena, A. (2019). Self-attention generative adversarial networks. In Proceedings of international conference on machine learning (pp. 7354–7363).

  34. Chen, T., Lucic, M., Houlsby, N., & Gelly, S. (2019). On self-modulation for generative adversarial networks. In Proceedings of international conference on learning representations.

  35. Thies, J., Zollhöfer, M., & Nießner, M. (2019). Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics, 38(4), 1–12.

    Article  Google Scholar 

  36. Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J. (2019). Stacked convolutional auto-encoders for hierarchical feature extraction. In Proceedings of international conference on artificial neural networks (pp. 52–59).

  37. Gatys, L., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 2414–2423).

  38. Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 4401–4410).

  39. Zhang, R., Isola, P., Efros, A. A., Shechtman, E., Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 586–595).

  40. Lotter, W., Kreiman, G., & Cox, D. (2017). Deep predictive coding networks for video prediction and unsupervised learning. In Proceedings of international conference on learning representations.

  41. Theis, L., van den Oord, A., & Bethge, M. (2016). A note on the evaluation of generative models. In Proceedings of international conference on learning representations.

  42. Y. Taigman, A. Polyak, and L. Wolf, "Unsupervised cross-domain image generation," in Proc. International Conference on Learning Representations, 2017.

  43. Huang, X., & Belongie, S. J. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of IEEE international conference on computer vision (pp. 1501–1510).

  44. Chen, Y., Wang, Y., Kao, M., & Chuang, Y. (2018). Deep photo enhancer: Unpaired learning for image enhancement from photographs with GANs. In Proceedings IEEE conference on computer vision and pattern recognition workshops (pp. 630–638).

  45. Zhao, H., Gallo, O., Frosio, I., & Kautz, J. (2017). Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging, 3(1), 47–57.

    Article  Google Scholar 

  46. Dosovitskiy, A., & Brox, T. (2016). Generating images with perceptual similarity metrics based on deep networks. In Proceedings of advances in neural information processing systems (pp. 658–666).

  47. Reed, S., et al. (2016). Generative adversarial text to image synthesis. In Proceedings of international conference on machine learning (pp. 1060–1069).

  48. Bojanowski, P., Joulin, A., Lopez-Paz, D., & Szlam A. (2018). Optimizing the latent space of generative networks. In Proceedings of international conference on machine learning (pp. 619–628).

Download references

Acknowledgements

Not applicant.

Funding

This work received no specific funding.

Author information

Authors and Affiliations

Authors

Contributions

The authors contributed significantly to the research and this paper, and the first author is the main contributor.

Corresponding author

Correspondence to Haitham Sabah Husin Al-Obaidi.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Informed Consent

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Obaidi, H.S.H., Kurnaz, S. Divergent CNN Architectures for Novel Image Generation: A Dual-Pipeline Approach. Wireless Pers Commun (2023). https://doi.org/10.1007/s11277-023-10758-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11277-023-10758-w

Keywords