Xiaodan Zhang - Internet Archive Scholar

Generative Adversarial Networks (GANs) have recently achieved significant improvement on paired/unpaired image-to-image translation, such as photo→ sketch and artist painting style transfer. However, existing models can only be capable of transferring the low-level information (e.g. color or texture changes), but fail to edit high-level semantic meanings (e.g., geometric structure or content) of objects. On the other hand, while some researches can synthesize compelling real-world images given

more »

... class label or caption, they cannot condition on arbitrary shapes or structures, which largely limits their application scenarios and interpretive capability of model results. In this work, we focus on a more challenging semantic manipulation task, which aims to modify the semantic meaning of an object while preserving its own characteristics (e.g. viewpoints and shapes), such as cow→sheep, motor→ bicycle, cat→dog. To tackle such large semantic changes, we introduce a contrasting GAN (contrast-GAN) with a novel adversarial contrasting objective. Instead of directly making the synthesized samples close to target data as previous GANs did, our adversarial contrasting objective optimizes over the distance comparisons between samples, that is, enforcing the manipulated data be semantically closer to the real data with target category than the input data. Equipped with the new contrasting objective, a novel mask-conditional contrast-GAN architecture is proposed to enable disentangle image background with object semantic changes. Experiments on several semantic manipulation tasks on ImageNet and MSCOCO dataset show considerable performance gain by our contrast-GAN over other conditional GANs. Quantitative results further demonstrate the superiority of our model on generating manipulated results with high visual fidelity and reasonable object semantics.

arXiv:1708.00315v1 fatcat:2ih6k4dukra4xkgflrvvhthc4a

In this paper, we propose a comprehensive benchmark to investigate models' logical reasoning capabilities in complex real-life scenarios. Current explanation datasets often employ synthetic data with simple reasoning structures. Therefore, it cannot express more complex reasoning processes, such as the rebuttal to a reasoning step and the degree of certainty of the evidence. To this end, we propose a comprehensive logical reasoning explanation form. Based on the multi-hop chain of reasoning,

more »

... explanation form includes three main components: (1) The condition of rebuttal that the reasoning node can be challenged; (2) Logical formulae that uncover the internal texture of reasoning nodes; (3) Reasoning strength indicated by degrees of certainty. The fine-grained structure conforms to the real logical reasoning scenario, better fitting the human cognitive process but, simultaneously, is more challenging for the current models. We evaluate the current best models' performance on this new explanation form. The experimental results show that generating reasoning graphs remains a challenging task for current models, even with the help of giant pre-trained language models.

arXiv:2210.12487v1 fatcat:rnv2anleqjg3ha4nvk3kfpepmq

doi:10.1055/a-1194-4745 pmid:32967025 fatcat:cgy4mqnrafht3bj2dmw5punobu

Panoptic segmentation that unifies instance segmentation and semantic segmentation has recently attracted increasing attention. While most existing methods focus on designing novel architectures, we steer toward a different perspective: performing automated multi-loss adaptation (named Ada-Segment) on the fly to flexibly adjust multiple training losses over the course of training using a controller trained to capture the learning dynamics. This offers a few advantages: it bypasses manual tuning

more »

... of the sensitive loss combination, a decisive factor for panoptic segmentation; it allows to explicitly model the learning dynamics, and reconcile the learning of multiple objectives (up to ten in our experiments); with an end-to-end architecture, it generalizes to different datasets without the need of re-tuning hyperparameters or re-adjusting the training process laboriously. Our Ada-Segment brings 2.7% panoptic quality (PQ) improvement on COCO val split from the vanilla baseline, achieving the state-of-the-art 48.5% PQ on COCO test-dev split and 32.9% PQ on ADE20K dataset. The extensive ablation studies reveal the ever-changing dynamics throughout the training process, necessitating the incorporation of an automated and adaptive learning strategy as presented in this paper.

arXiv:2012.03603v1 fatcat:6unlyvl76bdhjag2ccpfmfcfge

For example, Zhang et al. [17] proposed a simple learning principle, MixUp, to reduce memory and sensitivity to antagonistic examples of large deep neural networks. Berthelot et al. ...

doi:10.3390/math10152609 fatcat:f2dy4xxaovgdrhouad2zxfmznq

DOAJ Szczepanski

doi:10.3724/sp.j.1042.2021.00460 fatcat:njrxyqhw7nda3jfzvtucmkt5ti

Purpose To investigate the changes of intraocular pressure (IOP) induced by 3-diopter (3 D) accommodation in progressing myopes, stable myopes and emmetropes. Design Cross-sectional study. Participants 318 subjects including 270 myopes and 48 emmetropes. Methods 195 progressing myopes, 75 stable myopes and 48 emmetropes participated in this study. All subjects had their IOP measured using iCare rebound tonometer while accommodative stimuli of 0 D and 3 D were presented. Main Outcome Measures

more »

... values without accommodation and with 3 D accommodation were measured in all subjects. Baseline IOPs and IOP changes were compared within and between groups. Results There was no significant difference in IOPs between progressing myopes, stable myopes and emmetropes when no accommodation was induced (17.47±3.46, 16.62±2.98 and 16.80±3.62 respectively, p>0.05). IOP experienced an insignificantly slight decrease after 3 D accommodation in three groups (mean change -0.19±2.16, -0.03±1.68 and -0.39±2.65 respectively, p>0.05). Subgroup analysis showed in progressing myopic group, IOP of children (<18 years old) declined with accommodation while IOP of adults (18 years) increased, and the difference was statistically significant (p = 0.008). However, after PLOS ONE |

doi:10.1371/journal.pone.0141839 pmid:26517725 pmcid:PMC4627769 fatcat:fro4wtjuuzgtleuai5qovvljfu

DOAJ

In this paper, we consider a challenging secure wireless sensing scenario where a legitimate radar station (LRS) intends to detect a target at unknown location in the presence of an unauthorized radar station (URS). We aim to enhance the sensing performance of the LRS and in the meanwhile prevent the detection of the same target by the URS. Under this setup, conventional stealth-based approaches such as wrapping the target with electromagnetic wave absorbing materials are not applicable, since

more »

... hey will disable the target detection by not only the URS, but the LRS as well. To tackle this challenge, we propose in this paper a new target-mounted IRS approach, where intelligent reflecting surface (IRS) is mounted on the outer/echo surface of the target and by tuning the IRS reflection, the strength of its reflected radar signal in any angle of departure (AoD) can be adjusted based on the signal's angle of arrival (AoA), thereby enhancing/suppressing the signal power towards the LRS/URS, respectively. To this end, we propose a practical protocol for the target-mounted IRS to estimate the LRS/URS channel and waveform parameters based on its sensed signals and control the IRS reflection for/against the LRS/URS accordingly. Specifically, we formulate new optimization problems to design the reflecting phase shifts at IRS for maximizing the received signal power at the LRS while keeping that at the URS below a certain level, for both the cases of short-term and long-term IRS operations with different dynamic reflection capabilities. To solve these non-convex problems, we apply the penalty dual decomposition method to obtain high-quality suboptimal solutions for them efficiently. Finally, simulation results are presented that verify the effectiveness of the proposed protocol and algorithms for the target-mounted IRS to achieve secure wireless sensing, as compared with various benchmark schemes.

arXiv:2308.02676v1 fatcat:oyemkxr3yvgnjhedkvhz3v6fae

Open Access

Interactive fashion image manipulation, which enables users to edit images with sketches and color strokes, is an interesting research problem with great application value. Existing works often treat it as a general inpainting task and do not fully leverage the semantic structural information in fashion images. Moreover, they directly utilize conventional convolution and normalization layers to restore the incomplete image, which tends to wash away the sketch and color information. In this

more »

... , we propose a novel Fashion Editing Generative Adversarial Network (FE-GAN), which is capable of manipulating fashion images by free-form sketches and sparse color strokes. FE-GAN consists of two modules: 1) a free-form parsing network that learns to control the human parsing generation by manipulating sketch and color; 2) a parsing-aware inpainting network that renders detailed textures with semantic guidance from the human parsing map. A new attention normalization layer is further applied at multiple scales in the decoder of the inpainting network to enhance the quality of the synthesized image. Extensive experiments on high-resolution fashion image datasets demonstrate that the proposed method significantly outperforms the state-of-the-art methods on image manipulation.

arXiv:1906.00884v2 fatcat:3i2kxaal7rh65dh7kw6absgila

Multiple Versions

Hao Zhang is supported by the AFRL/DARPA project FA872105C0003. Xiaodan Liang is supported by award FA870215D0002. ...

arXiv:1711.00889v1 fatcat:tspqsn3zfjfwnacjgg5bru2bp4

Authors' Contributions Hang Song and Mingzhou Zhang contributed equally to this work. ...

doi:10.1155/2017/1247138 pmid:28321333 pmcid:PMC5339423 fatcat:juod4wfe6ve7rnd3pva57y7e3e

DOAJ

The pre-trained vision-language model, exemplified by CLIP, advances zero-shot semantic segmentation by aligning visual features with class embeddings through a transformer decoder to generate semantic masks. Despite its effectiveness, prevailing methods within this paradigm encounter challenges, including overfitting on seen classes and small fragmentation in masks. To mitigate these issues, we propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic

more »

... and visual information.Specifically, we leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings. Moreover, to circumvent noisy alignments from the vision part due to its redundant nature, we introduce route attention into self-attention for finding visual consensus, thereby enhancing semantic consistency within the same object. Equipped with a vision-language prompting strategy, our approach significantly boosts the generalization capacity of segmentation models for unseen classes. Experimental results underscore the effectiveness of our approach, showcasing mIoU gains of 4.5 on the PASCAL VOC 2012 and 3.6 on the COCO-Stuff 164k for unseen classes compared with the state-of-the-art methods.

arXiv:2403.08426v1 fatcat:ueqdlcjg45a3bcpggq4c7tdl3i

Lou, 1 Peiteng Shi, 2 Jun Wang, 2 Xiaohan Huang, 2 and Jiang Zhang 1, * 1 School of Systems Sciences, Beijing Normal University, Beijing, China 2 Science and Technology on Information Systems Engineering ... products 5 Agriculture, hunting, forestry and fishing . . . 32 Electricity, gas and water supply 33 Hotels and restaurants 34 Construction 35 Education 36 Health and social work Liangzhu Guo, 1 Xiaodan ...

arXiv:1501.06058v1 fatcat:cu5ylvr4dbcard3rxpkcrvtsea

We thank Jiaqi Li and Qianyu Zhang for their help in this project. ...

arXiv:2008.00563v1 fatcat:kzjovysh2zb2bcuws37ev7zxbe

Zhang and co-workers constructed a monolithic robust actuator of a binary cooperative Janus, which was synthesized by interfacial polymerization of immiscible hydrophilic and hydrophobic vinyl monomer ...

doi:10.3390/polym13111753 pmid:34072009 fatcat:kbf2xoxm3rftza3crz37b7oc74

DOAJ

Generative Semantic Manipulation with Contrasting GAN [article]

Preserved Fulltext

MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure [article]

Preserved Fulltext

ERCP during the COVID-19 epidemic

Preserved Fulltext

Ada-Segment: Automated Multi-loss Adaptation for Panoptic Segmentation [article]

Preserved Fulltext

Theme-Aware Semi-Supervised Image Aesthetic Quality Assessment

Preserved Fulltext

Behavioral oscillations in attentional processing

Preserved Fulltext

Intraocular Pressure Changes during Accommodation in Progressing Myopes, Stable Myopes and Emmetropes

Preserved Fulltext

Target-Mounted Intelligent Reflecting Surface for Secure Wireless Sensing [article]

Preserved Fulltext

Fashion Editing with Adversarial Parsing Learning [article]

Preserved Fulltext

Other Versions

Structured Generative Adversarial Networks [article]

Preserved Fulltext

Correlation Analysis of Ocular Symptoms and Signs in Patients with Dry Eye

Preserved Fulltext

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation [article]

Preserved Fulltext

Flow Distances on Open Flow Networks [article]

Preserved Fulltext

SemEval-2020 Task 5: Counterfactual Recognition [article]

Preserved Fulltext

Smart Hydrogel Bilayers Prepared by Irradiation

Preserved Fulltext