About Me

Introduction

Hi! My name is Changmeng Zheng (郑昌萌). I am currently a Research Assistant Professor of the Department of Computing at the Hong Kong Polytechnic University (PolyU).

Education

I received my Ph.D degree in Computer Science from the Hong Kong Polytechnic University, supervised by Prof. Qing Li. From 2023 to 2024, I was a Research Scholar of NExT++ Lab, under the supervision of Prof. Tat-Seng Chua. Prior to that, I obtained my Master's and Bachelor's degree in Software Engineering from South China University of Technology, advised by Prof. Yi Cai.

Openings

Our research group are actively recruiting self-motivated Postdoc, Ph.D. students, Dual PhD Degree students, MPhil/Msc, and Research Assistants, etc. Visiting scholars, interns, and self-funded students are also welcome. Please refer to this section for more detail. Send me an email if you are interested.

Dr. Changmeng Zheng

Research Assistant Professor

News

2025.08: We will organize an INLG 2025 workshop "LLM Reasoning on Medicine: Challenges, Opportunities, and Future" in Vietnam, Oct 30, 2025. Please find more information in INLG 2025 website.
2024.10: Our paper "A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning" has been nominated as Best Paper Award in ACM Multimedia 2024! Congratulations to all co-authors.
2024.07: One paper on Multimodal Reasoning is accepted by ACM Multimedia 2024.
2023.05: Two papers on Relation Extraction and Scene Graph Generation are accepted by IEEE TCSVT.
2023.05: Two paper on Multimodal Entity and Relation Extraction is accepted by ACL 2023.

Research Outline

I am generally interested in multimodal large language models, knowledge graphs and social media analytics. In particular, my research focuses on the extraction, retrieval and question-answering of text, video, and live media arising from the web and social networks. Apart from that, I have also extensively explored interdisciplinary areas and applications of artificial intelligence, including but not limited to: medical data analysis, marine trajectory prediction and desktop application visualization.

My list of publications can be found on Google Scholar.

Multimodal
LLMs

Knowledge
Graphs

Interdisciplinary
Applications

My research addresses the challenges of fine-grained reasoning and hallucination in MLLMs by exploring innovative approaches through multi-agent collaborative frameworks and knowledge-augmented architectures.

I'm interested in empowering current deep learning systems through structured knowledge representations, facilitating robust cross-modal reasoning and enabling seamless connections across diverse modalities.

My research focuses on developing deep learning approaches to multimodal data analysis, bridging diverse disciplines to solve complex real-world problems through the integration of multiple data types and domain knowledge..

Services

I'm excited to serve the research community in various aspects. I served as committee members for AI conferences including ACL, EMNLP, AAAI, IJCAI and ACM MM for over 5 years, and I'm serving as the journal reviewer for IEEE TASLP, TIP, TCSVT, TAI, TCSS and TCDS. In addition, I lead or participate many national funding projects like China National Programs for Science and Technology, China National Natural Science Foundation and the Major Research Project in Guangzhou City, collaborating with Peng Cheng Lab, Kingsoft and TCL.

My Past Teaching Experience

Computer Vision (COMP 4423) at PolyU
Programming Fundamentals and Applications (COMP 1012) at PolyU
Database Systems (COMP 2411) at PolyU

Selected Publications

A few selected publications are listed for each research direction. See Google Scholar for a full list of publications.
Most of the algorithms developed are incorporated into my Github repository.

Multimodal Representation Learning and Alignment

I work on representation learning and alignment of multiple modalities via structured knowledge.

Blueprint Debate on Graph

Translation-motivated Multimodal Representation

Multimodal Relation Extraction with Efficient Graph Alignment

ACM MM 2024

Blueprint confines the scope of multi-agent debate in multimodal reasoning.

ACL 2023

Cross-modal misalignment is similar to cross-lingual divergence issue.

ACM MM 2021

A dual-graph alignment captures the image-text correlation.

Cross-modal Knowledge for Relation Extraction

Object-aware Multimodal Named Entity Recognition

Pair-wise Relational Scene Graph Generation

IEEE TCSVT

Concept graphs bridge multimodal semantic gaps.

IEEE TMM

Fine-grained image-text alignment with adversarial representation fusion.

IEEE TCSVT

Pair-wise information accounts more on scene graph generation.

Multimodal Learning

Knowledge Graph Construction and Application

I innovate in named entity recognition, relation extraction and KG-based applications including reasoning, summarization and question answering.

Boundary-aware Nested Named Entity Recognition

Controllable Summarization with Guiding Entities

Visual Object Embeddings for Multimodal Named Entity Recognition

EMNLP 2019

Determining entity boundaries before category classification is important for nested NER.

COLING 2020

A dual LSTM framework starting from extracted entities preserves key information.

ACM MM 2020

Mapping fine-grained visual objects into embeddings reduces modality disparity.

Unsupervised Cross-domain Named Entity Recognition

Dual-channel Graph Convolutional Network for VQA

Bridge-based Cross-domain Named Entity Recognition

Neural Networks

Unsupervised entity-aware adversarial training relieves domain divergence in NER.

ACL 2020

Dual channel graph captures the object and syntactic relations simultaneously.

ACL 2023 Findings

Contrastive learning can refine the original representations of entities in different domains.

Knowledge Graph

Interdisciplinary Applications

My research encompasses diverse interdisciplinary applications of artificial intelligence, with particular emphasis on healthcare analytics, maritime intelligence and software engineering.

Deep-Learning Techniques Pneumonia Diagnosis in Pediatric Chest Radiographs

Diagnosing the Etiology of Pneumonia on Pediatric Chest X-rays

Heterogenous Multi-Source Fusion for Ship Trajectory Complement and Prediction

Pediatric Pulmonology

Pulmonary-thoracic segmentations can improve pneumonia diagnosis accuracy in pediatric chest radiographs.

Pediatric Pulmonology

A deep-learning model in classifying the etiology of pneumonia on pediatric chest X-rays with that of human readers.

DSC 2020

Our method makes better utilization of AIS, GPS and ARPA radar information to predict ship trajectory precisely.

Incorporating Concept Information into Term Weighting Schemes for Topic Models

Cross-Modality Adversarial Network for Visual Question Answering

A Challenging Dataset for Multimodal Relation Extraction

DASFAA 2020

CEP scheme and DCEP scheme, to improve the topic coherence by incorporating the concept information of the entities.

APWeb 2021

Cross-modality adversarial training and modality-invariant attention bring better semantic alignment for VQA.

ICME 2021

The first multimodal relation extraction dataset consisting of 10000+ sentences on 31 relations derived from Twitter.

Applications

Openings

For prospective students, I appreciate reading the following before reaching out to me through email. To make it easier for me to identify the applications, use "PhD (or Research Assistant , Visiting Student) Application" as your title.

PhDs

When reaching out to me, in addition to your CV, it would be best to demonstrate the following in your email.

Applicants do not need to have a degree in computer science, but they should possess good coding skills and a basic understanding of natural language processing. The research direction of the applicants does not need to align with mine, as long as they have sufficient interest in large language models.
Prospective students are encouraged to visit our laboratory in advance to gain a better understanding of how our lab operates. This will facilitate a more informed decision based on mutual selection principles.
Students who have only one publication as a (co-)first author, demonstrating their ability to develop ideas, implement, analyze, and write papers, are often considered more favorably than those who have participated in numerous publications without leading these projects.

Note:

I am aware that many students without a first-author publication record are interested in applying to our lab. We welcome these students to participate in our publication-oriented projects or lead one influential open-source project. We are looking for self-motivated students. I will guide you in advancing these projects, and strong performance will significantly enhance your chances of a successful application.

Research Assistants/Visiting Students

I welcome research assistants, visiting students, and interns at all levels. Students are required to demonstrate a strong interest and good background knowledge in large language models. While prior research experience is encouraged, it is not mandatory. All positions for research assistants, visiting students, and internships can be remote. Research assistant positions will be compensated according to the applicant's background.

Contact

Location

I'm currently located at Mong Man Wai Building, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong

Email

You could reach me via email. Show Email
I will try my best to respond if the schedule permits, unless I'm overwhelmed by emails.

Changmeng Zheng

About Me

Introduction

Education

Openings

Dr. Changmeng Zheng

News

Research Outline

MultimodalLLMs

KnowledgeGraphs

InterdisciplinaryApplications

Services

My Past Teaching Experience

Selected Publications

Multimodal Representation Learning and Alignment

Knowledge Graph Construction and Application

Interdisciplinary Applications

Openings

PhDs

Note:

Research Assistants/Visiting Students

Contact

Location

Email

Multimodal
LLMs

Knowledge
Graphs

Interdisciplinary
Applications