Hi, I'm Bin CAO 曹斌!

Creative Designer Coder Player

In Guangzhou

I am engaged in AI4CM (AI for Computational Materials) research, focusing on crystallography and spectroscopy

- Nice to meet you!

Bin CAO 曹斌

PhD student & Developer Coder Player

Hello there! My name is Bin CAO. I am engaged in AI4CM (AI for Computational Materials) research, focusing on crystallography and spectroscopy. My research primarily includes physics-based diffraction pattern simulation and machine learning representations in spectrum-based sequence models and crystal-based graph structures.

I am passionate about open science and strongly advocate for the unrestricted dissemination of knowledge. To support this vision, I share all code from my research (On GitHub & Huggingface) to ensure transparency and accessibility.

  • 5+

    Years of Experience
  • 20+

    Projects Completed
  • 20+

    Papers Published
- Experience

Everything about me!

  • -2025(July 1st - December 31th)

    Exchange Student (half a year)

    -CityU, HongKong

    Spectroscopy Crystal Characterization & Crystal Structure Prediction (CSP) via Generative Algorithms

    City University of Hong Kong (CityU) (香港城市大学)

    I am currently an exchange student at City University of Hong Kong (CityU) in the Department of Physics, under the supervision of Prof. Ren Yang.

    During this period, I aim to work on two main tasks: developing a crystal phase identification system based on XRD data, and generating crystal structures using generative algorithms.

  • -2023 - Present

    PhD in Advanced Materials

    -HKUST, Guangzhou

    AI-driven X-ray Structural Characterization & Crystal Generation and Property Prediction

    The Hong Kong University of Science and Technology (Guangzhou) (香港科技大学广州)

    Cao Bin is an active open-source community builder and a collaborative partner in materials science and AI.

    Currently, I am pursuing my studies at HKUST(GZ) under the supervision of Professor Zhang Tong-yi and Prof.Weng Lutao.

    Access my publication here : Google scholar.

  • -2023(Mar 1st - Aug 31th)

    Researcher (half a year)

    -Zhejiang Lab, Hangzhou

    Leading the development of the transfer learning framework.

    Zhejiang Lab(之江实验室)

    I have been working at Zhejiang Lab in the Intelligent Materials Design Department, led by Prof. Zhang Tongyi (Chief Scientist in Materials Science), for half a year.

    During this period, I mainly focused on studying transfer learning and established a strong research connection with my peers.

    The open-source project can be accessed here : TrAdaboost

  • -2020 - 2023

    Master of Philosophy

    -SHU, Shanghai

    AI-driven X-ray Structural Characterization & Active Learning Framework: Bgolearn.

    Shanghai University(上海大学)

    I obtained a master's degree in Solid Mechanics from Shanghai University, supervised by Prof. Zhang Tongyi.

    Shanghai University provided me with an excellent academic experience. During this period, I was awarded the National Scholarship and recognized as an Outstanding Graduate of Shanghai University.

  • -2016 - 2020

    Bachelor

    -BUCT, Beijing

    Mechanical Engineering & 3D Modeling and Finite Element Analysis.

    Beijing University of Chemical Technology(北京化工大学)

    I obtained my bachelor's degree from Beijing University of Chemical Technology (BUCT).

    During my four years of study at BUCT, I made many great friends and created wonderful memories.

    BUCT has a rigorous academic atmosphere and a dynamic learning environment—I highly recommend studying here.

- Projects

My Projects

bcao686@connect.hkust-gz.edu.cn
  • 01

    XRD structure identification

    I am making efforts to promote end-to-end structure identification...

    To achieve end-to-end intelligent structure identification, we developed a novel powder XRD simulation tool (SimXRD, ICLR 2025) that generates high-fidelity simulated XRD patterns closely aligned with experimental data.

    Building on this, we also participate in the opXRD database project, striving to establish the largest experimental raw XRD database (arXiv 2503.0557).

    Furthermore, we proposed the first software-hardware integrated system for real-time structure identification, achieving state-of-the-art performance (XQueryer, Oriel, Seattle, USA). In our framework, detailed atomic sites are determined using a refinement strategy.

    For more details, refer to the document: WPEM Manual (figshare,file=51378833).

  • 02

    Crystal structure generation

    Generating crystals with a minimal element set and maximum symmetry...

    Diffraction patterns and crystal structures are closely related concepts. Therefore, my research interest lies in crystal representation.

    In this survey (https://arxiv.org/pdf/2505.16379), we provide a comprehensive overview of crystal generation, summarizing and organizing various types of materials while illustrating multiple representations of crystalline structures. We then present a detailed summary and taxonomy of current AI-driven materials generation approaches. Furthermore, we discuss commonly used evaluation metrics and highlight open-source codebases and benchmark datasets.

    One of the projects we have worked on involves embedding crystals using asymmetry units (ASUs), space groups, lattice vectors, and the minimal element set to inversely generate stable, novel crystal structures (CGWGAN, JMI, 2024), achieving good results while preserving high symmetry.

    In another project, we introduced powder XRD to provide additional insights from reciprocal space, enhancing the model's understanding of crystals (ASUGNN, J. Appl. Cryst.), which shows great potential.

    I am currently working on deriving a universal pre-trained model for crystals and hope to share more soon!

  • 03

    Baysian Opt. - Bgolearn

    We launched a Bayesian optimization/active learning framework for the materials community...

    Bgolean is the first active learning framework designed for the materials community (homepage: https://github.com/Bgolearn).

    Since its release, it has achieved over 80,000 downloads, gaining significant popularity in the application community (Bgolearn, Mat. & Design, 2024). Bgolearn includes nine utility functions that can be applied to both single and multi-target designs, in regression or classification tasks.

    In collaboration with Dr. Ma, we launched a user interface for ease of use (MLMD, npj Comput. Mat., 2024).

  • 04

    Transfer Learning

    The transfer learning project has gained increasing attention...

    TrAdaboost is an open-source project for transfer learning education.

    After leading the transfer learning framework during my work at Zhejiang Lab, I decided to open-source a teaching project to introduce the fundamental concepts of transfer learning using simple models and toy data.

    This project has been gaining more and more attention. Thank you! (Location: https://github.com/Bin-Cao/TrAdaboost).

  • 05

    Outlier identifying by TCGPR model

    A noval machine learning algorithm for outlier identifying and feature selection...

    I proposed TCGPR in 2022, based on the data sensitivity reflected in kernel-based Gaussian process models. It defines a factor to evaluate the data consistency for pattern recognition and outlier identification (https://github.com/Bin-Cao/TCGPR)

    This model achieved great performance in studying materials with small data sets. By characterizing the data distributions, we can often achieve better fitting results (though it may not always work).

    Following this strategy, we successfully applied the algorithm to two works:(Small, 2024 : https://onlinelibrary.wiley.com/doi/10.1002/smll.202408750) (npj cm 2023 :https://www.nature.com/articles/s41524-023-01150-0).

XQueryer System

- Papers

Selected papers

- Collaborators

Collaborators & institutions

- Supervisors

My Supervisors

- News

My blog & news

- Let's Connect

Get in touch

I'm currently available for new collaborations, so feel free to send me a message about anything that you want to run past me. You can contact anytime at 24/7

Please Fill Required Fields

how to add google map