After My PhD Journey: My Conflict with My PhD Advisor Qizhen Zhang

When I applied for PhD programs, I received offers from several top database groups in Germany and Switzerland, as well as an offer from Qizhen Zhang at the University of Toronto. I eventually chose Toronto for database research. But after that, my visa to Canada was blocked for a long time and still not approved. So in September, I started doing research remotely. I had not formally enrolled yet, but it was still, in a loose sense, the beginning of my PhD journey. I wrote the post “Before My PhD Journey” at that time.

What I never expected then was that this journey would end so quickly. In less than three months, I decided to withdraw. And now, surprisingly, I am deeply grateful that I was able to leave so quickly.

Here is what happened. My first project was at the intersection of federated learning and databases. I chose this direction mainly because Qizhen had an important grant related to federated learning, and he wanted me to work on it. I was personally only moderately interested in federated learning, but I did want to work at the ML+DB intersection, so I accepted his suggestion.

Our first topic studied the impact of noise on federated learning. The paper was mostly experimental and found that federated learning is more sensitive to noise than traditional machine learning. During much of the project, we were like headless flies: besides me, there was no one in the group with strong machine learning background. So I was the one writing code, running experiments, and reasoning about the phenomena behind the results. We finished the paper in mid-October, submitted it to SIGMOD 2026, received a revision request in early January, passed the revision, and got acceptance in late February.

In parallel, because we observed the sensitivity of federated learning to data quality and noise, I wanted to solve that issue. I also noticed prior work showing that duplicate data does not help ML training much (especially for LLMs) and instead increases training time. Since LLM training data crawled from the web contains massive duplication, deduplication is highly needed. Existing methods were mostly centralized, so I proposed a distributed, decentralized, federated, privacy-preserving deduplication algorithm, so users’ private data would not need to be collected at a central node. I came up with this idea around mid-September, designed an efficient algorithm in about one day, and implemented it in about two days. We submitted it to WWW 2026 in early October.

After these two submissions in October, Qizhen pushed me to continue federated learning work, mainly on training framework and data pipeline optimization. By then, I was already less interested in continuing with federated learning. I could still do it well, but I felt it was not the most important near-term task. Privacy is important in the long run, yes, but right now the most pressing need is to improve intelligence and efficiency of LLM training and inference.

But Qizhen strongly wanted me to continue in federated learning. He told me: “During a PhD, what matters is deep cultivation in one direction. Getting distracted by other directions is harmful. I know your long-term goal is faculty; Toronto rejects many faculty candidates every year because their PhD research is not focused enough. You’re doing well in federated learning. I hope you can do even better and build your own brand.” I was puzzled. I had only been doing PhD-level research for one or two months. Yes, I had submitted two top-conference papers, and one later got into SIGMOD 2026, but that hardly means my direction should be fixed forever. Couldn’t I work on what interests me more? Later I realized the main reason was likely his grant needs.

Although I was uncomfortable, I kept working on federated learning. In late December, I traveled to Hong Kong with a friend to meet old friends, came back deeply moved, and wrote several posts. Then everything changed suddenly: one of my closest family members was diagnosed with a serious illness requiring long-term companionship and treatment. At that moment, I was shocked, but I immediately made the decision: I would not go abroad for the PhD. I would withdraw and stay with my family.

I informed Qizhen right away. He was also surprised, and said: “Don’t rush. You’ll soon have three top papers. Maybe I can contact the school and try to let you graduate in one year. You may not even need to spend much time in Toronto; maybe just come to handle formalities. I can’t guarantee it, but I can try.” Of course that sounded attractive: no need to go abroad, and still get a PhD in one year, which could keep my faculty dream alive. I did not expect he would go that far for me, and I felt both surprised and touched. I even told many friends about it. But my decision was already firm: no matter what, my priority had to be my family.

So what would I do after withdrawal? I could not bear seeing my family member suffer for years. I wanted to help overcome this difficult disease. That was unquestionably what I wanted most at the time. But I am not a medical researcher. What could I do? Then I thought: AI and LLMs are already very powerful. As a programmer and computer science researcher, I might feel that potential more directly than most people. I cannot personally cure disease, but if I can help push AI forward, maybe even help build a much stronger AI that can assist humans in conquering disease, then perhaps there is real hope. At that point I decided: I would start an AI company. That is why I chose AI entrepreneurship. One close friend strongly supported me and joined as a core co-founder.

An ironic part is that our team already had some public attention. We tried to stay low-key at first, but changes in company registration still exposed our startup. This triggered many self-media and marketing accounts to report on us, and many of those reports were likely AI-written and full of fabricated rumors, such as claiming I graduated from another school. It was frustrating. Up to now, we have barely spoken publicly, and we have never paid major domestic marketing accounts to publish PR pieces like “genius team charging at AGI.” Even the very few updates we posted about startup progress received heavy skepticism, especially on platforms like Xiaohongshu, where almost everyone assumed we were faking things. Still, one fortunate part is that many investors have tried to reach out and invest. We declined all of them. Right now we only want to focus on building.

In January, the SIGMOD revision request arrived. Qizhen started pushing me to revise. I spent time writing all required experimental code and designing experiments according to our hardware budget so all runs could finish before deadline. He remained very anxious and kept urging me. Even when I was busy with startup and family care and often got home after 11 p.m., he still pushed me for experiments, sometimes even asking for video calls to watch me work. As for the earlier promise to help me apply for accelerated graduation, he never mentioned it again. Still, out of responsibility, I completed all experiments on schedule. In the end, the SIGMOD 2026 result came out: the paper was accepted. The effort paid off.

Then came a dramatic turn. I logged into Slack to congratulate Qizhen, and found that in his lab group’s author list announcement, I had been moved to second author. I was speechless. On his personal website, the same paper also listed me second, with first author being an undergraduate who wrote zero lines of code and zero lines of the paper for this project. This was blatant first-author grabbing. I immediately messaged him on Slack: if you want to change author order, the minimum is to ask for my consent first. He then removed me from all lab groups, including the GitHub organization.

I complained to classmates and posted about it on WeChat Moments and Zhihu. I joked that I used to hear anti-PhD songs on Bilibili about first-author stealing, and never expected it to happen to me. Honestly I was not even angry. At the end of the day, it is one SIGMOD paper. If I can get one in a little over a month once, I can do it again. But I found it absurd: in prior conversations he was always approachable, and confidently promised to help me graduate in one year, and then eventually showed his real face.

A while later, I heard that an undergraduate in his group saw my Zhihu post and sent him a screenshot, so Qizhen responded by email. Ironically, the previous email right before that one had said: “I am very anxious that we may not be able to submit the revision. Please contact me as soon as possible. We should meet to decide what to do.” That was when there was still over a week before the deadline and he was anxiously pushing me to deliver all the experiments on time. I felt helpless then too: the jobs were already running on GPUs; pushing harder could not change physics.

In his latest reply, Qizhen said: when we submitted, I had told you that you and that undergraduate were co-first authors, and since his name comes earlier alphabetically, I put him first. I found this laughable. I wrote 100% of the code. He wrote 0%. I did all theoretical analysis and wrote most of the paper. He wrote 0 lines. On what basis were we co-first?

I immediately realized I had to defend ownership of my paper. So I wrote to the SIGMOD chairs right away to ask. The chair replied clearly: there is no such concept in the system here that allows arbitrary co-first author manipulation; I was the primary author, and no one could change author order without my consent. That made everything clear: this was Qizhen’s unilateral move. Maybe it was retaliation for my withdrawal and for the pressure he felt during revision. Looking back, the so-called accelerated graduation was probably just bait to keep me finishing the papers for him. Maybe even after finishing everything, first author still would not be mine. I later learned that his first PhD student had also master-out and left, and after that student decided to leave, he was similarly removed from the lab homepage; even that student’s once-first-author VLDB paper reportedly became an “N-th author” situation after revision. So I immediately posted my first-author paper on my personal homepage and open-sourced the code on GitHub. Otherwise who knows whether someday I would become “100th author.”

Looking back, life is full of randomness. You never know what the next period will bring. Someone who once seemed good may suddenly stab you in the back. A peaceful life may be disrupted by sudden disaster. What we can control is only our mindset and response. “A blessing in disguise” may sound cliché, but it rings true here. Leaving the PhD may look like a loss and perhaps the end of my faculty dream, yet entrepreneurship may still let me do what I most love: helping people grow. In a different way, it may also have helped me avoid a deeper pitfall. Maybe this is just life. What we can do is keep working, keep smiling at fate, never bow our heads, keep looking up at the stars, and keep our feet on the ground.

Jinming Hu
Founder and Chief Scientist

My research interests include machine learning, data mining, deep learning, computer vision, operating system, and database.

Related