🎓

PHD Researcher in Synthetic Data Generation

Station F, Paris
Full Time
Open
Apply

About the job

Neuralk-AI is looking for a PhD researcher in synthetic data generation with strong expertise in generative models and tabular representation learning. This PhD position focuses on advancing the next generation of tabular foundation models using high-fidelity synthetic data for in-context learning.

You must have a Master’s degree and be eligible to enroll in a PhD program to apply.

You will collaborate with our Paris-based research team (4 members) and conduct your work under the supervision of a leading academic research group from a top UK university.

About Neuralk

We are a passionate team leading the way in AI innovation, committed to driving the rapid adoption of transformative AI applications in the Industry. Our focus is on developing a Agent based on Tabular AI to allow any company to build AI applications that natively interact with their structured databases (tabular). Specifically, we develop a modern AI workflows platform that automatically solve your use-cases with state-of-the-art performance without custom training on your data.

As an early-stage AI-driven startup backed by significant funding ($4M), our AI-agent is powered by our proprietary Tabular Foundation Model, driving practical business solutions from research. We value clear communication and simplicity in our approaches, promoting a constant optimization mindset.

Join Neuralk to be part of a growing team, eager to learn and adapt, united by the belief that our technology can make a significant positive impact and contribute to transforming the AI industry.

Co-founders: Alexandre Pasquiou (CSO) & Antoine Moissenot (CEO).

Neuralk is dedicated to equal opportunity employment and fosters an environment that is open and respectful of diversity.

Mission Highlights

As a PhD Researcher in Synthetic Data Generation, you will:

•   Develop advanced generative models for realistic and diverse tabular data.

•   Work at the intersection of foundational ML research and real-world industrial AI applications.

•   Contribute directly to the performance and generalization of our in-house Tabular Foundation Model forin-context learning.

Role & Responsibilities

•   Model Design: Develop deep generative models (e.g., transformer-,diffusion-, or flow-based) for tabular synthetic data generation that captures complex real-world distributions.

•   Evaluation: Define task-aware fidelity metrics to assess the usefulness of synthetic data for pre-training.

•   Pretraining Support: Improve pretraining convergence of our Tabular ICLs by generating informative samples that guide learning dynamics.

•   Curriculum Learning: Create generation pipelines with controllable task complexity to enable curriculum-based ICL training.

•   Collaboration: Work closely with Neuralk engineers and external academic partners on experiment design, model evaluation, and deployment readiness.

•    Publication & Conferences: Publish your findings in top-tier machine learning venues and participate actively in the international research community.

Profile

•   Master’s degree in Computer Science, Machine Learning, or a closely related field.

•   Experience with at least one family of generative models (GANs, Flows, Diffusion, VAEs) applied tostructured data.

•   Solid knowledge of machinelearning, particularly model training, evaluation, and data representation.

•   Good communication skills inEnglish.

•   Capacity to workindependently while collaborating effectively with interdisciplinary teams.

•   A mindset driven by research impact and real-world applications.

 

Bonuses

•   Publication record in ML conferences or workshops (e.g., NeurIPS, ICLR, ICML).

•   Experience with curriculum learning, causal modeling, or representation learning for structured data.

•   Background in data-centric AI or meta-learning techniques.

•   Familiarity with framework ssuch as SynthCity, SDV, or TabPFN.

Interested in the role?

Get in touch and we will get back to you shortly.

Recruitment Process