About Me


I'm a second year PhD student in the Computer Science department at Cornell University where I am fortunate to be working with Prof. Volodymyr Kuleshov.

Here is my full CV.


My research interests include Generative modeling, Optimal transport, and AI for discovery & social good.


  • Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
    Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov
    ICML 2024
    [Paper] , [Site] , [Code] , [Slides] , [Video]
  • DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems
    Yair Schiff, Zhong Yi Wan, Jeffrey B. Parker, Stephan Hoyer, Volodymyr Kuleshov, Fei Sha, Leonardo Zepeda-Núñez
    ICML 2024
    [Paper] , [Code] , [Slides] , [Video]
  • InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models
    Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov
    ICML 2023
    [Paper] , [Video]
  • Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows
    Phillip Si, Zeyi Chen, Subham Sekhar Sahoo, Yair Schiff, Volodymyr Kuleshov
    ICML 2023
    [Paper] , [Video]
  • Learning with Stochastic Orders
    Carles Domingo-Enrich, Yair Schiff, Youssef Mroueh
    ICLR 2023 , Notable Top 25% acceptance
    [Paper] , [Code] , [Slides] , [Video]
  • Semi-Parametric Inducing Point Networks and Neural Processes
    Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov
    ICLR 2023
    [Paper] , [Code] , [Slides] , [Video]
  • Cloud-Based Real-Time Molecular Screening Platform with MolFormer
    Brian Belgodere*, Vijil Chenthamarakshan*, Payel Das*, Pierre Dognin*, Toby Kurien*, Igor Melnyk*, Youssef Mroueh*, Inkit Padhi*, Mattia Rigotti*, Jarret Ross*, Yair Schiff*, Richard A. Young* (*equal contribution, alphabetical order)
    ECML PKDD 2022 Demo Track
  • Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks
    David Alvarez-Melis, Yair Schiff, Youssef Mroueh
    Transactions of Machine Learning Research
    OTML NeurIPS Workshop 2021, Spotlight presentation

    [Paper] , [Code] , [Slides] , [Video]
  • Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations
    Yair Schiff*, Vijil Chenthamarakshan*, Samuel Hoffman*, Karthikeyan Natesan Ramamurthy*, Payel Das* (*equal contribution)
    ICASSP 2022
  • Predicting Deep Neural Network Generalization with Perturbation Response Curves
    Yair Schiff, Brian Quanz, Payel Das, Pin-Yu Chen
    NeurIPS 2021
    [Paper] , [Slides] , [Video]
  • Tabular Transformers for Modeling Multivariate Time Series
    Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jarret Ross, Ravi Nair, Erik Altman
    ICASSP 2021
    [Paper] , [Code]
  • Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge
    Pierre Dognin*, Igor Melnyk*, Youssef Mroueh*, Inkit Padhi*, Mattia Rigotti*, Jarret Ross*, Yair Schiff*, Richard Young, Brian Belgodere (*equal contribution)
    Journal of AI Research
    [Paper] , [Slides]


  • Simple and Effective Masked Diffusion Language Models
    Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov
    AccMLBio ICML Workshop 2024, Spotlight presentation
    SPIGM ICML Workshop 2024

    [Paper] , [Site] , [Code]
  • Advancing DNA Language Models: The Genomics Long-Range Benchmark
    Evan Trop, Chia-Hsiang Kao, Mckinley Polen, Yair Schiff, Bernardo P. de Almeida, Aaron Gokaslan, Thomas Pierrot, Volodymyr Kuleshov
    LLMs4Bio AAAI Workshop 2024,  MLGenX ICLR Workshop 2024
  • Gi and Pal Scores: Deep Neural Network Generalization Statistics
    Yair Schiff, Brian Quanz, Payel Das, Pin-Yu Chen
    RobustML ICLR Workshop 2021
  • Characterizing the Latent Space of Molecular Deep Generative Models
    Yair Schiff, Vijil Chenthamarakshan, Karthikeyan Natesan Ramamurthy, Payel Das
    TDA & Beyond NeurIPS Workshop 2020 , Spotlight presentation
    [Paper] , [Slides] , [Video]
  • Alleviating Noisy Data in Image Captioning with Cooperative Distillation
    Pierre Dognin*, Igor Melnyk*, Youssef Mroueh*, Inkit Padhi*, Mattia Rigotti*, Jarret Ross*, Yair Schiff* (*equal contribution, alphabetical order)
    VizWiz CVPR Workshop 2020
    [Paper] , [Slides]


  • Cross-species plant genomes modeling at single nucleotide resolution using a pre-trained DNA language model
    Jingjing Zhai, Aaron Gokaslan, Yair Schiff, Ana Berthel, Zong-Yan Liu, Zachary R Miller, Armin Scheben, Michelle C Stitzer, Cinta Romay, Edward S. Buckler, Volodymyr Kuleshov
  • Auditing and Generating Synthetic Data with Controllable Trust Trade-offs
    Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navartil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young