profile photo

Ji Xin

ML Scientist @ TD Layer 6 AI

About Me

I am a machine learning scientist at TD Layer 6 AI. I recently finished my Ph.D. degree in the School of Computer Science at the University of Waterloo, Canada, advised by Professors Jimmy Lin and Yaoliang Yu. I was also affiliated with the Vector Institute. My research interests are broadly in deep learning, NLP, and IR. My Ph.D. thesis is on improving model efficiency for NLP and IR applications.

In 2021, I interned at Microsoft Research with Chenyan Xiong, working on zero shot dense retrieval. Before Waterloo, I completed my bachelor's in the Department of Physics at Tsinghua University, China, in 2018. I was an undergrad research assistant at Professor Zhiyuan Liu's group, and I worked on information extraction, especially entity type classification.

Contact

  first_name [at] layer6 [dot] ai

Publications

  1. Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers

    Ji Xin, Raphael Tang, Zhiying Jiang, Yaoliang Yu, and Jimmy Lin.

    arXiv preprint.

  2. Few-Shot Non-Parametric Learning with Deep Latent Variable Model

    Zhiying Jiang, Yiqin Dai, Ji Xin, Ming Li, and Jimmy Lin.

    NeurIPS 2022.

  3. Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking

    Minghan Li, Xinyu Zhang, Ji Xin, Hongyang Zhang, and Jimmy Lin.

    EMNLP 2022. [code]

  4. Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

    Ji Xin, Chenyan Xiong, Ashwin Srinivasan, Ankita Sharma, Damien Jose, and Paul N. Bennett.

    Findings of ACL 2022. [code]

  5. Temporal Early Exiting for Streaming Speech Commands Recognition

    Raphael Tang, Karun Kumar, Ji Xin, Piyush Vyas, Wenyan Li, Gefei Yang, Yajie Mao, Craig Murray, and Jimmy Lin.

    ICASSP 2022.

  6. Voice Query Auto Completion

    Raphael Tang, Karun Kumar, Kendra Chalkley, Ji Xin, Liming Zhang, Wenyan Li, Gefei Yang, Yajie Mao, Junho Shin, Geoffrey Murray, and Jimmy Lin.

    EMNLP 2021.

  7. Simple and Effective Unsupervised Redundancy Elimination to Compress Dense Vectors for Passage Retrieval

    Xueguang Ma, Minghan Li, Kai Sun, Ji Xin, and Jimmy Lin.

    EMNLP 2021.

  8. How Does BERT Rerank Passages? An Attribution Analysis with Information Bottlenecks

    Zhiying Jiang, Raphael Tang, Ji Xin, and Jimmy Lin.

    EMNLP 2021 Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP).

  9. Approach Zero and Anserini at the CLEF-2021 ARQMath Track: Applying Substructure Search and BM25 on Operator Tree Path Tokens.

    Wei Zhong, Xinyu Zhang, Ji Xin, Jimmy Lin, and Richard Zanibbi.

    Conference and Labs of the Evaluation Forum (CLEF 2021): CEUR Workshop. [code]

  10. Serverless BM25 Search and BERT Reranking.

    Mayank Anand, Jiarui Zhang, Shane Ding, Ji Xin, and Jimmy Lin.

    International Conference on Design of Experimental Search & Information REtrieval Systems (DESIRES) 2021. [code]

  11. Bag-of-Words Baselines for Semantic Code Search

    Xinyu Zhang, Ji Xin, Andrew Yates, and Jimmy Lin.

    ACL 2021 Workshop on Natural Language Processing for Programming (NLP4Prog).

  12. BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression

    Ji Xin, Raphael Tang, Yaoliang Yu, and Jimmy Lin.

    EACL 2021. [code]

  13. Early Exiting BERT for Efficient Document Ranking

    Ji Xin, Rodrigo Nogueira, Yaoliang Yu, and Jimmy Lin.

    EMNLP 2020 Workshop on Simple and Efficient Natural Language Processing (SustaiNLP). [code]

  14. Inserting Information Bottlenecks for Attribution in Transformers

    Zhiying Jiang, Raphael Tang, Ji Xin, and Jimmy Lin.

    Findings of EMNLP 2020. [code]

  15. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

    Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, and Jimmy Lin.

    ACL 2020. [code] [in Huggingface Transformer] [checkpoints] [arXiv]

  16. Showing Your Work Doesn't Always Work

    Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yaoliang Yu, and Jimmy Lin.

    ACL 2020. [code] [arXiv]

  17. Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits

    Achyudh Ram, Ji Xin, Meiyappan Nagappan, Yaoliang Yu, Roco Cabrera Lozoya, Antonino Sabetta, and Jimmy Lin.

    arXiv preprint.

  18. Put It Back: Entity Typing with Language Model Enhancement

    Ji Xin, Hao Zhu, Xu Han, Zhiyuan Liu, and Maosong Sun.

    EMNLP 2018. [code]

  19. Improving Neural Fine-Grained Entity Typing with Knowledge Attention

    Ji Xin, Yankai Lin, Zhiyuan Liu, and Maosong Sun.

    AAAI 2018. [code]