Defining and benchmarking open problems in single-cell analysis

kisded kisdedUncategorized22 hours ago5 Views

Code availability

All Open Problems code is publicly available at https://www.github.com/openproblems-bio/openproblems. This code includes data loaders for all datasets used, with associated metadata on where this data came from. Code to reproduce the figures is publicly available at https://github.com/openproblems-bio/nbt2025-manuscript. Detailed information on all datasets is available at https://openproblems.bio/datasets. Documentation for the platform and contribution guides can be found at https://openproblems.bio/documentation.

References

  1. Zappia, L., Phipson, B. & Oshlack, A. PLOS Comput. Biol. 14, e1006245 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  2. Heumos, L. et al. Nat. Rev. Genet. 24, 550–572 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  3. Luecken, M. D. & Theis, F. J. Mol. Syst. Biol. 15, e8746 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  4. Donoho, D. J. Comput. Graph. Stat. 26, 745–766 (2017).

    Article 

    Google Scholar
     

  5. Sonrel, A. et al. Genome Biol. 24, 119 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  6. Brooks, T. G., Lahens, N. F., Mrčela, A. & Grant, G. R. Nat. Rev. Genet. 25, 326–339 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  7. Buchka, S., Hapfelmeier, A., Gardner, P. P., Wilson, R. & Boulesteix, A.-L. Genome Biol. 22, 152 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  8. Musgrave, K., Belongie, S. & Lim, S.-N. In Computer Vision – ECCV 2020 (eds Vedaldi, A. et al.) Lecture Notes in Computer Science Vol. 12370 (Springer, 2020); https://doi.org/10.1007/978-3-030-58595-2_41

  9. Luecken, M. D. et al. Nat. Methods 19, 41–50 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  10. Chazarra-Gil, R., van Dongen, S., Kiselev, V. Y. & Hemberg, M. Nucleic Acids Res. 49, e42 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  11. Tran, H. T. N. et al. Genome Biol. 21, 12 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  12. Mereu, E. et al. Nat. Biotechnol. 38, 747–755 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  13. Cao, Y. et al. Preprint at bioRxiv https://doi.org/10.1101/2023.12.19.572303 (2025).

  14. Cannoodt, R. et al. J. Open Source Softw. 9, 6089 (2024).

    Article 

    Google Scholar
     

  15. CZI Cell Science Program et al. Nucleic Acids Res. 53, D886–D900 (2025).

  16. Dimitrov, D. et al. Nat. Commun. 13, 3224 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  17. Armingol, E., Baghdassarian, H. M. & Lewis, N. E. Nat. Rev. Genet. 25, 381–400 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  18. Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. Nat. Protoc. 15, 1484–1506 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  19. Hou, R., Denisenko, E., Ong, H. T., Ramilowski, J. A. & Forrest, A. R. R. Nat. Commun. 11, 5011 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  20. Raredon, M. S. B. et al. Sci. Rep. 12, 4187 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  21. Cabello-Aguilar, S. et al. Nucleic Acids Res. 48, e55 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  22. Lance, C. et al. In Proc. NeurIPS 2021 Competitions and Demonstrations Track 162–176 (NeurIPS, 2022).

  23. Luecken, M. D. et al. In Proc. Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS, 2021); https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

  24. Gigante, S. et al. Openproblems-Bio/Openproblems: V1.0.0. Zenodo https://doi.org/10.5281/ZENODO.13769879 (2024).

Download references

Acknowledgements

We received continual support in many ways from Jonah Cool, Ivana Williams and Fiona Griffin from the Chan Zuckerberg Initiative for this project, without whom we would not have come this far. We would also like to thank Mohammad Lotfollahi for early discussions on Open Problems. E.V.B. would like to thank the Caltech Bioengineering Graduate program and Paul W. Sternberg for support. This work was supported by the Chan Zuckerberg Initiative Foundation (grant CZIF2022-007488, Human Cell Atlas Data Ecosystem) and the Chan Zuckerberg Initiative DAF, an advised fund of the Silicon Valley Community Foundation (grant number 2021-235155) awarded to M.D.L., D.B.B., S.G., F.J.T. and S.K. This work was co-funded by the European Union (ERC, DeepCell -101054957, to A.S. and F.J.T.). Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. G.P. is supported by the Helmholtz Association under the joint research school Munich School for Data Science and by the Joachim Herz Foundation. Throughout this work, W.L. was supported by the US National Institutes of Health under Continuing Education Training Grants (T15). D.D. was supported by the European Union’s Horizon 2020 Research and Innovation Program (860329 Marie-Curie ITN “STRATEGY-CKD”). M.E.V. is supported by the US National Institutes of Health under a Ruth L. Kirschstein National Research Service Award (1F31CA257625) from the National Cancer Institute. E.D. is supported by Wellcome Sanger core funding (WT206194). This work was supported by the Research Foundation Flanders (FWO) (1SF3822N to L.D.). B.R. is supported by the Bavarian state government with funds from the Hightech Agenda Bavaria. This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiele Intelligentie (AI) Vlaanderen” programme. C.B.G.-B. was supported by a PhD fellowship from Fonds Wetenschappelijk Onderzoek (FWO, 11F1519N). V.K. was supported by Wellcome Sanger core funding. G.L.M. received support from Swiss National Science Foundation grant PZ00P3_193445 and Chan Zuckerberg Initiative grants number 2022-249212 and 2019-002427. D.R. was supported by the National Cancer Institute of the US National Institutes of Health (2U24CA180996).

Author information

Author notes

  1. These authors contributed equally: Malte D. Luecken, Scott Gigante, Daniel B. Burkhardt, Robrecht Cannoodt.

  2. These authors jointly supervised this work: Fabian J. Theis, Smita Krishnaswamy.

Authors and Affiliations

  1. Institute of Computational Biology, Helmholtz Munich, Neuherberg, Germany

    Malte D. Luecken, Daniel C. Strobl, Luke Zappia, Giovanni Palla, Michaela F. Mueller, Artur Szałata, Yuge Ji & Fabian J. Theis

  2. Institute of Lung Health & Immunity, Helmholtz Munich; Member of the German Center for Lung Research (DZL), Munich, Germany

    Malte D. Luecken & Michaela F. Mueller

  3. Immunai, New York, USA

    Scott Gigante & Drausin Wulsin

  4. NVIDIA, Santa Clara, CA, USA

    Daniel B. Burkhardt

  5. Data Intuitive, Lebbeke, Belgium

    Robrecht Cannoodt, Luke Zappia, Kai Waldrant & Sai Nirmayi Yasa

  6. Data Mining and Modelling for Biomedicine group, VIB Center for Inflammation Research, Ghent, Belgium

    Robrecht Cannoodt, Louise Deconinck & Yvan Saeys

  7. Department of Applied Mathematics, Computer Science, and Statistics, Ghent University, Ghent, Belgium

    Robrecht Cannoodt, Louise Deconinck & Yvan Saeys

  8. Institute of Clinical Chemistry and Pathobiochemistry, School of Medicine, Technical University of Munich, Munich, Germany

    Daniel C. Strobl

  9. TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany

    Daniel C. Strobl, Giovanni Palla & Michaela F. Mueller

  10. Division of Pulmonary and Critical Care Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA

    Nikolay S. Markov

  11. Department of Mathematics, School of Computing, Information and Technology, Technical University of Munich, Munich, Germany

    Luke Zappia, Artur Szałata & Fabian J. Theis

  12. Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA

    Wesley Lewis & Smita Krishnaswamy

  13. Faculty of Medicine and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany

    Daniel Dimitrov & Julio Saez-Rodriguez

  14. Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA

    Michael E. Vinyard

  15. Broad Institute of MIT and Harvard, Cambridge, MA, USA

    Michael E. Vinyard, Qian Qin & Zhijian Li

  16. Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA

    Michael E. Vinyard, Zhijian Li & Luca Pinello

  17. Department of Computer Science, Yale University, New Haven, CT, USA

    D. S. Magruder, Alexander Tong & Smita Krishnaswamy

  18. Genentech Inc, South San Francisco, CA, USA

    Alma Andersson & Romain Lopez

  19. Gene Technology, Royal Institute of Technology (KTH), Stockholm, Sweden

    Alma Andersson

  20. Science for Life Laboratory (SciLifeLab), Solna, Sweden

    Alma Andersson

  21. Wellcome Sanger Institute, Cambridge, UK

    Emma Dann & Vitalii Kleshchevnikov

  22. Basic Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA

    Dominik J. Otto

  23. Computational Biology Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA

    Dominik J. Otto

  24. Translational Data Science IRC, Fred Hutchinson Cancer Center, Seattle, WA, USA

    Dominik J. Otto

  25. Apple, Paris, France

    Michal Klein

  26. Data Sciences Platform, Chan Zuckerberg Biohub, San Francisco, CA, USA

    Olga Borisovna Botvinnik, Ann T. Chen & Angela Oliveira Pisco

  27. Bridge Bio Pharma, Palo Alto, CA, USA

    Olga Borisovna Botvinnik

  28. Cellarity, Inc, Somerville, MA, USA

    Andrew Benz & Benjamin DeMeo

  29. Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA

    Jonathan M. Bloom

  30. Insitro, South San Francisco, USA

    Angela Oliveira Pisco

  31. VIB Center for AI & Computational Biology (VIB.AI), Ghent, Belgium

    Yvan Saeys

  32. Cellular Genetics Programme, Wellcome Sanger Institute, Hinxton, UK

    Fabian J. Theis

  33. Department of Genetics, Yale University, New Haven, CT, USA

    Smita Krishnaswamy

  34. Institute of AI for Health, Helmholtz Munich, Neuherberg, Germany

    Bastian Rieck

  35. Department of Informatics, University of Fribourg, Fribourg, Switzerland

    Bastian Rieck

  36. Genome Biology Unit, EMBL, Heidelberg, Germany

    Constantin Ahlmann-Eltze

  37. Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, United Arab Emirates

    Eduardo da Veiga Beltrame

  38. VIB Center for Brain & Disease Research, Leuven, Belgium

    Carmen Bravo González-Blas & Swann Floc’hlay

  39. Orion Medicines, Foster City, CA, USA

    Ann T. Chen

  40. Department of Biomedical Informatics, Harvard University, Cambridge, MA, USA

    Benjamin DeMeo

  41. Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA

    Can Ergen, Adam Gayoso, Galen Xing & Chenling Xu

  42. Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA

    Stephanie Hicks

  43. Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA

    Stephanie Hicks

  44. Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA

    Stephanie Hicks

  45. Brain Mind Institute, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland

    Gioele La Manno

  46. Chan Zuckerberg Initiative, Redwood City, CA, USA

    Maximilian G. Lombardo

  47. Department of Genetics, Stanford University, Stanford, CA, USA

    Romain Lopez

  48. Department of Statistical Sciences, University of Padua, Padua, Italy

    Dario Righelli

  49. Department of Computer Science, Princeton University, Princeton, NJ, USA

    Hirak Sarkar

  50. Princeton Ludwig Institute, Princeton University, Princeton, NJ, USA

    Hirak Sarkar

  51. Altos Labs, San Diego, CA, USA

    Valentine Svensson

  52. Mila–Quebec AI Institute, Montreal, Quebec, Canada

    Alexander Tong

  53. Université de Montréal, Montréal, Quebec, Canada

    Alexander Tong

  54. Gladstone–UCSF Institute of Genomic Immunology, San Francisco, CA, USA

    Galen Xing

Authors

  1. Malte D. Luecken
  2. Scott Gigante
  3. Daniel B. Burkhardt
  4. Robrecht Cannoodt
  5. Daniel C. Strobl
  6. Nikolay S. Markov
  7. Luke Zappia
  8. Giovanni Palla
  9. Wesley Lewis
  10. Daniel Dimitrov
  11. Michael E. Vinyard
  12. D. S. Magruder
  13. Michaela F. Mueller
  14. Alma Andersson
  15. Emma Dann
  16. Qian Qin
  17. Dominik J. Otto
  18. Michal Klein
  19. Olga Borisovna Botvinnik
  20. Louise Deconinck
  21. Kai Waldrant
  22. Sai Nirmayi Yasa
  23. Artur Szałata
  24. Andrew Benz
  25. Zhijian Li
  26. Jonathan M. Bloom
  27. Angela Oliveira Pisco
  28. Julio Saez-Rodriguez
  29. Drausin Wulsin
  30. Luca Pinello
  31. Yvan Saeys
  32. Fabian J. Theis
  33. Smita Krishnaswamy

Consortia

Open Problems Jamboree Members

  • Bastian Rieck
  • , Constantin Ahlmann-Eltze
  • , Eduardo da Veiga Beltrame
  • , Carmen Bravo González-Blas
  • , Ann T. Chen
  • , Benjamin DeMeo
  • , Can Ergen
  • , Swann Floc’hlay
  • , Adam Gayoso
  • , Stephanie Hicks
  • , Yuge Ji
  • , Vitalii Kleshchevnikov
  • , Gioele La Manno
  • , Maximilian G. Lombardo
  • , Romain Lopez
  • , Dario Righelli
  • , Hirak Sarkar
  • , Valentine Svensson
  • , Alexander Tong
  • , Galen Xing
  •  & Chenling Xu

Contributions

M.D.L., S.G., and D.B.B. conceived the idea. M.D.L., S.G., D.B.B., R.C., and O.B.B. developed the infrastructure. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., M.F.M., A.A., E.D., Q.Q., A.S., A.B., and Z.L. formalized a benchmarking task. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., D.S.M., M.F.M., A.A., E.D., Q.Q., D.J.O., M.K., O.B.B., K.W., S.N.Y., A.S., A.B., Z.L., C.A-E., E.d.V.B., A.T.C., B.D., C.E., V.K., H.S., V.S. and A.T. contributed to the codebase. M.D.L., S.G., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., L.D. and K.W. analyzed the results. M.D.L., S.G., D.B.B., J.M.B., A.O.P., J.S.-R., D.W., L.P., Y.S., F.J.T. and S.K. provided resources and supervised the work. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L. and D.D. coordinated the research. M.D.L., S.G., D.B.B., F.J.T. and S.K. acquired funding for the work. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., M.F.M., A.A., E.D., Q.Q., D.J.O., M.K., O.B.B., A.S., A.B., Z.L., B.R., J.M.B., A.O.P., C.A-E., E.d.V.B., A.B., C.B.G-B., A.T.C., B.D., C.E., S.F., A.G., S.H., Y.J., V.K., G.L.M., M.G.L., R.L., D.R., H.S., V.S., A.T., G.X. and C.X. contributed to benchmarking task definition. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V. and D.S.M. prepared the manuscript. D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., D.S.M. and M.F.M. contributed equally as second authors. All authors reviewed the manuscript.

Corresponding authors

Correspondence to
Fabian J. Theis or Smita Krishnaswamy.

Ethics declarations

Competing interests

M.D.L. consults for CatalYm GmbH, contracted for the Chan Zuckerberg Initiative and received speaker fees from Pfizer and Janssen Pharmaceuticals. S.G. has equity interest in Immunai Inc. D.B.B. is a paid employee of and has equity interest in NVIDIA. R.C. has equity interest in Data Intuitive BV. L.Z. has consulted for Lamin Labs GmbH. W.L. contracted for Protein Evolution Incorporated. From 2019 to 2022, A.A. was a consultant for 10x Genomics. From October 2023, E.D. has been a consultant for EnsoCell Therapeutics. O.B.B is currently an employee of Bridge Bio Pharma. A.S. consults for Cellarity Inc. and Exvivo Labs Inc. A.B. is a paid employee of and has equity interest in Cellarity, Inc. J.B. has equity interest in Cellarity, Inc. J.S.-R. reports funding from GSK, Pfizer and Sanofi and fees or honoraria from Travere Therapeutics, Stadapharm, Astex, Owkin, Pfizer and Grunenthal. D.W. has equity interest in Immunai Inc. F.J.T. consults for Immunai Inc., Singularity Bio B.V., CytoReason Ltd and Cellarity, and has ownership interest in Dermagnostix GmbH and Cellarity. S.K. is a visiting professor at Meta and scientific advisor at Ascent Bio, Inc. E.d.V.B has ownership interest in Retro Biosciences and ImYoo Inc and is employed by ImYoo Inc. A.T.C. is an employee of Orion Medicines. B.D. is a paid employee of and has equity interest in Cellarity Inc. A.G. is currently an employee of Google DeepMind. Google DeepMind has not directed any aspect of this study nor exerts any commercial rights over the results. R.L. is an employee of Genentech. V.S. has ownership interest in Altos Labs and Vesalius Therapeutics. A.T. has an ownership interest in Dreamfold.

Supplementary information

Source data

Source Data Fig. 1

Methods and metrics used per existing benchmarking repository, including dates of first and last commit.

Source Data Fig. 2

Table of metric results for the cell–cell communication task with metric explanations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luecken, M.D., Gigante, S., Burkhardt, D.B. et al. Defining and benchmarking open problems in single-cell analysis.
Nat Biotechnol (2025). https://doi.org/10.1038/s41587-025-02694-w

Download citation

  • Published:

  • DOI: https://doi.org/10.1038/s41587-025-02694-w

Read More

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Recent Comments

No comments to show.

Stay Informed With the Latest & Most Important News

I consent to receive newsletter via email. For further information, please review our Privacy Policy

Advertisement

Loading Next Post...
Follow
Sign In/Sign Up Sidebar Search Trending 0 Cart
Popular Now
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...

Cart
Cart updating

ShopYour cart is currently is empty. You could visit our shop and start shopping.