Artificial intelligence in histopathology image analysis for cancer precision medicine
Author: Weitz, Philippe
Date: 2023-10-27
Location: Lecture hall Atrium, Nobels väg 12B, Karolinska Institutet, Solna
Time: 13.00
Department: Inst för medicinsk epidemiologi och biostatistik / Dept of Medical Epidemiology and Biostatistics
View/ Open:
Thesis (1.255Mb)
Abstract
In recent years, there have been rapid advancements in the field of computational
pathology. This has been enabled through the adoption of digital pathology
workflows that generate digital images of histopathological slides, the publication
of large data sets of these images and improvements in computing infrastructure.
Objectives in computational pathology can be subdivided into two categories,
first the automation of routine workflows that would otherwise be performed by
pathologists and second the addition of novel capabilities. This thesis focuses on
the development, application, and evaluation of methods in this second category,
specifically the prediction of gene expression from pathology images and the
registration of pathology images among each other.
In Study I, we developed a computationally efficient cluster-based technique to perform transcriptome-wide predictions of gene expression in prostate cancer from H&E-stained whole-slide-images (WSIs). The suggested method outperforms several baseline methods and is non-inferior to single-gene CNN predictions, while reducing the computational cost with a factor of approximately 300. We included 15,586 transcripts that encode proteins in the analysis and predicted their expression with different modelling approaches from the WSIs. In a cross-validation, 6,618 of these predictions were significantly associated with the RNA-seq expression estimates with FDR-adjusted p-values <0.001. Upon validation of these 6,618 expression predictions in a held-out test set, the association could be confirmed for 5,419 (81.9%). Furthermore, we demonstrated that it is feasible to predict the prognostic cell-cycle progression score with a Spearman correlation to the RNA-seq score of 0.527 [0.357, 0.665].
The objective of Study II is the investigation of attention layers in the context of multiple-instance-learning for regression tasks, exemplified by a simulation study and gene expression prediction. We find that for gene expression prediction, the compared methods are not distinguishable regarding their performance, which indicates that attention mechanisms may not be superior to weakly supervised learning in this context.
Study III describes the results of the ACROBAT 2022 WSI registration challenge, which we organised in conjunction with the MICCAI 2022 conference. Participating teams were ranked on the median 90th percentile of distances between registered and annotated target landmarks. Median 90th percentiles for eight teams that were eligible for ranking in the test set consisting of 303 WSI pairs ranged from 60.1 µm to 15,938.0 µm. The best performing method therefore has a score slightly below the median 90th percentile of distances between first and second annotator of 67.0 µm.
Study IV describes the data set that we published to facilitate the ACROBAT challenge. The data set is available publicly through the Swedish National Data Service SND and consists of 4,212 WSIs from 1,153 breast cancer patients.
Study V is an example of the application of WSI registration for computational pathology. In this study, we investigate the possibility to register invasive cancer annotations from H&E to KI67 WSIs and then subsequently train cancer detection models. To this end, we compare the performance of models optimised with registered annotations to the performance of models that were optimised with annotations generated for the KI67 WSIs. The data set consists of 272 female breast cancer cases, including an internal test set of 54 cases. We find that in this test set, the performance of both models is not distinguishable regarding performance, while there are small differences in model calibration.
In Study I, we developed a computationally efficient cluster-based technique to perform transcriptome-wide predictions of gene expression in prostate cancer from H&E-stained whole-slide-images (WSIs). The suggested method outperforms several baseline methods and is non-inferior to single-gene CNN predictions, while reducing the computational cost with a factor of approximately 300. We included 15,586 transcripts that encode proteins in the analysis and predicted their expression with different modelling approaches from the WSIs. In a cross-validation, 6,618 of these predictions were significantly associated with the RNA-seq expression estimates with FDR-adjusted p-values <0.001. Upon validation of these 6,618 expression predictions in a held-out test set, the association could be confirmed for 5,419 (81.9%). Furthermore, we demonstrated that it is feasible to predict the prognostic cell-cycle progression score with a Spearman correlation to the RNA-seq score of 0.527 [0.357, 0.665].
The objective of Study II is the investigation of attention layers in the context of multiple-instance-learning for regression tasks, exemplified by a simulation study and gene expression prediction. We find that for gene expression prediction, the compared methods are not distinguishable regarding their performance, which indicates that attention mechanisms may not be superior to weakly supervised learning in this context.
Study III describes the results of the ACROBAT 2022 WSI registration challenge, which we organised in conjunction with the MICCAI 2022 conference. Participating teams were ranked on the median 90th percentile of distances between registered and annotated target landmarks. Median 90th percentiles for eight teams that were eligible for ranking in the test set consisting of 303 WSI pairs ranged from 60.1 µm to 15,938.0 µm. The best performing method therefore has a score slightly below the median 90th percentile of distances between first and second annotator of 67.0 µm.
Study IV describes the data set that we published to facilitate the ACROBAT challenge. The data set is available publicly through the Swedish National Data Service SND and consists of 4,212 WSIs from 1,153 breast cancer patients.
Study V is an example of the application of WSI registration for computational pathology. In this study, we investigate the possibility to register invasive cancer annotations from H&E to KI67 WSIs and then subsequently train cancer detection models. To this end, we compare the performance of models optimised with registered annotations to the performance of models that were optimised with annotations generated for the KI67 WSIs. The data set consists of 272 female breast cancer cases, including an internal test set of 54 cases. We find that in this test set, the performance of both models is not distinguishable regarding performance, while there are small differences in model calibration.
List of papers:
I. P. Weitz, Y. Wang, K. Kartasalo, L. Egevad, J. Lindberg, H. Grönberg, M. Eklund, M. Rantalainen. “Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using coexpression-based convolutional neural networks”. Bioinformatics. vol. 38, no. 13, pp. 3462–3469, Jun. 2022.
Fulltext (DOI)
Pubmed
View record in Web of Science®
II. P. Weitz, Y. Wang, J. Hartman, and M. Rantalainen. “An investigation of attention mechanisms in histopathology whole-slide-image analysis for regression objectives”. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). IEEE, Oct 2021.
Fulltext (DOI)
III. P. Weitz*, M. Valkonen*, L. Solorzano*, C. Carr, K. Kartasalo, C. Boissin, S. Koivukoski, A. Kuusela, D. Rasic, Y. Feng, S. Sinius Pouplier, A. Sharma, K. Ledesma Eriksson, S. Robertson, C. Marzahl, C. Gatenbee, A. Anderson, M. Wodzinski, A. Jurgas, N. Marini, M. Atzori, H. Müller, D. Budelmann, N. Weiss, S. Heldmann, J. Lotz, J. Wolterink, B. De Santi, A. Patil, A. Sethi, S. Kondo, S. Kasai, K. Hirasawa, M. Farrokh, N. Kumar, R. Greiner, L. Latonen, A. Laenkholm, J. Hartman, P. Ruusuvuori*, M. Rantalainen*. “The ACROBAT 2022 Challenge: Automatic Registration Of Breast Cancer Tissue”. *Equal contribution. [Manuscript]
Fulltext (DOI)
IV. P. Weitz*, M. Valkonen*, L. Solorzano*, C. Carr, K. Kartasalo, C. Boissin, S. Koivukoski, A. Kuusela, D. Rasic, Y. Feng, S. Sinius Pouplier, A. Sharma, K. Ledesma Eriksson, L. Latonen, A. Laenkholm, J. Hartman*, P. Ruusuvuori*, M. Rantalainen*. “A Multi-Stain Breast Cancer Histological Whole-Slide-Image Data Set from Routine Diagnostics”. Scientific Data. vol. 10, no. 1, p. 562, Aug. 2023. *Equal contribution.
Fulltext (DOI)
Pubmed
View record in Web of Science®
V. P. Weitz, V. Sartor, B. Acs, S. Robertson, D. Budelmann, J. Hartman, M. Rantalainen. "Increasing the usefulness of already existing annotations through WSI registration”. [Manuscript]
Fulltext (DOI)
I. P. Weitz, Y. Wang, K. Kartasalo, L. Egevad, J. Lindberg, H. Grönberg, M. Eklund, M. Rantalainen. “Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using coexpression-based convolutional neural networks”. Bioinformatics. vol. 38, no. 13, pp. 3462–3469, Jun. 2022.
Fulltext (DOI)
Pubmed
View record in Web of Science®
II. P. Weitz, Y. Wang, J. Hartman, and M. Rantalainen. “An investigation of attention mechanisms in histopathology whole-slide-image analysis for regression objectives”. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). IEEE, Oct 2021.
Fulltext (DOI)
III. P. Weitz*, M. Valkonen*, L. Solorzano*, C. Carr, K. Kartasalo, C. Boissin, S. Koivukoski, A. Kuusela, D. Rasic, Y. Feng, S. Sinius Pouplier, A. Sharma, K. Ledesma Eriksson, S. Robertson, C. Marzahl, C. Gatenbee, A. Anderson, M. Wodzinski, A. Jurgas, N. Marini, M. Atzori, H. Müller, D. Budelmann, N. Weiss, S. Heldmann, J. Lotz, J. Wolterink, B. De Santi, A. Patil, A. Sethi, S. Kondo, S. Kasai, K. Hirasawa, M. Farrokh, N. Kumar, R. Greiner, L. Latonen, A. Laenkholm, J. Hartman, P. Ruusuvuori*, M. Rantalainen*. “The ACROBAT 2022 Challenge: Automatic Registration Of Breast Cancer Tissue”. *Equal contribution. [Manuscript]
Fulltext (DOI)
IV. P. Weitz*, M. Valkonen*, L. Solorzano*, C. Carr, K. Kartasalo, C. Boissin, S. Koivukoski, A. Kuusela, D. Rasic, Y. Feng, S. Sinius Pouplier, A. Sharma, K. Ledesma Eriksson, L. Latonen, A. Laenkholm, J. Hartman*, P. Ruusuvuori*, M. Rantalainen*. “A Multi-Stain Breast Cancer Histological Whole-Slide-Image Data Set from Routine Diagnostics”. Scientific Data. vol. 10, no. 1, p. 562, Aug. 2023. *Equal contribution.
Fulltext (DOI)
Pubmed
View record in Web of Science®
V. P. Weitz, V. Sartor, B. Acs, S. Robertson, D. Budelmann, J. Hartman, M. Rantalainen. "Increasing the usefulness of already existing annotations through WSI registration”. [Manuscript]
Fulltext (DOI)
Institution: Karolinska Institutet
Supervisor: Rantalainen, Mattias
Co-supervisor: Hartman, Johan; Eklund, Martin; Grönberg, Henrik
Issue date: 2023-09-29
Rights:
Publication year: 2023
ISBN: 978-91-8017-148-9
Statistics
Total Visits
Views | |
---|---|
Artificial ... | 662 |
Total Visits Per Month
March 2024 | April 2024 | May 2024 | June 2024 | July 2024 | August 2024 | September 2024 | |
---|---|---|---|---|---|---|---|
Artificial ... | 50 | 42 | 44 | 19 | 30 | 33 | 17 |
File Visits
Views | |
---|---|
Thesis_Philippe_Weitz.pdf | 469 |
Top country views
Views | |
---|---|
Sweden | 168 |
Ireland | 103 |
United States | 90 |
United Kingdom | 56 |
Germany | 28 |
India | 26 |
China | 23 |
South Korea | 10 |
Russia | 8 |
Taiwan | 8 |
Top cities views
Views | |
---|---|
Dublin | 91 |
Stockholm | 67 |
Ashburn | 10 |
Karlstad | 8 |
Boydton | 6 |
Taichung | 6 |
Gothenburg | 5 |
Sundbyberg | 5 |
Ann Arbor | 4 |
San Mateo | 4 |