Diogo F. Soares

LASIGE and Department of Informatics, Ciências Ulisboa

E-mail: dfsoares(at)ciencias.ulisboa.pt

Office C6.3.38 @Ciências ULisboa

Short Bio. Diogo F. Soares is an Invited Assistant Professor at the Department of Informatics, Faculty of Sciences, University of Lisbon, and a researcher at LASIGE, contributing to the Health and Biomedical Informatics and Data and Systems Intelligence research lines. He teaches courses in Programming and Data Science and holds a PhD in Informatics and an MSc in Data Science from the University of Lisbon. His research focuses on patient-centered machine learning, with involvement in European and National funded projects. His primary research interests include Data Mining, Machine Learning, Unsupervised Learning Algorithms, and Biomedical Informatics.


MSc or PhD Student?

Do you want to do research in machine learning to improve patients quality-of-life? Reach me out!

Check here the MSc proposals for 24/25

Publications

Journal articles

  1. Soares, D. F., Henriques, R., & Madeira, S. C. (2024). Comprehensive assessment of triclustering algorithms for three-way temporal data analysis. Pattern Recognition, 110303.
  2. M. Amaral, D., Soares, D. F., Gromicho, M., de Carvalho, M., Madeira, S. C., Tomás, P., & Aidos, H. (2024). Temporal stratification of amyotrophic lateral sclerosis patients using disease progression patterns. Nature Communications, 15(1), 5717.
  3. Soares, D. F., Henriques, R., Gromicho, M., de Carvalho, M., & Madeira, S. C. (2023). Triclustering-based classification of longitudinal data for prognostic prediction: targeting relevant clinical endpoints in amyotrophic lateral sclerosis. Scientific Reports, 13(1), 6182.
  4. Tavazzi, E., Longato, E., Vettoretti, M., Aidos, H., Trescato, I., Roversi, C., Martins, A. S., Castanho, E. N., Branco, R., Soares, D. F., & others. (2023). Artificial intelligence and statistical methods for stratification and prediction of progression in amyotrophic lateral sclerosis: A systematic review. Artificial Intelligence in Medicine, 142, 102588.
  5. Soares, D. F., Henriques, R., Gromicho, M., de Carvalho, M., & Madeira, S. C. (2022). Learning prognostic models using a mixture of biclustering and triclustering: Predicting the need for non-invasive ventilation in Amyotrophic Lateral Sclerosis. Journal of Biomedical Informatics, 134, 104172.

Conference Proceedings

  1. Martins, A., Amaral, D., Castanho, E., Soares, D., Branco, R., Madeira, S., & Aidos, H. (2024). Predicting the functional rating scale and self-assessment status of ALS patients with sensor data.
  2. Branco, R., Soares, D. F., Martins, A. S., Valente, J. B., Castanho, E. N., Madeira, S. C., & Aidos, H. (2023). Investigating the Impact of Environmental Data on ALS Prognosis with Survival Analysis. CLEF (Working Notes), 1186–1198.
  3. Branco, R., Valente, J. B., Martins, A. S., Soares, D. F., Castanho, E. N., Madeira, S. C., & Aidos, H. (2023). Survival Analysis for Multiple Sclerosis: Predicting Risk of Disease Worsening. CLEF (Working Notes), 1199–1209.
  4. Nunes, S., Sousa, R. T., Serrano, F., Branco, R., Soares, D. F., Martins, A. S., Auletta, E., Castanho, E. N., Madeira, S. C., Aidos, H., & others. (2022). Explaining Artificial Intelligence Predictions of Disease Progression with Semantic Similarity. CLEF (Working Notes), 1256–1268.
  5. Branco, R., Soares, D. F., Martins, A. S., Auletta, E., Castanho, E. N., Nunes, S., Serrano, F., Sousa, R. T., Pesquita, C., Madeira, S. C., & others. (2022). Hierarchical Modelling for ALS Prognosis: Predicting the Progression Towards Critical Events. CLEF (Working Notes), 1211–1227.
  6. Soares, D., Henriques, R., Gromicho, M., Pinto, S., de Carvalho, M., & Madeira, S. C. (2021). Towards triclustering-based classification of three-way clinical data: A case study on predicting non-invasive ventilation in als. Practical Applications of Computational Biology & Bioinformatics, 14th International Conference (PACBB 2020) 14, 112–122.

Teaching

Programming I

2024/2025

Construction of Software Systems

2023/2024

Programming I (LTI)

2023/2024 2022/2023

Programming Labs

2022/2023

Intelligent Systems

2021/2022

Data Mining

2020/2021

Machine Learning

2020/2021

Software

TriHSPAM: Triclustering Heterogeneous Longitudinal Clinical Data using Sequential Patterns

TriHSPAM is an algorithm designed for triclustering heterogeneous longitudinal clinical data using sequential pattern mining techniques. It identifies patterns across three dimensions-patients, features, and time-allowing for the discovery of meaningful clinical subgroups and progression trends. This method is particularly useful for analyzing complex temporal datasets, such as those found in personalized medicine and disease progression studies.

G-HTric: 3W Dataset Generator with Annotated Examples and Triclustering Solutions

G-HTric is a dataset generator specifically designed for creating threeway (3W) data with annotated examples and corresponding triclustering solutions. It enables the generation of synthetic datasets that simulate real-world temporal data, providing researchers with ground truth to evaluate and benchmark triclustering algorithms. This tool is ideal for validating methods in the analysis of longitudinal and multi-dimensional data.

ClusTric

ClusTric is a tool designed to extract comprehensive patterns from triclustering and utilize them in an agglomerative clustering algorithm to reveal distinct patient groups. By identifying multi-dimensional patterns across patients, features, and time, it enhances the discovery of clinically relevant subgroups, supporting more targeted and personalized healthcare strategies.

BicTric: Learning prognostic models using a mixture of biclustering and triclustering

BicTric is a framework for building prognostic models by combining biclustering and triclustering techniques. It leverages both methods to identify meaningful patterns in clinical data, enabling the prediction of critical outcomes, such as disease progression and treatment needs.

Learning Predictive Models with a Triclustering-based Classifier

This framework gives the possibility of building triclustering-based predictive models. It includes a new triclustering algorithm, TCtriCluster

Blog