Eduardo N. Castanho, LASIGE’s PhD student, published the paper “G-bic: generating synthetic benchmarks for biclustering”, in BMC Bioinformatics journal, a high-ranked journal in the field of bioinformatics (h-index of 231, Scimago Q1), in December 2023. The paper was co-authored by LASIGE’s integrated researcher Sara C. Madeira, LASIGE’s former MSc student João P.Lobo, and Rui Henriques from INESC-ID and Instituto Superior Técnico, Universidade de Lisboa.
Biclustering is increasingly used in biomedical data analysis, recommendation tasks, and text mining domains, with hundreds of biclustering algorithms proposed. Synthetic data produces reference solutions to be compared with the found patterns. However, generating synthetic datasets is challenging since the generated data must ensure reproducibility, pattern representativity, and real data resemblance. The researchers propose G-Bic, a parametrizable generator for biclustering analysis, offering a solid means to assess biclustering solutions according to internal and external metrics robustly. Beyond expanding on aspects of pattern coherence, data quality, and positioning properties, it further handles specificities related to mixed-type datasets and time-series data.
The paper is available here.