Skip to content

ChEBI datasets are missing raw data #59

@sfluegel05

Description

@sfluegel05

Problem

When introducing the _DynamicDataset in #39, we changed the meaning of raw files. Previously, the labeled train.pkl, validation.pkl and test.pkl file names were returned by raw_file_names_dict() and raw_file_names(). Now, the GO class has a raw_file_names_dict() method that only the direct downloads (e.g., for GO, the go-basic.obo and uniprot_sprot.dat). It overwrites the method in _DynamicDataset which returns data.pkl. raw_file_names() is missing completely.

Solution

  • Introduce a third file names property for data.pkl
  • Link raw_file_names to raw_file_names_dict
  • Add chebi.obo as raw file name for chebi classes
  • Use data.pkl in weighted BCE loss (for calculating weights) (@sfluegel05)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions