PeptidesDatasetExpCondition#
- class omicspylib.datasets.peptides.PeptidesDatasetExpCondition(name: str, data: DataFrame, id_col: str, experiment_cols: list, protein_id_col: str | None = None, metadata: dict | None = None)#
Bases:
TabularExperimentalConditionDatasetPeptide dataset for a specific experimental condition. Includes all experiments (runs) for that case.
Normally, you don’t have to interact with this object.
PeptidesDatasetwraps multiplePeptidesDatasetExpConditionobjects under one group.
Constructor#
- PeptidesDatasetExpCondition.__init__(name: str, data: DataFrame, id_col: str, experiment_cols: list, protein_id_col: str | None = None, metadata: dict | None = None) None#
Initializes the object.
- Parameters:
name (str) – Name of the object.
data (pd.DataFrame) – Experiments of the specified condition as a Pandas data frame, where each column is one experiment. This table might contain unrelated columns. Only the column names specified under the
id_colandexperiment_colswill be used.id_col (str) – Column name containing the peptide identifiers. It is expected that this column is unique.
experiment_cols (list) – List of the column names for the experiments you want to include in this experimental condition. All these specified columns should be present in the provided data frame.
protein_id_col (str, optional) – Column name of the protein identifier column (e.g., Uniprot accession number). You might need to specify this name to be able to convert a
PeptidesDatasetto aProteinsDataset. If it is not provided, there is no information about doing that conversion.metadata (dict) – Optional metadata.
Properties#
- PeptidesDatasetExpCondition.experiment_names#
Get the list of experiment names.
- Returns:
A list of the experiment names from the given experimental condition.
- Return type:
list
- PeptidesDatasetExpCondition.id_col#
Column identifier for the record ids.
- Returns:
Column name with the unique identifiers as string.
- Return type:
str
- PeptidesDatasetExpCondition.metadata#
Return the dataset’s metadata. If no values exist, an empty dictionary is returned.
- Returns:
Datasets metadata.
- Return type:
dict
- PeptidesDatasetExpCondition.n_experiments#
Returns the number of experiments.
- Returns:
int
- Return type:
The number of experiments.
- PeptidesDatasetExpCondition.name#
Get experimental condition name (e.g. treated, untreated etc.).
- Returns:
Dataset’s name.
- Return type:
str
- PeptidesDatasetExpCondition.record_ids#
A list of unique protein ids as they are provided by the user.
- Returns:
A list of unique record ids.
- Return type:
list
Methods#
- PeptidesDatasetExpCondition.describe() dict#
Returns basic information about the dataset.
- Returns:
Dataset’s basic information including name, number of experiments, number of records, number of experimental names and number of records per experiment.
- Return type:
dict
- PeptidesDatasetExpCondition.drop(exp: str | list, omit_missing_cols: bool = True) T#
- PeptidesDatasetExpCondition.filter(exp: str | list | None = None, min_frequency: int | None = None, na_threshold: float = 0.0) PeptidesDatasetExpCondition#
Filter dataset based on a given set of properties.
- Parameters:
exp (list, str, optional) – List or experiment to keep with. Leave empty to keep all experiments.
min_frequency (int or None, optional) – If specified, records of the dataset will be filtered based on their within group frequency.
na_threshold (float or None, optional) – Values below or equal to this threshold are considered missing. It is used in to filter records based on the number of missing values.
- Returns:
A new instance of the dataset object, filtered based on the user’s input.
- Return type:
- PeptidesDatasetExpCondition.frequency(na_threshold: float = 0.0, axis: int = 1) DataFrame#
- PeptidesDatasetExpCondition.impute(method: Literal['fixed', 'fixed row', 'row min', 'row mean', 'row median'], na_threshold: float = 0.0, value: float | Series | None = None, shift: float = 0.0, random_noise: bool = False) T#
TBD …
- Parameters:
method
value
na_threshold
shift
random_noise (bool)
- PeptidesDatasetExpCondition.log2_transform() T#
- PeptidesDatasetExpCondition.log2_backtransform() T#
- PeptidesDatasetExpCondition.min(na_threshold: float = 0.0, axis: Literal['rows', 'columns'] | None = None) float | Series#
Calculate the minimum value of that condition. By default, records with quantitative value ⇐ 0.0 will be omitted, so that you don’t get 0.0 during
mincalculation.- Parameters:
na_threshold (float) – Values below or equal to this threshold are considered missing.
axis (AxisName, optional) – You can calculate the
minoverrowsorcolumns.
- PeptidesDatasetExpCondition.mean(na_threshold: float = 0.0, axis: int = 1) DataFrame#
- Parameters:
na_threshold
axis (int) – 1 for row by row and 0 for column by column.
- PeptidesDatasetExpCondition.missing_values(na_threshold: float = 0.0) Tuple[DataFrame, int, int]#
Calculate the number of missing values per experiment.
- Parameters:
na_threshold (float, optional) – Values equal or below this threshold will be considered missing.
- Returns:
pd.DataFrame – A Pandas data frame with the number of missing values per experiment.
Int – Number of missing values in total.
Int – Number of total values of that condition.
- PeptidesDatasetExpCondition.shift(exp, value, na_threshold: float = 0.0) None#
Shift values of a given experiment by a fixed value. The specified value will be subtracted from that experiment.
- Parameters:
exp
value
na_threshold (float)
- PeptidesDatasetExpCondition.to_table() DataFrame#
Returns the individual experiments from this condition as a Pandas data frame.
- Returns:
A table with protein ids as rows and experiment quantitative values as columns.
- Return type:
pd.DataFrame