features

silk_ml.features.split_classes(X, Y, label)[source]

Returns the splited value of the dataset using the requested label

Parameters
  • X (pd.DataFrame) – Main dataset with the variables

  • Y (pd.Series) – Target variable

  • label (str) – Name of the variable to split

Returns

The positive and negative data splited

Return type

tuple(pd.Series, pd.Series)

silk_ml.features.features_metrics(X, Y, target_name, plot=None)[source]

Determines the likelihood from each variable of splitting correctly the dataset

Parameters
  • X (pd.DataFrame) – Main dataset with the variables

  • Y (pd.Series) – Target variable

  • target_name (str or None) – Target name for reports

  • plot ('all' or 'categorical' or 'numerical' or None) – Plots the variables, showing the difference in the classes

Returns

Table of variables and their classification tests

Return type

pd.DataFrame