feature engineer

count features

class autox.autox_competition.feature_engineer.fe_count.FeatureCount[source]

Convert categorical features into the number of occurrences.

fit(df, degree=1, target=None, df_feature_type=None, silence_cols=[], select_all=True, max_num=None)[source]

Parameters

transform(df)[source]

class autox.autox_competition.feature_engineer.fe_cross.FeatureCross(importance_type='split')[source]

synthetic feature formed by multiplying (crossing) two features.

fit(X, y, objective, category_cols, top_k=10, used_cols=[])[source]

Parameters

X – {array-like, sparse matrix} of shape (n_samples, n_features). Training vector, where n_samples is the number of samples and n_features is the number of features.
y – array-like of shape (n_samples,). Target vector relative to X.
objective – str, objective equal to ‘binary’ or ‘regression’.
category_cols – list, column names of categorical features.
top_k – int, keep the top_k importance cross features, default top_k = 10.
used_cols – list, columns will be used for training model, default top_k = 10.

transform(X)[source]

Parameters: X – {array-like, sparse matrix} of shape (n_samples, n_features). Training vector, where n_samples is the number of samples and n_features is the number of features.
Returns: dataframe, cross features.

class autox.autox_competition.feature_engineer.fe_cumsum.FeatureCumsum[source]: cumsum特征描述

class autox.autox_competition.feature_engineer.fe_denoising_autoencoder.FeatureDenoisingAutoencoder[source]: DenoisingAutoencoder特征描述

class autox.autox_competition.feature_engineer.fe_diff.FeatureDiff[source]: diff特征描述