python Keras Deep NN code tabular categorical features: how to predict unseen in training data
$10-30 USD
Bezahlt bei Lieferung
1
use embedding layer for input layers : one hot for categorical values
2
provide code how to ignore new categorical values from data for prediction
for example new values should be encoded in one hot as all zeros
for example for used in train samples
abc -> 00001
cfr -> 00010
trvbn -> 00100
etc
not used in train
kljghkjlh -> 00000
ygtfrd-> 00000
u7y8uu -> 00000
3
deliver working example: data and Keras python code
data table should be big : millions of rows and more than 30 features
and each feature have at least 300 categories
4
for example use this idea
[login to view URL]
onehot_encoder = OneHotEncoder(sparse=False, handle_unknown='ignore')
or
[login to view URL]
BUT
1
and do not use slow solution like this (working with dataframes instead of arrays, not optimized for speed)
like in
[login to view URL]
2
do not use memory not efficient solution
meaning not dense data representation but spars data representation
since I do have big data - bid data table which takes a lot place in memory when hot encoded
Projekt-ID: #29464438
Über das Projekt
Vergeben an:
I have all the skills you need i can develop the model you want . I am proficient in tensorflow and keras