'Transaction data into a one-hot encoded boolean array without using TransactionEncoder
I want to transform a transaction data into a one-hot encoded boolean array without using TransactionEncoder. or to create a function from scratch that do same works as TransactionEncoder. after following code :
df = pd.read_csv('basket.csv')
df.set_index('ID', inplace=True)
transactions = []
for i in range(0, len(df)):
transactions.append([str(df.values[i,j]) for j in range(0, len(df.columns))])
print(transactions[:5])
flattened = [item for transaction in transactions for item in transaction]
print(len(flattened))
output :
[['whole milk', 'eggs', 'salty snack', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], ['whole milk', 'eggs', 'white bread', 'yogurt', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], ['whole milk', 'eggs', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], ['whole milk', 'eggs', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'], ['whole milk', 'eggs', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan']]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
