'AttributeError: 'SingleBlockManager' object has no attribute 'log'

I am using a big data with million rows and 1000 columns. I already referred this post here. Don't mark it as duplicate.

If sample data required, you can use the below

from numpy import *

m = pd.DataFrame(array([[1,0],
           [2,3]]))

I have some continuous variables with 0 values in them.

I would like to compute logarithmic transformation of all those continuous variables.

However, I encounter divide by zero error. So, I tried the below suggestion based on above linked post

df['salary'] = np.log(df['salary'], where=0<df['salary'], out=np.nan*df['salary']) #not working `python stopped working` problem`

from numpy import ma
ma.log(df['app_reg_diff'])  # error

My questions are as follows

a) How to avoid divide by zero error when applying for 1000 columns? How to do this for all continuous columns?

b) How to exclude zeros from log transformation and get the log values for rest of the non-zero observations?



Solution 1:[1]

You can replace the zero values with a value you like and do the logarithm operation normally.

import numpy as np
import pandas as pd

m = pd.DataFrame(np.array([[1,0], [2,3]]))

m[m == 0] = 1

print(np.log(m))

Here you would get zeros for zero items. You can for example replace it with -1 to get NaN.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Shahriar