To convert a float64-type column into an int64 or string-type column in python (short)

Address the “safe rule” issue by using the apply clause and pd.Int64Dtype()

2 min readAug 20, 2020

Suppose we have a pandas dataframe which consists of two columns — date (a datetime64-type column) and colf (a float64-type column). And column colf contains NaNs.

Now we make an integer-type column from colf. So we type

df_a['coli']=df_a['colf'].astype(pd.Int64Dtype())

But this causes error messages like TypeError: Cannot cast array from dtype(‘float64’) to dtype(‘int64’) according to the rule ‘safe'.

So we now resort to the apply clause (although it can be slow):

df_a['coli']= (df_a['colf'].apply(lambda w0: None if pd.isnull(w0) else np.int64(w0)) ).astype(pd.Int64Dtype())

Then we have:

Indeed, the newly created column is integer64-type:

>>> df_a.dtypes
date    datetime64[ns]
colf           float64
coli             Int64

Next we create a string type column based on column colf. It is well known that astype(str) converts NaN into a string ‘nan’. We try to do

df_a['cols']=df_a['colf'].astype('str')

As expected, we see that the NaNs in colf are converted into string nans in column cols:

Obviously, this form has the disadvantage that we cannot extract null values from the records in usual ways, and so we need to keep NaN. So we now use the apply clause:

df_a['cols_better']=df_a['colf'].apply(lambda flt0: np.nan if pd.isnull(flt0) else str(flt0))

Then we get:

That is, column cols_better is a string-type column and the NaNs from column colf are preserved.

To convert a float64-type column into an int64 or string-type column in python (short)

Address the “safe rule” issue by using the apply clause and pd.Int64Dtype()

Written by T Miyamoto

No responses yet