To convert a float64-type column into an int64 or string-type column in python (short)

Suppose we have a pandas dataframe which consists of two columns — date (a datetime64-type column) and colf (a float64-type column). And column colf contains NaNs.

Now we make an integer-type column from colf. So we type

df_a['coli']=df_a['colf'].astype(pd.Int64Dtype())

But this causes error messages like TypeError: Cannot cast array from dtype(‘float64’) to dtype(‘int64’) according to the rule ‘safe'.

So we now resort to the apply clause (although it can be slow):

df_a['coli']= (df_a['colf'].apply(lambda w0: None if pd.isnull(w0) else np.int64(w0)) ).astype(pd.Int64Dtype())

Then we have:

Indeed, the newly created column is integer64-type:

>>> df_a.dtypes
date datetime64[ns]
colf float64
coli Int64

Next we create a string type column based on column colf. It is well known that astype(str) converts NaN into a string ‘nan’. We try to do

df_a['cols']=df_a['colf'].astype('str')

As expected, we see that the NaNs in colf are converted into string nans in column cols:

Obviously, this form has the disadvantage that we cannot extract null values from the records in usual ways, and so we need to keep NaN. So we now use the apply clause:

df_a['cols_better']=df_a['colf'].apply(lambda flt0: np.nan if pd.isnull(flt0) else str(flt0))

Then we get:

That is, column cols_better is a string-type column and the NaNs from column colf are preserved.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store