Cleaning up data using python

Hello Guys,
Please I need help. I am switching career path to Data engineering. I have this project I am currently working on but need help.

I have the following data stored in a variable called damages, it has some missing data (see below).

damages = [‘Damages not recorded’, ‘100M’, ‘Damages not recorded’, ‘40M’, ‘27.9M’, ‘5M’, ‘Damages not recorded’, ‘306M’, ‘2M’, ‘65.8M’, ‘326M’, ‘60.3M’, ‘208M’, ‘1.42B’, ‘25.4M’, ‘Damages not recorded’, ‘1.54B’, ‘1.24B’, ‘7.1B’]

I have been asked to write a function that returns a new list of updated damages where the recorded data is converted to float values and the missing data is retained as "Damages not recorded" .

I was given a hint:
The function iterates through the damages list and uses string parsing and the following conversion dictionary to convert the data.
conversion = {“M”: 1000000,
“B”: 1000000000}

Can someone please give me some clues as to where I can begin?


1 Like


Were you able to resolve this?

Hi roxan-dersus,
i am new here and would like to help but need more details.
do you want to have a sum of all added values in float format?
(add 100 *1e6+ 40 *1e6+ 27.9 *1e6+… → here “167900000.0”
how to operate on the entries “Damages not recorded”? should i count these entries, only? or ignore them?
any value in K = Kilo = 1000?
input as list or as txt file? (preferring txt file, example is near csv)
output in list or in txt file? (preferring txt file)
any restrictions for filesize and character types?

thanks Andy

Hi again,
just for my own experience I’ve done my solution for your question. Give me your request if you are still interested in my idea.