Python Beginner

Hi All,

I am new to programming and python. I have a data analysis assignment where I am analysing property price data over a number of years and regions. I have lists of the region and associated price.  Following as shortened example:

region = [London, Glasgow, Liverpool, London, Liverpool, Manchester, Cardiff, Liverpool, London, Glasgow, Liverpool]

price = [550000,350000,402000,750000,600000,230000,987000,236000,578000,643000,867000]

I have the basic stats (Mean, Median, Max, Min, Std dev) on the price list as a whole.

I would now like to count and display the number of properties sold by region and then provide the basic stats by region.

Any ideas/suggestions and code example of how to approac this. 

I am not allowed use pandas numpy etc.


I can think of a couple of approaches.

Since you’re trying to calculate stats by region, I’d come up with a list of unique regions. I’ll let you google for different ways how you might do that.

Using your list of unique regions, then iterate through the two lists, and for each region, calculate and output the desired statistics.

The second approach involves creating a dictionary, where the key is a region. The value for each dictionary entry is then the stats you want to calculate per region. Like the previous approach, you have to iterate through your two lists and calculate the stats for each region. 

The “benefit” of the dictionary approach is the stats are “remembered” and can be accessed after you calculate them; the first approach would output the regional stats as calculated.

Many thanks for that. I have created the unique list of regions now but still struggling on pulling the pruce data from price list for each region. Can you give me some code hints or pointers to do this based on the data examples above…really appreciate your help…

Kinda having trouble understanding what you mean here… but it sounds to me like you need to look into object oriented programming.

class Region:
    def __init__ (self, _price, _median, _mean)
        self.price = _price
        self.median = _median
        self.mean = _mean
London = Region(30000, 15000, 25000)
Glasgow = Region(20000, 10000, 25000)
# etc...

Very crude generalization. But after that you can access the class’ values directly and print them off.

Classes are good for grouping data together so it’s not loose and scattered around, removing a lot of the guesswork that would come with comparing index values. You could also use a dictionary with a (key, value) store but that would also limit the amount of data you could store to the key.

If you’re not familiar with classes, I’ll explain what’s going on here. We’re declaring a “type” called Region, which contains values “price, median, and mean”. We use def init ()  as a constructor. The constructor is the function that’s called to create a class, which is why I can say Region(4000, 500, 3000) etc. to initialize the values of the class. With London and Glasgow we’re declaring instances of the Type “Region” to initialize the values and store them in a way that’s easier for us to manage.

I use Python a little bit, but it’s not my primary language so I’m not much of a “Pythonic” programmer. If my explanation was hard to understand, try reading