How to find the average over an area of center for a given radius

I have an excel file which contains the lat and long values of the center of a Tropical Cyclone(TC). The excel file is as given below:

19.8  69.4
20    69
20.4  68.2
20.5  67.2
20.5  65.7
20.3  65
20.2  64.2
20.2  63.7
20.2  62.9
20.2  62.3
20.2  61.5
20.1  61
20.1 60.3
20    59.5
19.9  58.9
19.8  58.3

Also, I have an NC(NetCDF) file which is of that of air temperature(The link to the data is given)air_temp.nc. Now what I intend to do is average over an area of radius 2.5◦ on the storm center for the variable in the NC data i.e. for each lat long value I need to find the average over an area of average 2.5◦. I know how to find the simple average using NumPy mean for individual lat-long, but I am confused about how to find over an area for a given radius.

Topic numpy matlab pandas python

Category Data Science


Define a function to calculate distance between two latitudes and longitudes. I found php implementation of below here. I converted it to python.

import numpy as np 
import math 

def getDistanceBetweenPoints(latitude1, longitude1, latitude2, longitude2):
        theta = longitude1 - longitude2
        distance = math.sin(np.radians(latitude1)) * math.sin(np.radians(latitude2)) + math.cos(np.radians(latitude1)) * math.cos(np.radians(latitude2)) * math.cos(np.radians(theta))

        distance = math.acos(distance)
        distance = np.degrees(distance)

        return distance

Import TC data in the pandas dataframe and apply below function to call on the DataFrame. The function would call above function getDistanceBetweenPoints for each latitude and longitude

# create a dataframe
df = pd.DataFrame(tc_csv_file_path)

# take subset of the data frame with columns we want
df_lat_long = df[['latitude', 'longitude']]

# insert a new column in dataframe which would hold distance between the latitudes and longitudes
df_lat_long.insert(df_lat_long.shape[1], "distance", [0.0 for val in range(df_lat_long.shape[0])], True)

Define a function which would be invoked row wise on a data frame

def getDistanceBetweenPoints_df(row):
    # Place latitude/longitude for a centre point here
    # I have filled 0.0, you replace with with actual values
    centre_lat = 0.0
    centre_long = 0.0
    row['distance'] = getDistanceBetweenPoints(centre_lat, centre_long, row['longitude'], row['latitude'], unit='Km')

Now invoke this function on data frame. It will called row wise and it will populate the column distance in the dataframe with the distance between center latitude/longitude and the row's latitude/longitude

# Invoke function on dataframe
df_lat_long.agg(getDistanceBetweenPoints_df, axis='columns')

# View the dataframe for values only where distance is less than or equal to 2.5◦
df_lat_long[df_lat_long['distance'] <= 2.5]

So, these are the latitudes and longitudes where distance from the given centre is less than of equal to 2.5◦. Do whatever you want to do with these


I think your problem is not so much calculating the average, as it is to create a subset within a radius. Once you have that subset, calculating the average is trivial.

You find an example of subsetting on radius here: https://stackoverflow.com/questions/59060532/calculate-coordinates-inside-radius-at-each-time-point

Note however that that example works with a 2d circle, which isn't exactly what you are looking for because Earth is a globe. In order to adjust for it, you'll need the great circle distance (or Haversine distance). Why? Look here: https://en.wikipedia.org/wiki/Haversine_formula

You'll find a python implementation of that concept here: https://stackoverflow.com/questions/52889566/calculate-euclidean-distance-for-latitude-and-longitude-pandas-dataframe-pytho

1. Load the data into a dataframe

import pandas as pd
import xarray as xr

data = xr.open_dataset('file')
df = data.to_dataframe()

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.