NCO netcdf4 operations - ncwa (Averaging) - linux
I am having trouble trying to combine three files to be averaged. I am not so sure how to even start. I have three files
"nday1.06.nc , nday1.07.nc, nday.08.nc"
each with the variables
"filling on), ('SST', <class 'netCDF4._netCDF4.Variable'>
float32 SST(time, nlat, nlon)
long_name: Surface Potential Temperature
units: degC
coordinates: TLONG TLAT time
grid_loc: 2110
cell_methods: time: mean time: mean time: mean
_FillValue: 9.96921e+36
missing_value: 9.96921e+36
unlimited dimensions: time
current shape = (1, 2400, 3600)
I just need to average the SST variables and then an output file with the averages
You need ncra not ncwa
http://nco.sourceforge.net/nco.html#ncra
ncra nday1.06.nc nday1.07.nc nday.08.nc out.nc
Similarly, you could you use cdo, but you first need to merge the files:
cdo mergetime nday1.06.nc nday1.07.nc nday.08.nc mergedfile.nc
and then average:
cdo timmean mergedfile.nc out.nc
Related
python3: Split time series by diurnal periods
I have the following dataset: 01/05/2020,00,26.3,27.5,26.3,80,81,73,22.5,22.7,22.0,993.7,993.7,993.0,0.0,178,1.2,-3.53,0.0 01/05/2020,01,26.1,26.8,26.1,79,80,75,22.2,22.4,21.9,994.4,994.4,993.7,1.1,22,2.0,-3.54,0.0 01/05/2020,02,25.4,26.1,25.4,80,81,79,21.6,22.3,21.6,994.7,994.7,994.4,0.1,335,2.3,-3.54,0.0 01/05/2020,03,23.3,25.4,23.3,90,90,80,21.6,21.8,21.5,994.7,994.8,994.6,0.9,263,1.5,-3.54,0.0 01/05/2020,04,22.9,24.2,22.9,89,90,86,21.0,22.1,21.0,994.2,994.7,994.2,0.3,268,2.0,-3.54,0.0 01/05/2020,05,22.8,23.1,22.8,90,91,89,21.0,21.4,20.9,993.6,994.2,993.6,0.7,264,1.5,-3.54,0.0 01/05/2020,06,22.2,22.8,22.2,92,92,90,20.9,21.2,20.8,993.6,993.6,993.4,0.8,272,1.6,-3.54,0.0 01/05/2020,07,22.6,22.6,22.0,91,93,91,21.0,21.2,20.7,993.4,993.6,993.4,0.4,284,2.3,-3.49,0.0 01/05/2020,08,21.6,22.6,21.5,92,92,90,20.2,20.9,20.1,993.8,993.8,993.4,0.4,197,2.1,-3.54,0.0 01/05/2020,09,22.0,22.1,21.5,92,93,92,20.7,20.8,20.2,994.3,994.3,993.7,0.0,125,2.1,-3.53,0.0 01/05/2020,10,22.7,22.7,21.9,91,92,91,21.2,21.2,20.5,995.0,995.0,994.3,0.0,354,0.0,70.99,0.0 01/05/2020,11,25.0,25.0,22.7,83,91,82,21.8,22.1,21.1,995.5,995.5,995.0,0.8,262,1.5,744.8,0.0 01/05/2020,12,27.9,28.1,24.9,72,83,70,22.3,22.8,21.6,996.1,996.1,995.5,0.7,228,1.9,1392.,0.0 01/05/2020,13,30.4,30.4,27.7,58,72,55,21.1,22.6,20.4,995.9,996.2,995.9,1.6,134,3.7,1910.,0.0 01/05/2020,14,31.7,32.3,30.1,50,58,48,20.2,21.3,19.7,995.8,996.1,995.8,3.0,114,5.4,2577.,0.0 01/05/2020,15,32.9,33.2,31.8,44,50,43,19.1,20.5,18.6,994.9,995.8,994.9,0.0,128,5.6,2853.,0.0 01/05/2020,16,33.2,34.4,32.0,46,48,41,20.0,20.0,18.2,994.0,994.9,994.0,0.0,125,4.3,2700.,0.0 01/05/2020,17,33.1,34.5,32.7,44,46,39,19.2,19.9,18.5,993.4,994.1,993.4,0.0,170,1.6,2806.,0.0 01/05/2020,18,33.6,34.2,32.6,41,47,40,18.5,20.0,18.3,992.6,993.4,992.6,0.0,149,0.0,2319.,0.0 01/05/2020,19,33.5,34.7,32.1,43,49,39,19.2,20.4,18.3,992.3,992.6,992.3,0.3,168,4.1,1907.,0.0 01/05/2020,20,32.1,33.9,32.1,49,51,41,20.2,20.7,18.5,992.4,992.4,992.3,0.1,192,3.7,1203.,0.0 01/05/2020,21,29.9,32.2,29.9,62,62,49,21.8,21.9,20.2,992.3,992.4,992.2,0.0,188,2.9,408.0,0.0 01/05/2020,22,28.5,29.9,28.4,67,67,62,21.8,22.0,21.7,992.5,992.5,992.3,0.4,181,2.3,6.817,0.0 01/05/2020,23,27.8,28.5,27.8,71,71,66,22.1,22.1,21.5,993.1,993.1,992.5,0.0,225,1.6,-3.39,0.0 02/05/2020,00,27.4,28.2,27.3,75,75,68,22.5,22.5,21.7,993.7,993.7,993.1,0.5,139,1.5,-3.54,0.0 02/05/2020,01,27.3,27.7,27.3,72,75,72,21.9,22.6,21.9,994.3,994.3,993.7,0.0,126,1.1,-3.54,0.0 02/05/2020,02,25.4,27.3,25.2,85,85,72,22.6,22.8,21.9,994.4,994.5,994.3,0.1,256,2.6,-3.54,0.0 02/05/2020,03,25.5,25.6,25.3,84,85,82,22.5,22.7,22.1,994.3,994.4,994.2,0.0,329,0.7,-3.54,0.0 02/05/2020,04,24.5,25.5,24.5,86,86,82,22.0,22.5,21.9,993.9,994.3,993.9,0.0,290,1.2,-3.54,0.0 02/05/2020,05,24.0,24.5,23.5,87,88,86,21.6,22.1,21.3,993.6,993.9,993.6,0.7,285,1.3,-3.54,0.0 02/05/2020,06,23.7,24.1,23.7,87,87,85,21.3,21.6,21.3,993.1,993.6,993.1,0.1,305,1.1,-3.51,0.0 02/05/2020,07,22.7,24.1,22.5,91,91,86,21.0,21.7,20.7,993.1,993.3,993.1,0.6,220,1.1,-3.54,0.0 02/05/2020,08,22.9,22.9,22.6,92,92,91,21.5,21.5,21.0,993.2,993.2,987.6,0.0,239,1.5,-3.53,0.0 02/05/2020,09,22.9,23.0,22.8,93,93,92,21.7,21.7,21.4,993.6,993.6,993.2,0.0,289,0.4,-3.53,0.0 02/05/2020,10,23.5,23.5,22.8,92,93,92,22.1,22.1,21.6,994.3,994.3,993.6,0.0,256,0.0,91.75,0.0 02/05/2020,11,26.1,26.2,23.5,80,92,80,22.4,23.1,22.2,995.0,995.0,994.3,1.1,141,1.9,789.0,0.0 02/05/2020,12,28.7,28.7,26.1,69,80,68,22.4,22.7,22.1,995.5,995.5,995.0,0.0,116,2.2,1468.,0.0 02/05/2020,13,31.4,31.4,28.6,56,69,56,21.6,22.9,21.0,995.5,995.7,995.4,0.0,65,0.0,1762.,0.0 02/05/2020,14,32.1,32.4,30.6,48,58,47,19.8,22.0,19.3,995.0,995.6,990.6,0.0,105,0.0,2657.,0.0 02/05/2020,15,34.0,34.2,31.7,43,48,42,19.6,20.1,18.6,993.9,995.0,993.9,3.0,71,6.0,2846.,0.0 02/05/2020,16,34.7,34.7,32.3,38,48,38,18.4,20.3,18.3,992.7,993.9,992.7,1.4,63,6.3,2959.,0.0 02/05/2020,17,34.0,34.7,32.7,42,46,38,19.2,20.0,18.4,991.7,992.7,991.7,2.2,103,4.8,2493.,0.0 02/05/2020,18,34.3,34.7,33.6,41,42,38,19.1,19.4,18.0,991.2,991.7,991.2,2.0,141,4.8,2593.,0.0 02/05/2020,19,33.5,34.5,32.5,42,47,39,18.7,20.0,18.4,990.7,991.4,989.9,1.8,132,4.2,1317.,0.0 02/05/2020,20,32.5,34.2,32.5,47,48,40,19.7,20.3,18.7,990.5,990.7,989.8,1.3,191,4.2,1250.,0.0 02/05/2020,21,30.5,32.5,30.5,59,59,47,21.5,21.6,20.0,979.8,990.5,979.5,0.1,157,2.9,345.5,0.0 02/05/2020,22,28.6,30.5,28.6,67,67,59,21.9,21.9,21.5,978.9,980.1,978.7,0.6,166,2.2,1.122,0.0 02/05/2020,23,27.2,28.7,27.2,74,74,66,22.1,22.2,21.6,978.9,979.3,978.6,0.0,246,1.7,-3.54,0.0 03/05/2020,00,26.5,27.2,26.0,77,80,74,22.2,22.5,22.0,979.0,979.1,978.7,0.0,179,1.4,-3.54,0.0 03/05/2020,01,26.0,26.6,26.0,80,80,77,22.4,22.5,22.1,979.1,992.4,978.7,0.0,276,0.6,-3.54,0.0 03/05/2020,02,26.0,26.5,26.0,79,81,75,22.1,22.5,21.7,978.8,979.1,978.5,0.0,290,0.6,-3.53,0.0 03/05/2020,03,25.3,26.0,25.3,83,83,79,22.2,22.4,21.8,978.6,989.4,978.5,0.5,303,1.0,-3.54,0.0 03/05/2020,04,25.3,25.6,24.6,81,85,81,21.9,22.5,21.7,978.1,992.7,977.9,0.7,288,1.5,-3.00,0.0 03/05/2020,05,23.7,25.3,23.7,88,88,81,21.5,21.9,21.5,977.6,991.8,977.3,1.2,256,1.8,-3.54,0.0 03/05/2020,06,23.3,23.7,23.3,91,91,88,21.7,21.7,21.5,976.9,977.6,976.7,0.4,245,1.8,-3.54,0.0 03/05/2020,07,23.0,23.6,23.0,91,91,89,21.4,21.9,21.3,976.7,977.0,976.4,0.9,257,1.9,-3.54,0.0 03/05/2020,08,23.4,23.4,22.9,90,92,90,21.7,21.7,21.3,976.8,976.9,976.5,0.4,294,1.6,-3.52,0.0 03/05/2020,09,23.0,23.5,23.0,88,90,87,21.0,21.6,20.9,992.1,992.1,976.7,0.8,263,1.6,-3.54,0.0 03/05/2020,10,23.2,23.2,22.5,91,92,88,21.6,21.6,20.8,993.0,993.0,992.2,0.1,226,1.5,29.03,0.0 03/05/2020,11,26.0,26.1,23.2,77,91,76,21.6,22.1,21.5,993.8,993.8,982.1,0.0,120,0.9,458.1,0.0 03/05/2020,12,26.6,27.0,25.5,76,80,76,22.1,22.5,21.4,982.7,994.3,982.6,0.3,121,2.3,765.3,0.0 03/05/2020,13,28.5,28.7,26.6,66,77,65,21.5,23.1,21.2,982.5,994.2,982.4,1.4,130,3.2,1219.,0.0 03/05/2020,14,31.1,31.1,28.5,55,66,53,21.0,21.8,19.9,982.3,982.7,982.1,1.2,129,3.7,1743.,0.0 03/05/2020,15,31.6,31.8,30.7,50,55,49,19.8,20.8,19.2,992.9,993.5,982.2,1.1,119,5.1,1958.,0.0 03/05/2020,16,32.7,32.8,31.1,46,52,46,19.6,20.7,19.2,991.9,992.9,991.9,0.8,122,4.4,1953.,0.0 03/05/2020,17,32.3,33.3,32.0,44,49,42,18.6,20.2,18.2,990.7,991.9,979.0,2.6,133,5.9,2463.,0.0 03/05/2020,18,33.1,33.3,31.9,44,50,44,19.3,20.8,18.9,989.9,990.7,989.9,1.1,170,5.4,2033.,0.0 03/05/2020,19,32.4,33.2,32.2,47,47,44,19.7,20.0,18.7,989.5,989.9,989.5,2.4,152,5.2,1581.,0.0 03/05/2020,20,31.2,32.5,31.2,53,53,46,20.6,20.7,19.4,989.5,989.7,989.5,1.7,159,4.6,968.6,0.0 03/05/2020,21,29.7,32.0,29.7,62,62,51,21.8,21.8,20.5,989.7,989.7,989.4,0.8,154,4.0,414.2,0.0 03/05/2020,22,28.3,29.7,28.3,69,69,62,22.1,22.1,21.7,989.9,989.9,989.7,0.3,174,2.0,6.459,0.0 03/05/2020,23,26.9,28.5,26.9,75,75,67,22.1,22.5,21.7,990.5,990.5,989.8,0.2,183,1.0,-3.54,0.0 The second column is time (hour). I want to separate the dataset by morning (06-11), afternoon (12-17), evening (18-23) and night (00-05). How I can do it?
You can use pd.cut: bins = [-1,5,11,17,24] labels = ['morning', 'afternoon', 'evening', 'night'] df['day_part'] = pd.cut(df['hour'], bins=bins, labels=labels)
I added column names, including Hour for the second column. Then I used read_csv which reads the source text, "dropping" leading zeroes, so that Hour column is just int. To split rows (add a column marking the diurnal period), use: df['period'] = pd.cut(df.Hour, bins=[0, 6, 12, 18, 24], right=False, labels=['night', 'morning', 'afternoon', 'evening']) Then you can e.g. use groupby to process your groups. Because I used right=False parameter, the bins are closed on the left side, thus bin limits are more natural (no need for -1 as an hour). And bin limits (except for the last) are just starting hours of each period - quite natural notation.
How to create a time array in python for seasonal data
I am working with paleoclimate data (536-550 CE) in NetCDF format, which I imported with xarray. The time format is a bit strange: import xarray as xr ds_tas_01 = xr.open_dataset('ue536a01_temp2_seasmean.nc') ds_tas_01['time'] <xarray.DataArray 'time' (time: 61)> array([15360215.25, 15360430.75, 15360731.75, 15361031.75, 15370131.75, 15370430.75, 15370731.75, 15371031.75, 15380131.75, 15380430.75, 15380731.75, 15381031.75, 15390131.75, 15390430.75, 15390731.75, 15391031.75, 15400131.75, 15400430.75, 15400731.75, 15401031.75, 15410131.75, 15410430.75, 15410731.75, 15411031.75, 15420131.75, 15420430.75, 15420731.75, 15421031.75, 15430131.75, 15430430.75, 15430731.75, 15431031.75, 15440131.75, 15440430.75, 15440731.75, 15441031.75, 15450131.75, 15450430.75, 15450731.75, 15451031.75, 15460131.75, 15460430.75, 15460731.75, 15461031.75, 15470131.75, 15470430.75, 15470731.75, 15471031.75, 15480131.75, 15480430.75, 15480731.75, 15481031.75, 15490131.75, 15490430.75, 15490731.75, 15491031.75, 15500131.75, 15500430.75, 15500731.75, 15501031.75, 15501231.75]) Coordinates: * time (time) float64 1.536e+07 1.536e+07 1.536e+07 ... 1.55e+07 1.55e+07 Attributes: standard_name: time bounds: time_bnds units: day as %Y%m%d.%f calendar: proleptic_gregorian axis: T So I want to make my own time array that I can use to plot the climate data. For monthly data I used: import numpy as np time = np.arange('0536-01-31', '0551-01-31', dtype='datetime64[M]') which gives me an array with the years and months between those two dates. now I grouped my data by season using cdo seasmean ('djf', 'mam', jja, 'son') and got 61 values instead of 180. Is there a way to regroup the 'time' array to seasonal values, or create a new time array that corresponds to the seasonal data?
I made it work by setting the number of steps in np.arange: time = np.arange('0536-01-31', '0551-01-31', steps=3, dtype='datetime64[M]') This gives a time step every three months, so essentially every 'season'.
Resampling Time Series Data (Pandas Python 3)
Trying to convert data at daily frequency to weekly frequency. In: weeklyaaapl = pd.DataFrame() weeklyaapl['Open'] = aapl.Open.resample('W').iloc[0] #here I am trying to take the first value of the aapl.Open, #that falls within the week. Out: ValueError: .resample() is now a deferred operation use .resample(...).mean() instead of .resample(...) I want the true open (the first open that prints for the week) (the open of the first day in that week). It instead wants me to take the mean of the daily open values for a given week using .mean(), which is not the information I need. Can't seem to interpret the error, documentation isn't helping either.
I think you need. aapl.resample('W').first() Output: Open High Low Close Volume Date 2010-01-10 30.49 30.64 30.34 30.57 123432050 2010-01-17 30.40 30.43 29.78 30.02 115557365 2010-01-24 29.76 30.74 29.61 30.72 182501620 2010-01-31 28.93 29.24 28.60 29.01 266424802 2010-02-07 27.48 28.00 27.33 27.82 187468421
Writing to a NetCDF3 file using module netcdf4 in python
I'm having a issue writing to a netcdf3 file using the netcdf4 functions. I tried using the create variable function but it gives me this error: NetCDF: Attempting netcdf-4 operation on netcdf-3 file nc = Dataset(root.fileName,'a',format="NETCDF4") Hycom_U = nc.createVariable('/variables/Hycom_U','float',('time','lat','lon',)) Hycom_V = nc.createVariable('/variables/Hycom_V','f4',('time','lat','lon',)) nc= root group (NETCDF3_CLASSIC data model, file format NETCDF3): netcdf_library_version: 4.1.3 format_version: HFRNet_1.0.0 product_version: HFRNet_1.1.05 Conventions: CF-1.0 title: Near-Real Time Surface Ocean Velocity, Hawaii, 2 km Resolution institution: Scripps Institution of Oceanography source: Surface Ocean HF-Radar history: 22-Feb-2017 00:55:46: NetCDF file created 22-Feb-2017 00:55:46: Filtered U and V by GDOP < 1.25 ; FMRC Best Dataset references: Terrill, E. et al., 2006. Data Management and Real-time Distribution in the HF-Radar National Network. Proceedings of the MTS/IEEE Oceans 2006 Conference, Boston MA, September 2006. creator_name: Mark Otero creator_email: motero#ucsd.edu creator_url: http://cordc.ucsd.edu/projects/mapping/ summary: Surface ocean velocities estimated from HF-Radar are representative of the upper 0.3 - 2.5 meters of the ocean. The main objective of near-real time processing is to produce the best product from available data at the time of processing. Radial velocity measurements are obtained from individual radar sites through the U.S. HF-Radar Network. Hourly radial data are processed by unweighted least-squares on a 2 km resolution grid of Hawaii to produce near real-time surface current maps. geospatial_lat_min: 20.487279892 geospatial_lat_max: 21.5720806122 geospatial_lon_min: -158.903594971 geospatial_lon_max: -157.490005493 grid_resolution: 2km grid_projection: equidistant cylindrical regional_description: Unites States, Hawaiian Islands cdm_data_type: GRID featureType: GRID location: Proto fmrc:HFRADAR,_US_Hawaii,_2km_Resolution,_Hourly_RTV History: Translated to CF-1.0 Conventions by Netcdf-Java CDM (NetcdfCFWriter) Original Dataset = fmrc:HFRADAR,_US_Hawaii,_2km_Resolution,_Hourly_RTV; Translation Date = Thu Feb 23 13:35:32 GMT 2017 dimensions(sizes): time(25), lat(61), lon(77) variables(dimensions): float32 u(time,lat,lon), float64 time_run(time), float64 time(time), float32 lat(lat), float32 lon(lon), float32 v(time,lat,lon) groups: What are the netcdf 3 operations I can use to add data into the file? I found out that I could manually add data by simply doing this nc.variables["Hycom_U"]=U2which directly adds the data, but nothing else. Is there a better way to do this?
I believe the issue is that you're claiming the file to be netCDF4 format: nc = Dataset(root.fileName,'a',format="NETCDF4")` but you really want to indicate that it's netCDF3: nc = Dataset(root.fileName,'a',format="NETCDF3_CLASSIC") Additional documentation can be found here.
I figured it out! I simply couldn't use a path as a varname. Hycom_U = nc.createVariable('Hycom_U','float',('time','lat','lon',)) It properly created a variable for me.
How to determine a formula for execution time given quantitative data, Excel, trendlines, monte carlo simulation
Can I get your help on some Maths and possibly Excel? I have benchmarked my app increasing the number of iterations and number of obligors recording the time taken in seconds with the following result: 200 400 600 800 1000 1200 1400 1600 1800 2000 20000 15.627681 30.0968663 44.7592684 60.9037558 75.8267358 90.3718977 105.8749983 121.0030672 135.9191249 150.3331682 40000 31.7202111 62.3603882 97.2085204 128.8111731 156.2443206 186.6374271 218.324317 249.2699288 279.6008184 310.9970803 60000 47.0708635 92.4599437 138.874287 186.0576007 231.2181381 280.541207 322.9836878 371.3076757 413.4058622 459.6208335 80000 60.7346238 120.3216303 180.471169 241.668982 300.4283548 376.9639188 417.5231669 482.6288981 554.9740194 598.0394434 100000 76.7535915 150.7479245 227.5125656 304.3908046 382.5900043 451.6034296 526.0730786 609.0358776 679.0268121 779.6887277 120000 90.4174626 179.5511355 269.4099593 360.2934453 448.4387573 537.1406039 626.7325734 727.6132992 807.4767327 898.307638 How can I now come up with a function for T (time taken in seconds) as an expression of number of obligors O and number of iterations I Thanks
I'm not quite sure of the data involved due to the question construction/presentation. Assuming you're looking for y = f(x). If you load the data into Excel, you can use the methods SLOPE and INTERCEPT on the data ranges to derive an expression of the form y = mx+c and thus a linear function. If you want a quadratic or cubic, you can use LINEST with a column of time data squared/cubed etc. to give you quadratic/cubic parameters, and thus derive an appropriate higher order function.
Spoke to one of the quants here the function is of the from T = KNO, where T is time, K some constant, N iterations, O obligors. Rearrange for K = T/(NO), plug this into my sample data, take the average of all sample points, use the Std dev for the error I did this for my data and get: T = 3.81524E-06 * N * O (with 1.9% error), this is a pretty good approximation.
Create a chart in Excel, add a trendline, and select to have the equation displayed on the chart.
To clarify: You have tabular data below which you want to fit to some function f(O,I)=t? 200 400 600 800 1000 1200 1400 1600 1800 2000 20000 15.627681 30.0968663 44.7592684 60.9037558 75.8267358 90.3718977 105.8749983 121.0030672 135.9191249 150.3331682 40000 31.7202111 62.3603882 97.2085204 128.8111731 156.2443206 186.6374271 218.324317 249.2699288 279.6008184 310.9970803 60000 47.0708635 92.4599437 138.874287 186.0576007 231.2181381 280.541207 322.9836878 371.3076757 413.4058622 459.6208335 80000 60.7346238 120.3216303 180.471169 241.668982 300.4283548 376.9639188 417.5231669 482.6288981 554.9740194 598.0394434 100000 76.7535915 150.7479245 227.5125656 304.3908046 382.5900043 451.6034296 526.0730786 609.0358776 679.0268121 779.6887277 120000 90.4174626 179.5511355 269.4099593 360.2934453 448.4387573 537.1406039 626.7325734 727.6132992 807.4767327 898.307638 A rough guess looks like both O & I are linear. So f is in the form t = aO + bI + c. Plug in a few (O,I,t) and see what a,b,c should be.