Daily anomalies using climate data operator (CDO) in Cygwin (Windows 10) - cygwin

I'm complete beginner with Cywing and CDO, which both have been installed in Windows 10. I'm working with 3 variables from ERA 5 Land hourly data: 2m temperature, total precipitation and runoff. Some facts about these vars:
three variables are in netCDF format.
2m temperature: contains hourly values and its units are in Kelvin.
total precipitation and runoff: contains hourly values and their units are depth in metres.
I want to obtain daily anomalies of 2017 arising from 30-year period (1981-2010). This post gave me a general idea what to do but I'm not pretty sure how to replicate. Intuitively, I think this would be the setps:
Convert units according to each var (e.g. K to C for 2m temperature, metres to mm for total precipitation)
Convert data from hourly to daily values
Obtain mean values for 2017 data and 1981-2010 data
Substract: 30-year mean values minus 2017 mean value
Download the file containing 2017 anomalies
Not sure about the order of procedures.
What the coding would be like in Cygwin terminal?

before you start I would recommend strongly to abandon cygwin and install the linux subsystem under windows (i.e. not parallel boot), if you do a quick search you will see that it is very easy to install ubuntu directly within windows itself, that way you can open a linux terminal and easily install anything you want with sudo apt install , e.g.
sudo apt install cdo
Once you have done that to answer some of your questions:
Convert units according to each var (e.g. K to C for 2m temperature, metres to mm for total precipitation)
e.g. to convert temperature:
cdo subc,273.15 in.nc out.nc
similar for rain using mulc [recall that this doesn't change the metadata "units", you need to use nco for that]
Convert data from hourly to daily values
for instantaneous fields like temperature
cdo daysum in.nc daymean.nc
for flux field (like rain)
cdo daymean -shifttime,-1hour in.nc raindaymean.nc
Obtain mean values for 2017 data and 1981-2010 data.
cdo selyear,2017 -yearmean in.nc year2017_anom.nc
Substract: 30-year mean values minus 2017 mean value
Erm, usually you want to do this the other way round no? 2017-long term mean, so you can see if it is warmer or cooler?
cdo sub year2017_anom.nc -timmean alldata_daymean.nc
Download the file containing 2017 anomalies
I don't understand this question, haven't you already downloaded the hourly data from the CDS platform ? This question only makes sense if you are using the CDS toolbox, which doesn't seem to be the case - anyway, for the downloading step, if this is not clear then you can take a look at my video on this topic on my youtube channel here: https://www.youtube.com/watch?v=AXG97K6NYD8&t=469s

Related

Performing T-Test on Time Series

My boss asked me to perform a T-Test to test the significance for a certain metric we use called conversion rate.
I have collected 18 months worth of data for this metric dating April 1, 2017 - September 30th, 2018.
He initially told me to collect 12 - 14 months of the data and run a t-test to to look for significance of the metric. (Higher conversion rate means better!).
I'm not really sure how to go about it. Do I split the data up into 9 month samples i.e. Sample 1: April 2017 - December 2017, Sample 2: January 2018 - September 2018 and run a two sample t-test? Or would it make sense to compare all of the data against a mean like 0?
Is there a better approach to this? The bottom line is he wants to see that the conversion rate has significantly increased over time.
Thanks,
- Keith
My advice is to dump the t-test and look only at the magnitude of the change in the conversion rate. After all, the conversion rate is what's important to your business. By the way, looking at the magnitude of something practically relevant is called "effect size analysis"; a web search for that should turn up a lot of resources. To get started, just make a plot of the available data -- is conversion rate going up or going down or what?
Further questions should be directed to stats.stackexchange.com instead of SO. Good luck and have fun.

Getting data from Excel into a Shelve (Python3)

I have come across a Python module for reading Apple serial numbers and gift ving back the product type. Please see Mac Model Shelf
It's very powerful and has a vast a catalogue of product types - some stuff going well back into the 90's. Slightly overkill for my purposes. The results it gives back tend to be a little vague (for new macs at least). No processor type, speed etc.
I have decided to re-write it slightly with my own database of serial numbers and corresponding machine types.
E.g "W88010010P2" will give back "Black Macbook 2008 2GHz - PG 1001", for the sake of the example. The 'PG' stands for Product Group, the reference code I use to find identical macs in my Filemaker database.
import shelve
databaseOfMacs = shelve.open("macmodelshelfNEW")
inputSerial = "W88010010P2"
modelCodeIsolatedFromSerial = ""
#extracting the model code. 3 digits for older macs, 4 for newer
#newer macs have 12 digits, older ones just 11
if len(inputSerial) == 12:
modelCodeIsolatedFromSerial = inputSerial[-4:]
elif len(inputSerial) == 11:
modelCodeIsolatedFromSerial = inputSerial[-3:]
#setting a key-value pair, for the sake of example
databaseOfMacs['0P2'] = "Black Macbook 2008 2GHz - PG 1001"
model = databaseOfMacs[modelCodeIsolatedFromSerial]
print model
This will produce as the output...
Black Macbook 2008 2GHz - PG 1001
Process finished with exit code 0
Adding key-value pairs inside the script is not practical. I have started to build up an Excel (xlsx) file of the key-value pairs of model codes and their actual descriptions. Just two columns.
A B
0P2 Black Macbook 2008 2GHz - PG 1001
G8WP MacBook Pro 205Ghz i7 (Retina, 15-inch, Mid 2015) - PG 786
I have searched online on SO but cannot find a clean way of getting this data into the shelve file. The suggested solutions are importing the values into a dictionary and after the fact, importing the dictionary into the shelve. I am getting stuck on just the first part, errant '0's are getting into the dictionary, based on python creating dictionary from excel
,stackoverflow
If this could be condensed into a single step it would change everything!
Thanks for any help which is given.
UPDATE
I think I have figured it out myself... just the getting the excel data into a shelve...
import pandas as pd
import shelve
excelDict = pd.read_excel('serials_and_models.xlsx', header=None, index_col=0, squeeze=True).to_dict()
excelMacDataBaseShelve = shelve.open("excelMacDataBaseShelve")
excelMacDataBaseShelve.update(excelDict)
# to verify all is well
for key in excelMacDataBaseShelve:
print(key, excelMacDataBaseShelve[key])
The wonderful thing about shelves is that I can just update the excel file as I go and when I need to retrieve some data via the python script it will always be up to date.
If anybody can point out something I've done wrong or could perhaps improve, please leave a comment!!

How can I create Date Object from Date and Time Strings in Lua NodeMCU?

I am playing around with NodeMCU on an ESP8266. I have a Date String and a Time String from a Web Request like this:
15.07.16 (German format DD.MM.YY)
19:50 (24 hours format)
These DateTimes lay usually a little bit in the future. I want to get the number of minutes from the current time to the time from my strings above.
I guess I have to create a time object from the strings and then compare it to the current time. But how can I do that with Lua?
Unfortunately there is no os Library on NodeMCU (or I might have missed how to enable it).
Calculating the difference manually would be a huge pain which I would like to avoid. Does anyone know a way to compute that with available or external libraries?
Thanks for any support!
There's a pending PR for rtctime that does the exact opposite, Unix epoch to UTC calendar.
If you convert your strings to a Unix epoch X you could do
-- delta in minutes
local delta = (X - rtctime.get()) / 60
You can either calculate X yourself, which is far from trivial due to leap years & seconds and other date/time oddities, or your can send a request to http://www.convert-unix-time.com/api?date=15.07.2016%2019:50&timezone=Vienna&format=german and extract the timestamp from it.
First you get the numbers from the strings using Lua's string library:
https://www.lua.org/pil/20.html
https://www.lua.org/manual/5.3/manual.html#6.4
Then you do the time calculations using Lua's os library:
https://www.lua.org/pil/22.1.html
https://www.lua.org/manual/5.3/manual.html#6.9
I won't give you more information as you did not show any own effort to solve the problem.
Addon:
As you don't have the os library (didn't know that) you can simply calculate that stuff yourself.
Get the month, year hour and minute number from the strings using string.sub or string patterns.
Then simply calculate the time difference. You know how many days each month has. You know how many minutes per hour and how many hours per day.
Determine if the year is a leap year (if you don't know how: https://support.microsoft.com/en-us/kb/214019)

How to compare data between a database and a guide which are differently structured?

A rather complicated problem in data exchange between a database and a bookform:
The organisation in which I work has a database in mysql for all social profit organisations in Brussels, Belgium. At the same time there is a booklet created in Indesign which was developed in a different time and with different people than the database and consequently has a different structure.
Every year a new book is published and the data needs to be compared manually because of this difference in structure. The book changes its way of displaying entries according to the need of a chapter. It would help to have a crossplatform search and change tool, best not with one keyword but with all the relevant data for an entry in the book.
An example of an entry in the booklet:
BESCHUTTE WERKPLAATS BOUCHOUT
Neromstraat 26 • 1861 Wolvertem • Tel 02-272 42 80 • Fax 02-269 85 03 • Gsm 0484-101 484 E-mail info#bwbouchout.be • Website www.bwbouchout.be Werkdagen: 8u - 16u30, vrijdag tot 14u45.
Personen met een fysieke en/of verstandelijke handicap. Ook psychiatrische patiënten en mensen met een meervoudige handicap.
Capaciteit: 180 tewerkstellingsplaatsen.
A problem: The portable phone number is written in another format as in the database. The database would say: 0484 10 14 84 the book says: 0484-101 484
The opening times are formulated completely different, but some of it is similar.
Are there tools which would make life easier? Tools where you would be able to find similar data something like: similar data finder for excel but then cross platform and with more possibilities? I believe most data exchange programs work very "one-way same for every entry". Is there a program which is more flexible?
For clarity: I need to compare the data, not to generate the data out of the database.
It could mean saving a lot of time, money and eyestrain. Thanks,
Erik Willekens
Erik,
The specific problem of comparing two telephone number which are formatted differently is relatively easy to overcome by stripping all non-numeric characters.
However I don't think that's really what you are trying to achieve. I believe you're attempting to compare whether the booklet data is different to the database data but disregard certain formatting.
Realistically this isn't possible without having some very well defined rules on the formatting. For instance formatting on the organisation name is probably very significant whereas telephone number formatting is not.
Instead you should be tracking changes within the database and then manually check the booklet.
One possible solution is to store the booklet details for each record in your database alongside the correctly formatted ones. This allows you to perform a manual conversion once for the entire booklet and then each subsequent year lets you just compare the new booklet values to the old booklet values stored in the DB.
An example might make this clearer. Imagine you had this very simple record:
Org Name Booklet Org Name GSM Booklet GSM
-------- ---------------- --- -----------
BESCHUTTE BESCHUTTE WERKP 0484 10 14 84 0484-101 484
When you get next year's booklet, then as long as the GSM number in the new booklet still says 0484-101 484 you won't have to worry about converting it to your database format and then checking to see if it has changed.
This would not be a good approach if a large proportion of details in the booklet changed each year

What is the easiest way to adjust EXIF timestamps on photos from multiple cameras in Windows Vista?

Scenario: Several people go on holiday together, armed with digital cameras, and snap away. Some people remembered to adjust their camera clocks to local time, some left them at their home time, some left them at local time of the country they were born in, and some left their cameras on factory time.
The Problem: Timestamps in the EXIF metadata of photos will not be synchronised, making it difficult to aggregate all the photos into one combined collection.
The Question: Assuming that you have discovered the deltas between all of the camera clocks, What is the simplest way to correct these timestamp differences in Windows Vista?
use exiftool. open source, written in perl, but also available as standalone .exe file. author seems to have though of everything exif related. mature code.
examples:
exiftool "-DateTimeOriginal+=5:10:2 10:48:0" DIR
exiftool -AllDates-=1 DIR
refs:
http://www.sno.phy.queensu.ca/~phil/exiftool/
http://www.sno.phy.queensu.ca/~phil/exiftool/#shift
Windows Live Photo Gallery Wave 3 Beta includes this feature. From the help:
If you change the date and time
settings for more than one photo at
the same time, each photo's time stamp
is changed by the same amount, so that
the time stamps of all the selected
photos remain in their original
chronological order.
Instructions:
Select Photos to change (you can use the search feature to limit by camera model, etc).
Right-Click and select 'Change Time Taken...'.
Select a new time and click OK.
Current download location is from LiveSide.net.
Easiest, probably a small python script that will use something like os.walk to go through all the files below a folder and then use pyexiv2 to actually read and then modify the EXIF data. A tutorial on pyexiv2 can be found here.
I'd dare to advice my software for this purpose: EXIFTimeEdit. Open-source and simple, it supports all the possible variants I could imagine:
Shifting date part (year/month/day/hour/minute) by any value
Setting date part to any value
Determining necessary shift value
Copying resulting timestamp to EXIF DateTime field and last modified property

Resources