Updating a YAML file in Python - python-3.x

I am updating the following template.yaml file in Python3:
alpha:
alpha_1:
alpha_2:
beta:
beta_1:
beta_2:
- beta_2a:
beta_2b:
gamma:
Using ruamel.py I am able to fill the blank space correctly.
file_name = 'template.yaml'
config, ind, bsi = ruamel.yaml.util.load_yaml_guess_indent(open(file_name))
and updating each element I am able to arrive to:
alpha:
alpha_1: "val_alpha1"
alpha_2: "val_alpha2"
beta:
beta_1: "val_beta1"
beta_2:
- beta_2a: "val_beta2a"
beta_2b: "val_beta2b"
gamma: "val_gamma"
Here there is the issue, I may need other children elements in beta_2 node, in this way:
alpha:
alpha_1: "val_alpha1"
alpha_2: "val_alpha2"
beta:
beta_1: "val_beta1"
beta_2:
- beta_2a: "val_beta2a"
beta_2b: "val_beta2b"
- beta_2c: "val_beta2c"
beta_2d: "val_beta2d"
gamma: "val_gamma"
I do not know in advance if I could need more branches like above and change the template each time is not an option.
My attempts with update() or appending dict were unsuccessful. How can I get the desired result?
My attempt:
entry = config["beta"]["beta_2"]
entry[0]["beta_2a"] = "val_beta2a"
entry[0]["beta_2b"] = "val_beta2b"
entry[0].update = {"beta_2c": "val_beta2a", "beta_2d": "val_beta2d"}
In this case, the program does not display any changes in the results, meaning that the last line with update did not work at all.

2022-03-31 16:18:34 ['ryd', '--force', 'so-71693609.ryd']
Your indent is five for the list with a two space offset for the indicator (-),
so there is no real need to try and analyse the indent unless some other program
changes that.
The value for beta_2 is a list, to get what you want you need to append
a dictionary to that list:
import sys
from pathlib import Path
import ruamel.yaml
from ruamel.yaml.scalarstring import DoubleQuotedScalarString as DQ
file_name = Path('template.yaml')
yaml = ruamel.yaml.YAML()
yaml.indent(sequence=5, offset=2)
config = yaml.load(file_name)
config['alpha'].update(dict(alpha_1=DQ('val_alpha1'), alpha_2=DQ('val_alpha2')))
config['beta'].update(dict(beta_1=DQ('val_beta1')))
config['gamma'] = DQ('val_gamma')
entry = config["beta"]["beta_2"]
entry[0]["beta_2a"] = DQ("val_beta2a")
entry[0]["beta_2b"] = DQ("val_beta2b")
entry.append(dict(beta_2c=DQ("val_beta2a"), beta_2d=DQ("val_beta2d")))
yaml.dump(config, sys.stdout)
which gives:
alpha:
alpha_1: "val_alpha1"
alpha_2: "val_alpha2"
beta:
beta_1: "val_beta1"
beta_2:
- beta_2a: "val_beta2a"
beta_2b: "val_beta2b"
- beta_2c: "val_beta2a"
beta_2d: "val_beta2d"
gamma: "val_gamma"

Related

Make all timestamps in a list have the same format

I have this list and would like for all of the timestamps to have the same format (... = more elements):
timestampList = [...
"8:36 - Appointment1",
"9:21 - Appointment2",
"10:01 - Appointment3",
"11:52 - Appointment4",
"12:18 - Appointment5" ...]
Is there an easy way to make sure all timestamps in the list have the same format(HH:MM)? Is there perhaps a module that makes this possible? I have tried to resolve the problem but couldn't find a way of doing it. I want the list to look like this:
timestampList = [...
"08:36 - Appointment1",
"09:21 - Appointment2",
"10:01 - Appointment3",
"11:52 - Appointment4",
"12:18 - Appointment5" ...]
You can use re.sub and a lookahead regex from the beginning of the line. If we see that the timestamp starts with \d:, then prepend a "0":
>>> import re
>>> [re.sub(r"^(?=\d:)", "0", x) for x in timestamps]
['08:36 - Appointment1', '09:21 - Appointment2', '10:01 - Appointment3', '11:52 - Appointment4', '12:18 - Appointment5']

decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]

I thought that setting a fixed number of decimal points to all numbers of an array of Decimals, and the new arrays resulting from operations thereof, could be achieved by simply doing:
from decimal import *
getcontext().prec = 5 # 4 decimal points
v = Decimal(0.005)
print(v)
0.005000000000000000104083408558608425664715468883514404296875
However, I get spurious results that I know are the consequence of the contribution of these extra decimals to the calculations. Therefore, as a workaround, I used the round() function like this:
C_subgrid= [Decimal('33.340'), Decimal('33.345'), Decimal('33.350'), Decimal('33.355'), Decimal('33.360'), Decimal('33.365'), Decimal('33.370'), Decimal('33.375'), Decimal('33.380'), Decimal('33.385'), Decimal('33.390'), Decimal('33.395'), Decimal('33.400'), Decimal('33.405'), Decimal('33.410'), Decimal('33.415'), Decimal('33.420'), Decimal('33.425'), Decimal('33.430'), Decimal('33.435'), Decimal('33.440'), Decimal('33.445'), Decimal('33.450'), Decimal('33.455'), Decimal('33.460'), Decimal('33.465'), Decimal('33.470'), Decimal('33.475'), Decimal('33.480'), Decimal('33.485'), Decimal('33.490'), Decimal('33.495'), Decimal('33.500'), Decimal('33.505'), Decimal('33.510'), Decimal('33.515'), Decimal('33.520'), Decimal('33.525'), Decimal('33.530'), Decimal('33.535'), Decimal('33.540'), Decimal('33.545'), Decimal('33.550'), Decimal('33.555'), Decimal('33.560'), Decimal('33.565'), Decimal('33.570'), Decimal('33.575'), Decimal('33.580'), Decimal('33.585'), Decimal('33.590'), Decimal('33.595'), Decimal('33.600'), Decimal('33.605'), Decimal('33.610'), Decimal('33.615'), Decimal('33.620'), Decimal('33.625'), Decimal('33.630'), Decimal('33.635'), Decimal('33.640'), Decimal('33.645'), Decimal('33.650'), Decimal('33.655'), Decimal('33.660'), Decimal('33.665'), Decimal('33.670'), Decimal('33.675'), Decimal('33.680'), Decimal('33.685'), Decimal('33.690'), Decimal('33.695'), Decimal('33.700'), Decimal('33.705'), Decimal('33.710'), Decimal('33.715'), Decimal('33.720'), Decimal('33.725'), Decimal('33.730'), Decimal('33.735'), Decimal('33.740'), Decimal('33.745'), Decimal('33.750'), Decimal('33.755'), Decimal('33.760'), Decimal('33.765'), Decimal('33.770'), Decimal('33.775'), Decimal('33.780'), Decimal('33.785'), Decimal('33.790'), Decimal('33.795'), Decimal('33.800'), Decimal('33.805'), Decimal('33.810'), Decimal('33.815'), Decimal('33.820'), Decimal('33.825'), Decimal('33.830'), Decimal('33.835'), Decimal('33.840'), Decimal('33.845'), Decimal('33.850'), Decimal('33.855'), Decimal('33.860'), Decimal('33.865'), Decimal('33.870'), Decimal('33.875'), Decimal('33.880'), Decimal('33.885'), Decimal('33.890'), Decimal('33.895'), Decimal('33.900'), Decimal('33.905'), Decimal('33.910'), Decimal('33.915'), Decimal('33.920'), Decimal('33.925'), Decimal('33.930'), Decimal('33.935'), Decimal('33.940'), Decimal('33.945'), Decimal('33.950'), Decimal('33.955'), Decimal('33.960'), Decimal('33.965'), Decimal('33.970'), Decimal('33.975'), Decimal('33.980'), Decimal('33.985'), Decimal('33.990'), Decimal('33.995'), Decimal('34.000'), Decimal('34.005'), Decimal('34.010'), Decimal('34.015'), Decimal('34.020'), Decimal('34.025'), Decimal('34.030'), Decimal('34.035'), Decimal('34.040'), Decimal('34.045'), Decimal('34.050'), Decimal('34.055'), Decimal('34.060'), Decimal('34.065'), Decimal('34.070'), Decimal('34.075'), Decimal('34.080'), Decimal('34.085'), Decimal('34.090'), Decimal('34.095'), Decimal('34.100'), Decimal('34.105'), Decimal('34.110'), Decimal('34.115'), Decimal('34.120'), Decimal('34.125'), Decimal('34.130'), Decimal('34.135'), Decimal('34.140')]
C_subgrid = [round(v, 4) for v in C_subgrid]
I got the values of C_subgrid list by printing it out during execution of my code, and I pasted it here. Not sure where the single quotes come from. This code snipped worked fine in Python2.7, but when I upgraded to Python 3.7 it started raising this error:
File "/home2/thomas/Documents/4D-CHAINS_dev/lib/peak.py", line 301, in <listcomp>
C_subgrid = [round(v, 4) for v in C_subgrid] # convert all values to fixed decimal length floats!
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]
Strangely, if I run it within ipython it works fine, only within my code it creates problems. Can anybody think of any possible reason?

Overwrite GPS coordinates in Image Exif using Python 3.6

I am trying to transform image geotags so that images and ground control points lie in the same coordinate system inside my software (Pix4D mapper).
The answer here says:
Exif data is standardized, and GPS data must be encoded using
geographical coordinates (minutes, seconds, etc) described above
instead of a fraction. Unless it's encoded in that format in the exif
tag, it won't stick.
Here is my code:
import os, piexif, pyproj
from PIL import Image
img = Image.open(os.path.join(dirPath,fn))
exif_dict = piexif.load(img.info['exif'])
breite = exif_dict['GPS'][piexif.GPSIFD.GPSLatitude]
lange = exif_dict['GPS'][piexif.GPSIFD.GPSLongitude]
breite = breite[0][0] / breite[0][1] + breite[1][0] / (breite[1][1] * 60) + breite[2][0] / (breite[2][1] * 3600)
lange = lange[0][0] / lange[0][1] + lange[1][0] / (lange[1][1] * 60) + lange[2][0] / (lange[2][1] * 3600)
print(breite) #48.81368778730952
print(lange) #9.954511162420633
x, y = pyproj.transform(wgs84, gk3, lange, breite) #from WGS84 to GaussKrüger zone 3
print(x) #3570178.732528623
print(y) #5408908.20172699
exif_dict['GPS'][piexif.GPSIFD.GPSLatitude] = [ ( (int)(round(y,6) * 1000000), 1000000 ), (0, 1), (0, 1) ]
exif_bytes = piexif.dump(exif_dict) #error here
img.save(os.path.join(outPath,fn), "jpeg", exif=exif_bytes)
I am getting struct.error: argument out of range in the dump method. The original GPSInfo tag looks like: {0: b'\x02\x03\x00\x00', 1: 'N', 2: ((48, 1), (48, 1), (3449322402, 70000000)), 3: 'E', 4: ((9, 1), (57, 1), (1136812930, 70000000)), 5: b'\x00', 6: (3659, 10)}
I am guessing I have to offset the values and encode them properly before writing, but have no idea what is to be done.
It looks like you are already using PIL and Python 3.x, not sure if you want to continue using piexif but either way, you may find it easier to convert the degrees, minutes, and seconds into decimal first. It looks like you are trying to do that already but putting it in a separate function may be clearer and account for direction reference.
Here's an example:
def get_decimal_from_dms(dms, ref):
degrees = dms[0][0] / dms[0][1]
minutes = dms[1][0] / dms[1][1] / 60.0
seconds = dms[2][0] / dms[2][1] / 3600.0
if ref in ['S', 'W']:
degrees = -degrees
minutes = -minutes
seconds = -seconds
return round(degrees + minutes + seconds, 5)
def get_coordinates(geotags):
lat = get_decimal_from_dms(geotags['GPSLatitude'], geotags['GPSLatitudeRef'])
lon = get_decimal_from_dms(geotags['GPSLongitude'], geotags['GPSLongitudeRef'])
return (lat,lon)
The geotags in this example is a dictionary with the GPSTAGS as keys instead of the numeric codes for readability. You can find more detail and the complete example from this blog post: Getting Started with Geocoding Exif Image Metadata in Python 3
After much hemming & hawing I reached the pages of py3exiv2 image metadata manipulation library. One will find exhaustive lists of the metadata tags as one reads through but here is the list of EXIF tags just to save few clicks.
It runs smoothly on Linux and provides many opportunities to edit image-headers. The documentation is also quite clear. I recommend this as a solution and am interested to know if it solves everyone else's problems as well.

Minimal self-compiling to .pdf Rmarkdown file

I need to compose a simple rmarkdown file, with text, code and the results of executed code included in a resulting PDF file. I would prefer if the source file is executable and self sifficient, voiding the need for a makefile.
This is the best I have been able to achieve, and it is far from good:
#!/usr/bin/env Rscript
library(knitr)
pandoc('hw_ch4.rmd', format='latex')
# TODO: how to NOT print the above commands to the resulting .pdf?
# TODO: how to avoid putting everyting from here on in ""s?
# TODO: how to avoid mentioning the file name above?
# TODO: how to render special symbols, such as tilde, miu, sigma?
# Unicode character (U+3BC) not set up for use with LaTeX.
# See the inputenc package documentation for explanation.
# nano hw_ch4.rmd && ./hw_ch4.rmd && evince hw_ch4.pdf
"
4E1. In the model definition below, which line is the likelihood?
A: y_i is the likelihood, based on the expectation and deviation.
4M1. For the model definition below, simulate observed heights from the prior (not the posterior).
A:
```{r}
points <- 10
rnorm(points, mean=rnorm(points, 0, 10), sd=runif(points, 0, 10))
```
4M3. Translate the map model formula below into a mathematical model definition.
A:
```{r}
flist <- alist(
y tilda dnorm( mu , sigma ),
miu tilda dnorm( 0 , 10 ),
sigma tilda dunif( 0 , 10 )
)
```
"
Result:
What I eventually came to use is the following header. At first it sounded neat, but later I realized
+ is indeed easy to compile in one step
- this is code duplication
- mixing executable script and presentation data in one file is a security risk.
Code:
#!/usr/bin/env Rscript
#<!---
library(rmarkdown)
argv <- commandArgs(trailingOnly=FALSE)
fname <- sub("--file=", "", argv[grep("--file=", argv)])
render(fname, output_format="pdf_document")
quit(status=0)
#-->
---
title:
author:
date: "compiled on: `r Sys.time()`"
---
The quit() line is supposed to guarantee that the rest of the file is treated as data. The <!--- and --> comments are to render the executable code as comments in the data interpretation. They are, in turn, hidden by the #s from the shell.

R simplify heatmap to pdf

I want to plot a simplified heatmap that is not so difficult to edit with the scalar vector graphics program I am using (inkscape). The original heatmap as produced below contains lots of rectangles, and I wonder if they could be merged together in the different sectors to simplify the output pdf file:
nentries=100000
ci=rainbow(nentries)
set.seed=1
mean=10
## Generate some data (4 factors)
i = data.frame(
a=round(abs(rnorm(nentries,mean-2))),
b=round(abs(rnorm(nentries,mean-1))),
c=round(abs(rnorm(nentries,mean+1))),
d=round(abs(rnorm(nentries,mean+2)))
)
minvalue = 10
# Discretise values to 1 or 0
m0 = matrix(as.numeric(i>minvalue),nrow=nrow(i))
# Remove rows with all zeros
m = m0[rowSums(m0)>0,]
# Reorder with 1,1,1,1 on top
ms =m[order(as.vector(m %*% matrix(2^((ncol(m)-1):0),ncol=1)), decreasing=TRUE),]
rowci = rainbow(nrow(ms))
colci = rainbow(ncol(ms))
colnames(ms)=LETTERS[1:4]
limits=c(which(!duplicated(ms)),nrow(ms))
l=length(limits)
toname=round((limits[-l]+ limits[-1])/2)
freq=(limits[-1]-limits[-l])/nrow(ms)
rn=rep("", nrow(ms))
for(i in toname) rn[i]=paste(colnames(ms)[which(ms[i,]==1)],collapse="")
rn[toname]=paste(rn[toname], ": ", sprintf( "%.5f", freq ), "%")
heatmap(ms,
Rowv=NA,
labRow=rn,
keep.dendro = FALSE,
col=c("black","red"),
RowSideColors=rowci,
ColSideColors=colci,
)
dev.copy2pdf(file="/tmp/file.pdf")
Why don't you try RSvgDevice? Using it you could save your image as svg file, which is much convenient to Inkscape than pdf
I use the Cairo package for producing svg. It's incredibly easy. Here is a much simpler plot than the one you have in your example:
require(Cairo)
CairoSVG(file = "tmp.svg", width = 6, height = 6)
plot(1:10)
dev.off()
Upon opening in Inkscape, you can ungroup the elements and edit as you like.
Example (point moved, swirl added):
I don't think we (the internet) are being clear enough on this one.
Let me just start off with a successful export example
png("heatmap.png") #Ruby dev's think of this as kind of like opening a `File.open("asdfsd") do |f|` block
heatmap(sample_matrix, Rowv=NA, Colv=NA, col=terrain.colors(256), scale="column", margins=c(5,10))
dev.off()
The dev.off() bit, in my mind, reminds me of an end call to a ruby block or method, in that, the last line of the "nested" or enclosed (between png() and dev.off()) code's output is what gets dumped into the png file.
For example, if you ran this code:
png("heatmap4.png")
heatmap(sample_matrix, Rowv=NA, Colv=NA, col=terrain.colors(32), scale="column", margins=c(5,15))
heatmap(sample_matrix, Rowv=NA, Colv=NA, col=greenred(32), scale="column", margins=c(5,15))
dev.off()
it would output the 2nd (greenred color scheme, I just tested it) heatmap to the heatmap4.png file, just like how a ruby method returns its last line by default

Resources