Gitlab CI: Set dynamic variables - gitlab

For a gitlab CI I'm defining some variables like this:
variables:
PROD: project_package
STAGE: project_package_stage
PACKAGE_PATH: /opt/project/build/package
BUILD_PATH: /opt/project/build/package/bundle
CONTAINER_IMAGE: registry.example.com/project/package:e2e
I would like to set those variables a bit more dynamically, as there are mainly only two parts: project and package. Everything else depends on those values, that means I have to change only two values to get all other variables.
So I would expect something like
variables:
PROJECT: project
PACKAGE: package
PROD: $PROJECT_$PACKAGE
STAGE: $PROD_stage
PACKAGE_PATH: /opt/$PROJECT/build/$PACKAGE
BUILD_PATH: /opt/$PROJECT/build/$PACKAGE/bundle
CONTAINER_IMAGE: registry.example.com/$PROJECT/$PACKAGE:e2e
But it looks like, that the way doing this is wrong...

I don't know where your expectation comes from, but it is trivial to check there is no special meaning for $, _, '/' nor : if not followed by a space in YAML. There might be in gitlab, but I doubt strongly that there is in the way you expect.
To formalize your expectation, you assume that any key (from the same mapping) preceded by a $ and terminated by the end of the scalar, by _ or by / is going to be "expanded" to that key's value. The _ has to be such terminator otherwise $PROJECT_$PACKAGE would not expand correctly.
Now consider adding a key-value pair:
BREAKING_TEST: $PACKAGE_PATH
is this supposed to expand to:
BREAKING_TEST: /opt/project/build/package/bundle
or follow the rule you implied that _ is a terminator and just expand to:
BREAKING_TEST: project_PATH
To prevent this kind of ambiguity programs like bash use quoting around variable names to be expanded ( "$PROJECT"_PATH vs. $PROJECT_PATH), but the more sane, and modern, solution is to use clamping begin and end characters (e.g. { and }, $% and %, ) with some special rule to use the clamping character as normal text.
So this is not going to work as you indicated as indeed you do something wrong.
It is not to hard to pre-process a YAML file, and it can be done with e.g. Python (but watch out that { has special meaning in YAML), possible with the help of jinja2: load the variables, and then expand the original text using the variables until replacements can no longer be made.
But it all starts with choosing the delimiters intelligently. Also keep in mind that although your "variables" seem to be ordered in the YAML text, there is no such guarantee when the are constructed as dict/hash/mapping in your program.
You could e.g. use << and >>:
variables:
PROJECT: project
PACKAGE: package
PROD: <<PROJECT>>_<<PACKAGE>>
STAGE: <<PROD>>_stage
PACKAGE_PATH: /opt/<<PROJECT>>/build/<<PACKAGE>>
BUILD_PATH: /opt/<<PROJECT>>/build/<<PACKAGE>>/bundle
CONTAINER_IMAGE: registry.example.com/<<PROJECT>>/<<PACKAGE>>:e2
which, with the following program (that doesn't deal with escaping << to keep its normal meaning) generates your original, expanded, YAML exactly.
import sys
from ruamel import yaml
def expand(s, d):
max_recursion = 100
while '<<' in s:
res = ''
max_recursion -= 1
if max_recursion < 0:
raise NotImplementedError('max recursion exceeded')
for idx, chunk in enumerate(s.split('<<')):
if idx == 0:
res += chunk # first chunk is before <<, just append
continue
try:
var, rest = chunk.split('>>', 1)
except ValueError:
raise NotImplementedError('delimiters have to balance "{}"'.format(chunk))
if var not in d:
res += '<<' + chunk
else:
res += d[var] + rest
s = res
return s
with open('template.yaml') as fp:
yaml_str = fp.read()
variables = yaml.safe_load(yaml_str)['variables']
data = yaml.round_trip_load(expand(yaml_str, variables))
yaml.round_trip_dump(data, sys.stdout, indent=2)

Related

Lark: parsing special characters

I'm starting with Lark and got stuck on an issue with parsing special characters.
I have expressions given by a grammar. For example, these are valid expressions: Car{_}, Apple3{3+}, Dog{a_7}, r2d2{A3*}, A{+}... More formally, they have form: name{feature} where
name: CNAME
feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+
The definition of constants can be found here.
The problem is that the special characters are not present in produced tree (see example below). I have seen this answer, but it did not help me. I tried to place ! before special characters, escaping them. I also enabled keep_all_tokens, but this is not desired because then characters { and } are also present in the tree. Any ideas how to solve this problem? Thank you.
from lark import Lark
grammar = r"""
start: object
object : name "{" feature "}" | name
feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+
name: CNAME
%import common.LETTER
%import common.DIGIT
%import common.CNAME
%import common.WS
%ignore WS
"""
parser = Lark(grammar, parser='lalr',
lexer='standard',
propagate_positions=False,
maybe_placeholders=False
)
def test():
test_str = '''
Apple_3{3+}
'''
j = parser.parse(test_str)
print(j.pretty())
if __name__ == '__main__':
test()
The output looks like this:
start
object
name Apple_3
feature 3
instead of
start
object
name Apple_3
feature
3
+
You said you tried placing ! before special characters. As I understand the question you linked, the ! has to be replaced before the rule:
!feature: (DIGIT|LETTER|"+"|"-"|"*"|"_")+
This produces your expected result for me:
start
object
name Apple_3
feature
3
+

python use for loop to modify list of variables

I have a script using argparse to gather a list of user defined directories. On the command line they may or may not specify a trailing "/" symbol. I'd like to do something up front so that all variables have the trailing "/" so I can reliably do:
# What I want:
with open(args.a + filename, "w") as fileout:
#do stuff
print('whatever', file=fileout)
rather than having to include an extra "/" in the name like this:
# What I have:
with open(args.a + "/" + filename, "w") as fileout:
#do stuff
print('whatever', file=fileout)
I also know that dir/ect/ory and dir//ect//ory are nearly equivalent save some fringe cases which are not applicable, but putting + "/" + all over the place seems wrong/wasteful.
In attempting to make a small function to run on all relevant variable I'm only seeing the desired outcome when I explicitly call the function on the variable not on a list containing the elements.
def trailingSlash(x):
if x.endswith("/"):
return x
else:
return x + "/"
a = 'ok/'
b = 'notok'
c = 'alsonotok'
for _ in [a, b, c]:
_ = trailingSlash(_)
print(a,b,c) #gives ok/ notok alsonotok
c = trailingSlash(c)
print(c) #gives alsonotok/
I understand why changing a list as you are iterating over it is generally bad, and understand that in the for loop the iterator is not actually pointing to a, b, or c. I also know if I wanted the values in a new list i could do something like [trailingSlash(x) for x [a,b,c]] but I need to maintain the a,b,c handle. in I know that I can also solve this by specifically calling x = trailingSlash(x) on every individual variable, but seems like there should be a better way. Any solutions I'm missing?
You can use os.path.join() to ignore the whole issue. It behaves no matter whether there are slashes at the end or not, and is platform-independent as a bonus (that is, it uses \\ instead of / when running on Windows, for example):
import os
...
os.path.join("dir/", "ect", "ory")
# "dir/ect/ory" on Unix, "dir\\ect\\ory" on Windows
In your case you'd want to do
with open(os.path.join(args.a, filename), "w") as fileout:
...

How to serialize escaped strings in a list

I'm trying to a .yml policy document for AWS. The problem is my list of strings is being surrounded in double quotes "" when I try to escape it myself, i.e.
- "'acm:AddTagsToCertificate'".
When I do nothing, it shows as
- acm:AddTagsToCertificate.
Problem is I need the final result in the .yml to look like
- 'acm:AddTagsToCertificate'
In terms of my own trouble shooting, I've tried using double and single quotations. I've also tried subclassing list to override how lists are serialized until other SO answers said that was frowned upon.
Here's the reduced code which shows my issue
import yaml;
data = {'apigateway:CreateDeployment': 6}
actions = [];
for key in data:
key = "\'" + key + "\'"
print(key)
actions.append(key);
with open('test.yml', 'w') as output:
yaml.dump(actions, output, default_flow_style=False)
Use default_style="'" in the dump:
import yaml
data = {'apigateway:CreateDeployment': 6}
actions = list(data.keys())
with open('test.yml', 'w') as output:
yaml.dump(actions, output, default_flow_style=False, default_style="'")

Caching parsed document

I have a set of YAML files. I would like to cache these files so that as much work as possible is re-used.
Each of these files contains two documents. The first document contains “static” information that will always be interpreted in the same way. The second document contains “dynamic” information that must be reinterpreted every time the file is used. Specifically, it uses a tag-based macro system, and the document must be constructed anew each time the file is used. However, the file itself will not change, so the results of parsing the entire file could be cached (at a considerable resource savings).
In ruamel.yaml, is there a simple way to parse an entire file into multiple parsed documents, then run construction on each document individually? This would allow me to cache the result of constructing the first “static” document and cache the parse of the second “dynamic” document for later construction.
Example file:
---
default_argument: name
...
%YAML 1.2
%TAG ! tag:yaml-macros:yamlmacros.lib.extend,yamlmacros.lib.arguments:
---
!merge
name: !argument name
The first document contains metadata that is used (along with other data from elsewhere) in the construction of the second document.
If you don't want to process all YAML documents in a stream completely, you'll have to split up the stream by hand, which is not entirely easy to do in a generic way.
What you need to know is what a YAML stream can consist of:
zero or more documents. Subsequent documents require some sort of separation marker line. If a document is not terminated by a document end marker line, then the following document must begin with a directives end marker line.
A document end marker line is a line that starts with ... followed by space/newline and a directives end marker line is --- followed by space/newline.
The actual production rules are slightly more complicated and "starts with" should ignore the fact that you need to skip any mid-stream byte-order marks.
If you don't have any directives, byte-order-marks and no document-end-markers (and most multi-document YAML streams that I have seen, do not have those), then you can just data = Path().read() the multi-document YAML as a string, split using l = data.split('\n---') and process only the appropriate element of the resulting list with YAML().load(l[N]).
I am not sure the following properly handles all cases, but it does handle your multi-doc stream:
import sys
from pathlib import Path
import ruamel.yaml
docs = []
current = ""
state = "EOD"
for line in Path("example.yaml").open():
if state in ["EOD", "DIR"]:
if line.startswith("%"):
state = "DIR"
else:
state = "BODY"
current += line
continue
if line.startswith('...') and line[3].isspace():
state = "EOD"
docs.append(current)
current = ""
continue
if state == "BODY" and current and line.startswith('---') and line[3].isspace():
docs.append(current)
current = ""
continue
current += line
if current:
docs.append(current)
yaml = ruamel.yaml.YAML()
data = yaml.load(docs[1])
print(data['name'])
which gives:
name
Looks like you can indeed directly operate the parser internals of ruamel.yaml, it just isn't documented. The following function will parse a YAML string into document nodes:
from ruamel.yaml import SafeLoader
def parse_documents(text):
loader = SafeLoader(text)
composer = loader.composer
while composer.check_node():
yield composer.get_node()
From there, the documents can be individually constructed. In order to solve my problem, something like the following should work:
def process_yaml(text):
my_constructor = get_my_custom_constructor()
parsed_documents = list(parse_documents(path.read_text()))
metadata = my_constructor.construct_document(parsed_documents[0])
return (metadata, document[1])
cache = {}
def do_the_thing(file_path):
if file_path not in cache:
cache[file_path] = process_yaml(Path(file_path).read_text())
metadata, document = cache[file_path]
my_constructor = get_my_custom_constructor(metadata)
return my_constructor.construct_document(document)
This way, all of the file IO and parsing is cached, and only the last construction step need be performed each time.

python yaml update preserving order and comments

Im inserting a key into Yaml using python but I would like to preserve order and comments in the yaml
#This Key is used for identifying Parent tests
ParentTest:
test:
JOb1: myjob
name: testjob
arrive: yes
Now Im using below code to insert new key
params['ParentTest']['test']['new_key']='new value'
yaml_output=yaml.dump(pipeline_params, default_flow_style=False)
How to preserve the exact order and comments ?
Below arrive moved up but I want to preserve order & comments as well
output is :
ParentTest:
test:
arrive: yes
JOb1: myjob
name: testjob
pyyaml cannot keep comments, but ruamel does.
Try this:
doc = ruamel.yaml.load(yaml, Loader=ruamel.yaml.RoundTripLoader)
doc['ParentTest']['test']['new_key'] = 'new value'
print ruamel.yaml.dump(doc, Dumper=ruamel.yaml.RoundTripDumper)
The order of keys will also be preserved.
Edit: Look at Anthon's answer from 2020: https://stackoverflow.com/a/59659659/93745
Although #tinita's answer works, it uses the old ruamel.yaml API and
that gives you less control over the loading/dumping. Even so, you
cannot preserve the inconsistent indentation of your mappings: the key
ParentTest is indented four positions, the key test a further
three and the key JOb1 only two positions. You can "only" set the
same indentation for all mappings (i.e their keys), and separate from
that the indentation of all sequences (i.e. their elements) and if
there is enough space, you can offset the sequence indicator (-) within
the sequence element indent.
In the default, round-trip mode, ruamel.yaml preserves key order, and
additionally you can preserve quotes, folded and literal scalars.
With a slightly extended YAML input as example:
import sys
import ruamel.yaml
yaml_str = """\
#This Key is used for identifying Parent tests
ParentTest:
test:
JOb1:
- my
- job
# ^ four indent positions for the sequence elements
# ^ two position offset for the sequence indicator '-'
name: 'testjob' # quotes added to show working of .preserve_quotes = True
arrive: yes
"""
yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
params = yaml.load(yaml_str)
params['ParentTest']['test']['new_key'] = 'new value'
params['ParentTest']['test'].yaml_add_eol_comment('some comment', key='new_key', column=40) # column is optional
yaml.dump(params, sys.stdout)
which gives:
#This Key is used for identifying Parent tests
ParentTest:
test:
JOb1:
- my
- job
# ^ four indent positions for the sequence elements
# ^ two position offset for the sequence indicator '-'
name: 'testjob' # quotes added to show working of .preserve_quotes = True
arrive: yes
new_key: new value # some comment
Addtion,if you want keep quote,you can try this:
import ruamel.yaml
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True

Resources