How do I get outputs for multiple modules? - terraform

I have a module I call multiple times in a single tf file. One of the things it does is create an S3 bucket. I have this defined in its output:
output "mybucket" {
value = "${aws_s3_bucket.mybucket.id}"
}
In order to view this output though, because I'm using modules, is scope to the specific module which means doing this:
terraform output -module=module1 mybucket
Which means if I just want a list of ALL the buckets created via the tf file I have to loop over them programmatic. Sadly wildcards do not work:
terraform output -module=* mybucket
So now, how do I do this? I could loop over all the modules and call output multiple times but I can't find a terraform command that lists all the modules currently in use.
With state list I get the modules names but in a format I have to parse:
terraform state list aws_s3_bucket.mybucket
module.module1.aws_s3_bucket.mybucket
module.module2.aws_s3_bucket.mybucket
Is there a way to query state and retrieve all the buckets that were created OR a way to view outputs of all modules?
Edit: So it seems I can output a list from the tf file that calls the modules. The tf file collects the output of multiple module calls into a list:
module "mod1" {..do stuff, create s3 bucket, output{mybucket} ...}
module "mod2" {..do stuff, create s3 bucket, output{mybucket} ...}
module "mod3" {..do stuff, create s3 bucket, output{mybucket} ...}
output "my-buckets" {
value = ["${module.mod1.mybucket}","${module.mod2.mybucket}"]
}
The disappointing thing though is I have to manual enter each modules by name. The splat operator didn't work here to expand the modules- am I using it wrong?:
output "my-buckets" {
value = ["${module.*.mybucket}"]
}
The error I get is:
* output 'my-buckets': unknown module referenced: *
output 'my-buckets': undefined module referenced *

Also following up with #ydaetskcoR's answer, the splat notation for your case would look somewhat like:
output "output-s3-buckets" {
value = "${aws_s3_bucket.mybuckets.*.id}"
description = "List of buckets created"
}
Which can be read at the module declaration level as
list_of_buckets = ["${modules.s3_module.output-s3-buckets}"]

Related

target_transform in torchvision.datasets.ImageFolder seems not to work

I am using PuyTorch 1.13 with Python 3.10.
I have a problem where I import pictures from a folder structure using
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform,
is_valid_file=is_valid_file)
In this command labels are assigned automatically according to which subdirectory belongs an image.
I wanted to assign different labels and use target_transform for this purpose (e.g. I wanted to use a word from the file name to assign an appropriate label).
I have used
def target_transform(id):
print(2)
return id * 2
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=target_transform, is_valid_file=is_valid_file)
Next,
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=lambda id:2*id, is_valid_file=is_valid_file)
or
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=
torchvision.transforms.Lambda(lambda id:2*id), is_valid_file=is_valid_file)
But none of these affect the labels. In addition, in the first example I included the print statemet to see whether the function is called but it is not. I have serached the use of this funciton but the exmaples I have found do not work and the documentation is scarce in this respect. Any idea what is wrogn with the code?

AWS S3 object filter to NOT match Prefix in Python script

When iterating over S3 objects using Python/boto3, I see that there's a filter method. But can you apply a NOT condition?
I want to just get the top level objects, not objects in folders (they have a prefix). I am currently doing this and it works:
import boto3
s3 = boto3.resource('s3')
bucket = cfg['s3']['bucket_name']
for obj in s3.Bucket(bucket).objects.all():
if not re.match('folder_name.*', obj.key):
I see support for a filter like this:
for obj in s3.Bucket(bucket).objects.filter(Prefix=folder_name):
I'm asking is there a way to say Prefix != folder_name?
If you just want a list of objects at without a shared prefix, specify the delimiter to the filter, and boto3 will filter away the shared prefixes:
s3 = boto3.resource('s3')
for obj in s3.Bucket(bucket).objects.filter(Delimiter='/'):
print(obj.key)
You can do this with s3pathlib. It provide an objective oriented interface for S3Path so you can easily create a simple filter function that takes S3Path as input argument, and returns True / False to indicate that if you want to yield it. This is an example solves your problem:
from s3pathlib import S3Path
# use tailing / to indicate that it is a dir
p_dir = S3Path("bucket", "root_dir/")
# define a filter
n_parts_of_root_dir = len(p_dir.parts)
def top_level_object(s3path):
return len(s3path.parts) == (n_parts_of_root_dir + 1)
for p in p_dir.iter_objects(limit=100).filter(top_level_object):
... do what ever you want
You can also do more advanced filtering or leverage the built-in filters out-of-the box. For example you can filter by attributes like S3Path.dirname, S3Path.basename, S3Path.fname, S3Path.ext. See this document for more information

How may I dynamically create global variables within a function based on input in Python

I'm trying to create a function that returns a dynamically-named list of columns. Usually I can manually name the list, but I now have 100+ csv files to work with.
My goal:
Function creates a list, and names it based on dataframe name
Created list is callable outside of the function
I've done my research, and this answer from an earlier post came very close to helping me.
Here is what I've adapted
def test1(dataframe):
# Using globals() to get dataframe name
df_name = [x for x in globals() if globals()[x] is dataframe][0]
# Creating local dictionary to use exec function
local_dict = {}
# Trying to generate a name for the list, based on input dataframe name
name = 'col_list_' + df_name
exec(name + "=[]", globals(), local_dict)
# So I can call this list outside the function
name = local_dict[name]
for feature in dataframe.columns:
# Append feature/column if >90% of values are missing
if dataframe[feature].isnull().mean() >= 0.9:
name.append(feature)
return name
To ensure the list name changes based on the DataFrame supplied to the function, I named the list using:
name = 'col_list_' + df_name
The problem comes when I try to make this list accessible outside the function:
name = local_dict[name].
I cannot find away to assign a dynamic list name to the local dictionary, so I am forced to always call name outside the function to return the list. I want the list to be named based on the dataframe input (eg. col_list_df1, col_list_df2, col_list_df99).
This answer was very helpful, but it seems specific to variables.
global 'col_list_' + df_name returns a syntax error.
Any help would be greatly appreciated!

loop to read multiple files

I am using Obspy _read_segy function to read a segy file using following line of code:
line_1=_read_segy('st1.segy')
However I have a large number of files in a folder as follow:
st1.segy
st2.segy
st3.segy
.
.
st700.segy
I want to use a for loop to read the data but I am new so can any one help me in this regard.
Currently i am using repeated lines to read data as follow:
line_2=_read_segy('st1.segy')
line_2=_read_segy('st2.segy')
The next step is to display the segy data using matplotlib and again i am using following line of code on individual lines which makes it way to much repeated work. Can someone help me with creating a loop to display the data and save the figures .
data=np.stack(t.data for t in line_1.traces)
vm=np.percentile(data,99)
plt.figure(figsize=(60,30))
plt.imshow(data.T, cmap='seismic',vmin=-vm, vmax=vm, aspect='auto')
plt.title('Line_1')
plt.savefig('Line_1.png')
plt.show()
Your kind suggestions will help me a lot as I am a beginner in python programming.
Thank you
If you want to reduce code duplication, you use something called functions. And If you want to repeatedly do something, you can use loops. So you can call a function in a loop, if you want to do this for all files.
Now, for reading the files in folder, you can use glob package of python. Something like below:
import glob, os
def save_fig(in_file_name, out_file_name):
line_1 = _read_segy(in_file_name)
data = np.stack(t.data for t in line_1.traces)
vm = np.percentile(data, 99)
plt.figure(figsize=(60, 30))
plt.imshow(data.T, cmap='seismic', vmin=-vm, vmax=vm, aspect='auto')
plt.title(out_file_name)
plt.savefig(out_file_name)
segy_files = list(glob.glob(segy_files_path+"/*.segy"))
for index, file in enumerate(segy_files):
save_fig(file, "Line_{}.png".format(index + 1))
I have not added other imports here, which you know to add!. segy_files_path is the folder where your files reside.
You just need to dynamically open the files in a loop. Fortunately they all follow the same naming pattern.
N = 700
for n in range(N):
line_n =_read_segy(f"st{n}.segy") # Dynamic name.
data = np.stack(t.data for t in line_n.traces)
vm = np.percentile(data, 99)
plt.figure(figsize=(60, 30))
plt.imshow(data.T, cmap="seismic", vmin=-vm, vmax=vm, aspect="auto")
plt.title(f"Line_{n}")
plt.show()
plt.savefig(f"Line_{n}.png")
plt.close() # Needed if you don't want to keep 700 figures open.
I'll focus on addressing the file looping, as you said you're new and I'm assuming simple loops are something you'd like to learn about (the first example is sufficient for this).
If you'd like an answer to your second question, it might be worth providing some example data, the output result (graph) of your current attempt, and a description of your desired output. If you provide that reproducible example and clear description of the problem you're having it'd be easier to answer.
Create a list (or other iterable) to hold the file names to read, and another container (maybe a dict) to hold the result of your read_segy.
files = ['st1.segy', 'st2.segy']
lines = {} # creates an empty dictionary; dictionaries consist of key: value pairs
for f in files: # f will first be 'st1.segy', then 'st2.segy'
lines[f] = read_segy(f)
As stated in the comment by #Guimoute, if you want to dynamically generate the file names, you can create the files list by pasting integers to the base file name.
lines = {} # creates an empty dictionary; dictionaries have key: value pairs
missing_files = []
for i in range(1, 701):
f = f"st{str(i)}.segy" # would give "st1.segy" for i = 1
try: # in case one of the files is missing or can’t be read
lines[f] = read_segy(f)
except:
missing_files.append(f) # store names of missing or unreadable files

vpc_zone_identifier should be a list

I'm not getting my head around this. When doing a terraform plan it complains the value should be a list. Fair enough. Let's break this down in steps.
The error
1 error(s) occurred:
* module.instance-layer.aws_autoscaling_group.mariadb-asg: vpc_zone_identifier: should be a list
The setup
The VPC and subnets are created with terraform in another module.
The outputs of that module give the following:
"subnets_private": {
"sensitive": false,
"type": "string",
"value": "subnet-1234aec7,subnet-1234c8a7"
},
In my main.tf I use the output of said module to feed it into a variable for my module that takes care of the auto scaling groups:
subnets_private = "${module.static-layer.subnets_private}"
This is used in the module to require the variable:
variable "subnets_private" {}
And this is the part where I configure the vpc_zone_identifier:
Attempt: split
resource "aws_autoscaling_group" "mariadb-asg" {
vpc_zone_identifier = "${split(",",var.subnets_private)}"
Attempt: list
resource "aws_autoscaling_group" "mariadb-asg" {
vpc_zone_identifier = "${list(split(",",var.subnets_private))}"
Question
The above attempt with the list(split( should in theory work. Since terraform complains but doesn't print the actual value it's quite hard to debug. Any suggestions are appreciated.
Filling in the value manually works.
When reading the documentation very carefully it appears the split is not spitting out clean elements that afterwards can be put into a list.
They suggest to wrap brackets around the string ([" xxxxxxx "]) so terraform picks it up as a list.
If my logic is correct that means
subnet-1234aec7,subnet-1234c8a7 is outputted as subnet-1234aec7","subnet-1234c8a7 (note the quotes), assuming the quotes around the delimiter of the split command have nothing to do with this.
Here is the working solution
vpc_zone_identifier = ["${split(",",var.subnets_private)}"]
For the following helps:
vpc_zone_identifier = ["${data.aws_subnet_ids.all.ids}"]

Resources