I'm looking to improve my org's Puppet work lifecycle by comparing differences for actual node catalogs. We came across this project which compiles catalogs for nodes and creates a diff for them, but it seems to require an online master.
I need to be able to do what this tool does, albeit without a master - I'd just like to compile a deterministic JSON or YAML blob which describes all of the resources that would be managed by Puppet for a given node and given a set of facts.
Is there a way for me to do this without an online master?
If you have Rspec-puppet set up, there is an easy way to do this. Just add a File.write statement inside one of your it blocks:
require 'spec_helper'
describe 'myclass' do
it {
File.write(
'myclass.json',
PSON.pretty_generate(catalogue)
)
#is_expected.to compile.with_all_deps
}
end
I have more information in a blog post on this here.
If you can't use Rspec-puppet to do it (recommended), have a look at another blog post I wrote, Compiling a puppet catalog – on a laptop.
I have been looking for it for a long time.
In the end, Vagrant in combination with Puppet brought me the solution.
The only thing you need is to install puppet.
First you have to set the Facts. After that you can compile your catalog.
FACTER_server_role='webstack' \
... \
FACTER_hostname='hostname' \
FACTER_fqdn='hostnamne.fqdn' \
puppet catalog compile hostnamne.fqdn \
--modulepath "./modules" \
--hiera_config "./hiera.yaml" \
--environmentpath ./environments/ \
--environment production
If you want to get a clean json file, pipe the output in sed and send the output to a file.
| sed -n '1!p' > $file
Related
Is there a simple way to change the schedule of a kubernetes cronjob like kubectl change cronjob my-cronjob "10 10 * * *"? Or any other way without needing to do kubectl apply -f deployment.yml? The latter can be extremely cumbersome in a complex CI/CD setting because manually editing the deployment yaml is often not desired, especially not if the file is created from a template in the build process.
Alternatively, is there a way to start a cronjob manually? For instance, a job is scheduled to start in 22 hours, but I want to trigger it manually once now without changing the cron schedule for good (for testing or an initial run)?
You can update only the selected field of resourse by patching it
patch -h
Update field(s) of a resource using strategic merge patch, a JSON merge patch, or a JSON patch.
JSON and YAML formats are accepted.
Please refer to the models in
https://htmlpreview.github.io/?https://github.com/kubernetes/kubernetes/blob/HEAD/docs/api-reference/v1/definitions.html
to find if a field is mutable.
As provided in comment for ref :
kubectl patch cronjob my-cronjob -p '{"spec":{"schedule": "42 11 * * *"}}'
Also, in current kubectl versions, to launch a onetime execution of a declared cronjob, you can manualy create a job that adheres to the cronjob spec with
kubectl create job --from=cronjob/mycron
The more recent versions of k8s (from 1.10 on) support the following command:
$ kubectl create job my-one-time-job --from=cronjobs/my-cronjob
Source is this solved k8s github issue.
From #SmCaterpillar answer above kubectl patch my-cronjob -p '{"spec":{"schedule": "42 11 * * *"}}',
I was getting the error: unable to parse "'{spec:{schedule:": yaml: found unexpected end of stream
If someone else is facing a similar issue, replace the last part of the command with -
"{\"spec\":{\"schedule\": \"42 11 * * *\"}}"
I have a friend who developed a kubectl plugin that answers exactly that !
It takes an existing cronjob and just create a job out of it.
See https://github.com/vic3lord/cronjobjob
Look into the README for installation instructions.
And if you want to do patch a k8s cronjob schedule with the Python kubernetes library, you can do this like that:
from kubernetes import client, config
config.load_kube_config()
v1 = client.BatchV1beta1Api()
body = {"spec": {"schedule": "#daily"}}
ret = v1.patch_namespaced_cron_job(
namespace="default", name="my-cronjob", body=body
)
print(ret)
I am trying to load data from BigQuery to Jupyter Notebook, where I will do some manipulation and plotting. The datasets is 25 millions rows with 10 columns, which definitely exceeds my machine's memory capacity(16 GB).
I have read this post about using HDFStore, but the problem here is that I still need to read the data to Jupyter Notebook to do the manipulation.
I am using Google Cloud Platform, so setting a huge cluster in Dataproc might be an option, though that could be costly.
Anyone gets similar issue and has a solution?
Concerning products within Google Cloud Platform you can create a Datalab instance to run your notebooks and specify the desired machine type with the --machine-type flag (docs). You can use a high-memory machine if needed.
Of course, you can also use Dataproc as you already proposed. For easier setup you can use the predefined initialization action with the following parameter upon cluster creation:
--initialization-actions gs://dataproc-initialization-actions/datalab/datalab.sh
Edit
As you are using a GCE instance, you can also use a script to autoshutdown the VM when you are not using it. You can edit ~/.bash_logout so that it checks if it's the last session and, if so, stops the VM
if [ $(who|wc -l) == 1 ];
then
gcloud compute instances stop $(hostname) --zone $(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/zone 2>\dev\null | cut -d/ -f4) --quiet
fi
Or, if you prefer a curl approach:
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" https://www.googleapis.com/compute/v1/projects/$(gcloud config get-value project 2>\dev\null)/zones/$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/instance/zone 2>\dev\null | cut -d/ -f4)/instances/$(hostname)/stop -d ""
Keep in mind that you might need to update Cloud SDK components to get the gcloud command to work. Either use:
gcloud components update
or
sudo apt-get update && sudo apt-get --only-upgrade install kubectl google-cloud-sdk google-cloud-sdk-datastore-emulator google-cloud-sdk-pubsub-emulator google-cloud-sdk-app-engine-go google-cloud-sdk-app-engine-java google-cloud-sdk-app-engine-python google-cloud-sdk-cbt google-cloud-sdk-bigtable-emulator google-cloud-sdk-datalab -y
You can include one of these and ~/.bash_logout edits in your startup-script.
I am running a Node.js app on Google App Engine, using the following command to deploy my code:
gcloud app deploy --stop-previous-version
My desired behavior is for all instances running previous versions to be terminated, but they always seem to stick around. Is there something I'm missing?
I realize they are not receiving traffic, but I am still paying for them and they cause some background telemetry noise. Is there a better way of running this command?
Example output of the gcloud app instances list:
As you can see I have two different versions running.
We accidentally blew through our free Google App Engine credit in less than 30 days because of an errant flexible instance that wasn't cleared by subsequent deployments. When we pinpointed it as the cause it had scaled up to four simultaneous instances that were basically idling away.
tl;dr: Use the --version flag when deploying to specify a version name. An existing instance with the same version will be
replaced then next time you deploy.
That led me down the rabbit hole that is --stop-previous-version. Here's what I've found out so far:
--stop-previous-version doesn't seem to be supported anymore. It's mentioned under Flags on the gcloud app deploy reference page, but if you look at the top of the page where all the flags are listed, it's nowhere to be found.
I tried deploying with that flag set to see what would happen but it seemingly had no effect. A new version was still created, and I still had to go in and manually delete the old instance.
There's an open Github issue on the gcloud-maven-plugin repo that specifically calls this out as an issue with that plugin but the issue has been seemingly ignored.
At this point our best bet at this point is to add --version=staging or whatever to gcloud deploy app. The reference docs for that flag seem to indicate that that it'll replace an existing instance that shares that "version":
--version=VERSION, -v VERSION
The version of the app that will be created or replaced by this deployment. If you do not specify a version, one will be generated for you.
(emphasis mine)
Additionally, Google's own reference documentation on app.yaml (the link's for the Python docs but it's still relevant) specifically calls out the --version flag as the "preferred" way to specify a version when deploying:
The recommended approach is to remove the version element from your app.yaml file and instead, use a command-line flag to specify your version ID
As far as I can tell, for Standard Environment with automatic scaling at least, it is normal for old versions to remain "serving", though they should hopefully have zero instances (even if your scaling configuration specifies a nonzero minimum). At least that's what I've seen. I think (I hope) that those old "serving" instances won't result in any charges, since billing is per instance.
I know most of the above answers are for Flexible Environment, but I thought I'd include this here for people who are wondering.
(And it would be great if someone from Google could confirm.)
I had same problem as OP. Using the flex environment (some of this also applies to standard environment) with Docker (runtime: custom in app.yaml) I've finally solved this! I tried a lot of things and I'm not sure which one fixed it (or whether it was a combination) so I'll list the things I did here, the most likely solutions being listed first.
SOLUTION 1) Ensure that cloud storage deletes old versions
What does cloud storage have to do with anything? (I hear you ask)
Well there's a little tooltip (Google Cloud Platform Web UI (GCP) > App Engine > Versions > Size) that when you hover over it says:
(Google App Engine) Flexible environment code is stored and billed from Google Cloud Storage ... yada yada yada
So based on this info and this answer I visited GCP > Cloud Storage > Browser and found my storage bucket AND a load of other storage buckets I didn't know existed. It turns out that some of the buckets store cached cloud functions code, some store cached docker images and some store other cached code/stuff (you can tell which is which by browsing the buckets).
So I added a deletion policy to all the buckets (except the cloud functions bucket) as follows:
Go to GCP > Cloud Storage > Browser and click the link (for the relevant bucket) in the Lifecycle Rules column > Click ADD A RULE > THEN:
For SELECT ACTION choose "Delete Object" and click continue
For SELECT OBJECT choose "Number of newer versions" and enter 1 in the input
Click CREATE
This will return you to the table view and you should now see the rule in the lifecycle rules column.
REPEAT this process for all relevant buckets (the relevant buckets were described earlier).
THEN delete the contents of the relevant buckets. WARNING: Some buckets warn you NOT to delete the bucket itself, only the contents!
Now re-deploy and your latest version should now get deployed and hopefully you will never have this problem again!
SOLUTION 2) Use deploy flags
I added these flags
gcloud app deploy --quiet --promote --stop-previous-version
This probably doesn't help since these flags seem to be the default but worth adding just in case.
Note that for the standard environment only (I heard on the grapevine) you can also use the --no-cache flag which might help but with flex, this flag caused the deployment to fail (when I tried).
SOLUTION 3)
This probably does not help at all, but I added:
COPY app.yaml .
to the Dockerfile
TIP 1)
This is probably more of a helpful / useful debug approach than a fix.
Visit GCP > App Engine > Versions
This shows all versions of your app (1 per deployment) and it also shows which version each instance is running (instances are configured in app.yaml).
Make sure all instances are running the latest version. This should happen by default. Probably worth deleting old versions.
You can determine your version from the gcloud app deploy logs (at the start of the logs) but it seems that the versions are listed by order of deployment anyway (most recent at top).
TIP 2)
Visit GCP > App Engine > Instances
SSH into an instance. This is just a matter of clicking a few buttons (see screenshot below). Once you have SSH'd in run:
docker exec -it gaeapp /bin/bash
Which will get you into the docker container running your code. Now you can browse around to make sure it has your latest code.
Well I think my answer is long enough now. If this helps, don't thank me, J-ES-US is the one you should thank ;) I belong to Him ^^
Google may have updated their documentation cited in #IAmKale's answer
Note that if the version is running on an instance of an auto-scaled service, using --stop-previous-version will not work and the previous version will continue to run because auto-scaled service instances are always running.
Seems like that flag only works with manually scaled services.
This is a supplementary and optional answer in addition to my other main answer.
I am now, in addition to my other answer, auto incrementing version manually on deploy using a script.
My script contents are below.
Basically, the script auto increments version every time you deploy. I am using node.js so the script uses npm version to bump the version but this line could easily be tweaked to whatever language you use.
The script requires a clean git working directory for deployment.
The script assumes that when the version is bumped, this will result in file changes (e.g. changes to package.json version) that need pushing.
The script essentially tries to find your SSH key and if it finds it then it starts an SSH agent and uses your SSH key to git commit and git push the file changes. Else it just does a git commit without a push.
It then does a deploy using the --version flag ... --version="${deployVer}"
Thought this might help someone, especially since the top answer talks a lot about using the --version flag on a deploy.
#!/usr/bin/env bash
projectName="vehicle-damage-inspector-app-engine"
# Find SSH key
sshFile1=~/.ssh/id_ed25519
sshFile2=~/Desktop/.ssh/id_ed25519
sshFile3=~/.ssh/id_rsa
sshFile4=~/Desktop/.ssh/id_rsa
if [ -f "${sshFile1}" ]; then
sshFile="${sshFile1}"
elif [ -f "${sshFile2}" ]; then
sshFile="${sshFile2}"
elif [ -f "${sshFile3}" ]; then
sshFile="${sshFile3}"
elif [ -f "${sshFile4}" ]; then
sshFile="${sshFile4}"
fi
# If SSH key found then fire up SSH agent
if [ -n "${sshFile}" ]; then
pub=$(cat "${sshFile}.pub")
for i in ${pub}; do email="${i}"; done
name="Auto Deploy ${projectName}"
git config --global user.email "${email}"
git config --global user.name "${name}"
echo "Git SSH key = ${sshFile}"
echo "Git email = ${email}"
echo "Git name = ${name}"
eval "$(ssh-agent -s)"
ssh-add "${sshFile}" &>/dev/null
sshKeyAdded=true
fi
# Bump version and git commit (and git push if SSH key added) and deploy
if [ -z "$(git status --porcelain)" ]; then
echo "Working directory clean"
echo "Bumping patch version"
ver=$(npm version patch --no-git-tag-version)
git add -A
git commit -m "${projectName} version ${ver}"
if [ -n "${sshKeyAdded}" ]; then
echo ">>>>> Bumped patch version to ${ver} with git commit and git push"
git push
else
echo ">>>>> Bumped patch version to ${ver} with git commit only, please git push manually"
fi
deployVer="${ver//"."/"-"}"
gcloud app deploy --quiet --promote --stop-previous-version --version="${deployVer}"
else
echo "Working directory unclean, please commit changes"
fi
For node.js users if you call the script deploy.sh you should add:
"deploy": "sh deploy.sh"
In your package.json scripts and deploy with npm run deploy
I wrote a simple module to install a package (BioPerl) on a Ubuntu VM. The whole init.pp file is here:
https://gist.github.com/anonymous/17b4c31bf7309aff14dfdcd378e44f40
The problem is it doesn't work, and it gives me no feedback to let me know why it doesn't work. There are 3 simple steps in the module. I checked and it didn't do any of them. Heres the first 2:
Step 1: Download an archive and save it to /usr/local/lib
exec { 'bioperl-download':
command => "sudo /usr/bin/wget --no-check-certificate -O ${archive_path} ${package_uri}",
require => Package['wget']
}
Step 2: Extract the archive
exec { 'bioperl-extract':
command => "sudo /usr/bin/tar zxvf ${archive_path} --directory ${install_path}; sudo rm ${archive_path}",
require => Exec['bioperl-download']
}
pretty simple. But I have no idea where the problem is because I can't see what its doing. The provisioner is set to verbose mode, and here are the output lines for my module:
==> default: Notice: /Stage[main]/Bioperl/Exec[bioperl-download]/returns: executed successfully
==> default: Notice: /Stage[main]/Bioperl/Exec[bioperl-extract]/returns: executed successfully
==> default: Notice: /Stage[main]/Bioperl/Exec[bioperl-path]/returns: executed successfully
So all I know is it executed these three steps successfully. It doesn't tell me anything about whether the steps did their job properly or not. I know that it didn't download the archive to /usr/local/lib that directory, and that it didn't add an environment variable file to /usr/profile.d. Maybe the issue is the variables containing the directories are wrong. Maybe the variable containing the archives download URI is wrong. How can I find these things out?
UPDATE:
It turns out the module does work. But to improve the module (since I want to upload it to forge.puppetlabs.com, I tried implementing the changes suggested by Matt. Heres the new code:
file { 'bioperl-download':
path => "${archive_path}",
source => "http://cpan.metacpan.org/authors/id/C/CJ/CJFIELDS/${archive_name}",
ensure => "present"
}
exec { 'bioperl-extract':
command => "sudo /bin/tar zxvf ${archive_name}",
cwd => "${bioperl_target_dir}",
require => File['bioperl-download']
}
A problem: It gives me an error telling me that the source cannot be http://. I see in the docs that they do indeed allow http:// files as the source for the file resource. Maybe I'm using an older version of puppet?
I want to try out the puppet-archive module, but I'm not sure how I can set it as a required dependency. By that, I mean how I can make sure its installed first. Do I need to get my module to download the module from github and save it to the modules directory? Or is there a way to let puppet install it automatically? I added it as a dependency to the metadata.json file, but that doesn't install it. I know I can just get my module to download the package, but I was wondering what best practice for this is.
The initial problem you describe is acceptance testing. Verifying that the Puppet resources and code you wrote actually resulted in the desired end state you wanted is normally accomplished with Serverspec: http://serverspec.org/. For example, you can write a Puppet module to deploy an application, but you only know that Puppet did what you told it to, and not that the application actually successfully deployed. Note Serverspec is also what people generally use to solve this problem for Ansible and Chef also.
You can write a Serverspec test similar to the following to help test your module's end state:
describe file('/usr/local/lib/bioperl.tar.gz') do
it { expect(subject).to be_file }
end
describe file('/usr/profile.d/env_file') do
it { expect_subject).to be_file }
its(:content) { is_expected.to match(/env stuff/) }
end
However, your problem also seems to deal with debugging why your acceptance tests failed. For that, you need unit testing. This is normally solved with RSpec-Puppet: http://rspec-puppet.com/. I would show you how to write some tests for your situation, but I don't think you should be writing your Puppet module the way that you did, so it would render the unit tests irrelevant.
Instead, consider using a file resource with the source attribute and a HTTP URI to grab the tarball instead of an exec with wget: https://docs.puppet.com/puppet/latest/type.html#file-attribute-source. Also, you might want to consider using the Puppet archive module to assist you: https://forge.puppet.com/puppet/archive.
If you have questions on how to use these tools to provide unit and acceptance testing, or have questions on how to refactor your module, then don't hesitate to write followup questions on StackOverflow and we can help you.
We are working on integrating GitLab (enterprise edition) in our tooling, but one thing that is still on our wishlist is to create a merge request in GitLab via a command line (or batchfile or similar, for that matter). We would like to integrate this in our tooling. Searching here and on the web lead me to believe that this is not possible with native GitLab, but that we need additional tooling for that.
Am I correct? And what kind of tooling would I want to use for this?
As of GitLab 11.10, if you're using git 2.10 or newer, you can automatically create a merge request from the command line like this:
git push -o merge_request.create
More information can be found in the docs.
It's not natively supported, but it's not hard to throw together. The gitlab API has support for opening MR: https://github.com/gitlabhq/gitlabhq/blob/master/doc/api/merge_requests.md#create-mr
You can use following utility.
Disclosure : I developed it.
https://github.com/vishwanatharondekar/gitlab-cli
You can create merge request using this.
Some of the features it has are.
Base branch is optional. If base branch is not provided. Current branch is used as base branch.
target branch is optional. If target branch is not provided, default branch of the repo in gitlab will be used.
Created pull request page will be opened automatically after successful creation.
If title is not supported with -m option value. It will be taken from in place editor opened. First line is taken as title.
In the editor opened third line onwards takes as description.
Comma separated list of labels can be provided with its option.
Supports CI.
Repository specific configs can be given.
squash option is available.
remove source branch option is available.
If you push your branch before this command (git push -o merge_request.create) it will not work. Git will response with Everything up-to-date and merge request will not be created (gitlab 12.3).
When I tried to remove my branch from a server (do not remove your local branch!!!) then it worked for me in this form.
git push --set-upstream origin your-branch-name -o merge_request.create
In addition to answering of #AhmadSherif, You can use merge_request.target=<branch_name> for declaring the target branch.
sample usage:
git push -o merge_request.create -o merge_request.target=develop origin feature
Simple This:
According to the Gitlab documents, you can define an alias for this command to simpler usage.
git config --global alias.mwps "push -o merge_request.create -o
merge_request.target=master -o merge_request.merge_when_pipeline_succeeds"
I made a shell function which opens up the GitLab MR web page with desired parameters.
Based on the directory with the git repo you are currently in, it:
Finds the correct URL to your repo.
Sets the source branch to the branch you're currently on.
As a optional first argument you can provide the target branch. Otherwise, GitLab defaults to your default branch, which is typically master.
gmr() {
# A quick way to open a GitLab merge request URL for the current git branch
# you're on. The optional first argument is the target branch.
repo_path=$(git remote get-url origin --push | sed 's/^.*://g' | sed 's/.git$//g')
current_branch=$(git rev-parse --abbrev-ref HEAD)
if [[ -n $1 ]]; then
target_branch="&merge_request[target_branch]=$1"
else
target_branch=""
fi
xdg-open "https://gitlab.com/$repo_path/merge_requests/new?merge_request[source_branch]=$current_branch$target_branch"
}
You can set more default values in the URL, like removing the source branch after merge:
&merge_request[force_remove_source_branch]=true
Or assignee to someone:
&merge_request[assignee_ids][]=12345
Or add a reviewer:
&merge_request[reviewer_ids][]=54321
You can easily find the possible query string parameters by searching the source of the GitLab MR webpage for merge_request[.
As of now, GitLab sadly does not support this, however I recently saw it on their issue tracker. It appears one can expect a 'native tool' in the upcoming months.
GitLab tweeted out about numa08/git-gitlab some time ago, so I guess this would be worth a try.
In our build script we just pop up the browser with the correct URL and let the developer write his comments in the form hit save to create the merge request. You get this url with the correct parameters by creating a merge request manually and copying the url of the form.
#!/bin/bash
set -e
set -o pipefail
BRANCH=${2}
....
git push -f origin-gitlab $BRANCH
open "https://gitlab.com/**username**/**project-name**/merge_requests/new?merge_request%5Bsource_branch%5D=$BRANCH&merge_request%5Bsource_project_id%5D=99999&merge_request%5Btarget_branch%5D=master&merge_request%5Btarget_project_id%5D=99999"
You can write a local git alias to open a Gitlab Merge Request creation page in the default browser for the currently checked-out branch.
[alias]
lab = "!start https://gitlab.com/path/to/repo/-/merge_requests/new?merge_request%5Bsource_branch%5D=\"$(git rev-parse --abbrev-ref HEAD)\""
(this is a very simple alias for windows; I guess there are equivalent replacements for "start" on linux and fancier aliases that work with github and bitbucket too)
As well as being able to immediately see&modify the details of the MR, the advantage of this over using the merge_request.create push option is that you don't need your local branch to be behind the remote for it to work.
You might additionally want to store the alias in the repo itself.
I use https://github.com/mdsb100/cli-gitlab
I am creating the MR from inside of a gitlab CI docker container based on alpine linux, so I include the install command in before-script (that could also be included in your image). All commands in the following .gitlab-ci.yml file, are also relevant for normal command line usage (as long as you have the cli-gitlab npm installed).
variables:
TARGET_BRANCH: 'live'
GITLAB_URL: 'https://your.gitlab.net'
GITLAB_TOKEN: $PRIVATE_TOKEN #created in user profile & added in project settings
before-script:
-apk update && apk add nodejs && npm install cli-gitlab -g
script:
- gitlab url $GITLAB_URL && gitlab token $GITLAB_TOKEN
- 'echo "gitlab addMergeRequest $CI_PROJECT_ID $CI_COMMIT_REF_NAME \"$TARGET_BRANCH\" 13 `date +%Y%m%d%H%M%S`"'
- 'gitlab addMergeRequest $CI_PROJECT_ID $CI_COMMIT_REF_NAME "$TARGET_BRANCH" 13 `date +%Y%m%d%H%M%S` 2> ./mr.json'
- cat ./mr.json
This will echo true if the merge request already exists, and echo the json result of the new MR if it succeeds to create one (also saving to a mr.json file).
Since GitLab 15.7 (Dec. 2022), the GitLab CLI glab is officially integrated to GitLab.
Introducing the GitLab CLI
The command line is one of the most important tools in a software engineer’s toolkit and the majority of their process and work revolve around tools available there. They customize their CLI with styles and extend it through applications to ensure maximum efficiency while performing tasks. The CLI is the backbone of scripts and workflows developers depend on to complete their work.
To support more developers where they’re already working, we’ve adopted the open source project glab, which will form the foundation of GitLab’s native CLI experience.
The GitLab CLI brings GitLab together with Git and your code, with no application or tab switching required.
You can read about our adoption of glab, our partnership with 1Password, and how to contribute to the project in our blog post.
A special thank you to Clement Sam for creating glab and trusting us with its future.
That means you can create a MR with glab mr create:
glab mr create -a username -t "fix annoying bug"