Sed replace inside a json file with regex - linux

I have a huge json file, i will copy a part of it :
"panels": [
"targets": [
{
"alias": "First Prod",
"dimensions": {
"Function": "robot-support-dev-create-human"
},
}
{
"alias": "Second Prod",
"dimensions": {
"Function": "robot-support-prototype-dev-beta-activate-human"
},
}
{
"alias": "third Prod",
"dimensions": {
"Function": "robot-support-dev-jira-kill-human"
},
}
{
"alias": "Somehting",
"dimensions": {
"Robotalias": "default",
"RobotName": "Robot-prod-prototype",
"Operation": "Fight"
},
]
]
I want to perform a Regex on Function each time it contains the robot-support-dev to robot-support-prod-...
sed -i ' s/"robot-support-([a-zA-Z0-9_]+)-dev-([^"]+)"/robot-support-\\1-prod-\\2/g;'
This is what i did but there's something wrong with my regex maybe

You can't match a json with a regex. The "right way(TM)" would be to first extract Function using a json aware tool like jq, then modify it with sed, then insert it back using json aware tool.
Sed by default uses basic regex, so you need to change ( ) + into \( \) \{1,\} (the \+ is a GNU extension) or with GNU sed just use sed -E to use extended regex. Also the \\1 would be interpreted as 2 characters \ with 1, you want to use \1 with a single \. But anyway, your regex is just invalid and does not match what you want (I guess). Also the " are missing on the right side in the replacement string, so your command would just remove the ". Just substitute what you need, try:
sed 's/"robot-support-dev-/"robot-support-prod-/g;'

As #KamilCuk pointed out, there are some issues with the regex you are currently using. I think that if the occurrences you give as an example are the only possibilities, it would work if you match these groups:
sed -i 's/"robot-support\(-\|-.*-\)dev-\(.*\)"/"robot-support\1prod-\2"/g'

As pointed out elsewhere on this page, using sed for this type of problem is, at best, fraught with danger. Since the question has been tagged with jq, it should be pointed out that jq is an excellent match for this type of problem. In particular, a trivial solution can be obtained using the filter sub of arity 2 or 3, i.e. sub/3: sub("FROM"; "TO"; "g").
You might also wish to use walk so that you don't have to be concerned about where exactly the "Function" keys occur, e.g.
walk( if type == "object" and (.Function|type) == "string"
then .Function |= sub( "robot-support-(?<a>([^-]+-)?)dev-"; "robot-support-\(.a)prod-"; "g")
else . end)

Related

How do I grep and replace string in bash

I have a file which contains my json
{
"type": "xyz",
"my_version": "1.0.1.66~22hgde",
}
I want to edit the value for key my_version and everytime replace the value after third dot with another number which is stored in a variable so it will become something like 1.0.1.32~22hgde. I am using sed to replace it
sed -i "s/\"my_version\": \"1.0.1.66~22hgde\"/\"my_version\": \"1.0.1.$VAR~22hgde\"/g" test.json
This works but the issue is that my_version string doesn't remain constant and it can change and the string can be something like this 1.0.2.66 or 2.0.1.66. So how do I handle such case in bash?
how do I handle such case?
You write a regular expression to match any possible combination of characters that can be there. You can learn regex with fun with regex crosswords online.
Do not edit JSON files with sed - sed is for lines. Consider using JSON aware tools - like jq, which will handle any possible case.
A jq answer: file.json contains
{
"type": "xyz",
"my_version": "1.0.1.66~22hgde",
"object": "can't end with a comma"
}
then, replacing the last octet before the tilde:
VAR=32
jq --arg octet "$VAR" '.my_version |= sub("[0-9]+(?=~)"; $octet)' file.json
outputs
{
"type": "xyz",
"my_version": "1.0.1.32~22hgde",
"object": "can't end with a comma"
}

Use match from file in string

I have a list of user ids in a file and I'm trying to generate some sql from the file.
cat users.json | awk '/UID/{print "INSERT IGNORE INTO potential_problem_users VALUES ("$2");" }'
But it's not doing what I expected it to do:
);SERT IGNORE INTO potential_problem_users VALUES ("1"
);SERT IGNORE INTO potential_problem_users VALUES ("2"
The query I am trying for is:
INSERT IGNORE INTO potential_problem_users VALUES ([userID]);
example json if it helps:
{
"results": [
{
"UID": "abc"
},
{
"UID": "124"
}
],
"objectsCount": 5,
"totalCount": 10966,
"statusCode": 200,
"errorCode": 0,
"statusReason": "OK"
}
Am I using the right tool? If I am, what am I doing wrong?
Your JSON input has a couple of issues which makes it impossible to parse them using parser like jq which on OS X you can install by brew install jq
The error free JSON from your question would be
{
"results": [
{
"UID": "abc"
},
{
"UID": "124"
}
],
"objectsCount": 5,
"totalCount": 10966,
"statusCode": 200,
"errorCode": 0,
"statusReason": "OK"
}
To parse this JSON and produce two statements as queries containing the UID values, just do
jq --raw-output '"INSERT IGNORE INTO potential_problem_users VALUES (" + (.results[] | .UID) + ")"' users.json
would produce an output as
INSERT IGNORE INTO potential_problem_users VALUES (abc)
INSERT IGNORE INTO potential_problem_users VALUES (124)
The addition operator + in jq allows you to concatenate strings to form the final string.
jq snippet from jqplay.
awk -F: '/UID/ {print "INSERT IGNORE INTO potential_problem_users VALUES ("gensub(" ","","g",$2)");"}' users.json
Use gensub to take the spaces out of second : delimited field. Note also that you don't need to cat the file to awk it.
You have several JSON parsers available to you on OS X that will also work on Linux. You can use Perl, Python or Ruby among others.
Here is a demo in Ruby:
Given:
$ cat json
{
"results": [
{
"UID": "abc"
},
{
"UID": "124"
}
],
"objectsCount": 5,
"totalCount": 10966,
"statusCode": 200,
"errorCode": 0,
"statusReason": "OK"
}
You can parse that file and print to lines of interest this way:
$ ruby -0777 -lane 'require "json"
d=JSON.parse($_)
d["results"].each {|e| puts "INSERT IGNORE INTO potential_problem_users VALUES (#{e["UID"]})" }' json
INSERT IGNORE INTO potential_problem_users VALUES (abc)
INSERT IGNORE INTO potential_problem_users VALUES (124)
users.json contains carriage returns (aka \rs aka control-Ms), run dos2unix or similar on it first then re-run your awk script (or run any other script) on it and then let us know if you still have a problem.
Almost every time you find characters you expect to see at the end of a line showing up at the start instead the problem is control-Ms.

Replace "[ to [ with sed

I have a file:
"tags": "['PNP']"
Clearly "[ is wrong, it must ot be "tags" : ['PNP']
So I wanna to replace with sed:
sed -i "1,$ s/"[/[/g" file.json
However it told me that it is not match
How can I do it?
You can do
sed 's/"\[/\[/; s/\]"/\]/' file.json
The brackets [] are special characters in basic regular expressions, so you need to escape them.
On input:
"tags": "['PNP']"
This outputs:
"tags": ['PNP']

Can some one explain what this code is doing? what are all the special characters in sed doing?

jsonval () {
temp=`echo $haystack | sed 's/\\\\\//\//g' | sed 's/[{}]//g' | awk -v k="text" ' {n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}' | sed 's/\"\:\"/\|/g' | sed 's/[\,]/ /g' | sed ' s/\"//g' | grep -w $needle`
echo ${temp##*|}
}
dev_key='xxxxxxxxxxxx'
zip_code='48446'
city='Lapeer'
state='MI'
red=$(tput setaf 1)
textreset=$(tput sgr0)
haystack=$(curl -Ls -X GET http://api.wunderground.com/api/$dev_key/conditions/q/$state/$city.json)
needle='temperature_string'
temperature=$(jsonval $needle $haystack)
needle='weather'
current_condition=$(jsonval $needle $haystack)
echo -e '\n' $red $current_condition 'and' $temperature $textreset '\n'
this code is supposed to parse json weather data to terminal using a developer key to call the information.
This the full code, can someone explain what sed is doing, I know it supposed to act as a substitute method, but why are there so many slashes and special characters used?
Also what is the echo ${temp##*|} doing, all these special characters is making it hard for me to understand this code.
It seems that this command try to parse json It's far to be a good idea, since there's some nice item in the toolbox. One of them is jq. It's good at formatting JSON outputs or retrieving items in complicated Data Source. Example :
file.json
{
"items": [
{
"tags": [
"bash",
"vim",
"zsh"
],
"owner": {
"reputation": 178,
"user_id": 22734,
"user_type": "registered",
"profile_image": "https://www.gravatar.com/avatar/25ee9a1b9f5a16feb1432882a9ef2f06?s=128&d=identicon&r=PG",
"display_name": "Brad Parks",
"link": "http://unix.stackexchange.com/users/22734/brad-parks"
},
"is_answered": false,
"view_count": 2,
"answer_count": 0,
"score": 0,
"last_activity_date": 1417919326,
"creation_date": 1417919326,
"question_id": 171907,
"link": "http://unix.stackexchange.com/questions/171907/use-netrw-or-nerdtree-in-zsh-bash-to-select-a-file-by-browsing",
"title": "Use Netrw or Nerdtree in Zsh/Bash to select a file BY BROWSING?"
}
]
}
Output from searching owner's sub HASH :
Don't reinvent the wheel badly ;)

Struggling with awk

I have a curl command which returns me this kind of json formated text
[{"id": "nUsrLast//device control", "name": "nUsrLast", "access": "readonly", "value": "0", "visibility": "visible", "type": "integer"}]
I would like to get the value of the field value.
Can someone give me a simple awk or grep command to do so ?
Here is an awk
awk -v RS="," -F\" '/value/ {print $4}' file
0
How does it work?
Setting RS to , it breaks line to some like this:
awk -v RS="," '{$1=$1}1' file
[{"id": "nUsrLast//device control"
"name": "nUsrLast"
"access": "readonly"
"value": "0"
"visibility": "visible"
"type": "integer"}]
Then /value/ {print $4} prints field 4 separated by "
You could use grep with oP parameters,
$ echo '[{"id": "nUsrLast//device control", "name": "nUsrLast", "access": "readonly", "value": "0", "visibility": "visible", "type": "integer"}]' | grep -oP '(?<=\"value\": \")[^"]*'
0
From grep --help,
-P, --perl-regexp PATTERN is a Perl regular expression
-o, --only-matching show only the part of a line matching PATTERN
Pattern Explanation:
(?<=\"value\": \") Lookbehind is used to set or place the matching marker. In our case, regex engine places the matching marker just after to the string "value": ".
[^"]* Now it matches any character except " zero or more times. When a " is detected then the regex engine would stop it's matching operation.
This solution isn't grep or awk but chances are pretty good your system has perl on it, and this is the best solution thus far:
echo <your_json> | perl -e '<STDIN> =~ /\"value\"\s*:\s*\"(([^"]|\\")*)\"/; print $1;'
It handles the possibility of a failed request by ensuring there is a trailing " character. It also handles backslash-escaped " symbols in the string and whitespace between "value" and the colon character.
It does not handle JSON broken across multiple lines, but then none of the other solutions do, either.
\"value\"\s*:\s*\" Ensures that we're dealing with the correct field, then
(([^"]|\\")*) Captures the associated valid JSON string
\" Makes sure the string is properly terminated
Frankly, you're better off using a real JSON parser, though.

Resources