I'm trying to curl a webpage and does some processing to it and in final i am trying to print in json format.(which actually needs to be in mongodb input)
so the input (which is read though curl) is
Input:
brendan google engineer
stones microsoft chief_engineer
david facebook tester
for the kind of processing, i'm assigning values to the variables ($name, $emloyer, $designation)
my final command which converts to json is,
echo [{\"Name\":\"$name\"},{\"Employer\":\"$employer\"},{\"dDesignation\":\"$designation\"}]
The current output is,
[{"Name":"brendan","Employer":"google","Designation":"engineer"}]
[{"Name":"stones","Employer":"microsoft","Designation":"chief_engineer"}]
[{"Name":"david","Employer":"facebook","Designation":"tester"}]
but, i want the output in the same line separated by comma and square brackets in the start and end (not on every lines)
Expected output:
[{"Name":"brendan","Employer":"google","Designation":"engineer"},{"Name":"stones","Employer":"microsoft","Designation":"chief_engineer"},
{"Name":"david","Employer":"facebook","Designation":"tester"}]
any suggestions.
Conventional text-processing tools can't do this right for the general case. There are a bunch of corner cases to JSON -- nonprintable and high-Unicode characters (and quotes) need to be escaped, for instance. Use a tool that's actually built for the job, such as jq:
jq -n -R '
[
inputs |
split(" ") |
{ "Name": .[0], "Employer": .[1], "Designation": .[2] }
]' <<EOF
brendan google engineer
stones microsoft chief_engineer
david facebook tester
EOF
...emits as output:
[
{
"Name": "brendan",
"Employer": "google",
"Designation": "engineer"
},
{
"Name": "stones",
"Employer": "microsoft",
"Designation": "chief_engineer"
},
{
"Name": "david",
"Employer": "facebook",
"Designation": "tester"
}
]
Something like this?
sep='['
curl "...whatever..." |
while read -r name employer designation; do
printf '%s{"Name": "%s", "Employer": "%s", "Designation": "%s"}' "$sep" "$name" "$employer" "$designation"
sep=', '
done
printf ']\n'
I do agree that this is brittle and error-prone; if you can use a JSON-aware tool like jq, by all means do that instead.
If you have access to jq 1.5 or later, then you can use inputs and may wish to consider using splits(" +") in case the tokens might be separated by more than one space:
jq -n -R '
[inputs
| [splits(" +")]
| { "Name": .[0], "Employer": .[1], "Designation": .[2] }]'
If you do not have ready access to jq 1.5 or later, then please note that the following will work with jq 1.4:
jq -R -s '
[split("\n")[]
| select(length>0)
| split(" ")
| { "Name": .[0], "Employer": .[1], "Designation": .[2] }]'
Related
I have a file which contains my json
{
"type": "xyz",
"my_version": "1.0.1.66~22hgde",
}
I want to edit the value for key my_version and everytime replace the value after third dot with another number which is stored in a variable so it will become something like 1.0.1.32~22hgde. I am using sed to replace it
sed -i "s/\"my_version\": \"1.0.1.66~22hgde\"/\"my_version\": \"1.0.1.$VAR~22hgde\"/g" test.json
This works but the issue is that my_version string doesn't remain constant and it can change and the string can be something like this 1.0.2.66 or 2.0.1.66. So how do I handle such case in bash?
how do I handle such case?
You write a regular expression to match any possible combination of characters that can be there. You can learn regex with fun with regex crosswords online.
Do not edit JSON files with sed - sed is for lines. Consider using JSON aware tools - like jq, which will handle any possible case.
A jq answer: file.json contains
{
"type": "xyz",
"my_version": "1.0.1.66~22hgde",
"object": "can't end with a comma"
}
then, replacing the last octet before the tilde:
VAR=32
jq --arg octet "$VAR" '.my_version |= sub("[0-9]+(?=~)"; $octet)' file.json
outputs
{
"type": "xyz",
"my_version": "1.0.1.32~22hgde",
"object": "can't end with a comma"
}
I want to process data with following format by jq:
{
"data": [
{
"valueX": 11111,
"valueY": 11111,
},
{
"valueX": 2222,
"valueY": 2222,
}
...,
{
"valueX": 2222,
"valueY": 2222,
}
],
"meaningless_data": "x"
}
I want to go through data in "data" section, which has 100 sets of data. I wrote following, I saved all content into ${input}, and wanted to print out valueXs. I'm able to get echo part printed out but still see a lot of parse error: Invalid numeric literal at EOF. How could I get it fixed?
for row in $(echo "${input}" | jq -r '.[] | #base64'); do
_jq() {
echo ${row} | base64 --decode | jq -r ${1}
}
for i in {0..100}; do
echo "Printing valueX: "$(_jq '.['"${i}"'].valueX')" . "
done
done
To fix the pseudo-JSON, you could use a tool such as https://hjson.org/. See the jq FAQ for further details and other options: https://github.com/stedolan/jq/wiki/FAQ
jsonval () {
temp=`echo $haystack | sed 's/\\\\\//\//g' | sed 's/[{}]//g' | awk -v k="text" ' {n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}' | sed 's/\"\:\"/\|/g' | sed 's/[\,]/ /g' | sed ' s/\"//g' | grep -w $needle`
echo ${temp##*|}
}
dev_key='xxxxxxxxxxxx'
zip_code='48446'
city='Lapeer'
state='MI'
red=$(tput setaf 1)
textreset=$(tput sgr0)
haystack=$(curl -Ls -X GET http://api.wunderground.com/api/$dev_key/conditions/q/$state/$city.json)
needle='temperature_string'
temperature=$(jsonval $needle $haystack)
needle='weather'
current_condition=$(jsonval $needle $haystack)
echo -e '\n' $red $current_condition 'and' $temperature $textreset '\n'
this code is supposed to parse json weather data to terminal using a developer key to call the information.
This the full code, can someone explain what sed is doing, I know it supposed to act as a substitute method, but why are there so many slashes and special characters used?
Also what is the echo ${temp##*|} doing, all these special characters is making it hard for me to understand this code.
It seems that this command try to parse json It's far to be a good idea, since there's some nice item in the toolbox. One of them is jq. It's good at formatting JSON outputs or retrieving items in complicated Data Source. Example :
file.json
{
"items": [
{
"tags": [
"bash",
"vim",
"zsh"
],
"owner": {
"reputation": 178,
"user_id": 22734,
"user_type": "registered",
"profile_image": "https://www.gravatar.com/avatar/25ee9a1b9f5a16feb1432882a9ef2f06?s=128&d=identicon&r=PG",
"display_name": "Brad Parks",
"link": "http://unix.stackexchange.com/users/22734/brad-parks"
},
"is_answered": false,
"view_count": 2,
"answer_count": 0,
"score": 0,
"last_activity_date": 1417919326,
"creation_date": 1417919326,
"question_id": 171907,
"link": "http://unix.stackexchange.com/questions/171907/use-netrw-or-nerdtree-in-zsh-bash-to-select-a-file-by-browsing",
"title": "Use Netrw or Nerdtree in Zsh/Bash to select a file BY BROWSING?"
}
]
}
Output from searching owner's sub HASH :
Don't reinvent the wheel badly ;)
I have a curl command which returns me this kind of json formated text
[{"id": "nUsrLast//device control", "name": "nUsrLast", "access": "readonly", "value": "0", "visibility": "visible", "type": "integer"}]
I would like to get the value of the field value.
Can someone give me a simple awk or grep command to do so ?
Here is an awk
awk -v RS="," -F\" '/value/ {print $4}' file
0
How does it work?
Setting RS to , it breaks line to some like this:
awk -v RS="," '{$1=$1}1' file
[{"id": "nUsrLast//device control"
"name": "nUsrLast"
"access": "readonly"
"value": "0"
"visibility": "visible"
"type": "integer"}]
Then /value/ {print $4} prints field 4 separated by "
You could use grep with oP parameters,
$ echo '[{"id": "nUsrLast//device control", "name": "nUsrLast", "access": "readonly", "value": "0", "visibility": "visible", "type": "integer"}]' | grep -oP '(?<=\"value\": \")[^"]*'
0
From grep --help,
-P, --perl-regexp PATTERN is a Perl regular expression
-o, --only-matching show only the part of a line matching PATTERN
Pattern Explanation:
(?<=\"value\": \") Lookbehind is used to set or place the matching marker. In our case, regex engine places the matching marker just after to the string "value": ".
[^"]* Now it matches any character except " zero or more times. When a " is detected then the regex engine would stop it's matching operation.
This solution isn't grep or awk but chances are pretty good your system has perl on it, and this is the best solution thus far:
echo <your_json> | perl -e '<STDIN> =~ /\"value\"\s*:\s*\"(([^"]|\\")*)\"/; print $1;'
It handles the possibility of a failed request by ensuring there is a trailing " character. It also handles backslash-escaped " symbols in the string and whitespace between "value" and the colon character.
It does not handle JSON broken across multiple lines, but then none of the other solutions do, either.
\"value\"\s*:\s*\" Ensures that we're dealing with the correct field, then
(([^"]|\\")*) Captures the associated valid JSON string
\" Makes sure the string is properly terminated
Frankly, you're better off using a real JSON parser, though.
I have this script:
#!/bin/sh
local name="url"
local line='{ "name": "url", "value": "http:\/\/www.example.com\/dir1\/page1" }'
local name2="protocol"
local line2='{ "name": "protocol", "value": "ftp" }'
sed -i "/\<$name2\>/s/.*/$line2/" /mydir/myfile
sed -i "/\<$name\>/s/.*/$line/" /mydir/myfile
myfile contain:
{ "name": "url", "value": "http:\/\/www.example.com\/dir2\/page2" }
{ "name": "url2", "value": "http:\/\/www.example.net/page" }
{ "name": "protocol", "value": "http" }
I detect a problem with / symbol in value field with my sed command. How to fix this error?
Since your files have / all over the place its better to use an alternate regex delimiter; it is supported by sed.
sed -i "/\<$name2\>/s|.*|$line2|" /mydir/myfile
sed -i "/\<$name\>/s|.*|$line|" /mydir/myfile
Use a different separator like this:
sed -i "/\<$name2\>/s%.*%$line2%" /mydir/myfile
For example, this is a dump I did just now:
printf "eins x eins\nzwei x zwei\nabc x abc\ndef x def\n" | sed "/abc/s%x%y%"
It prints:
eins x eins
zwei x zwei
abc y abc
def x def
As you can see, just the line containing abc gets modified by the s command.
Instead of the % sign you can use any character which does not appear in the value you want to replace. Keep that in mind. You also can use _ or ⌴ or ; or a letter, a number, whatever you like. It just mustn't appear in the value.