Bombardier - Read from file - performance-testing

Is there a way to have the benchmarking tool Bombardier read portion of input from a file and substitute this in the request body?
I want to test for x number of ID's with rest of the request body being the same. As an example
bombardier -c 10 -d 200s -m POST -b {"id": < fromfile >, "somefield": "yes"} http://localhost:8080
How can the above value < from file > be replaced by the actual file contents?
Thanks.

Related

Formating window values with jq

I would like to take the json output of window values from a program. The output of it currently is:
[
3067,
584
]
[
764,
487
]
But I need it to be formatted like: 3067,584 764x487. How would I go about doing this using jq or other commands?
I'm not very experienced with jq and json formatting in general, so I'm not really sure where to start. I have tried looking this up but am still not really sure how to do it.
A solution that does not --slurp would be using input for every other array:
jq -r 'join(",")+" "+(input|join("x"))'
3067,584 764x487
Demo
If your input is a stream of JSON arrays, you could use jq's -s/--slurp command-line option. Join the first array with comma, the second array with "x"; and finally join both strings with a space:
$ jq -sr '[(.[0]|join(",")), (.[1]|join("x"))] | join(" ")' <<JSON
[
3067,
584
]
[
764,
487
]
JSON
3067,584 764x487
Alternatively, simply use string interpolation:
jq -sr '"\(.[0][0]),\(.[0][1]) \(.[1][0])x\(.[1][1])"'

How do I prepend variable to open file stream when using split to create csv's?

I have a bash file that takes a large csv and splits the csv into smaller csv's based on this blog https://medium.com/swlh/automatic-s3-file-splitter-620d04b6e81c. It works well as it is fast never downloading the csv's which is great for a lambda. The csv's after they split do not have headers only the originating csv. This is problem for me since I am not able to read with apache pyspark a set of files one with header row and many other files without header rows.
I want to add a header row to each csv written.
What the code does
INFILE
"s3//test-bucket/test.csv"
OUTFILES - split into 300K lines
"s3//dest-test-bucket/test.00.csv"
"s3//dest-test-bucket/test.01.csv"
"s3//dest-test-bucket/test.02.csv"
"s3//dest-test-bucket/test.03.csv"
Original code that works
LINECOUNT=300000
INFILE=s3://"${S3_BUCKET}"/"${FILENAME}"
OUTFILE=s3://"${DEST_S3_BUCKET}"/"${FILENAME%%.*}"
FILES=($(aws s3 cp "${INFILE}" - | split -d -l ${LINECOUNT} --filter "aws s3 cp - \"${OUTFILE}_\$FILE.csv\" | echo \"\$FILE.csv\""))
This was my attempt to add a variable to outgoing file stream, but it did not work.
LINECOUNT=300000
INFILE=s3://"${S3_BUCKET}"/"${FILENAME}"
OUTFILE=s3://"${DEST_S3_BUCKET}"/"${FILENAME%%.*}"
HEADER=$(aws s3 cp "${INFILE}" - | head -n 1)
FILES=($(aws s3 cp "${INFILE}" - | split -d -l ${LINECOUNT} --filter "echo ${HEADER}; aws s3 cp - \"${OUTFILE}_\$FILE.csv\" | echo \"\$FILE.csv\""))
Attempt 2:
LINECOUNT=300000
INFILE=s3://"${S3_BUCKET}"/"${FILENAME}"
OUTFILE=s3://"${DEST_S3_BUCKET}"/"${FILENAME%%.*}"
HEADER=$(aws s3 cp "${INFILE}" - | head -n 1)
FILES=($(aws s3 cp "${INFILE}" - | split -d -l ${LINECOUNT} --filter "{ echo -n ${HEADER}; aws s3 cp - \"${OUTFILE}_\$FILE.csv\"; } | echo \"\$FILE.csv\""))
AWS documentation states
You can use the dash parameter for file streaming to standard input (stdin) or standard output (stdout).
I don't know if this is even possible with a open file stream.
Hope this helps. I think you are only missing the cat aspect of adding the header.
This article shows one way to split a file and provide the header using split command and filter arguments.
Using that snip and applying it to the code above seems to work. Notice that the 2 commands inside the curly braces are echo ${HEADER} and cat. The first, echo creates the header on stdout and then the second, cat will pipe aws cp stdin to stdout which is the input to aws cp - creating the new file on S3.
HEADER='"Name", "Team", "Position", "Height(inches)", "Weight(lbs)", "Age"'
aws s3 cp ${INFILE} - | split -d -l ${LINECOUNT} --filter "{ \[ "\$FILE" != "x00" \] && echo ${HEADER} ; cat; } | aws s3 cp - \"${OUTFILE}\${FILE}.csv\""
After running the command, I observed 3 new files and each file had the desired header.
head -n2 *.csv
==> x00.csv <==
"Name", "Team", "Position", "Height(inches)", "Weight(lbs)", "Age"
"Adam Donachie", "BAL", "Catcher", 74, 180, 22.99
==> x01.csv <==
Name, Team, Position, Height(inches), Weight(lbs), Age
"John Rheinecker", "TEX", "Starting Pitcher", 74, 230, 27.76
==> x02.csv <==
Name, Team, Position, Height(inches), Weight(lbs), Age
"Chase Utley", "PHI", "Second Baseman", 73, 183, 28.2

Filtering Live json Stream on linux

I have live raw json stream data from the virtual radar server I'm using.
I use Netcat to fetch the data and jq to save it on my kali linux. using the following command.
nc 127.0.0.1 30006 | jq > A7.json
But i want to filter specific content from the data stream.
i use the following command to extract the data.
cat A7.json | jq '.acList[] | select(.Call | contains("QTR"))' - To fetch Selected airline
But i realized later that the above command only works once. in other words, it does not refresh. as the data is updating every second. i have to execute the command over and over again to extract the filter data which is causing to generate duplicate data.
Can someone help me how to filter the live data without executing the command over and over .
As you don't use the --stream option, I suppose your document is a regular JSON document.
To execute your command every second, you can implement a loop that sleeps for 1 second:
while true; do sleep 1; nc 127.0.0.1 30006 | jq '.acList[] | select(…)'; done
To have the output on the screen and also save to a file (like you did with A7.json), you can add a call to tee:
# saves the document as returned by `nc` but outputs result of `jq`
while true; do sleep 1; nc 127.0.0.1 30006 | tee A7.json | jq '.acList[] | …'; done
# saves the result of `jq` and outputs it
while true; do sleep 1; nc 127.0.0.1 30006 | jq '.acList[] | …' | tee A7.json; done
Can you try this ?
nc localhost 30006 | tee -a A7.json |
while true; do
stdbuf -o 0 jq 'try (.acList[] | select(.Call | contains("QTR")))' 2>/dev/null
done
Assuming that no other process is competing for the port, I'd suggest trying:
nc -k -l localhost 30006 | jq --unbuffered ....
Or if you want to keep a copy of the output of the netcat command:
nc -k -l localhost 30006 | tee A7.json | jq --unbuffered ....
You might want to use tee -a A7.json instead.
Break Down
Why I did what I did?
I have live raw json stream data from the virtual radar server, which is running on my laptop along with Kali Linux WSL on the background.
For those who don't know virtual radar server is a Modes-s transmission decoder that is used to decode different ADS-B Formats. Also, it rebroadcast the data in a variety of formats.. one of them is the Json stream. And I want to save select aircraft data in json format on Kali Linux.
I used the following commands to save the data before
$ nc 127.0.0.1 30001 | jq > A7.json - To save the stream.
$ cat A7.json | jq '.acList[] | select(.Call | contains("QTR"))' - To fetch Selected airline
But I realized two things after using the above. One I'm storing unwanted data which is consuming my storage. Two, when I used the second command it just goes through the json file once and produces the data which is saved at that moment and that moment alone. which caused me problems as I have to execute the command over and over again to extract the filter data which is causing me to generate duplicate data.
Command worked for me
The following command worked flawlessly for my problem.
$ nc localhost 30001 | sudo jq --unbuffered '.acList[] | select (.Icao | contains("800CB8"))' > A7.json
The following also caused me some trouble which i explain clearly down below.
Errors X Explanation
This error was resulted in the missing the Field name & key in the json object.
$ nc localhost 30001 | sudo jq --unbuffered '.acList[] | select (.Call | contains("IAD"))' > A7.json
#OUTPUT
jq: error (at <stdin>:0): null (null) and string ("IAD") cannot have their containment checked
If you see the below JSON data you'll see the missing Field name & key which caused the error message above.
{
"Icao": "800CB8",
"Alt": 3950,
"GAlt": 3794,
"InHg": 29.7637787,
"AltT": 0,
"Call": "IAD766",
"Lat": 17.608658,
"Long": 83.239166,
"Mlat": false,
"Tisb": false,
"Spd": 209,
"Trak": 88.9,
"TrkH": false,
"Sqk": "",
"Vsi": -1280,
"VsiT": 0,
"SpdTyp": 0,
"CallSus": false,
"Trt": 2
}
{
"Icao": "800CB8",
"Alt": 3950,
"GAlt": 3794,
"AltT": 0,
"Lat": 17.608658,
"Long": 83.239166,
"Mlat": false,
"Spd": 209,
"Trak": 88.9,
"Vsi": -1280
}
{
"Icao": "800CB8",
"Alt": 3800,
"GAlt": 3644,
"AltT": 0,
"Lat": 17.608795,
"Long": 83.246155,
"Mlat": false,
"Spd": 209,
"Trak": 89.2,
"Vsi": -1216
}
Commands that didn't work for me.
Command #1
When i used jq with --stream with the filter it produced the below output. --Stream with output filter worked without any errors.
$ nc localhost 30001 | sudo jq --stream '.acList[] | select (.Icao | contains("800"))' > A7.json
#OUTPUT
jq: error (at <stdin>:0): Cannot index array with string "acList"
jq: error (at <stdin>:0): Cannot index array with string "acList"
jq: error (at <stdin>:0): Cannot index array with string "acList"
jq: error (at <stdin>:0): Cannot index array with string "acList"
jq: error (at <stdin>:0): Cannot index array with string "acList"
jq: error (at <stdin>:0): Cannot index array with string "acList"
Command #2
For some reason -k -l didn't work to listen to the data. but the other command worked perfectly. i think it didn't work because of the existing port outside of the wsl.
$ nc -k -l localhost 30001
$ nc localhost 30001
Thank you to everyone who helped me to solve my issue. I'm very grateful to you guys

Bash: Loop Read N lines at time from CSV

I have a csv file of 100000 ids
wef7efwe1fwe8
wef7efwe1fwe3
ewefwefwfwgrwergrgr
that are being transformed into a json object using jq
output=$(jq -Rsn '
{"id":
[inputs
| . / "\n"
| (.[] | select(length > 0) | . / ";") as $input
| $input[0]]}
' <$FILE)
output
{
"id": [
"wef7efwe1fwe8",
"wef7efwe1fwe3",
....
]
}
currently, I need to manually split the file into smaller 10000 line files... because the API call has a limit.
I would like a way to automatically loop through the large file... and only use 10000 lines as a time as $FILE... up until the end of the list.
I would use the split command and write a little shell script around it:
#!/bin/bash
input_file=ids.txt
temp_dir=splits
api_limit=10000
# Make sure that there are no leftovers from previous runs
rm -rf "${temp_dir}"
# Create temporary folder for splitting the file
mkdir "${temp_dir}"
# Split the input file based on the api limit
split --lines "${api_limit}" "${input_file}" "${temp_dir}/"
# Iterate through splits and make an api call per split
for split in "${temp_dir}"/* ; do
jq -Rsn '
{"id":
[inputs
| . / "\n"
| (.[] | select(length > 0) | . / ";") as $input
| $input[0]]
}' "${split}" > api_payload.json
# now do something ...
# curl -dapi_payload.json http://...
rm -f api_payload.json
done
# Clean up
rm -rf "${temp_dir}"
Here's a simple and efficient solution that at its core just uses jq. It takes advantage of the -c command-line option. I've used xargs printf ... for illustration - mainly to show how easy it is to set up a shell pipeline.
< data.txt jq -Rnc '
def batch($n; stream):
def b: [limit($n; stream)]
| select(length > 0)
| (., b);
b;
{id: batch(10000; inputs | select(length>0) | (. / ";")[0])}
' | xargs printf "%s\n"
Parameterizing batch size
It might make sense to set things up so that the batch size is specified outside the jq program. This could be done in numerous ways, e.g. by invoking jq along the lines of:
jq --argjson n 10000 ....
and of course using $n instead of 10000 in the jq program.
Why “def b:”?
For efficiency. jq’s TCO (tail recursion optimization) only works for arity-0 filters.
Note on -s
In the Q as originally posted, the command-line options -sn are used in conjunction with inputs. Using -s with inputs defeats the whole purpose of inputs, which is to make it possible to process input in a stream-oriented way (i.e. one line of input or one JSON entity at a time).

How to calculate number of unique response codes in an apache server log file?

I just started a system admin job. I was given a log file which contains some response codes (actually they are too many). I need to make a list of the codes use, for example, 400, 200, 304, 404. Then, I need to show how many times each response code is repeated. For this last task, I did this:
less file name | grep -c "404"
But still I need to make a list of those response code.
This is a sample of the log file:
10.229.120.3|-|[12/Apr/2020:23:58:40 -0500]|/keepalive.html||200|10|1143|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.4|-|[12/Apr/2020:23:58:53 -0500]|/keepalive.html||200|10|2367|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.3|-|[12/Apr/2020:23:58:55 -0500]|/keepalive.html||200|10|1194|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.4|-|[12/Apr/2020:23:59:08 -0500]|/keepalive.html||200|10|2212|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.3|-|[12/Apr/2020:23:59:10 -0500]|/keepalive.html||200|10|1780|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.4|-|[12/Apr/2020:23:59:23 -0500]|/keepalive.html||200|10|1268|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.3|-|[12/Apr/2020:23:59:25 -0500]|/keepalive.html||200|10|1160|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.4|-|[12/Apr/2020:23:59:38 -0500]|/keepalive.html||200|10|1206|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.3|-|[12/Apr/2020:23:59:40 -0500]|/keepalive.html||200|10|1138|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.4|-|[12/Apr/2020:23:59:53 -0500]|/keepalive.html||200|10|1304|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
10.229.120.3|-|[12/Apr/2020:23:59:55 -0500]|/keepalive.html||200|10|2476|HTTP/1.1|GET|TLSv1.2|AES256-GCM-SHA384
With awk:
awk -F'|' '{ a[$6]++} END{ for (i in a) print i "\t" a[i] }' logfile | sort -n
This counts the number of occurrences of the sixth field and prints status code and occurrences for each array element. The output is then sorted numerically by status code.
Use sort -nrk2 if you want the output sorted by occurrence in reverse order.

Resources