awk unix insert into file location directory - linux

In linux, I am trying to select a variable from a specific column and row of CSV file and then use this variable as the end of a file location hierarchy. When I type the following into a bash terminal window, it seems to work by outputting the variable in correct row and column on screen.
awk -F "," 'FNR == 2 {print $8}' /sdata/images/projects/ASD_SSD/1/ruths_data/ruth/imaging\ study/imaging\ study\ working/delete2.csv
However, I am trying to go do the following substitution within a script, this fails to work...
r=2
c=8
s=awk -F "," 'FNR == $r {print $c}' /sdata/images/projects/ASD_SSD/1/ruths_data/ruth/imaging\ study/imaging\ study\ working/delete2.csv
I then try to use the s output as the end of a hierarchy file location. For example, /home/ork/js/s*
I keep getting the following error, so this looks like it's not creating the s variable and then not inserting it into the actual file location.
omitting directory `/home/ork/js/'
I have spent a few hours trying to figure out what is preventing this from working and am a new user (so I am sure it is something simple, sorry).
I hope I was clear enough, please let me know if this requires further clarification.

This is a common question here. The single quotes are protecting the variables from the shell, so they never get expanded. Also command substitution is needed when assigning to variable s. One way to do it would be:
s=$(awk -F, 'FNR==r{print c}' r="$r" c="$c" file)

Related

bash: How to replace an entire line in a text file by a part of its content

I have a text file, called texto.txt in Documentos folder, with some values like the ones below:
cat ~/Documentos/texto.txt
65f8: Testado
a4a1: Testado 2
So I want to change a whole line by using a customized function which gets as parameters the new value.
The new value will always keep the first 6 characters, changing only what comes after them. Although I am testing only the first four.
Then I edited my .bashrc including my function like shown below.
muda()
{
export BUSCA="$(echo $* | cut -c 1-4)";
sed -i "/^$BUSCA/s/.*/$*/" ~/Documentos/texto.txt ;}
When I run the command below it works like a charm, but I feel it could be improved.
muda a4a1: Testado 3
Result:
cat ~/Documentos/texto.txt
65f8: Testado
a4a1: Testado 3
Is there a smarter way to do this? Maybe by getting rid of BUSCA variable?
I'd write:
muda() {
local new_line="$*"
local key=${newline:0:4}
sed -i "s/^${key//\//\\/}.*/${new_line//\//\\/}/" ~/Documentos/texto.txt
}
Notes:
using local variables, not exported environment variables
does not call out to cut, bash can extract a substring
escaping any slashes in the variable values so the sed code is not broken.

Assign new variable from each line of a text file

What I'm basically trying to do is automatically detect if there is text in a line, and if so create a new variable containing the text in said line , within a script. If there is no text in a line then the variable doesn't get created. I can do this manually by opening the file -
$ cat file.txt
sometxt
somemoretext
evenmoretext
...
then adding to my script the appropriate lines -
TXT=file.txt
VAR1=$(sed -n 1p $TXT)
VAR2=$(sed -n 2p $TXT)
...
but this is a pain since I have to count how many lines there are total, then copy and paste each line assigning the variables and changing 'VAR!' to 'VAR2' and '1p' to '2p'. There has to be an easier way. Thanks
#JNevil thanks for pointing me in the right direction.
Heres what ended up working for me -
for var_name in (cat links.txt); do
wget <servername.com>$var_name
done
Still dont know how to use curl but this worked fine!

Usage of AWK in Linux

please explain the line below used in shell scripts,
awk -F\| -v src=$storekey 'src==$41' $SRC_Path >> $DST_Path
Thanks!
Ok first ${variable} is a shell variable, so those would be defined higher in your script i.e.
storekey = "1234" or something
you can try this on your shell (linux or command line terminal)
type:
$ storekey="foo"
$ echo $storekey
So most of your question is pertaining to the variables and the command line which confuses how they are used, if you replaced the variables on a command line to test, you could work test it out to find out what they are doing.
In essence Awk is a stream parsing tool, so if you had a file of say 10 columns with a known delimiter such as "," or "|" you could ask awk for a specific column to be printed or output. This is what is happening below, but it is being confused by the presence of custom shell variables.
then to break down the command line awk is parsing a "|" delimited input (-F\| ) defined by $storekey variable, taking the column where src== $41 (this has some reference to the data being input), from $SRC_PATH (a directory) to $DST_PATH (another directory or path).
If you could share more of the shell script I could provide a more in depth answer.
btw, you could also find out more information, using the commands
man awk
info awk
from your command line, however these are a bit arcane for those not so familiar with *nix variants.

Using AWK and setting results to bash variables/arrays?

I have a file that replicates the results from show processlist command from mySQL.
The file looks like this:
*************************** 1. row ***************************
Id: 1
User: system user
Host:
db: NULL
Command: Connect
Time: 1030455
State: Waiting for master to send event
Info: NULL
*************************** 2. row ***************************
Id: 2
User: system user
Host:
db: NULL
Command: Connect
Time: 1004
State: Has read all relay log; waiting for the slave
I/O thread to update it
Info: NULL
And it keeps going on for a few more times in the same structure.
I want to use AWK to only get these parameters: Time,ID,Command and State, and store every one of these parameters into a different variable or array so that I can later use / print them in my bash shell.
The problem is, I am pretty bad with AWK, I dont know how to both seperate the parameters I want from the file and also set them as a bash variable or array.
Many thanks in advance for the help!
EDIT: Here is my code so far
echo "Enter age"
read age
cat data | awk 'BEGIN{ RS="row"
FS="\n"
OFS="\n"}
{ print $2,$7}
' | awk 'BEGIN{ RS="Id"}
{if ($4 > $age){print $2}}'
The file 'data' contains blocks like I have pasted above. The code should, if the 'age' entered is smaller than the Time parameter in the data file (which is $4 in my awk code), return the ID parameter, but it returns nothing.
If I remove the if statement and print $4 instead of $2 this is my output
Enter age
1
1030455
1004
2144
2086
0
So I was thinking maybe that blank line is somehow messing up my AWK print? Is there a simple way to ignore that blank line while keeping my other data?
This is how you'd use awk to produce the values you want as a set of tab-separated fields on each line per "row" block from the input:
$ cat tst.awk
BEGIN {
RS="[*]+ [[:digit:]]+[]. row [*]+\n"
FS="\n"
OFS="\t"
}
NR>1 {
sub(/\n$/,"") # remove the trailing newline
gsub(/\n\s+/," ") # compress all multi-line fields into single lines
gsub(OFS," ") # ensure the only OFS in the output IS between fields
delete n2v
for (i=1; i<=NF; i++) {
name = gensub(/:.*/,"","",$i)
value = gensub(/^[^:]+:\s+/,"","",$i)
n2v[name] = value
}
if (n2v["Time"]+0 > age) { # force a numeric comparison
print n2v["Time"], n2v["Id"], n2v["Command"], n2v["State"]
}
}
$ awk -v age=2000 -f tst.awk file
1030455 1 Connect Waiting for master to send event
If the target age is already stored in a shell variable just init the awk variable from the shell variable of the same name:
$ age="2000"
$ awk -v age="$age" -f tst.awk file
The above uses GNU awk for multi-char RS (which you already had), gensub(), \s, and delete array.
When you say "and store every one of these parameters into a different variable or array" it could mean one of several things so I'll leave that part up to you but you might be looking for something like:
arr=( $(awk '...') )
or
awk '...' |
while IFS="\t" read -r Time Id Command State
do
<do something with those 4 vars>
done
but by far the most likely situation is that you don't want to use shell at all but instead just stay inside awk.
Remember - every time you write a loop in shell just to manipulate text you have the wrong approach. UNIX shell is an environment from which to call UNIX tools and the UNIX tool for general text manipulation is awk.
Until you edit your question to tell us more about your problem though, we can't guess what the right solution is from this point on.
At the first level you have your shell which you use to run any other child process. It's impossible to modify parents environment from within child process. When you run your bash script file (which has +x right) it's spawned as new process (child). It can set it's own environment but when it ends its live you'll get back to the original (parent).
You can set some variables on bash and export them to it's environment. It'll be inherited by it's children. However it can't be done in opposite direction (parent can't inherit from its child).
If you wish to execute some commands from the script file in the current bash's context you can source the script file. source ./your_script.sh or . ./your_script.sh will do that for you.
If you need to run awk to filter some data for you and keep results in the bash you can do:
awk ... | read foo
This works as read is shell buildin function rather than external process (check type read, help, help read, man bash to check it by yourself).
or:
foo=`awk ....`
There are many other constructions you can use. Whatever bash script you do please compare your code with bash pitfalls webpage.

unix - get substring of a file name

If I have a folder called myfiles/ which has a bunch of python files in it, in a shell script like the following:
for k in myfiles/*.py
do
// code here?
done
How do I print for each k a string that's just --name-of-file--.py ?
If I do
echo $k
as is, it prints myfiles/--name-of-file--.py
I'm very new to shell scripting, but it seems like the cut function attempts to cut the contents of the file and not just the file name (and I don't really know how to use cut).
To be clear, I'd like to know how to get rid of the folder name when printing.
basename "$k"
Or if you want to avoid spawning so many processes, this is more efficient:
echo ${k##*/}

Resources