/usr/bin/cut: Argument list too long in bash script

/usr/bin/cut: Argument list too long in bash script - linux

I loaded a csv file into a variable and then I am trying to cut out some columns which results in this error /usr/bin/cut: Argument list too long. Here is what I did:
if [ $# -ne 1 ]; then
echo "you need to add the file name as argument"
fi
echo "input file name $1"
input_file=$(<$1)
#cut the required columns.
cut_input_file=$(cut -f 1,2,3,5,8,9,10 -d \| $input_file)
echo $(head $cut_input_file)
What am I missing?

Reason of that error is your use of $input_file which has full file data.
You need to run cut on the file not on the file content so use:
cut -f 1,2,3,5,8,9,10 -d '|' "$1"
To run cut against file content use:
cut -f 1,2,3,5,8,9,10 -d '|' <<< "$input_file"

Related

Linux: append all filenames in path to text file

I want to add the filenames of all files of a certain type (*.cub) in the path to a text file in the same path. This file will become the batch (.submit) file. That I can run overnight. I also need to adapt the name a bit.
I do not really know how to describe it better, so I'll give an example:
Let's say I have three files: 001.cub, 002.cub & 003.cub
Then the final text file must be:
[program] -i 001.cub -o 001.vdb
[program] -i 002.cub -o 002.vdb
[program] -i 003.cub -o 003.vdb
It seems a fairly easy operation, but I simply can't get it right.
Also, it really has to become a .submit (or at least some text) file. I cannot run the program immediately.
I hope someone can help!

A simple for loop will do the job:
for i in *.cub
b=$(basename "$i" .cub)
echo "program -i \"$b.cub\" -o \"$b.vdb\""
done >output.txt

Create an empty sh file
List the files *.cub and loop through them
Store the sequence by splitting on dot [.]
echo the required string and append to the sh file of step 1
echo -n "" > 'Run.sh'
for filename in `ls *.cub`
do
sequence=`echo $filename | cut -d "." -f1`
echo "Program -i $filename -o $sequence.vdb" >> Run.sh
done
Directly put the stream into the file as below:
for filename in `ls *.cub`
do
sequence=`echo $filename | cut -d "." -f1`
echo "Program -i $filename -o $sequence.vdb"
done > Run.sh
For everything before the extension to be retained in the variable:
for filename in `ls *.cub`
do
sequence=`echo $filename | rev | cut -d "." -f2- | rev`
echo "Program -i $filename -o $sequence.vdb"
done > Run.sh
For extracting only the numbers from the filename and use accordingly:
for filename in `ls *.cub`
do
sequence=`echo $filename | sed 's/[^0-9]*//g'`
echo "Program -i $filename -o $sequence.vdb"
done > Run.sh

This oneliner will do what you want:
ls *.cub | sort | awk '{split($1,x,"."); print "[program] -i "$1" -o "x[1]".vdb "}' > something.sh

grep lines containing specific string (a line can be written on max 3 lines)

I need to get all log done in my project.
I'm using this command to do that:
grep -rnw $1 -e "Logger.[view]*;$" >> log.txt
this line return all lines containing Logger.[one of the these caracters]
contained in the project directory "$1" except that there are some lines written on 2 or 3 lines (IDE formating). In this case I get only the first line only.
What can I do to get the complete text of that log knowing that a log line will always end with ");"
example of such line :
Logger.v(xxxxxxxxxxxxx
xxxxxxxxxxxxxxxx);
Here is my script:
#!/bin/bash
echo "Hello Logger!
# get project path
echo "project directory is $1"
# get all project logs and store them into temporary file tmp.txt for processing
grep -rnw $1 -e "Logger.[view]" >> tmp.txt
echo "tmp.txt created successfully"
# remove package name from previous result and store result into log.txt
sed -r 's/.{52}//' tmp.txt >> log.txt
echo "log.txt created successfully"
grep command return file_path/file_name : line_number : line.
I found this command that returns only the line even if it is written in 2 or 3 lines but without the file_path file_name and the line_number
sed -n '/Logger.[viewd]/{:start /;/!{N;b start};/Logger.[viewd]/p}' Main.java
Is there a way to have those two results combined.
example :
/home/xxx/xxx/xxx/Main.java:97:Logger.i(xxxxxxxxxxxxx);
/home/xxx/xxx/xxx/Main.java:106:Logger.d(yyyyyyyyyyyy
yyyyyyyyyyyyyyyyyyyy);

i think that's a break line problem. Try to replace grep -rnw $1 -e "Logger.[view]" >> tmp.txt by the following lines:
for i in `ls $1`;
do
cat $1/$i | tr '\n' ' ' | grep -rnw -e "Logger.[view]" >> tmp.txt
done
Here, tr '\n' ' ' replace the break line by a simple space.

I found a solution for my problem and here is my code:
# get all project logs and store them into log.txt for processing
for i in $(find -name "*.java")
do
echo >> log.txt
echo "**************** file $i ********************************" >> log.txt
echo >> log.txt
grep -rnw Logger.[viewd] $i | while read -r line ; do
# remove breaklines from first line to avoid having bad results
line="$(echo $line | sed $'s/\r//')"
# if first line ends with ");" print it to log file
if [[ ${line: -2} == ");" ]]; then
echo $line >> log.txt
# else get next line also
else
# get second line number
line_number="$(echo "$line" | cut -d : -f1)"
next_line_number=$((line_number+1))
# get second line
next_line=$(sed "${next_line_number}q;d" $i | sed -e 's/^[ \t]*//')
# concatenate first line & second line
line="$line $next_line"
# print resulting line to log file
echo $line >> log.txt
fi
done

concatenate the result of echo and a command output

I have the following code:
names=$(ls *$1*.txt)
head -q -n 1 $names | cut -d "_" -f 2
where the first line finds and stores all names matching the command line input into a variable called names, and the second grabs the first line in each file (element of the variable names) and outputs the second part of the line based on the "_" delim.
This is all good, however I would like to prepend the filename (stored as lines in the variable names) to the output of cut. I have tried:
names=$(ls *$1*.txt)
head -q -n 1 $names | echo -n "$names" cut -d "_" -f 2
however this only prints out the filenames
I have tried
names=$(ls *$1*.txt
head -q -n 1 $names | echo -n "$names"; cut -d "_" -f 2
and again I only print out the filenames.
The desired output is:
$
filename1.txt <second character>
where there is a single whitespace between the filename and the result of cut.
Thank you.

Best approach, using awk
You can do this all in one invocation of awk:
awk -F_ 'NR==1{print FILENAME, $2; exit}' *"$1"*.txt
On the first line of the first file, this prints the filename and the value of the second column, then exits.
Pure bash solution
I would always recommend against parsing ls - instead I would use a loop:
You can avoid the use of awk to read the first line of the file by using bash built-in functionality:
for i in *"$1"*.txt; do
IFS=_ read -ra arr <"$i"
echo "$i ${arr[1]}"
break
done
Here we read the first line of the file into an array, splitting it into pieces on the _.

Maybe something like that will satisfy your need BUT THIS IS BAD CODING (see comments):
#!/bin/bash
names=$(ls *$1*.txt)
for f in $names
do
pattern=`head -q -n 1 $f | cut -d "_" -f 2`
echo "$f $pattern"
done

If I didn't misunderstand your goal, this also works.
I've always done it this way, I just found out that this is a deprecated way to do it.
#!/bin/bash
names=$(ls *"$1"*.txt)
for e in $names;
do echo $e `echo "$e" | cut -c2-2`;
done

Bash substring from position not printing

I am using the following format #{string:start:length} to extract the file name from wget's .listing file, line by line.
The format for the file is something I think we are all familiar with:
04-30-13 01:41AM 7033614 some_archive.zip
04-29-13 08:13PM <DIR> DIRECTORY NAME 1
04-29-13 05:41PM <DIR> DIRECTORY NAME 2
All file names start at pos:40, so setting :start to 39, with no :length should (and does) return the file name for each line:
#!/bin/bash
cat .listing | while read line; do
file="${line:40}"
echo $file
done
Correctly returns:
some_archive.zip
DIRECTORY NAME 1
DIRECTORY NAME 2
However, if I get any more creative, it breaks:
#!/bin/bash
cat .listing | while read line; do
file="${line:40}"
dir=$(echo $line | egrep -o '<DIR>' | head -n1)
if [ $dir ]; then
echo "the file $file is a $dir"
fi
done
Returns:
$ ./test.sh
is a <DIR>ECTORY NAME 1
is a <DIR>ECTORY NAME 2
What gives? I lose "the file " and the rest of the test looks like it prints on top of "the file DIRECTORY NAME 1" from pos:0.
It's weird, what's it on account of?

The answer, as I am learning more and more with linux as I progress, is non-printing control characters.
Adding a pipe to egrep for only printing characters solved the problem:
#!/bin/bash
cat .listing | while read line; do
file=$(echo ${line:39} | egrep -o '[[:print:]]+' | head -n1)
dir=$(echo $line | egrep -o '<DIR>' | head -n1)
if [ $dir ]; then
echo "the file $file is a $dir"
fi
done
Correctly returns:
$ ./test.sh
the file DIRECTORY NAME 1 is a <DIR>
the file DIRECTORY NAME 2 is a <DIR>
Wish there were a better way to visualize these control characters, but what the above does is basically take the string segment, pull out the first string of printable characters, and assign it to the variable.
I assume there is a control character at the end of the line that returns the cursor to the beginning of the line. Causing the rest of the echo to be printed there, overwriting the previous characters.'
Odd.

You can remove the \r control characters from the whole file by using the tr command on the first line of your script:
#!/bin/bash
cat .listing | tr -d '\015' | while read line; do
file="${line:39}"
dir=$(echo $line | egrep -o '<DIR>' | head -n1)
if [ $dir ]; then
echo "the file $file is a $dir"
fi
done

Shell Script: Read the first line of properties file separately form the rest of the lines

I am writing a shell script that reads a properties file & perfroms some operation.
That is it reads from fist line of the prop file.
Now in this script I want to add a switch which if ENABLED will execute the script and will perform the regular operation.
If DISABLED will exit the program noramally.
I want to put this swich in the same prop file. (i.e. Now the first line of the prop file will be either ENABLED or DISABLED)
Currently I'm using:
cat init_token.properties | while read line
Now before this I want to separately read the value of the switch & then if ENABLED, the while read line should start form the second line of the properties file.
In nutshell, I want to segrigate the reading of the Ist line and then the rest.
Format of init_token.properties:
ENABLED
abc.dat IP 120.210.60.1
xyz.dat PORT 8200
pqr.dat IP 420.24012.4
Script:
#!/bin/ksh
dos2unix init_token.properties &
# PATH for DAT files
DAT_FILE_PATH='.'
cat init_token.properties | while read line
do
# PARAMETER EXAMPLE - <FILENAME> <ATTRIBUTE> <VALUE>
# read FILENAME
FILENAME=`echo "$line" | awk -F " " '{print $1}'`
# read ATTRIBUTE
ATTRIBUTE=`echo "$line" | awk -F " " '{print $2}'`
# read VALUE
VALUE=`echo "$line" | awk -F " " '{print $3}'`
# setting fully qualified filepath name & temporary file
FULLPATH=$DAT_FILE_PATH"/"$FILENAME
TEMP_FILE=tempfile
old='$('$FILENAME'_'$ATTRIBUTE')'
# replace $(<FILEANME>_<ATTRIBUTE>) with VALUE if file exists
if [ -e $FULLPATH ]
then
sed 's/'$old'/'$VALUE'/g' $FULLPATH > $TEMP_FILE && mv $TEMP_FILE $FULLPATH
else
echo 'File '$FULLPATH' does not exists while replacing token '$old
fi
done
exit

something like this, perhaps?
let CNTR=0
cat init_token.properties | while read line
do
let CNTR=CNTR+1
if [ $X == 1 ]; then
//is first line
else
//is not first line
fi
# PARAMETER EXAMPLE - <FILENAME> <ATTRIBUTE> <VALUE>
# read FILENAME
FILENAME=`echo "$line" | awk -F " " '{print $1}'`

You could try something like this at the top of your script:
CHECK=$(head -n 1 prop.file)
if [ "$CHECK" == "DISABLED" ]; then
exit 0
fi

First you can split the line with read, so you don't need to use echo | awk:
cat init_token.properties | while read filename attribute value
do
Next are the checks for ENABLED/DISABLED/other:
case "$filename" in
ENABLED) ;;
DISABLED) exit ;;
*)
# It's another line, do processing
...
;;
esac
done
Another point: don't put dos2unix ... in the background. It may run longer than your script does. Just call it without &:
dos2unix init_token.properties

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

/usr/bin/cut: Argument list too long in bash script - linux

Reason of that error is your use of $input_file which has full file data. You need to run cut on the file not on the file content so use: cut -f 1,2,3,5,8,9,10 -d '|' "$1" To run cut against file content use: cut -f 1,2,3,5,8,9,10 -d '|' <<< "$input_file"

Related

Linux: append all filenames in path to text file

grep lines containing specific string (a line can be written on max 3 lines)

concatenate the result of echo and a command output

Bash substring from position not printing

Shell Script: Read the first line of properties file separately form the rest of the lines

Categories

Resources