bash insert a new line before specific pattern - string

I have a csv file with the following contents:
"user1","track1","player1" "user1","track2","player2" "user2","track1","player1""user2","track2","player2".....
I need to insert a new line before this pattern: "user.
Actually at the beginning I replaced every space with a new line with this command
awk -v RS=" " '{print}' myfile >output.csv
but then I found that at some points in my file, there is not space before some users and when I impoted into the DB the values of columns were swapped at those points... :|. So I was wondering if someone knows how could I insert a new line before specific set of characters to avoid that problem..
Thanks,

With GNU sed you can try this:
sed -r 's/(.)("user)/\1\n\2/g' myfile >output.csv
With BSD/OSX sed (wich doesn't support escape sequence \n in the replacement string, using an ANSI C-quoted string string), use:
sed -E $'s/(.)("user)/\\1\\\n\\2/g' myfile >output.csv
# Alternative, with the ANSI C-quoted quoted string spliced in only where needed.
sed -E 's/(.)("user)/\1\'$'\n''\2/g' myfile >output.csv
With a strictly POSIX-compliant sed, use a literal newline instead of escape sequence \n:
sed 's/\(.\)\("user\)/\1\
\2/g' myfile >output.csv
Note that with your sample input some lines will have a trailing space.

With awk
awk '{gsub(/"user/,"\n\"user"); print}' file
That will have a leading CR in it if the file starts with "user
If you want to get rid of that, you can do:
awk '{gsub(/"user/,"\n\"user"); sub(/^\n/,""); print}' file

Related

how to sed spacial character if it come inside double quote in linux file

I have txt file delimited by comma (,) and each column quoted by double quote
what I want to do is :
I need to keep the delimiter as comma but I want to remove each comma come into double pair quote (as each column around by double quote)
sample on input and output file I want
input file :
"2022111812160156601777153","","","false","test1",**"here the , issue , that comma comma come inside the column"**
the output as I want :
"2022111812160156601777153","","","false","test1",**"here the issue that comma comma come inside the column"**
what I try :
sed -i ':a' -e 's/\("[^"]*\),\([^"]*"\)/\1~\2/;ta' test.txt
but above sed command replace all comma not only the comma that come inside the column
is there are way to do it ?
Using sed
$ sed -Ei.bak ':a;s/((^|,)(\*+)?"[^"]*),/\1/;ta' input_file
"2022111812160156601777153","","","false","test1",**"here the issue that comma comma come inside the column"**
Any time you find yourself using more than s, g, and p (with -n) in sed you'd be better off using awk for some combination of clarity, robustness, efficiency, portability, etc.
Using any awk in any shell on every Unix box:
$ awk 'BEGIN{FS=OFS="\""} {for (i=2; i<=NF; i+=2) gsub(/,/,"",$i)} 1' file
"2022111812160156601777153","","","false","test1",**"here the issue that comma comma come inside the column"**
Just like GNU sed has -i as in your question to update the input file with the command's output, GNU awk has -i inplace, or just add > tmp && mv tmp file with any awk or any other Unix command.
This might work for you (GNU sed):
sed -E ':a;s/^(("[^",]*"\**,?\**)*"[^",]*),/\1/;ta' file
This iterates through each line removing any commas within paired double quoted fields.
N.B. The solution above also caters for double quoted field prefixed/suffixed by zero or *'s. If this should not be catered for, here is an ameliorated solution:
sed -E ':a;s/^(("[^",]*",?)*"[^",]*),/\1/;ta' file
N.B. Escaped double quotes and commas would need a or more involved regexp.

SED: Insert a string with special character

I want to INSERT a string with "'" as special character in multiple files. All the files in which I want to insert have a line after which I want to perform my INSERT.
eg:
File Before INSERT:
...
FROM LOCAL :LOAD_FILE
REJECTED DATA :REJECT_PATH
...
File After INSERT:
...
FROM LOCAL :LOAD_FILE
DELIMITER AS '|'
REJECTED DATA :REJECT_PATH
...
I've tried writing down many SED commands but they are generating errors. One of them is:
sed 'LOAD_FILE/a/ DELIMITER AS \'\|\'/g' SOURCE > DESTINATION
awk -v line='DELIMITER AS '"'|'"'' '1; /LOAD_FILE/{print line }' input
FROM LOCAL :LOAD_FILE
DELIMITER AS '|'
REJECTED DATA :REJECT_PATH
Using surrounding double quotes:
sed "/FROM LOCAL :LOAD_FILE/s//&\nDELIMITER AS '|'/" file
or single quotes (safer to avoid unwanted variable expansion):
sed '/FROM LOCAL :LOAD_FILE/s//&\nDELIMITER AS '"'|'"'/' file
This might work for you (GNU sed):
sed '/LOAD_FILE/aDELIMITER AS '\'\|\' file
This appends the line DELIMITER AS \'\|\' following the match on LOAD_FILE
N.B. The sed command is in two parts, the /LOAD_FILE/aDELIMITER AS is concatenated with \'\|\'
If you prefer:
sed 's/LOAD_FILE/&\nDELIMITER AS '\'\|\''/' file
Another way of putting it :
sed -e ':a;N;$!ba;s/LOAD_FILE\n/LOAD_FILE\nDELIMITER AS \x27|\x27\n/g'
about syntax I used :
How can I replace a newline (\n) using sed?

Sed error "sed: unmatched '/'"

I have a file containing data with the following format:
{"parameter":"toto.tata.titi", "value":"0/2", "notif":"1"}
I make a change on the file with sed:
sed -i "/\<$param\>/s/.*/$line/" myfile
which line variable is
{"parameter":"toto.tata.titi", "value":"0/2", "notif":"3"}
and param variable is toto.tata.titi
The above sed command return error:
sed: unmatched '/'
Because the line variable is containing / ==> "0/2"
How to update my sed command to make it work even if the line variable is containing /?
Your $param or $line may contain / on it which causes sed to have a syntax error. Consider using other delimiters like | or #.
Example:
sed -i "/\<$param\>/s|.*|$line|" myfile
But that may not be enough. You can also quote your slashes when they expand:
sed -i "/\<${param//\//\\/}\>/s|.*|$line|" myfile
Other safer characters can be used too:
d=$'\xFF' ## Or d=$(printf '\xFF') which is compatible in many shells.
sed -i "/\<${param//\//\\/}\>/s${d}.*${d}${line}${d}" myfile
Check if your $param or $line variables have \n (newline), in my case, that was the reason of the error.
i had the same error when i used sed -i.bak 's/\r$//' command in my ansible task. i solved it by simply escaping the '\' in '\r'. i changed the command to sed -i.bak 's/\\r$//'
This will robustly only operate on lines containing the string (not regexp) toto.tata.titi:
awk -v param="$param" -v line="$line" 'index($0,param){$0 = line} 1' file
and will be unaffected by any chars in param or in line other than backslashes. If you need to be able to process backslashes as-is, the shell variables just need to be moved to the file name list and the awk variables populated from them in the BEGIN section instead of by -v assignment:
awk 'BEGIN{param=ARGV[1]; line=ARGV[2]; ARGV[1]=ARGV[2]=""} index($0,param){$0 = line} 1' "$param" "$line" file

Replace whitespace with a comma in a text file in Linux

I need to edit a few text files (an output from sar) and convert them into CSV files.
I need to change every whitespace (maybe it's a tab between the numbers in the output) using sed or awk functions (an easy shell script in Linux).
Can anyone help me? Every command I used didn't change the file at all; I tried gsub.
tr ' ' ',' <input >output
Substitutes each space with a comma, if you need you can make a pass with the -s flag (squeeze repeats), that replaces each input sequence of a repeated character that is listed in SET1 (the blank space) with a single occurrence of that character.
Use of squeeze repeats used to after substitute tabs:
tr -s '\t' <input | tr '\t' ',' >output
Try something like:
sed 's/[:space:]+/,/g' orig.txt > modified.txt
The character class [:space:] will match all whitespace (spaces, tabs, etc.). If you just want to replace a single character, eg. just space, use that only.
EDIT: Actually [:space:] includes carriage return, so this may not do what you want. The following will replace tabs and spaces.
sed 's/[:blank:]+/,/g' orig.txt > modified.txt
as will
sed 's/[\t ]+/,/g' orig.txt > modified.txt
In all of this, you need to be careful that the items in your file that are separated by whitespace don't contain their own whitespace that you want to keep, eg. two words.
without looking at your input file, only a guess
awk '{$1=$1}1' OFS=","
redirect to another file and rename as needed
What about something like this :
cat texte.txt | sed -e 's/\s/,/g' > texte-new.txt
(Yes, with some useless catting and piping ; could also use < to read from the file directly, I suppose -- used cat first to output the content of the file, and only after, I added sed to my command-line)
EDIT : as #ghostdog74 pointed out in a comment, there's definitly no need for thet cat/pipe ; you can give the name of the file to sed :
sed -e 's/\s/,/g' texte.txt > texte-new.txt
If "texte.txt" is this way :
$ cat texte.txt
this is a text
in which I want to replace
spaces by commas
You'll get a "texte-new.txt" that'll look like this :
$ cat texte-new.txt
this,is,a,text
in,which,I,want,to,replace
spaces,by,commas
I wouldn't go just replacing the old file by the new one (could be done with sed -i, if I remember correctly ; and as #ghostdog74 said, this one would accept creating the backup on the fly) : keeping might be wise, as a security measure (even if it means having to rename it to something like "texte-backup.txt")
This command should work:
sed "s/\s/,/g" < infile.txt > outfile.txt
Note that you have to redirect the output to a new file. The input file is not changed in place.
sed can do this:
sed 's/[\t ]/,/g' input.file
That will send to the console,
sed -i 's/[\t ]/,/g' input.file
will edit the file in-place
Here's a Perl script which will edit the files in-place:
perl -i.bak -lpe 's/\s+/,/g' files*
Consecutive whitespace is converted to a single comma.
Each input file is moved to .bak
These command-line options are used:
-i.bak edit in-place and make .bak copies
-p loop around every line of the input file, automatically print the line
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code
If you want to replace an arbitrary sequence of blank characters (tab, space) with one comma, use the following:
sed 's/[\t ]+/,/g' input_file > output_file
or
sed -r 's/[[:blank:]]+/,/g' input_file > output_file
If some of your input lines include leading space characters which are redundant and don't need to be converted to commas, then first you need to get rid of them, and then convert the remaining blank characters to commas. For such case, use the following:
sed 's/ +//' input_file | sed 's/[\t ]+/,/g' > output_file
This worked for me.
sed -e 's/\s\+/,/g' input.txt >> output.csv

Surround all lines in a text file with quotes ('something')

I've got a list of directories that contain spaces.
I need to surround them with ' ' to ensure that my batch scripts will work.
How can one surround each new line with a ' and a ' (quotes).
e.g.
File1:
/home/user/some type of file with spaces
/home/user/another type of file with spaces
To
File2:
'/home/user/some type of file with spaces'
'/home/user/another type of file with spaces'
Use sed?
sed -e "s/\(.*\)/'\1'/"
Or, as commented below, if the directories might contain apostrophes (nightmare if they do) use this alternate
sed -e "s/'/'\\\\''/g;s/\(.*\)/'\1'/"
Using sed:
sed -i "s/^.*$/'&'/g" filename
You can use sed(1) to insert single quotes at the beginning and end of each line in a file as so:
sed -i~ -e "s/^/'/;s/$/'/" the_file
I prefer awk (it's faster than bash and very easy to extend):
awk '{print "\'" $0 "\'"}'
very simple logic, you just need to echo the quotes in front and behind.
while read -r line
do
echo "'$line'"
# do something
done < "file"
Using sd, to surround with ' the command looks like:
sd '(.*)' \''$1'\'
to surround with " the command looks like:
sd '(.*)' '"$1"'
Hopefully you got the idea.

Resources