grep/sed copy two identic file names in a directory - linux

I am going to execute the sed command on Mac OSX El Capitan:
grep -rl 'efefef' . | xargs sed -i ' ' "s/efefef/cccccc/g"
If I do the command the really strange thing is, if the grep command find this expression, the command is copying the file into the same directory with the SAME filename. How is it possible?!?
-rw-r--r-- 1 craphunter staff 12605 16 Okt 14:40 backend_pay.de.yml
-rw-r--r-- 1 craphunter staff 12694 15 Okt 16:41 backend_pay.de.yml
Now I do have two files with the same FILENAME in the SAME directory?!?!?
Any idea? How is it even possible?!
Thanks!
craphunter

You added a space to the backup file's name:
sed -i ' '
Use something more distinctive, like ~.

Related

Listing directories with tab as delimiter

When we list files in unix using ls -l command the output is a table with space as a separator, for example the following
(jupyter-lab) ➜ mylab ls -l
total 2
drwxr-sr-x. 2 hs0424 ragr 0 Feb 1 12:17 A bad directory
drwxr-sr-x. 2 hs0424 ragr 0 Feb 1 12:18 A very bad directory
I want to convert to a tab separated file (.tsv), just changing spaces to \t, such as ls -l | sed -E 's/ +/\t/g' would not work since filenames contain spaces. Do we have better solution ?
Hard to show expected output with tabs but if we use \t as a replacement of tab, I want something as follows,
(jupyter-lab) ➜ mylab ls -l
total 2
drwxr-sr-x.\t2\ths0424\tragr\t0\tFeb 1\t12:17\tA bad directory
drwxr-sr-x.\t2\ths0424\tragr\t0\tFeb 1\t12:18\tA very bad directory
(Edit 1)
We can assume access to GNU tools
Use GNU find -printf or stat, either of which let you provide an arbitrary format string, instead of ls.
find . -mindepth 1 -maxdepth 1 -printf '%M\t%y\t%g\t%G\t%u\t%U\t%f\t%l\n'
or
# for normal cases
stat --printf='%A\t%G\t%g\t%U\t%u\t%n\n' *
# for directories where filenames could exceed command line length limit
printf '%s\0' * | xargs -0 stat --printf='%A\t%G\t%g\t%U\t%u\t%n\n'

Copy specific word from a file to another file using shell script

i am new to shell scripting.
my folder structure is like below format, in that every folder one file is there the file name is note.json, so i want to copy from note.json specific word like "user", i tried this for single file, it's working but showing unnecessary data and also i needed in loop format (means going to every folder doing the same) can any body help me out?
my folder structure:
drwxr-xr-x - zeppelin hdfs 0 2020-06-01 16:20 /user/zeppelin/notebook/2FBC2M3K2
drwxr-xr-x - zeppelin hdfs 0 2020-05-20 18:01 /user/zeppelin/notebook/2FBDEKUGP
drwxr-xr-x - zeppelin hdfs 0 2020-05-26 20:32 /user/zeppelin/notebook/2FBDXNZRC
drwxr-xr-x - zeppelin hdfs 0 2020-05-26 21:00 /user/zeppelin/notebook/2FBEAGZEE
drwxr-xr-x - zeppelin hdfs 0 2020-05-25 14:18 /user/zeppelin/notebook/2FBGXSHZR
drwxr-xr-x - zeppelin hdfs 0 2020-05-20 14:31 /user/zeppelin/notebook/2FBHCNKJP
drwxr-xr-x - zeppelin hdfs 0 2020-06-02 17:34 /user/zeppelin/notebook/2FBJCZ212
I tried for single folder using below command,
$ cat note.json | grep "user"
"user": "Ayan.Paul",
"data": "org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)\n\tat org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)\n\tat org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:324)\n\tat org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:265)\n\tat org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)\n\tat org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)\n\tat org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:718)\n\tat org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:801)\n\tat org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)\n\tat org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:633)\n\tat org.apache.zeppelin.scheduler.Job.run(Job.java:188)\n\tat org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\nCaused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335)\n\tat org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199)\n\tat org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)\n\tat org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)\n\tat org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541)\n\tat org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527)\n\tat org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)\n\tat org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562)\n\tat org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)\n\tat org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)\n\tat org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat org.apache.thrift.server.TServlet.doPost(TServlet.java:83)\n\tat org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:208)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:707)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)\n\tat org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\t... 3 more\nCaused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAccessControlException:Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:483)\n\tat org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:1330)\n\tat org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:1094)\n\tat org.apache.hadoop.hive.ql.Driver.compile(Driver.java:705)\n\tat org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1863)\n\tat org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1810)\n\tat org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1805)\n\tat org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)\n\
As said above, if it is json structured the best and clean way is to use jq.
otherwise, if this line always stay the same you can try:
cat note.json | grep "\"user\":" | sed 's/\"//g' | sed 's/,//g' | sed 's/ //g'
where
grep "\"user\":" - will take the the line you wanted
cut -d":" -f2 - will take from the second column by ":" separator
sed 's/\"//g' - remove "
sed 's/,//g' - remove commas
sed 's/ //g' - will remove spaces just in case ( you don't have to use it)
if you need the loop for it, lets say:
folder_Path='/path/to/myfolder'
files_in_folder=$(ls ${folder_Path})
for file in ${files_in_folder}
do
if [[ ${file} == "note.json" ]]
then
cat ${file} | grep "\"user\":" | sed 's/\"//g' | sed 's/,//g' | sed 's/ //g' > ${new_file_path}
fi
If you know that the note.json file always has "user" at the beginning of a line, then you can grep for that. It also sounds like you want the value of the "user" JSON field. Try using jq to parse that. Below is the "cheap and dirty" way of stripping out the extra characters. (We'll stick with a loop because you're probably doing something other things for each file...)
for file in $(find . -name note.json); do
grep "^.user" $file | cut -c 10- | tr -d '",'
done
If you want help with using jq to parse JSON, just ask a different question showing a "note.json" file and your attempt at pasring it!

bash tail the newest file in folder without variable

I have a bunch of log files in a folder. When I cd into the folder and look at the files it looks something like this.
$ ls -lhat
-rw-r--r-- 1 root root 5.3K Sep 10 12:22 some_log_c48b72e8.log
-rw-r--r-- 1 root root 5.1M Sep 10 02:51 some_log_cebb6a28.log
-rw-r--r-- 1 root root 1.1K Aug 25 14:21 some_log_edc96130.log
-rw-r--r-- 1 root root 406K Aug 25 14:18 some_log_595c9c50.log
-rw-r--r-- 1 root root 65K Aug 24 16:00 some_log_36d179b3.log
-rw-r--r-- 1 root root 87K Aug 24 13:48 some_log_b29eb255.log
-rw-r--r-- 1 root root 13M Aug 22 11:55 some_log_eae54d84.log
-rw-r--r-- 1 root root 1.8M Aug 12 12:21 some_log_1aef4137.log
I want to look at the most recent messages in the most recent log file. I can now manually copy the name of the most recent log and then perform a tail on it and that will work.
$ tail -n 100 some_log_c48b72e8.log
This does involve manual labor so instead I would like to use bash-fu to do this.
I currently found this way to do it;
filename="$(ls -lat | sed -n 2p | tail -c 30)"; tail -n 100 $filename
It works, but I am bummed out that I need to save data into a variable to do it. Is it possible to do this in bash without saving intermediate results into a variable?
tail -n 100 "$(ls -at | head -n 1)"
You do not need ls to actually print timestamps, you just need to sort by them (ls -t). I added the -a option because it was in your original code, but note that this is not necessary unless your logfiles are "dot files", i.e. starting with a . (which they shouldn't).
Using ls this way saves you from parsing the output with sed and tail -c. (And you should not try to parse the output of ls.) Just pick the first file in the list (head -n 1), which is the newest. Putting it in quotation marks should save you from the more common "problems" like spaces in the filename. (If you have newlines or similar in your filenames, fix your filenames. :-D )
Instead of saving into a variable, you can use command substitution in-place.
A truly ls-free solution:
tail -n 100 < <(
for f in *; do
[[ $f -nt $newest ]] && newest=$f
done
cat "$newest"
)
There's no need to initialize newest, since any file will be newer than the null file named by the empty string.
It's a bit verbose, but it's guaranteed to work with any legal file name. Save it to a shell function for easier use:
tail_latest () {
dir=${1:-.}
size=${2:-100}
for f in "$dir"/*; do
[[ $f -nt $newest ]] && newest=$f
done
tail -f "$size" "$newest"
}
Some examples:
# Default of 100 lines from newest file in the current directory
tail_latest
# 200 lines from the newest file in another directory
tail_latest /some/log/dir 200
A plug for zsh: glob qualifiers let you sort the results of a glob directly, making it much easier to get the newest file.
tail -n 100 *(om[1,1])
om sorts the results by modification time (newest first). [1,1] limits the range of files matched to the first. (I think Y1 should do the same, but it kept giving me an "unknown file attribute" error.)
Without parsing ls, you'd use stat
tail -n 100 "$(stat -c "%Y %n" * | sort -nk1,1 | tail -1 | cut -d" " -f 2-)"
Will break if your filenames contain newlines.
version 2: newlines are OK
tail -n 100 "$(
stat --printf "%Y:%n\0" * |
sort -z -t: -k1,1nr |
{ IFS=: read -d '' time filename; echo "$filename"; }
)"
You can try this way also
ls -1t | head -n 1 | xargs tail -c 50
Explanation :
ls -1rht -- list the files based on modified time in reverse order.
tail -n 1 -- get the last one file
tail -c 50 -- show the last 50 character from the file.

How to get the latest filename alone in a directory?

I am using
ls -ltr /homedir/mydirectory/work/ |tail -n 1|cut -d ' ' -f 10
But this is a very crude way of getting the desired result.And also its unreliable.
The output I get on simply executing
ls -ltr /homedir/mydirectory/work/ |tail -n 1
is
-rw-r--r-- 1 user pusers 1764 Apr 1 12:06 firstfile.xml
So here I get the file name.
But if the output on doing the above command is like
-rw-r--r-- 100 user pusers 1764 Apr 1 12:06 firstfile.xml
the first command fails ! And understandably as I am cutting the result from the 10th character which does not hold valid now.
So how to refine it.
Why do you use the -l flag for ls if you don't need it? Make ls simply output the filenames if you don't need more information instead of trying to "parse" its non-unified output (raping poor text processing utilities...).
LAST_MODIFIED_FILE=`ls -tr | tail -n 1`
If you really want to achieve this using your method, then, use awk instead of cut
ls -ltr /var/log/ |tail -n 1| awk '{print $9}'
Extended user user529758 answer which can give result as per file name
use below commnad as per the file name
ls -tr Filename* | tail -n 1

One liner to rename bunch of files

I was looking for a linux command line one-liner to rename a bunch of files in one shot.
pattern1.a pattern1.b pattern1.c ...
Once the command is executed I should get
pattern2.a pattern2.b pattern2.c ...
for i in pattern1.*; do mv -- "$i" "${i/pattern1/pattern2}"; done
Before you run it, stick an echo in front of the mv to see what it would do.
If you happen to be using Linux, you may also have a perl script at /usr/bin/rename (sometimes installed as "prename") which can rename files based on more complex patterns than shell globbing permits.
The /usr/bin/rename on one of my systems is documented here. It could be used like this:
rename "s/pattern1/pattern2/" pattern1.*
A number of other Linux environments seem to have a different rename that might be used like this:
rename pattern1 pattern2 pattern1.*
Check man rename on your system for details.
Plenty of ways to skin this cat. If you'd prefer your pattern to be a regex rather than a fileglob, and you'd like to do the change recursively you could use something like this:
find . -print | sed -ne '/^\.\/pattern1\(\..*\)/s//mv "&" "pattern2\1"/p'
As Kerrek suggested with his answer, this one first shows you what it would do. Pipe the output through a shell (i.e. add | sh to the end) once you're comfortable with the commands.
This works for me:
[ghoti#pc ~]$ ls -l foo.*
-rw-r--r-- 1 ghoti wheel 0 Mar 26 13:59 foo.php
-rw-r--r-- 1 ghoti wheel 0 Mar 26 13:59 foo.txt
[ghoti#pc ~]$ find . -print | sed -ne '/^\.\/foo\(\..*\)/s//mv "&" "bar\1"/p'
mv "./foo.txt" "bar.txt"
mv "./foo.php" "bar.php"
[ghoti#pc ~]$ find . -print | sed -ne '/^\.\/foo\(\..*\)/s//mv "&" "bar\1"/p' | sh
[ghoti#pc ~]$ ls -l foo.* bar.*
ls: foo.*: No such file or directory
-rw-r--r-- 1 ghoti wheel 0 Mar 26 13:59 bar.php
-rw-r--r-- 1 ghoti wheel 0 Mar 26 13:59 bar.txt
[ghoti#pc ~]$

Resources