Head on svn log doesn't always stop - search

Consider this:
svn log -r HEAD:1 --search $pattern | head -4
Sometimes this command finds the necessary amount of lines (e.g. 4) and stops. But sometimes it just keeps searching (i.e. hangs) even after having found the necessary amount of lines.
I don't know on what it depends (whether it keeps searching or stops). I would like to know the reason and I would like to know how to modify my command so it always stops right after having found the necessary amount of lines (I don't want the svn log to search the entire svn history as this might take forever).

Plain svn log will always continue showing the revision history from HEAD revision to 0 relevant to your query unless you kill the process (assuming that you don't use the --limit switch or specified some subtree like /branches/myfeature). Adjust your script to kill the process once it shows the required number of log messages.

Related

How does tail -f works internally

If I run this command:
tail -f file.txt
I will see changes in real time.
My question is: How does it work ? Is there a way for a process to be notified each time a file is changed ?
Or is it a loop, like watch command do ?
Thanks
I too had this question, as I wanted to implement a similar feature in Java.
It looks like tail is periodically checking the size and modified date of the file against values it recorded when it last outputted the file's contents. When it detects a change, it outputs the data at the end of the file (starting where it last left off). It then updates the values, and repeats the process.
https://github.com/coreutils/coreutils/blob/master/src/tail.c#L1235
There seems to be code later on to handle if the file gets truncated.
With --follow (-f), tail defaults to following the file descriptor,
which means that even if a tail'ed file is renamed, tail will continue
to track its end. This default behavior is not desirable when you re‐
ally want to track the actual name of the file, not the file descriptor
(e.g., log rotation). Use --follow=name in that case. That causes
tail to track the named file in a way that accommodates renaming, re‐
moval and creation.

monitor linux processes for a long period of time and save it into a text file or csv file

I am running a stability test which involve several important processes I want to be able to monitor those processes individually (CPU,memory, IO, etc) , i know I can use TOP command but using this command will result in seeing only live metrics and now overall or average which I can derive into a graph and see how it was over time. how can i do that?
You can still use top, printing the output of a single instance to a file, then using grep to isolate the processes you want to see, and then using awk to select the fields you want.
Something like
top -n 1 -b > /tmp/log_top_running ; grep <process_name> /tmp/log_top_running | awk '{print $10}' >> <report_file>
will extract the process running time and append it to the report file. -b is to avoid escape chars in the file, -n 1 terminates top after the first refresh.
This is the most basic thing you can do - you can probably do something smarter by passing to top the flags to only print the stuff you want to see.
To have it execute regularly you can write this command in a script and use the watch command, setting an interval in seconds with the -n option. After you have your file you can plot it.
Hope it helps.

How can I run two bash scripts simultaneously and without repetition of the same action?

I'm trying to write a script that automatically runs a data analysis program. The data analysis takes a file, analyzes it, and puts all the outputs into a folder. The program can be run on two terminals simultaneously (each analyzing a different subject file).
I wrote a script that can do all the inputs automatically. However, I can only get my script to run one automatically. If I run my script simultaneously it will analyze the same subject twice (useless)
Currently, my script looks like:
for name in `ls [file_directory]`
do
[Data analysis commands]
done
If you run this on two terminals, it will start from the top of the directory containing all the data files. This is a problem, so I tried to do checks for duplicates but they weren't very effective.
I tried a name comparison with the if command (didn't work because all the output files except one were of a unique name, so it would check the first outfput folder at the top of the directory and say the name was different even though an output folder further down had the same name). It looked something like..
for name in `ls <file_directory>`
do
for output in `ls <output directory>`
do
If [ name==output ]
then
echo "This file has already been analyzed."
else
<Data analyis commands>
fi
done
done
I thought this was the right method but apparently not. I would need to check all the names before some decision was made (rather one by one which that does)
Then I tried moving completed data files with the mv command (didn't work because "name" in the for statement stored all the file names so it went down the list regardless of what was in the folder at present). I remember reading something about how shell scripts do not do things in "real time" so it makes sense that this didn't work.
My thought was looking for some sort of modification to that if statement so it does all the name checks before I make a decision (how?)
Also are there any other commands I could possibly be missing that I could possibly try?
One pattern I use often is to use split command.
ls <file_directory> > file_list
split -d -l 10 file_list file_list_part
This will create files like file_list_part00 to file_list_partnn
You can then feed these file names to you script.
for file_part in `ls file_list_part*`
do
for file_name in `cat file_part | tr '\n' ' '`
do
data_analysis_command file_name
done
done
Never use "ls" in a "for" (http://mywiki.wooledge.org/ParsingLs)
I think you should use a fifo (see mkfifo)
As a follow-on from the comments, you can install GNU Parallel with homebrew:
brew install parallel
Then your command becomes:
parallel analyse ::: *.dat
and it will process all your files in parallel using as many CPU cores as you have in your Mac. You can also add in:
parallel --dry-run analyse ::: *.dat
to get it to show you the commands it would run without actually running anything.
You can also add in --eta (Estimated Time of Arrival) for an estimate of when the jobs will be done, and -j 8 if you want to run, say 8, jobs at a time. Of course, if you specifically want the 2 jobs at a time you asked for, use -j 2.
You can also have GNU Parallel simply distribute jobs and data to any other machines you may have available via ssh access.

How to turn command output to a live status update?

I am curious if there's a tool or an easy way to continually execute a certain command in given intervals, and reprint its output into same place on the screen.
The example that prompted me to think about it is 'dropbox-cli status'. Manually executed:
$> dropbox-cli status
Syncing (6,762 files remaining)
Indexing 3,481 files...
$> dropbox-cli status
Syncing (5,162 files remaining)
Indexing 2,681 files...
I am looking for:
$> tracker --interval=1s "dropbox-cli status"
Syncing (6,743 files remaining)
Indexing 3,483 files
The imaginary command 'tracker' would block and the two output lines would be continually reprinted over every second, rather than creating an appending log output.
You can use watch
watch -n1 dropbox-cli status
Specify the time in seconds with param -n.

How to limit svn log search results?

I know I can search svn logs using svn log --search $pattern. However, I'd like to limit the results to a certain number. The only thing I was able to find is the -l option but it limits the original log entries on which the search is run and I'd like for the search to be run on the entire log history and only limit the number of results themselves.
There is no built-in way to tell svn log --search to limit the results. As you already know, you can limit the number of revisions it checks, but this is not what you look for. I guess that you could write a script that checks the output of svn log and cuts it to the necessary amount. Don't forget that you could use --xml switch.
If you desperately need such feature by some reason, drop a line to users# Apache Subversion mailing list. Describe how and why you need this enhancement and maybe it will be filed as a feature request. :)
Linux style:
svn log --search $pattern | head -n 10

Resources