search multiple strings in multiple files via command-line - search

I have a txt file that contains about 500 values, one per line. I need to check to see of any of those 500 values appear in any of 6 csv files each containing 100k lines. I can search for one value in those 6 csv files using
for /f "delims==" %%f in ('dir /s /b "P:\*.txt"') do FIND /N "[SEARCHSTRING]" "%~1%%f" >> "C:\found.txt"
but how do I do multiple searches automatically via command-line or batch file (CaSe SenSiTIve)?

#ECHO OFF
SETLOCAL
SET "sourcedir=c:\sourcedir"
SET "destdir=C:\destdir"
for /f "delims=" %%a in ('dir /s /b "%sourcedir%\*.csv"') do (
FINDSTR /N /g:"yourtextfilecontaining500linestomatch.txt" "%%~fa") > "%destdir%\%%~nafound.txt"
GOTO :EOF
What you are asking is rather unclear. I used c:\sourcedir as the location of the .csv files and c:\destdir as the location for the reports. Replacing
FINDSTR /N /g:"yourtextfilecontaining500linestomatch.txt" "%%~fa") > "%destdir%\%%~nafound.txt with your original (with the double > would accumulate the lines into a single file - if that's what you want. As it stands, a new file will be created with name the same as your .csv+found.txt

An easy way is to use a batch script. You can loop through each of the files one by one. If you want to do them all at once you need to thread your program.
for /L %%A in (1,1,6) do (
Your code goes here
)
That batch script will loop six times. I am not really sure how you specify the file but if you loop through each file it will work.
So put your current batch script where I said "Your code goes here"
for /f "delims==" %%f in ('dir /s /b "P:\*.txt"') do FIND /N "[SEARCHSTRING]" "%~1%%f" >> "C:\found.txt"
But you need to edit it to point to the file that you want to search. If your files are 1.txt, 2.txt 3.txt then all you need to do is set your file name to the current loop iteration number.

I've been using variants of this shell function for years:
ematch () {
for f in $(find . -type f | grep -v '~' | grep -v \.svn\/) ; do
egrep "$1" "$f" /dev/null 2> /dev/null
done
}
-> ematch "(string1|string2|string3)"
Feel free to adapt to your needs and post your mods here.

Related

Bat file to list files, using semicolon delimiter?

So far I have this:
#ECHO OFF
dir *.txt /c /b /on > content.txt
Which gives output:
file1.txt
file2.txt
file3.txt
But I need it like this, separated with semicolon on each line:
file1.txt;
file2.txt;
file3.txt;
I assume I probably need to write for loop and add string ";" somewhere, but I don't know where or how to do this. Or is there a way to just set a specific delimiter?
Edit:
My usecase changed, I thought it would be better if there are files in subfolders listed as well, but "/" should be replaced with space " ".
Example output:
file1.txt;
file2.txt;
subfolder1 file1.txt;
subfolder2 file1.txt;
Note that I do not want the full parent path, only subfolders.
Quick single line batch-file answer:
#(For /F Tokens^=*^ Delims^=^ EOL^= %%G In ('Dir "*.txt" /A:-D /B /O:N 2^>NUL') Do #Echo %%G;) 1>"content.log"
…and in cmd:
(For /F Tokens^=*^ Delims^=^ EOL^= %G In ('Dir "*.txt" /A:-D /B /O:N 2^>NUL') Do #Echo %G;) 1>"content.log"
I have decided to output to a .log file, so that the listing doesn't include itself.
Please use the built-in help to learn how each command works.
When you read the help information, please be aware that a 'simple' for loop will not pick up all files, it will ignore all hidden files for instance. Also despite any first impressions you may have from testing, the order of files returned, depends upon both the file system and type. The dir command is the most efficient way of ensuring that sort order.
[EDIT /]
Here is a batch-file solution, (as that's what you posted as an answer), for your New and completely different question.
#(For /F Tokens^=*^ Delims^=^ EOL^= %%G In ('Dir "*.txt" /A:-D /B /O:N /S 2^>NUL') Do #(Set "FileName=%%~dpG" & SetLocal EnableDelayedExpansion & Set "Filename=!FileName:~,-1!" & For %%H In ("!FileName:%__CD__%=!") Do #EndLocal & Echo %%~H %%~nxG;)) 1>"content.log"
In future, when you have existing answers to your asked question, do not change that question when not only the main command is different, but the intended result format too.
a Simple for loop will do:
#(for %%i in (*.txt) do #echo %%i;)>"content.txt"
That will however also echo the content of contents.txt to itself if it exists, so you can exclude it.
#(for %%i in (*.txt) do #if /i not "%%~i" == "content.txt" #echo %%i;)>"content.txt"
if the plan is to iterate through subdirs as well, run for /R
I found a partial solution to list subfolders, excluding parent path. It is not perfect (for loop is not accurate as mentioned in above comments, neither I am happy with pushd command), but works. I left "space" in place of \ because thats how I need it.
#echo off
setlocal enabledelayedexpansion
pushd "c:\users\documents\"
(
for /r %%a in (*.txt) do (
set x=%%a
set x=!x:%cd%=!;
echo !x:\= !
)
)>filelist.log
popd
Code explanation:
First I remove the parent path with set x=!x:%cd%=!; (exclamation marks replace the %, see enabledelayedexpansion help) and then remove the slashes when echoing.

Findstr /g with token or delim

Suppose we have 2 files
First.txt
123
456
And Second.txt
789;123
123;def
482;xaq
What i need is to find the lines in the second file only containing entries of the first file in first column (token 1, delim ; ).
This is what i need:
Output.txt
123;def
Of course,
findstr /g:first.txt second.txt
will output both lines:
789;123
123;def
Any idea how i can mix findstr and for /f to get the needed output?
Thank you!
If all of the elements in the first column are of the same length, then the simple answer would be
findstr /b /g:first.txt second.txt
Note however that if first.txt contains a line 12 then this would match 123;abc and 129;pqr in the second file.
You can take advantage of the super-limited regex capabilities of findstr and compare each line of first.txt to only the very beginning of each line of second.txt.
#echo off
for /F %%A in (first.txt) do findstr /R /C:"^%%A;" second.txt
The /R flag means that the search string should be treated as a regular expression. The ^ in the search string means that %%A comes at the very beginning of the line. The ; is a literal semicolon that will prevent the 123 line from picking up 1234;abcd in second.txt.
Without executing a separate findstr for each value and to avoid the problem with partial matches at the start of the line, you can try with
#echo off
setlocal enableextensions disabledelayedexpansion
( cmd /q /c"(for /f "delims=" %%a in (first.txt) do echo(%%a;)"
) | findstr /g:/ /l /b second.txt
What it does is read first.txt and echo each line with the delimiter. This output is retrieved by the findstr using /g:/ to use the standard input as the source for the elements to match, that will be considered as literals (/l) at the start of the line (/b) in the second.txt file
Is the general form for CSV. Note in batch %A becomes %%A.
for /f "delims=," %A in (csv.txt) do findstr /c:"%A" file2.txt
Here's the output
C:\Users\User>for /f "delims=," %A in (csv.txt) do findstr /c:"%A" csv1.txt
C:\Users\User>findstr /c:"55" csv1.txt
55,61,hi there, Good
C:\Users\User>findstr /c:"60" csv1.txt
54,60,hi there, Bad
C:\Users\User>findstr /c:"Bad" csv1.txt
54,63,hi there, Bad
54,60,hi there, Bad
C:\Users\User>findstr /c:"55" csv1.txt
55,61,hi there, Good
Contents of two files.
55,60
60,60
Bad,60
55,60
and
55,61,hi there, Good
54,62,hi there, Good
54,63,hi there, Bad
54,60,hi there, Bad

Batch search multiple strings simultaneously

I have this large database of equipment:
Equipment500
Equipment501
..........
Equipment998
Equipment999
As well as an even larger database with details about equipment:
Equipment1:details....
Equipment2:details....
..................
Equipment9998:details....
Equipment9999:details....
What i need, is to select only the details for equipment i need:
for /f "tokens=* delims= " %%a in (%cd%\equipment.db) do (
findstr /i /c:"%%a" details.db > Output\%%a
)
The output will be, of course, a folder with files:
In Equipment500 it will be Equipment500:details....
In Equipment501 it will be Equipment501:details....
..................
In Equipment998 it will be Equipment998:details....
In Equipment999 it will be Equipment999:details....
The problem is that it takes a lot of time.
I need this multithreaded so that it runs more instances of findstr (preferably all 500) at the sametime to do processing instantly.
Any idea is appreciated. Thank you!
#echo off
echo building input files (this needs some time):
del *.db
for /l %%i in (500,1,999) do #echo Equipment%%i>>equipment.db
for /l %%i in (1,1,9999) do #echo Equipment%%i:Detailswhatever>>details.db
echo %time% start adapting
REM adapt equipment.db:
(for /f "delims=" %%i in (equipment.db) do echo %%i:)>equip.db
REM find all strings:
echo %time% start searching
findstr /g:equip.db details.db >output.txt
echo %time% done
NOTE: "Equipment.db" has to be adapted, because searching for "Equipment2" would also find "Equipment20", Equipment21"... "Equipment200" ...
Since you only provide vague information about your file structure, I'd suggest
#echo off
for /f "tokens=1*delims=:" %%a in (details.db) do >>%%a.dat echo %%b
which assumes each entry in details.db is of the form
equipment1234:details

I need to delete a string in multiple lines from a file using batch script

Working file: Projects/fentbase/common/javasrc/validators/PasswordSetupVal.java
Working file: Projects/fentbase/channeladministration/spec/ui/dev/ppdl/AccessSchemeMaintenancePreview.dppdl
Working file: Projects/fentbase/common/javasrc/validators/EmailValidator.java
Working file: Projects/fentbase/common/javasrc/validators/MailIdVal.java
I have this in a file. I need to take out "Working file: " from every line. Please let me know how do I do that.
#echo off
setlocal enabledelayedexpansion
for /f "delims=" %%a in (%1) do (
set line=%%a
echo !line:Working file=!
)
Usage:
sciprt.cmd file.txt > new_file.txt
move /y new_file.txt file.txt
another way to skin the cat:
for /f "tokens=2,*" %%i in ('type "file.txt"') do #echo(%%j
Here is one way:
type "file.txt" | repl "Working file: " "" L >"newfile.txt"
This uses a helper batch file called repl.bat (by dbenham) - download from: https://www.dropbox.com/s/qidqwztmetbvklt/repl.bat
Place repl.bat in the same folder as the batch file or in a folder that is on the path.

Search by date using command line

Is there any way to search for files in a directory based on date? I want to find all files with created date greater than a specific date, is it possible to do it with dir command?
Just discovered the forfiles command.
forfiles /s /m *.log /d -7 /c "cmd /c echo #path"
Will list all the log files modified more than seven days old, in all subdirectories, though it does not appear to look at the create date. It does support specifying a specific date.
See forfiles /? for more info.
an easier way for me is something like
dir /s *.xlsx | find "07/14/2016"
dir by itself can not filter by date, but you can parse the output of dir using for command. If in your country dir prints the date in YMD format, then you only need to compare it with given date. If the order of date parts is different, then you have to use another for command to parse the date and change it to YMD. This will display a list of files modified after 5th Februrary.
#Echo Off
for /f "usebackq tokens=1,4 skip=5" %%A in (`dir /-c`) do (
if %%A GTR 2012-02-05 echo %%A %%B
)
if does standard string comparison, so at the end you can get additional line if summary line passes the comparison. To avoid it, you can use if %%A GTR 2012-02-05 if exist %%B echo %%A %%B
EDIT:
There is even better approach which avoids parsing of dir output and also allows searching by time, not only by date:
#Echo Off
for /r %%A in (*) do (
if "%%~tA" GTR "2012-02-05 00:00" echo %%~tA %%A
)
Well you cant as far as i know, but this sort of think will work, but still really useless unless you have a short date range ;)
for /R %a in (01 02 03 04 05) do dir | find "02/%a/2012"
This is easy to do with PowerShell. I know that your question was about cmd, but PS is included in windows 7 and later. It can also be installed on XP and Vista.
Use the Get-ChildItem command (aliased as dir) to get all files. Pipe the output to the Where-Object command (aliased as ?) to return files where the date is greater then (-gt) a specific date.
For Powershell 2.0 (default on Windows 7), you must use a scriptblock:
dir -file | ? {$_.LastWriteTimeUtc -gt ([datetime]"2013-05-01")}
For Powershell 3.0 (default on Windows 8) and up you can use this simpler syntax instead:
dir -file | ? LastWriteTimeUtc -gt ([datetime]"2013-05-01")
The dir -file command returns a collection of System.IO.FileInfo objects. This file object has many properties that you can use for complex filtering (file size, creation date, etc.). See MSDN for documentation.

Resources