I have a .txt file with some data in it, where "BARREL-5, BODY-3" etc are what is being measured, the decimal after the "V" is the measured value, and the date (sans the exact time) is being used to identify which sample the measurement belongs to, as such
4491 316 SS,BARREL-5,V,1.393,5/7/2015 7:47:05 AM,0,,,,13,...
4491 316 SS,BARREL-5,V,1.3865,2/17/2016 11:26:12 AM,0,,,,13,...
4491 316 SS,BODY-3,V,1.256,5/6/2015 6:45:42 PM,0,,,,13,...
4491 316 SS,BODY-3,V,1.2565,5/7/2015 7:46:16 AM,0,,,,13,...
4491 316 SS,BODY-3,V,1.246,2/17/2016 11:24:18 AM,0,,,,13,...
This data is exported from a (really obsolete) program in .txt format only every time we take sample measurements for a batch of parts. The data needs to be viewable in excel to quickly determine if the batch of parts is in tolerance or not. The problem with changing an extension and going with it is that a full line of data is as follows:
4491 316 SS,BARREL-5,V,1.393,5/7/2015 7:47:05 AM,0,,,,13,Blow Pattern=1-1-1,Die Set=FN3,Forge=Erie,Heat #=E150058,Job #=I2928,Lube Type=Hydraforge,Operator=Paul & ,Revision=C,Run Temperature=2250,Shift=2nd,Shim bottom Die=X,Shim Top Die=X,Shim Trimmer=X,C
This comes out really messy in excel, and the number of columns can vary +1 sometimes (it adds a column following the 0 in 0,,,, if that 0 is instead a 1).
My desired output, in txt format, would look something like this:
Project Number: 4491,,,, Material: 316 SS
,5/7/2015,2/17/2016,5/6/2015
BARREL-5,1.393,1.3865,NA
BODY-3,1.256,1.2565,1.246
Right now I loop through all the lines, extract all the dates, remove any duplicates (down to a list of each unique date), and then turn those into a string in the csv file as such ,date1,date2,date3,etc.
I then loop through the data file again and put the variable name down on a new line, checking first to see if it is repeated and if so, not echoing that variable name. I am using the following for loop to do this (filePath is a file path and project number combined, with the file extension or name-addition to be added on as needed):
for /f "tokens=1 delims=," %%a IN (%filePath%.txt) DO (
set varname=%%a%
find "!varname!" %filePath%Excel.csv
if !errorlevel!==1 (echo !varname! >> %filePath%Excel.csv)
)
My current output essentially looks something like this:
Project Number: 4491,,,, Material: 316 SS
,5/7/2015,2/17/2016,5/6/2015
BARREL-5,
BODY-3,
As you can see, I need to put the actual measured values in their proper locations. Is there any easy way to do this in batch? The code must be easily run by anyone, on any computer (Win 7 and XP). The data files are unlikely to change anytime soon, so the program doesn't need to be very robust. I am also limited by the fact that I tried using Powershell but discovered that I am not allowed to run any scripts on any of the computers...
The approach I hesitate to dig into would be something along the lines of creating a list of the line numbers for each line which switches to a new measurement name, then running another loop through the file and printing all the values on the same line, breaking them up based on the line count. The lines of variables would then be concatenated to the proper lines in the csv file.
Thanks for any help. I have edited this to include more relevant examples and details.
Your description is confusing and you did not shown what is the desired output, so there is no way to try to write a solution to your problem... However, your partial requirements can be obtained via a Batch file in a very simple way:
#echo off
setlocal EnableDelayedExpansion
set "dates="
for /F "tokens=1-3 delims=," %%a in (test.txt) do (
rem Get a list of unique dates:
set "dates=!dates:%%c,=!%%c,"
rem Take the values of the variables
set "var[%%a]=!var[%%a]!,%%b"
)
rem Show the results
echo Dates: %dates:~1,-1%
echo/
echo Variables:
for /F "tokens=2* delims=[]=" %%a in ('set var[') do echo %%a%%b
Using this data as input file:
varname1,valueA,date1
varname1,valueB,date2
varname1,valueC,date3
varname2,valueD,date1
varname2,valueE,date2
varname2,valueF,date3
... this is the output:
Dates: date1,date2,date3
Variables:
varname1,valueA,valueB,valueC
varname2,valueD,valueE,valueF
EDIT: Code modified to fulfill the new specifications
#echo off
setlocal EnableDelayedExpansion
set "max=0"
set "dates=,"
for /F "tokens=1-4,6,7 delims=, " %%a in (test.txt) do (
rem Get header data
set "project=%%a" & set "material=%%b %%c"
rem Get a list of unique dates
if "!dates:%%f=!" equ "!dates!" set "dates=!dates!%%f,"
rem Take the values of the variables
set "var[%%d]=!var[%%d]!,%%e"
rem Get data for variable equalization
set "data=%%d"
for /F %%D in ("!data:-=_!") do (
set /A "len[%%D]+=1"
if !len[%%D]! gtr !max! set "max=!len[%%D]!"
)
)
rem Equalize variables
set /A max-=1
for /F "tokens=2,3 delims=[]=" %%i in ('set len[') do (
set "data=%%i"
for /F %%D in ("!data:_=-!") do for /L %%I in (%%j,1,%max%) do (
set "var[%%D]=!var[%%D]!,NA"
)
)
rem Show the results
(
echo Project Number: %project%,,,, Material: %material%
echo %dates:~0,-1%
for /F "tokens=2* delims=[]=" %%a in ('set var[') do echo %%a%%b
) > output.txt
The output generated by this program is exactly the same specified in the question...
An easy way by 2 command lines to solve your question, see screenshot at bottom.
:: Extract all dates and get unqiue
msr -p your-source.txt -t "^.*?,(\d+/\d+/\d+)\s+(\d+:\d+:\d+).*" -o "$1" -PAC | nin nul -uPAC | msr -S -t "(\S+)\s+" -o ",$1" -PAC >> result.csv
:: Extract column2 like "BARREL-5" -> Auto classify -> Extract values like "1.393" -> Add "NA" if lack columns
for /f "tokens=*" %%a in ('nin source.txt nul "^[^,]+,([^,]+)" -u -PAC') do #msr -p source.txt -t ".*?,%%a,V,(\d+\.\d+),.*" -o "$1" -PAC | msr -S -t "\s+(\S+)" -o ",$1" -PAC | msr -t "^\d+\.?\d*,\d+\.?\d*$" -o "$0,NA" -aPAC| msr -t ".+" -o "%%a,$0" -PAC >> result.csv
But I don't know how you first line come out: Project Number: 4491,,,, Material: 316 SS
The above uses 2 common single exe tool (no dependencies): msr.exe (Match/Search/Replace) + nin.exe (Not-In-latter: get difference/intersection) in my open project https://github.com/qualiu/msr tools directory.
Use msr-Win32.exe and nin-Win32.exe if you're on a 32-bit Windows.
I have a string that contains substrings, which are separated by a set of chars like ,, ., -, ; and probably other. I need to split that string into as many substrings there are. And I need to understand how that splitting with that for loop works so I can add the other chars when needed.
Example:
set string=aaa,bbb.ccc-ddd;eee
for /f "tokens=1* delims=-.," %%a in ("%string%") do ( echo %%a, %%b )
I need to get to the result of 5 substrings in this example: aaa, bbb, ccc, ddd, eee. In that for loop I'd like to work with each of these substrings, like in a normal programming language. But this is killing me.
I get like aaa, bbb.ccc-ddd;eee. Only 2 substrings. What a I doing wrong? Also there might be a 1000 substrings.
FOR /F splits a line into multiple tokens (up to 31), but these tokens have to be defined.
for /f "tokens=1-31 delims=-.," %%a in ("%string%") do ( echo %%a, %%b, %%c %%d, %%e, %%f, ... )
But thats useless in your case with up to 100 tokens.
But you can force FOR /F to suppose it's getting multiple lines, and for each line the tokenization starts again.
(set \n=^
%=EMPTY=%
)
setlocal EnableDelayedExpansion
FOR %%L in ("!\n!") do (
set "multiLine=!string:-=%%~L!"
set "multiLine=!multiLine:;=%%~L!"
set "multiLine=!multiLine:,=%%~L!"
set "multiLine=!multiLine:.=%%~L!"
)
for /f "tokens=*" %%a in ("!multiLine!") do (
echo # %%a
)
First, it replaces all of your characters -,;. to a newline \n.
The FOR /F works on a per line basis, so you get in %%a one part in each loop
I have a situation where I'm trying to keep a static list of related items in a string and parse them out as sets in a bat file.
SET RootPath=C:\Users\woodh\test\
SET FromPath=StuffFrom\
SET ToPath=StuffTo\
SET CTLNames='text1.txt,red_text1:text2.txt,white_text2:text3.txt,blue_text3:'
With CTLNames containing pairs of entries to be parsed and consumed in the job.
I did the following
:Step20
rem -----------------------------------------------------------------
rem loop thru all files in the control list processing each pair at a time
rem -----------------------------------------------------------------
FOR /F "delims=:" %%f IN (%CTLNames%) DO (
IF NOT "%%f" == "" (
CALL:BreakEntry "%%f"
)
)
:Finish
rem ----------------------------------------------------------------
rem -- Finish
rem ----------------------------------------------------------------
goto end
:BreakEntry
rem -----------------------------------------------------------------
rem loop thru all files in the control list processing each entry one at a time
rem -----------------------------------------------------------------
Set EntryLine=%~1
IF NOT "%EntryLine%" == "" (
ECHO %EntryLine%
FOR /F "tokens=1,2 delims=," %%a IN ("%EntryLine%") DO (
ECHO %%a
ECHO %%b
CALL:MoveThisFile %%a, %%b
)
)
goto:eof
But It's only processing the first pair of names and not continuing through the rest of the list.
Your question is confusing. You didn't explained what exactly is the purpose of your code nor the expected output, so we can only guess. So I guess that you have a series of pairs of values separated by colon, and that each pair of values is separated by comma. This way, the problem with your code is that for /F command does not iterate over several values when just one string is processed: the string is divided accordingly to "tokens and delims" options and the command is executed just one time. You need to use a different method to process all substring in the string.
This is the way I would do it:
#echo off
setlocal
SET "CTLNames=text1.txt,red_text1:text2.txt,white_text2:text3.txt,blue_text3:"
for %%f in ("%CTLNames::=" "%") do (
for /F "tokens=1,2 delims=," %%a in (%%f) do (
echo %%a
echo %%b
echo CALL :MoveThisFile %%a, %%b
)
)
I suggest you to remove the #echo off line and execute the program, so you may review what exactly is executed.
The reason why it doesn't work as expected (it only prints the 1st pair), is because for /f works on lines; CTLNames only consists of a line so a single iteration is needed.
The confusing part is that it still printed the 1st pair...that is because it actually did the split (on the 1st :) but by default for only cares about the 1st token (before the delim) and drops the rest. You can convince yourself by changing the line to:
FOR /F "tokens=* delims=:" %%f IN (%CTLNames%) DO (
you'll see that the value of %%f (because we instructed it to take all the tokens into account) is the whole line.
The reason why I asked if the COLON(:) is mandatory as a separator between pairs, is because you can also iterate over a non numeric list - no /f flag, but here you can't specify the delimiter so you must use a regular one: SPACE( ), COMMA(,), SEMICOLON(;), TAB, and maybe others (anyway COLON is not one of them) - so this loop:
for %%f in (text1.txt:red_text1 text2.txt:white_text2 text3.txt:blue_text3;) do (
echo %%f
)
- note that I used 3 separators: TAB, SPACE and SEMICOLON in the for loop (not sure how visible it is) -
would yield:
text1.txt:red_text1
text2.txt:white_text2
text3.txt:blue_text3
Or you could use regular separators everywhere, and give up at the pair concept altogether, but I don't know if this is what you want.
I wasn't able to solve the problem using COLON as a separator from a single for loop, but I was able to find a way. Here's your script (slightly modified):
#ECHO OFF
rem ECHO %CTLNames%
CALL :Step20 "%CTLNames%"
GOTO :eof
:Step20
rem -----------------------------------------------------------------
rem loop thru all files in the control list processing each pair at a time
rem -----------------------------------------------------------------
IF "" == "%~1" GOTO :eof
FOR /F "tokens=1* delims=:" %%f IN ("%~1") DO (
rem echo f: %%f
CALL :BreakEntry "%%f"
CALL :Step20 "%%g%
)
GOTO :eof
:BreakEntry
rem -----------------------------------------------------------------
rem loop thru all files in the control list processing each entry one at a time
rem -----------------------------------------------------------------
Set EntryLine=%~1
ECHO %EntryLine%
FOR /F "tokens=1,2 delims=," %%a IN ("%EntryLine%") DO (
ECHO %%a
ECHO %%b
rem CALL :MoveThisFile %%a, %%b
)
GOTO :eof
The main thing is (besides other small changes) that Step20 is a recursive function (label), and it uses the for loop to split the line, it processes the 1st token, then it calls itself on the remaining tokens (until there are no more left).
Note: the single quotes surrounding CTLNames should be removed.
I am trying to get a value from a string that has a left boundary of test/time (ms)= and right boundary of , test/status=0.
For example, if I have an input string that looks like:
input="test/ing=123, hello/world=321, test/time (ms)=100, test/status=0"
In Perl, I know I can do something like:
input=~/"test/time (ms)="(.*)", test/status=0"/;
$time=$1;
$time will hold the value that I want to get.
Unfortunately, I can only write the code in Windows Batch or VBScript. Does anyone know how batch can perform the same action as the one in Perl?
Pure batch:
for /f "delims==," %%A in ("%input:*test/time (ms)=%) do echo %%A
The search and replace within the IN clause looks for the first occurance of test/time (ms) and replaces from the beginning of the original string to the end of the search string with nothing. The FOR /F then parses out the 100 using delimiters of = and ,.
The presence of enclosing quotes within the value of %input% causes the IN() clause to look weird with no visible end quote.
It looks better with delayed expansion:
setlocal enableDelayedExpansion
for /f "delims==," %%A in ("!input:*test/time (ms)=!") do echo %%A
I prefer to keep enclosing quotes out of my variable values, and explicitly add them to my code as needed. This makes the normal expansion version look more natural (delayed expansion version remains same):
set "input=test/ing=123, hello/world=321, test/time (ms)=100, test/status=0"
for /f "delims==," %%A in ("%input:*test/time (ms)=%") do echo %%A
Batch with help of JScript
If you have my hybrid JScript/batch REPL.BAT utility, then you can use regex to be very specific in your parsing:
call repl ".*test/time \(ms\)=(.*?),.*" $1 sa input
To get the value in a variable:
set "val="
for /f "delims=" %%A in ('repl ".*test/time \(ms\)=(.*?),.*" $1 sa input') do set "val=%%A"
Note that CALL is not needed within IN() clause. It also is not needed when using pipes.
Batch file:
SET input="test/ing=123, hello/world=321, test/time (ms)=100, test/status=0"
FOR %%i IN (%input:, =" "%) DO FOR /F "TOKENS=1,* DELIMS==" %%j IN (%%i) DO IF "%%j" == "test/time (ms)" ECHO %%k
EDIT: explanation
%input:, =" "% returns "test/ing=123" "hello/world=321" "test/time (ms)=100" "test/status=0"
Outer FOR will assign %%i to each string from previous result.
Inner FOR will assign characters left of = to %%j, and right ones to %%k.
Then is just comparing %%j with desired key and showing value if match.
VBScript/RegExp:
>> input="test/ing=123, hello/world=321, test/time (ms)=100, test/status=0"
>> set r = New RegExp
>> r.Pattern = "\(ms\)=(\d+),"
>> WScript.Echo r.Execute(input)(0).Submatches(0)
>>
100
I'm working with very large FIX message log files. Each message represents a set of tags separated by SOH characters.
Unlike MQ messages, individual FIX tags (and overall messages) do not feature fixed length or position. Log may include messages of different types (with a different number & sequence of tags).
Sample (of one of many types of messages):
07:00:32 -SEND:8=FIX.4.0(SOH)9=55(SOH)35=0(SOH)34=2(SOH)43=N(SOH)52=20120719-11:00:32(SOH)49=ABC(SOH)56=XYZ(SOH)10=075
So the only certain things are as follows: (1) tag number with equal sign uniquely identifies the tag, (2) tags are delimited by SOH characters.
For specific tags (just a few of them at a time, not all of them), I need to get a list of their distinct values - something like this:
49=ABC 49=DEF 49=GHI...
Format of the output doesn't really matter.
I would greatly appreciate any suggestions and recommendations.
Kind regards,
Victor O.
Option 1
The batch script below has decent performance. It has the following limitations
It ignores case when checking for duplicates.
It may not properly preserve all values that contain = in the value
EDIT - My original code did not support = in the value at all. I lessened that limitation by adding an extra SOH character in the variable name, and changed the delims used to parse the value. Now the values can contain = as long as unique values are differentiated before the =. If the values differentiate after the = then only one value will be preserved.
Be sure to fix the definition of the SOH variable near the top.
The name of the log file is passed as the 1st parameter, and the list of requested tags is passed as the 2nd parameter (enclosed in quotes).
#echo off
setlocal disableDelayedExpansion
:: Fix the definition of SOH before running this script
set "SOH=<SOH>"
set LF=^
:: The above 2 blank lines are necessary to define LF, do not remove.
:: Make sure there are no existing tag_ variables
for /f "delims==" %%A in ('2^>nul set tag_') do set "%%A="
:: Read each line and replace SOH with LF to allow iteration and parsing
:: of each tag/value pair. If the tag matches one of the target tags, then
:: define a tag variable where the tag and value are incorporated in the name.
:: The value assigned to the variable does not matter. Any given variable
:: can only have one value, so duplicates are removed.
for /f "usebackq delims=" %%A in (%1) do (
set "ln=%%A"
setlocal enableDelayedExpansion
for %%L in ("!LF!") do set "ln=!ln:%SOH%=%%~L!"
for /f "eol== tokens=1* delims==" %%B in ("!ln!") do (
if "!!"=="" endlocal
if "%%C" neq "" for %%D in (%~2) do if "%%B"=="%%D" set "tag_%%B%SOH%%%C%SOH%=1"
)
)
:: Iterate the defined tag_nn variables, parsing out the tag values. Write the
:: values to the appropriate tag file.
del tag_*.txt 2>nul
for %%A in (%~2) do (
>"tag_%%A.txt" (
for /f "tokens=2 delims=%SOH%" %%B in ('set tag_%%A') do echo %%B
)
)
:: Print out the results to the screen
for %%F in (tag_*.txt) do (
echo(
echo %%F:
type "%%F"
)
Option 2
This script has almost no limitations, but it significantly slower. The only limitation I can see is it will not allow a value to start with = (the leading = will be discarded).
I create a temporary "search.txt" file to be used with the FINDSTR /G: option. I use a file instead of a command line search string because of FINDSTR limitations. Command line search strings cannot match many characters > decimal 128. Also the escape rules for literal backslashes are inconsistent on the command line. See What are the undocumented features and limitations of the Windows FINDSTR command? for more info.
The SOH definition must be fixed again, and the 1st and 2nd arguments are the same as with the 1st script.
#echo off
setlocal disableDelayedExpansion
:: Fix the definition of SOH before running this script
set "SOH="
set lf=^
:: The above 2 blank lines are necessary to define LF, do not remove.
:: Read each line and replace SOH with LF to allow iteration and parsing
:: of each tag/value pair. If the tag matches one of the target tags, then
:: check if the value already exists in the tag file. If it doesn't exist
:: then append it to the tag file.
del tag_*.txt 2>nul
for /f "usebackq delims=" %%A in (%1) do (
set "ln=%%A"
setlocal enableDelayedExpansion
for %%L in ("!LF!") do set "ln=!ln:%SOH%=%%~L!"
for /f "eol== tokens=1* delims==" %%B in ("!ln!") do (
if "!!"=="" endlocal
set "search=%%C"
if defined search (
setlocal enableDelayedExpansion
>search.txt (echo !search:\=\\!)
endlocal
for %%D in (%~2) do if "%%B"=="%%D" (
findstr /xlg:search.txt "tag_%%B.txt" || >>"tag_%%B.txt" echo %%C
) >nul 2>nul
)
)
)
del search.txt 2>nul
:: Print out the results to the screen
for %%F in (tag_*.txt) do (
echo(
echo %%F:
type %%F
)
Try this batch file. Add the log file name as parameter. e.g.:
LISTTAG.BAT SOH.LOG
It will show all tag id and its value that is unique. e.g.:
9=387
12=abc
34=asb73
9=123
12=xyz
Files named tagNNlist.txt (where NN is the tag id number) will be made for finding unique tag id and values, but are left intact as reports when the batch ends.
The {SOH} text shown in below code is actually the SOH character (ASCII 0x01), so after you copy & pasted the code, it should be changed to an SOH character. I have to substitute that character since it's stripped by the server. Use Wordpad to generate the SOH character by typing 0001 then press ALT+X. The copy & paste that character into notepad with the batch file code.
One thing to note is that the code will only process lines starting at column 16. The 07:00:32 -SEND: in your example line will be ignored. I'm assuming that they're all start with that fixed-length text.
Changes:
Changed generated tag list file into separate files by tag IDs. e.g.: tag12list.txt, tag52list.txt, etc.
Removed tag id numbers in generated tag list file. e.g.: 12=abc become abc.
LISTTAG.BAT:
#echo off
setlocal enabledelayedexpansion
if "%~1" == "" (
echo No source file specified.
goto :eof
)
if not exist "%~1" (
echo Source file not found.
goto :eof
)
echo Warning! All "tagNNlist.txt" file in current
echo directory will be deleted and overwritten.
echo Note: The "NN" is tag id number 0-99. e.g.: "tag99list.txt"
pause
echo.
for /l %%a in (0,1,99) do if exist tag%%alist.txt del tag%%alist.txt
for /f "usebackq delims=" %%a in ("%~1") do (
rem *****below two lines strip the first 15 characters (up to "-SEND:")
set x=%%a
set x=!x:~15,99!
rem *****9 tags per line
for /f "tokens=1,2,3,4,5,6,7,8,9 delims={SOH}" %%b in ("!x!") do (
call :dotag "%%b" %*
call :dotag "%%c"
call :dotag "%%d"
call :dotag "%%e"
call :dotag "%%f"
call :dotag "%%g"
call :dotag "%%h"
call :dotag "%%i"
call :dotag "%%j"
)
)
echo.
echo Done.
goto :eof
rem dotag "{id=value}"
:dotag
for /f "tokens=1,2 delims==" %%p in (%1) do (
set z=0
if exist tag%%plist.txt (
call :chktag %%p "%%q"
) else (
rem>tag%%plist.txt
)
if !z! == 0 (
echo %%q>>tag%%plist.txt
echo %~1
)
)
goto :eof
rem chktag {id} "{value}"
:chktag
for /f "delims=" %%y in (tag%1%list.txt) do (
if /i "%%y" == %2 (
set z=1
goto :eof
)
)
goto :eof