im generating a CSV from an XLS file with VBA, after that I am filtering the CSV with Batch. My filter looks like this:
for %%a in (*.csv) do (
for /f "usebackq tokens=1-10 delims=, eol=^" %%1 in ("%%a") do (
if %%4 EQU Req_Category ECHO %%1,%%2,%%3,%%4,%%5,%%6,%%7,%%8,%%9 >> "%%a"_JIRA.csv
if %%4 EQU Requirement ECHO %%1,%%2,%%3,%%4,%%5,%%6,%%7,%%8,%%9 >> "%%a"_JIRA.csv
)
)
This works fine if the CSV File has no empty lines.
In rare occasions the XLS -> CSV converting generates empty lines or CRs in the CSV.
SW_Fn-289,4.1.1.1,Controling Hardware PCB,Heading,,,,,IgnoreTesting,
SW_Fn-291,4.1.1.1.0-1,"
Date : 07.03.1777
The SystemDesignSpecification is stored in SVN path
http://sblablablabla.xlsm
",Requirement,Lab1 (B-Sample),,Released,Accepted,IgnoreTesting,
SW_Fn-4281,4.1.1.1.0-2,"
Date : 123.123.123
Path : https://apath.com
",Requirement,R1,,New,New,IgnoreTesting,
SW_Fn-166,4.2,Compliance Requirements,Heading,,,,,IgnoreTesting,
SW_Fn-286,4.2.1,Resource Usage,Heading,,,,,IgnoreTesting,
Every line in the CSV should start with an ID: SW_Fn-Example.
Does every one have an idea how can bring the info on one line with a batch function?
I need to get the file to look like this (before filtering):
SW_Fn-289,4.1.1.1,Controling Hardware PCB,Heading,,,,,IgnoreTesting,
SW_Fn-291,4.1.1.1.0-1,"Date : 07.03.1777 TheSystemDesignSpecificationisstored in SVN path http://sblablablabla.xlsm",Requirement,Lab1 (B-Sample),,Released,Accepted,IgnoreTesting,
SW_Fn-4281,4.1.1.1.0-2," Date : 123.123.123 Path : https://apath.com",Requirement,R1,,New,New,IgnoreTesting,
SW_Fn-166,4.2,Compliance Requirements,Heading,,,,,IgnoreTesting,
SW_Fn-286,4.2.1,Resource Usage,Heading,,,,,IgnoreTesting,
There shouldnt be a line that does not start with SW_Fn-blabla. If a line starts with something else, then it should be a part of the previous line that has an Sw_Fn-blabla.
Then my filter will work to produce this:
SW_Fn-291,4.1.1.1.0-1,"Date : 07.03.1777 TheSystemDesignSpecificationisstored in SVN path http://sblablablabla.xlsm",Requirement,Lab1 (B-Sample),,Released,Accepted,IgnoreTesting,
SW_Fn-4281,4.1.1.1.0-2," Date : 123.123.123 Path : https://apath.com",Requirement,R1,,New,New,IgnoreTesting,
Thanks in advance
try this:
#echo off
for %%a in (*.csv) do (
for /f "delims=" %%b in (%%a) do (
for /f "tokens=4 delims=," %%c in ("%%b") do (
if "%%c"=="Requirement" echo %%b >>%%~na_JIRA%%~xa
if "%%c"=="Req_Category" echo %%b >>%%~na_JIRA%%~xa
)
)
)
read and handle each line complete to overcome the consecutive-delimiter-issue mentioned by Magoo (use another for to check Token4, but don't bother to disassemble and reassemble the complete line)
Aak! don't use numerics for the metavariable (%%1) - it's highly unreliable. Use an alphabetic character.
Batch treats a string of delimiters as a single delimiter and you have nominated commas and spaces as delimiters, so
SW_Fn-166,4.2,Compliance Requirements,Heading,,,,,IgnoreTesting,
would appear as
SW_Fn-166,4.2,Compliance,Requirements,Heading,IgnoreTesting,,,,
You haven't shown what you expect as output. Do you only want the lines that begin SW_Fn- or do you want all lines that don't start SW-Fn appended to the last line that did?
#ECHO Off
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q36475816.csv"
SET "outfile=%destdir%\outfile.txt"
SET "line="
(
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO (
SET "newpart=%%a"
IF DEFINED line CALL :test
IF DEFINED line CALL SET "line=%%line%% %%a"
IF NOT DEFINED line SET "line=%%a"
)
IF DEFINED line ECHO(%line%
)>"%outfile%"
GOTO :EOF
:: Test new data " Accumulate data into line or output & start a new line
:test
SET "newpart=%newpart:"=x%"
IF NOT "%newpart:~0,6%"=="SW_Fn-" goto :eof
echo(%line%
SET "line="
GOTO :eof
You would need to change the settings of sourcedir and destdir to suit your circumstances.
I used a file named q36475816.csv containing your data for my testing.
Produces the file defined as %outfile%
Note that your posted data contains unbalanced quotes in the Fn-4281 item. It's always better to use actual data rather than "somewhere close".
Read each line. If we've already accumulated part of a line, check whether the first few characters are the target. If they are, output the line as constructed and clear line.
If line is clear after this operation, set it to the line read (which must startwith the target, otherwise accumulate the line.
In the :test procedure, remove quotes before testing so that it doesn't break the syntax. Obviously, if the first few characters contains a quote, it doesn't fit the target so the test will correctly detect "no fit"
Your file is actually valid CSV format. Quoted CSV fields may contain any of the following:
comma
quote literal, escaped as ""
newline (either LF or CRLF)
You don't have commas or quotes within your fields, but you do have newlines that are giving your code serious problems.
But that is only one potential problem. Another issue is FOR /F treats consecutive delimiters as a single delimiter, so if any of your desired keeper lines have any empty fields, then your output will be completely wrong.
Batch is inherently far from ideal for any kind of text processing, but for CSV it is especially bad for all but the most simplest problems. If you really want to use batch, you could use ParseCSV.bat to properly parse your CSV and read it using FOR /F in a reliable manner. But there are better options.
PowerShell has an Import-Csv cmdlet. I'm not sure of its capabilities, but if it supports newlines within fields, then you could develop a really slick solution with that.
Another option is my JREPL.BAT regular expression text processor. The following code looks nasty, but it will very efficiently produce your desired output in one step:
jrepl "((?:[\s\S]*?,){3}(?:(Req_Category,|Requirement,)|.*?,)(?:.*?,){4}.*?),[^,\n]*\n?" "$2?$1.replace(/\r\n/g,' ')+'\r\n':''" /m /j /f input.csv /o output.csv
You would need to use CALL JREPL if you put the command within another batch script.
My JREPL solution relies on the fact that none of your input fields contain quoted commas. If it did contain quoted commas, then a JREPL solution would be even more complicated.
This solution works by using the /M multiline option so that I can match across line-breaks.
The search matches each 10 field collection (your 10th field seems to be always empty), regardless of line breaks. $1 contains the first 9 fields (without the trailing comma). $2 contains the 4th field if and only if it matches "Req_Category" or "Requirement". The replacement javascript expression tests if $2 is defined, and if it is, then the whole search expression is replaced with $1 after all newlines have been replaced by spaces, and then a newline is appended. IF $2 is not defined then the whole search expression is replaced with an empty string. Simple in concept, but kind of nasty to develop ;-)
A slight simplification allows you to preserve the original fields containing newlines, and still do the filtering you desire.:
jrepl "((?:[\s\S]*?,){3}(?:(Req_Category,|Requirement,)|.*?,)(?:.*?,){4}.*?),[^,\n]*\n?" "$2?$1+'\r\n':''" /m /j /f input.csv /o output.csv
For something so simple that can easily be done with find replace in notepad, I can't see why it is so hard to do in a command line as it is just one step in the entire procedure that I would like to get down to a single run. The output from the first part lists the local path to all the websites in the webserver as the each local path c:/ etc. every site has the same 59 characters before the part that matters.
To make this a usable link, I need to then add a different string in the same position as the old one with the correct http://. etc. to the balance of the line to make it a working hyperlink.
The final step needs to convert any single "\’s" that are left to a "/". Normally there is only one
All of this can be done in notepad++ using find and replace but it takes 3 runs to achieve the end result the original text file is nothing special, no skipped lines, everyone is identical in layout.
The same 59 characters need to be chopped off (it could even be by Number and not by comparing the text, just shorten by 59 characters if that is easier. The replacement text string is always exactly the same that just gets appended to each line. And for the final touch of replacing every \ with a / to make it fully web-compatible there is only one occurrence on each line.
I have seen many find and replace batch-files that seem to be overkill for such a simple task.
Take each line, count fifty nine characters forward, chop off the 59 and add in the replacement text in its place.
Then change the only backslash in the line to a forward slash and it’s done
Does anyone know a simpler easier way to do this
This uses a helper batch file called repl.bat - download from: https://www.dropbox.com/s/qidqwztmetbvklt/repl.bat
Place repl.bat in the same folder as the batch file or in a folder that is on the path.
Just change http://www.domain.com/ to what you need to prefix the lines with.
type "file.txt" | repl "\\" "/" | repl "^.{59}" "http://www.domain.com/" >"newfile.txt"
The two \\ are intentional as it is a regular expression.
#ECHO OFF
SETLOCAL
(
FOR /f "delims=" %%a IN (Q21495128.txt) DO (
SET "line=%%a"
CALL SET "line=%%line:\=/%%"
CALL SET "line=replacement text%%line:~59%%"
FOR /f "tokens=1*delims==" %%x IN ('set line') DO ECHO %%y
)
)>newfile.txt
GOTO :EOF
where Q21495128.txt was my test source file worked for me.
i have to process 300+ HTML files, extract a string from each one and place it in a separate text file for import downstream. upside: the string format is identical in each file and is +/- two lines from the same position as well.
i thought maybe using Python, but then i thought PERL might be a better way since this kinda plays to it's backyard.
sadly, i have no access to UNIX/LINUX or i'd just grep it...
this is such an odd client request that i'm a bit goggle-eyed ATM.
so: what is the best way to extract a target string from a BATCH of files?
WR!
If you give us more details (i.e. path and name of the files, the string you want to extract, etc) perhaps I may write a Windows Batch .BAT file to achieve this task...
EDIT
To write a Batch file that successfully run I need a couple additional data, so I made some assumptions. You may help me to fix the details. This is my method:
Seek for a line that contains ">Text link<". I suppose there is just one; this may be fixed.
Read the next line. I assumed that each td is located in independent lines; this may be fixed.
In this line remove the text from beginning of line until value string.
Replace quotes by $ (the next step cannot process quotes).
Get the text between $; this is the result.
for /F skip... command may read a wrong line if thefile contains empty lines; this may be fixed.
#echo off
setlocal DisableDelayedExpansion
findstr /n ">Text link<" thefile.htm > linefound.tmp
for /F "delims=:" %%a in (linefound.tmp) do set lineNo=%%a
for /F "skip=%lineNo% delims=" %%a in (thefile.htm) do (
set "theLine=%%a"
goto continue
)
:continue
setlocal EnableDelayedExpansion
set theLine=!theLine:*value=!
set theLine=!theLine:"=$!
for /F "tokens=2 delims=$" %%a in ("!theLine!") do set URL=%%a
echo Result: %URL%
EDIT no. 2
You are confusing me. Worked the first code or not? The second example you posted in the comments seems not be related to the first one (is the data within second <td> or after [url=http://?). Is it the same problem or a different one? Please, don't assume I know about HTML file format (I don't). I DO know about Batch files, but I can't guess what to do if I have not complete details...
The following Batch file show everything between square brackets that comes IN THE SAME LINE that have the [url=http:// string in the file given in the first parameter:
#echo off
for /F "tokens=2 delims=[]" %%a in ('findstr /n "[url=http://" %1') do echo %%a
As you're already familiar with Grep, why not use a Windows port, such as the Grep in GnuWin32?
Another great way to get a ton of *nix functionality in Windows is Cygwin http://www.cygwin.com
I'm writing a simple .bat file and I've run into some weird behavior. There are a couple places where I have to do a simple if/else, but the code inside the blocks don't seem to be working correctly.
Here's a simple case that demonstrates the error:
#echo off
set MODE=FOOBAR
if "%~1"=="" (
set MODE=all
echo mode: %MODE%
) else (
set MODE=%~1
echo mode: %MODE%
)
echo mode: %MODE%
The output I'm getting is:
C:\>test.bat test
mode: FOOBAR
mode: test
Why is the echo inside the code block not getting the new value of the variable? In the actual code I'm writing I need to build a few variables and reference them within the scope of the if/else. I could switch this to use labels and gotos instead of an if/else, but that doesn't seem nearly as clean.
What causes this behavior? Is there some kind of limit on variables within code blocks?
You are running into the problem of cmd's static variable expansion. The MODE variable is only evaluated once. You can see this if you omit the #echo off line.
From the set /? documentation:
Finally, support for delayed environment variable expansion has
been added. This support is always
disabled by default, but may be
enabled/disabled via the /V command
line switch to CMD.EXE. See CMD /?
Delayed environment variable expansion is useful for getting around
the limitations of the current
expansion which happens when a line of
text is read, not when it is executed.
The following example demonstrates the
problem with immediate variable
expansion:
set VAR=before
if "%VAR%" == "before" (
set VAR=after
if "%VAR%" == "after" #echo If you see this, it worked
)
would never display the message, since
the %VAR% in BOTH IF statements is
substituted when the first IF
statement is read, since it logically
includes the body of the IF, which is
a compound statement. So the IF
inside the compound statement is
really comparing "before" with "after"
which will never be equal. Similarly,
the following example will not work as
expected:
set LIST=
for %i in (*) do set LIST=%LIST% %i
echo %LIST%
in that it will NOT build up a list of
files in the current directory, but
instead will just set the LIST
variable to the last file found.
Again, this is because the %LIST% is
expanded just once when the FOR
statement is read, and at that time
the LIST variable is empty. So the
actual FOR loop we are executing is:
for %i in (*) do set LIST= %i
which just keeps setting LIST to the
last file found.
Delayed environment variable expansion
allows you to use a different
character (the exclamation mark) to
expand environment variables at
execution time. If delayed variable
expansion is enabled, the above
examples could be written as follows
to work as intended:
set VAR=before
if "%VAR%" == "before" (
set VAR=after
if "!VAR!" == "after" #echo If you see this, it worked
)
set LIST=
for %i in (*) do set LIST=!LIST! %i
echo %LIST%
setlocal EnableDelayedExpansion
will enable the /v flag
Additionally to already anwered here.
Long answer article:
https://rsdn.org/article/winshell/NTCommandProcessor.xml
The google translated variant is pretty broken, but you can at least read the raw text:
https://translate.google.com/?sl=ru&tl=en&text=https%3A%2F%2Frsdn.org%2F%3Farticle%2Fwinshell%2FNTCommandProcessor.xml&op=translate
Start read from Conditional block section.
Short answer:
The block operator (...) blocks a %-variable expansion until the top most scope. You have to exit the scope out to be able to use %-variable as is:
#echo off
set "MODE="
(
(
set MODE=all
echo MODE=%MODE%
)
echo MODE=%MODE%
)
echo MODE=%MODE%
Or use call prefix to reevaluate it in place:
#echo off
set "MODE="
(
(
set MODE=all
call echo MODE=%%MODE%%
)
)
Looks like the read and write use different scoping rules.
If you eliminate this line
set MODE=FOOBAR
it will work as expected. So you'll probably need to have a complex series if if/elses to get the variables populated as you'd like.