How can I convert a list of space separated files to comma separated files, in nmake? - nmake

Let's say I have a makefile (using nmake) with a SOURCES macro like this:
SOURCES = C:\folder\file1.cpp C:\folder\file2.cpp C:\folder\file3.cpp
I have a tool that needs to input these files as items in a comma separated list, like this:
C:\folder\file1.cpp,C:\folder\file2.cpp,C:\folder\file3.cpp
Is there a way, in nmake, to convert SOURCES into a comma separated list?
Edit:
Will this work (note the space after the colon)?
COMMALIST=$(SOURCES: =,)

Turns out my suggested approach can work. The sole caveat is that you need careful spacing for the inputs. For instance, using the question's SOURCES example, COMMALIST would be:
,C:\folder\file1.cpp,C:\folder\file2.cpp,C:\folder\file3.cpp
By removing the leading space, I get the output I desire.

Related

use rename command to batch rename the files

I tried to just keep the numbers in the square brackets and the file extensions.
so the files below:
【004】ssd水电费.txt
【006】佛山市,地方cd2.txt
【022】风sf.pdf
I'd like to be:
004.txt
006.txt
022.pdf
or just like
4.txt
6.txt
22.pdf
I know the 'rename 's/old-exp/new-exp' command and a little bit regex, however I could not found a way to match the regex what i expected.
I tried rename 's/[\u4e00-\u9eff]+//' * to replace the Chinese chars but not work.
You want to use something like the following:
rename 'tr/A-Za-z0-9.//cd; s/^(\d+).*(\.[a-z]+)$/$1$2/' *
(You'll want to use -n first to test that it does what you want.)
That removes all characters from the file name other than A-Za-z0-9. and then pulls out only the extension and the digits at the beginning.
The reason the Unicode match doesn't work is because rename uses byte strings, not Unicode strings, since not all Unix paths are guaranteed to be valid Unicode. Therefore, unless you have to, it's easier to simply filter out the byte values that you don't want rather than than convert them to Unicode.

How to remove \r\n line breaks in a text file that are within quotes and not the end of the row

I have a large set of files that contain line breaks within a column that are all wrapped in quotes, but U-SQL cannot process the files because it is seeing the \r\n as the end of the row despite being wrapped in quotes.
Is there an easy way to fix these files other than opening each file up individually in something like notepad++? It seems there should be a way to ignore line breaks if they are contained within quotes.
Example is something like this:
1,200,400,"123 street","123 street,\r\nNew York, NY\r\nUnited States",\N,\N,200\r\n
Notepad++ works fine for finding and replacing values manually, but I'm trying to find a batch way to do this because I have multiple files (50+ per source table) and hundreds of thousands of records in each that I need to fix.
According to U-SQL GitHub issue 84: USQL and embedded newline characters you can either build a custom extractor, or try to use the escapeCharacter parameter of the built-in extractor:
USING Extractors.Csv(quoting : true, escapeCharacter : '\\') // quoting is true by default, but it does not hurt to repeat.

Notepad++ Search and Replace with Multiple Lines, Lookahead, Wildcard Issues?

I have a tricky problem. I need to make a minor change to a large number of xml files (500+). The change involves switching a value from 'false' to 'true.' The line that needs to change looks like this:
<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
And it needs to become:
<VoltageIsMeasuredLineLine>true</VoltageIsMeasuredLineLine>
Unfortunately there are numerous instances of this set of tags in each file, so we can't do a simple find and replace. The thing that makes this set of tags unique is that they come some lines after:
<CID>STATIONNAME.BUS.STATIONNAME.DKV</CID>
However, each file has a different station name, so I had used wildcards to filter them out.
<CID>.*.BUS.*.DKV</CID>
So the code looks like this:
<CID>STATIONNAME.BUS.STATIONNAME.DKV</CID>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
And other sections in the code look like:
<CID>STATIONNAME.COLR.STATIONNAME.FCLR</CID>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
So I'm using the CID .BUS .DKV line as a starting point. Basically I need to change the first occurance of the VoltageisMeasured line that comes directly AFTER the CID .BUS .DKV line. But there's a lot of other lines in between (none of which are consistent from file to file) that I don't care about and are messing up my search.
I was suggested to try a Lookahead, but it did not work. This it the code I was told to try:
(?!<CID>.*.BUS.*.DKV</CID>(.*?)<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
Hower, that line is also returning the lines without .BUS and .DKV, which are the really important factors in determining this section's uniqueness. How can I modify this Lookahead so that it only returns sections that had the .BUS and .DKV in the CID part?
Another idea I had was to select everything in between the CID and Voltage parts, keep the selections in groups, and then print the first two groups as-is, and replace the third. Like this:
(<CID>.*.BUS.*.DKV</CID>)(.*)(<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>)
And replace with
\1\2<VoltageIsMeasuredLineLine>true</VoltageIsMeasuredLineLine>
But something is still wrong with the CID part. I'm sure these wildcards are part of the problem but I've hit a wall. Any help appreciated!
Try the following in Notepad++ (Version >= 6.0) with replace
Activate Option matches newline and
set in Find what:
(<CID>[A-Za-z\.]*BUS[A-Za-z\.]*</CID>.*?<VoltageIsMeasuredLineLine>)false
and in Replace with:
\1true
The assumption is that every STATIONNAME.BUS.STATIONNAME.DKV has one corresponding VoltageIsMeasuredLineLine (as I read from your question)
The trick is, to use greedy search. I look for the first VoltageIsMeasuredLineLine after VoltageIsMeasuredLineLine

How to split comma-separated words?

How can i split my words in new line (i have a lot of them) currently separated with comma,
Example of my file contains words in a single line:
Viktor, Vajt, Adios, Test, Line, Word1, Word2, etc...
The the output file should look like:
Viktor
Vajt
Adios
Test
...
If you are using NotePad++, this can easily be done. See image below
If you – for some reason – want to stick with the doc Format (to keep formatting, etc.) you could use LibreOffice (http://de.libreoffice.org/) to do the following replacement:
I agree installing LibreOffice just for this replacement would be overkill though.
Not sure what language you are in but you could use an explode/split function which would create an array of values split at ','. Then you could loop through the array and append the new line special character "\n". You would wind up with something like:
$fileContentsAsString; //read file into string variable
$valuesArray = explode(',' $fileContentsAsString);
$outputString;
foreach($valuesArray as $item){
$outputString .= $item . "\n";
}
For a quick text edititng i'm using online tool (http://regexptool.org/). Also you can do it step by step (screen).

filetype for numbers separated by whitespace?

I am thinking of using a filetype of numbers (as text) separated by whitespace in one of my apps. Any kind of whitespace (space, line break, whatever) will do. I am curious if this is a standard file type? If so, what is it called?
Most programs seem to implement it as txt, and for a simple enough format, I think that would be fine.
If your intention is to allow access to the file from excel or ilk, I would reccomend using Tab characters (\t or 0x0b) to separate fields with newlines separating records. Name it either .tsv or .txt and you have Tab Separated Values.

Resources