Can gulp collect lines grep:ed into single output file? - node.js

I'm trying to filter out lines from all .js source files, and put into a separate file. (Specifically, I'm trying to grep all calls to a string translation function and post-process them).
I think I have the different parts figured out but can't make them fit together.
For each file, process it
Write each file's grep:ed lines to output.
Append the result to a file
I've tried to through.push(<output per file>) from the plugin, but the following step expects a file, not a string.
From there, I expect I could do something like gulp-concat or stream merge on the results and pipe it on to gulp.dist, but there's bit missing here.

I figured out a way - simply replace the Vinyl file's content with the lines to output, and push that through to through.push.

Related

python - handle strings in a file with some Japanese in it

I have a .c file that I want to open with python 3 to update a specific number on a specific line.
It seems like the most common way to do this would be to read the file in, write each line to a temporary file, when I get to the line I want, modify it, then write it to the temp file and keep going. Once I'm done, write the contents of the temp file back to the original file.
The problem that I have, is that in the comments of the file there are Japanese characters. I know I can still read it in by adding the error equal ignore argument, that allows me to still read the lines in but it gets rid of the Japanese characters completely and I need to preserve those.
I haven't been able to find a way how to do this. Is there any way to read in a file that's part in Japanese and part in English?

opening a gzipped fil, characters following three pipes ("|||") are not visible

My input file is a gzipped file containing genomic information. I'm trying to parse the content on a line-by-line basis and have run into a strange problem.
Any given line looks something like this:
AC=26;AF=0.00519169;AN=5008;NS=2504;DP=17308;EAS_AF=0;AMR_AF=0.0072;AFR_AF=0.0015;EUR_AF=0.0109;SAS_AF=0.0082;AA=A|||;VT=SNP
However, when I print out what is being read in...
import gzip
with gzip.open(myfile.gz, 'rt') as f:
for line in f:
print(line)
The line looks like this:
AC=26;AF=0.00519169;AN=5008;NS=2504;DP=17308;EAS_AF=0;AMR_AF=0.0072;AFR_AF=0.0015;EUR_AF=0.0109;SAS_AF=0.0082;AA=A|||
Whatever information comes after the "|||" has been truncated.
Moreover, I can't even search the lines for strings that follow the "|||" (e.g. "VT=SNP" in line always returns False) I also can't line.strip("|||")
Any advice on what is causing this or what I need to look at?
Thank you for any help
EDIT: ok, it looks like there was something wrong with the gzip file. I uncompressed it and the script ran fine. Then I recompressed it and the script again ran fine (using gzip.open). Is there any straightforward way to compare the two compressed files (ie, the one that doesn't get read properly vs the one that works) so that I might get a hint at the root cause?

Checking for EOF using shell script

I have a project that involves extracting data from a database into a text file, and then ingesting it into Hadoop. So i want to create a shell script that NiFi can run to automatically to check if a text file is extracted and ingest it, but I need to make sure that the whole data has been extracted first before ingesting it. Meaning I would need to check that the text file has an EOF, how do I do that?
Don't have any code as of yet, I have very little knowledge writing shell scripts.
While creating the file, use a different name. Rename it to the expected name once the extraction is done. Then, the other process can start its work once the file exists.
EOF is not something that actually gets put in the text file - in fact, there isn't really any EOF value. EOF or end-of-file is a condition that occurs when you try to consume input from a source that has none to give.
There is no general marker you can look for in your text files that will tell you whether they are complete. You'll need to make your script indicate when a given chunk of data has been extracted in some other way. There are many possibilities; you could change the name of the file as choroba suggested, or you could create a lock file and remove it once the data extraction is done, or you could have your extraction program write a distinctive sequence of bytes to the file at the end, or so on.

Linux terminal script to create boilerplate files in current working directory with one varying word?

I have to create two boilerplate files, both of which always have the same content, with the EXCEPTION of a single word. I'm thinking of creating a command or something that I can run in the Linux terminal (Ubuntu), along with an argument that represents the one word which can vary in the files created. Perhaps a batch file will accomplish this, but I don't know what it will look like.
I will be able to run this command every time I create these boilerplate files, instead of pasting the boilerplate and changing the one word in the file that has to be changed.
These file paths relative to my current working directory are:
registration.php
etc/module.xml
A simple Python script that reads in the file as string and replaces the occurrence would probably be the quickest. Something like:
with open('somefile.txt', 'r+') as inputFile:
txt=inputFile.read().replace('someword', 'replacementword')
inputFile.seek(0)
inputFile.write(txt)
inputfile.close()

Piping SVG file into ImageMagick

Sorry if this belongs on serverfault
I'm wondering what the proper way is to use an SVG(xml) string as standard input
for a "convert msvg:- jpeg:- 2>&1" command (using linux)
Currently I'm just saving a temp file to use as input,
but the data originates from an API in my case, so feeding
the string directly to the command would obviously be most efficient.
I appreciate everyone's help. Thanks!
This should work:
convert - output.jpg
Example:
convert logo: logo.svg
cat logo.svg | convert - logo.jpg
Explanation:
The example's first line creates an SVN file and writes it to disk. This is only a preparatory stop so that we can run the second line.
The second line is a pipeline of two commands: cat streams the bytes of the file to stdout (standard output).
The first line served only as preparation for the next command in the pipeline, so that this next command has something to read in.
This next command is convert.
The - character is a way to tell convert to read its input data not from disk, but from stdin (standard input).
So convert reads its input data from its stdin and writes its JPEG output to the file logo.jpg.
So my first command/line is similar to your step described as 'currently I'm just saving a temp file to use as input'.
My second command/line does not use your API (I don't have access to it, do I?), but it demonstrates a different method to 'feeding a string directly to the command'.
So the most important lesson is this: Whereever convert would usually read input from a file and where you would write the file's name on the commandline, you can replace the filename by - to tell convert it should read from stdin. (But you need to make sure that there is actually something offered on convert's standard input which it can digest...)
Sorry, I can't explain better than this...

Resources