I want to compare the data in two .csv files.Have to compare the updated data between these two .csv file using nodejs.
Is ther any possibilities to do it in Nodejs.
Thanks,I am very newbie to this.
It will be easiest using one of the following modules:
https://www.npmjs.com/package/csv
https://www.npmjs.com/package/tsv
or other that you find in:
https://www.npmjs.com/browse/keyword/csv
https://www.npmjs.com/browse/keyword/tsv
(don't worry if it's CSV or TSV - just make sure that you use the correct delimiter which is comma in your case).
Related
Is it possible to compare two csv files in nifi(1.7) using groovy script? One file is at SFTP server and other is flow file. Problem is similar to that of link given below.
I have tried the approach given in link below :-
How can I compare the one line in one CSV with all lines in another CSV file?
script that reads CSV files and gets headers and filter by specific column, I have tried researching on it but nothing of quality I have managed to get.
Please any help will be deeply appreciated
There's a standard csv library included with Python.
https://docs.python.org/3/library/csv.html
It will automatically create a dictionary of arrays where the first row in the CSV determines the keys in the dict.
You can also follow pandas.read_csv for the same.
I need to read xlsx in nodejs. Xlsx contains text with accents and apostrophes and so on. Then i have to save the text in json file.
What are the best practices to perform that task?
Stage 1 - take a look at this module node-xlsx or more robust and possibly better for your needs xlsx.
Stage 2 - Writing the file to JSON - if the module can return a JSON format then great. If you use xlsx it has an option to JSON --> take a look here.
Since you may need to actually strip and/or protect special accents etc. you may need to validate the data which is returned before producing a JSON file.
As to actually writing a JSON file, there are a huge amount of NPM modules for the task.
I am using apapche spark. I want to access multiple json files from spark on date basis. How can i pick multiple files i.e. i want to provide range that files ending with 1034.json up to files ending with 1434.json. I am trying this.
DataFrame df = sql.read().json("s3://..../..../.....-.....[1034*-1434*]");
But i am getting the following error
at java.util.regex.Pattern.error(Pattern.java:1924)
at java.util.regex.Pattern.range(Pattern.java:2594)
at java.util.regex.Pattern.clazz(Pattern.java:2507)
at java.util.regex.Pattern.sequence(Pattern.java:2030)
at java.util.regex.Pattern.expr(Pattern.java:1964)
at java.util.regex.Pattern.compile(Pattern.java:1665)
at java.util.regex.Pattern.<init>(Pattern.java:1337)
at java.util.regex.Pattern.compile(Pattern.java:1022)
at org.apache.hadoop.fs.GlobPattern.set(GlobPattern.java:156)
at org.apache.hadoop.fs.GlobPattern.<init>(GlobPattern.java:42)
at org.apache.hadoop.fs.GlobFilter.init(GlobFilter.java:67)
Please specify a way out.
You can read something like this.
sqlContext.read().json("s3n://bucket/filepath/*.json")
Also, you can use wildcards in the file path.
For example:
sqlContext.read().json("s3n://*/*/*-*[1034*-1434*]")
I'm trying to receive some data in csv format, what I read is that StrongLoop only works with json data. So can I receive csv and transform to json to process the data?
Thanks.
This isn't a StrongLoop specific question. It is a general Node.js and data question. As such, I will answer in a generic fashion, but it is applicable to StrongLoop.
You will need to use a library to convert the delimited file into a JavaScript object. There are many packages on npm for reading/parsing/transforming/etc. CSV files: search npm.
The package that I have used extensively is David's CSV parser.
These libraries will allow you to parse and transform CSV into JavaScript objects (JSON).
Beware, however, that most CSV that I have dealt with does not conform to well formatted CSV. They don't properly escape quotes, quote strings with delimiters, etc.