Reading textfile and insert data into database - node.js

I have an exported text file that looks like this:
Text file
So it is built up like a table and it is unicode encoded. I don't create the export file, so please don't tell me to use csv files.
I already have a mariadb database in place with with a table that contains the respective headers (ID, Name, ..).
My goal is to read the data from the text file and insert it correctly into the daatabase. I am using node js and would like to know what steps i need to follow in order to accomplish my goal.
Can is use this instruction URL? I already tried it this way but i think the unicode encoding caused some problems.

You should really consider using csv files to import/export any column related data, they have a tabular structure sort of implied from them.
Here in this case, you'll have to write some sort of a parser, which reads your file, one line at a time, then splits the data using multiple spaces as a delimiter, overall not really worth it.
Search into using csvs, there are even npm modules available for use with csv.

So the way you would approach this is with streams. You would read each line into the buffer, parse it and save the data into the database.
Have a look at csv-stream library https://github.com/klaemo/csv-stream. It does the parsing for you and you can configure it to use tab as the delimiter.
const csv = require('csv-streamify')
const fs = require('fs')
const parser = csv({
delimiter: '\t',
columns: true,
});
// emits each line as a an object with keys as headers and properties as row values
parser.on('data', (line) => {
console.log(line)
// { ID: '1', Name: 'test', Date: '2010', State: 'US' }
// Insert row into DB
// ...
})
fs.createReadStream('data.txt').pipe(parser)

Related

Recognizing Date values in CSVtoJSON - node

I'm trying to convert CSVs to JSONs in Node, and am running into a problem dynamically parsing date values (in addition to the built-in checkType: true for numbers and boolean values).
Using CSVtoJSON (csvtojson), I'm able to write an explicit declaration in the colParser parameter,
const jsonArray=await csv({checkType: true, colParser:{
'Measurement Timestamp':(item => {return new Date(item);})
}
}).fromFile(csvFilePath);
and this works well, but I want to be able to recognize the content on its own without me needing to explicitly write the header name.
I'm even happy to use a filter like let isTimeDate = (data) => /time|date/i.test(data) to run on the header names prior to the parsing, but I'm running into a wall trying to have the colParser cycle through that list.
I also don't want to have to run a isDate function on every single element, because that's just wasteful.

In Ruby, how would one create new CSV's conditionally from an original CSV?

I'm going to use this as sample data to simplify the problem:
data_set_1
I want to split the contents of this csv according to Column A - DEPARTMENT and place them on new csv's named after the department.
If it were done in the same workbook (so it can fit in one image) it would look like:
data_set_2
My initial thought was something pretty simple like:
CSV.foreach('test_book.csv', headers: true) do |asset|
CSV.open("/import_csv/#{asset[1]}", "a") do |row|
row << asset
end
end
Since that should take care of the logic for me. However, from looking into it, CSV#foreach does not accept file access rights as second parameter, and it gets an error when I run it. Any help would be appreciated, thanks!
I don't see why you would need to pass file access rights to CSV#foreach. This method just reads the CSV. How I would do this is like so:
# Parse the entire CSV into an array.
orig_rows = CSV.parse(File.read('test_book.csv'), headers: true)
# Group the rows by department.
# This becomes { 'deptA' => [<rows>], 'deptB' => [<rows>], etc }
groups = orig_rows.group_by { |row| row[1] }
# Write each group of rows to its own file
groups.each do |dept, rows|
CSV.open("/import_csv/#{dept}.csv", "w") do |csv|
rows.each do |row|
csv << row.values
end
end
end
A caveat, though. This approach does load the entire CSV into memory, so if your file is very large, it wouldn't work. In that case, the "streaming" approach (line-by-line) that you show in your question would be preferrable.

Node.js: readfile reads just one line?

I'm trying to parse a CSV file line by line.
However, when i output the contents of the file it shows just one line?
Here's the code:
fs.readFile('data.csv', 'utf8', function (err,data) {
if (err) {
return console.log(err);
}
console.log(data)
var tbl = data.split('\n');
console.log(tbl.length);
})
First console.log outputs just one line of data while tbl.length outputs 1.
Why is it reading just one line instead of the entire file?
EDIT: Something strange going on, if i do data.length i get 580218, which is much more than that one line i'm getting as output?
Wanted to give Jonathan the chance to answer so he could get the points.
So couple of issues going on here.
Listing just one line from the CSV instead of the whole data.
Turns out the JSON.strigify(string) did the trick.
The extra lines or invalid characters may have caused it to output just one line, instead of the whole file.
The array.length for the split operation returned 1 line. I noticed later that the entire csv file was the [0] element of the array. Apparently something to do with the new lines in the string.
So i did stringily of the csv, and improved my split line a bit, and it worked.
Here's the modified code:
tbl = data.replace(/(\r\n|\n|\r)/gm,"|");
tbl = tbl.split("|")
console.log(tbl.length);
Voila!

Any way to figure out what language a certain file is in?

If I have an arbitrary file sent to me, using Node.js, how can I figure out what language it's in? It could be a PHP file, HTML, HTML with JavaScript inline, JavaScript, C++ and so on. Given that each of these languages is unique, but shares some syntax with other languages.
Are there any packages or concepts available to figure out what programming language a particular file is written in?
You'd need to get the extension of the file. Are you getting this file with the name including the extension or just the raw file? There is no way to tell if you do not either get the file name with the extension or to scan the dir it's uploaded to and grabbing the names of the files, and performing a directory listing task to loop through them all. Node has file system abilities so both options work. You need the file's name with extension saved in a variable or array to perform this. Depending on how you handle this you can build an array of file types by extensions optionally you can try using this node.js mime
Example:
var fileExtenstions = {h : "C/C++ header", php : "PHP file", jar : "Java executeable"};
You can either split the string that contains the files name using split() or indexOf() with substring.
Split Example:
var fileName = "hey.h"; // C/C++/OBJ-C header
var fileParts = fileName.split(".");
// result would be...
// fileParts[0] = "hey";
// fileParts[1] = "h";
Now you can loop the array of extensions to see what it is and return the description of the file you set in the object literal you can use a 2d array and a for loop on the numeric index and check the first index to see it's the extension and return the second index(second index is 1)
indexOf Example:
var fileName = "hey.h";
var delimiter = ".";
var extension = fileName.substring( indexOf( delimiter ), fileName.length );
now loop through the object and compare the value

Fetching Data from Web Page to Excel

I want to fetch data & save it to Excel. Web data is the following format:
Kawal store Rate this
Wz5a Delhi - 110018 | View Map
Call: XXXXXXXXXX
Distance : Less than 5 KM
Also See : Grocery Stores
Edit this
Photos
I want to save only the bold fields in following format:
COLUMN1 COLUMN2 COLUMN3
Single search page contains different data formats; for example sometimes PHOTOS is not there.
Sample URL: http://www.justdial.com/Delhi/Grocery-Stores-%3Cnear%3E-Ramesh-Nagar/ct-70444/page-10
Page number can be changed to get other data in the series while keeping other URL same
While generating an Actual excel file might be a heavy task, generating is CSV (Comma Seperated Values) is a much easier task, and CSV files are associated with all of the excel similar applications on both Windws, Mac, and all Linux distributions.
If you must insist on using Excel, here is a ready-made PHP utility for that.
Otherwise you can use the integrated fputcsv php function:
<?php
if(isset($_GET['export']) && $_GET['export'] == 'csv'){
header( 'Content-Type: text/csv' );
header( 'Content-Disposition: attachment;filename='.$filename);
$fp = fopen('php://output', 'w');
$myData = array (
array('1234', '567', 'efed', 'ddd'),//row1
array('123', '456', '789'), //row2
array('"aaa"', '"bbb"') //row3
);
foreach($myData as $row){
fputcsv($fp, $fields);
}
fclose($fp);
}
?>
It's as easy is that, the following lines will output a csv file with rows and column, identical to the structure of the myData array, so basically you can replace it's content with whatever you'd like, even a result set from a DB.
On more information about the CSV standard

Resources