Fetching Data from Web Page to Excel - excel

I want to fetch data & save it to Excel. Web data is the following format:
Kawal store Rate this
Wz5a Delhi - 110018 | View Map
Call: XXXXXXXXXX
Distance : Less than 5 KM
Also See : Grocery Stores
Edit this
Photos
I want to save only the bold fields in following format:
COLUMN1 COLUMN2 COLUMN3
Single search page contains different data formats; for example sometimes PHOTOS is not there.
Sample URL: http://www.justdial.com/Delhi/Grocery-Stores-%3Cnear%3E-Ramesh-Nagar/ct-70444/page-10
Page number can be changed to get other data in the series while keeping other URL same

While generating an Actual excel file might be a heavy task, generating is CSV (Comma Seperated Values) is a much easier task, and CSV files are associated with all of the excel similar applications on both Windws, Mac, and all Linux distributions.
If you must insist on using Excel, here is a ready-made PHP utility for that.
Otherwise you can use the integrated fputcsv php function:
<?php
if(isset($_GET['export']) && $_GET['export'] == 'csv'){
header( 'Content-Type: text/csv' );
header( 'Content-Disposition: attachment;filename='.$filename);
$fp = fopen('php://output', 'w');
$myData = array (
array('1234', '567', 'efed', 'ddd'),//row1
array('123', '456', '789'), //row2
array('"aaa"', '"bbb"') //row3
);
foreach($myData as $row){
fputcsv($fp, $fields);
}
fclose($fp);
}
?>
It's as easy is that, the following lines will output a csv file with rows and column, identical to the structure of the myData array, so basically you can replace it's content with whatever you'd like, even a result set from a DB.
On more information about the CSV standard

Related

In Ruby, how would one create new CSV's conditionally from an original CSV?

I'm going to use this as sample data to simplify the problem:
data_set_1
I want to split the contents of this csv according to Column A - DEPARTMENT and place them on new csv's named after the department.
If it were done in the same workbook (so it can fit in one image) it would look like:
data_set_2
My initial thought was something pretty simple like:
CSV.foreach('test_book.csv', headers: true) do |asset|
CSV.open("/import_csv/#{asset[1]}", "a") do |row|
row << asset
end
end
Since that should take care of the logic for me. However, from looking into it, CSV#foreach does not accept file access rights as second parameter, and it gets an error when I run it. Any help would be appreciated, thanks!
I don't see why you would need to pass file access rights to CSV#foreach. This method just reads the CSV. How I would do this is like so:
# Parse the entire CSV into an array.
orig_rows = CSV.parse(File.read('test_book.csv'), headers: true)
# Group the rows by department.
# This becomes { 'deptA' => [<rows>], 'deptB' => [<rows>], etc }
groups = orig_rows.group_by { |row| row[1] }
# Write each group of rows to its own file
groups.each do |dept, rows|
CSV.open("/import_csv/#{dept}.csv", "w") do |csv|
rows.each do |row|
csv << row.values
end
end
end
A caveat, though. This approach does load the entire CSV into memory, so if your file is very large, it wouldn't work. In that case, the "streaming" approach (line-by-line) that you show in your question would be preferrable.

Reading textfile and insert data into database

I have an exported text file that looks like this:
Text file
So it is built up like a table and it is unicode encoded. I don't create the export file, so please don't tell me to use csv files.
I already have a mariadb database in place with with a table that contains the respective headers (ID, Name, ..).
My goal is to read the data from the text file and insert it correctly into the daatabase. I am using node js and would like to know what steps i need to follow in order to accomplish my goal.
Can is use this instruction URL? I already tried it this way but i think the unicode encoding caused some problems.
You should really consider using csv files to import/export any column related data, they have a tabular structure sort of implied from them.
Here in this case, you'll have to write some sort of a parser, which reads your file, one line at a time, then splits the data using multiple spaces as a delimiter, overall not really worth it.
Search into using csvs, there are even npm modules available for use with csv.
So the way you would approach this is with streams. You would read each line into the buffer, parse it and save the data into the database.
Have a look at csv-stream library https://github.com/klaemo/csv-stream. It does the parsing for you and you can configure it to use tab as the delimiter.
const csv = require('csv-streamify')
const fs = require('fs')
const parser = csv({
delimiter: '\t',
columns: true,
});
// emits each line as a an object with keys as headers and properties as row values
parser.on('data', (line) => {
console.log(line)
// { ID: '1', Name: 'test', Date: '2010', State: 'US' }
// Insert row into DB
// ...
})
fs.createReadStream('data.txt').pipe(parser)

ADLA Job: Write To Different Files Based On Line Content

I have a BUNCH of fixed width text files that contain multiple transaction types with only 3 that I care about (121,122,124).
Sample File:
D103421612100188300000300000000012N000002000001000032021420170012260214201700122600000000059500000300001025798
D103421612200188300000300000000011000000000010000012053700028200004017000000010240000010000011NNYNY000001000003N0000000000 00
D1034216124001883000003000000000110000000000300000100000000000CS00000100000001200000033NN0 00000001200
So What I need to do is read line by line from these files and look for the ones that have a 121, 122, or 124 at startIndex = 9 and length = 3.
Each line needs to be parsed based on a data dictionary I have and the output needs to be grouped by transaction type into three different files.
I have a process that works but it's very inefficient, basically reading each line 3 times. The code I have is something like this:
#121 = EXTRACT
col1 string,
col2 string,
col3 string //ect...
FROM inputFile
USING new MyCustomExtractor(
new SQL.MAP<string, string> {
{"col1","2"},
{"col2","6"},
{"col3","3"} //ect...
};
);
OUTPUT #121
TO 121.csv
USING Outputters.Csv();
And I have the same code for 122 and 124. My custom extractor takes the SQL MAP and returns the parsed line and skips all lines that don't contain the transaction type I'm looking for.
This approach also means I'm running through all the lines in a file 3 times. Obviously this isn't as efficient as it could be.
What I'm looking for is a high level concept of the most efficient way to read a line, determine if it is a transaction I care about, then output to the correct file.
Thanks in advance.
How about pulling out the transaction type early using the Substring method of the String datatype? Then you can do some work with it, filtering etc. A simple example:
// Test data
#input = SELECT *
FROM (
VALUES
( "D103421612100188300000300000000012N000002000001000032021420170012260214201700122600000000059500000300001025798" ),
( "D103421612200188300000300000000011000000000010000012053700028200004017000000010240000010000011NNYNY000001000003N0000000000 00" ),
( "D1034216124001883000003000000000110000000000300000100000000000CS00000100000001200000033NN0 00000001200" ),
( "D1034216999 0000000000000000000000000000000000000000000000000000000000000000000000000000000 00000000000" )
) AS x ( rawData );
// Pull out the transaction type
#working =
SELECT rawData.Substring(8,3) AS transactionType,
rawData
FROM #input;
// !!TODO do some other work here
#output =
SELECT *
FROM #working
WHERE transactionType IN ("121", "122", "124"); //NB Note the case-sensitive IN clause
OUTPUT #output TO "/output/output.csv"
USING Outputters.Csv();
As of today, there is no specific U-SQL function that can define the output location of a tuple on the fly.
wBob presented an approach to a potential workaround. I'd extend the solution the following way to address your need:
Read the entire file, adding a new column that helps you identify the transaction type.
Create 3 rowsets (one for each file) using a WHERE statement with the specific transaction type (121, 122, 124) on the column created in the previous step.
Output each rowset created in the previous step to their individual file.
If you have more feedback or needs, feel free to create an item (and voting for others) on our UserVoice site: https://feedback.azure.com/forums/327234-data-lake. Thanks!

Export data to excel from drupal

I need to export all my users with their webform submitted data to excel file.I can export users, but how do this with related webforms I dont now. Please, help me.
How about this:
select CONCAT(GROUP_CONCAT(CONCAT('"', sd.data, '"' )), ', "',u.uid, '","', u.name, '", "', u.mail,'"') from webform_submitted_data sd JOIN webform_submissions s ON s.sid = sd.sid JOIN users u ON u.uid = s.uid GROUP by s.sid LIMIT 1 INTO OUTFILE '/Users/nandersen/Downloads/users.csv' FIELDS TERMINATED BY '' ENCLOSED BY '' LINES TERMINATED BY '\n';
You need to group concatenate the webform data, which is multiple rows but one column with the user data which is in mutiple columns but one row. So by concatenating the data separately and grouping by the submission id, you can get the data you need.
Will output something like this:
"Nate","Andersen","nate#test.com","123 Atlanta Avenue","Nederland","Texas","12345","4095496504","safe_key4","safe_key6","safe_key2","09/07/1989", "69","oknate", "nate#test.com"

How can I import data from text files into Excel?

I have multiple folders. There are multiple txt files inside these folder. I need to extract data (just a single value: value --->554) from a particular type of txt file in this folder.(individual_values.txt)
No 100 Value 555 level match 0.443 top level 0.443 bottom 4343
There will be many folders with same txt file names but diff value. Can all these values be copyed to excel one below the other.
I have to extract a value from a txt file which i mentioned above. Its a same text file with same name located inside different folders. All i want to do is extract this value from all the text file and paste it in excel or txt one below the other in each row.
Eg: The above is a text file here I have to get the value of 555 and similarly from other diff values.
555
666
666
776
Yes.
(you might want to clarify your question )
Your question isn't very clear, I imagine you want to know how this can be done.
You probably need to write a script that traverses the folders, reads the individual files, parses them for the value you want, and generates a Comma Separated Values (CSV) file. CSV files can easily be imported to Excel.
There are two or three basic methods you can use to get stuff into a Excel Spreadsheet.
You can use OLE wrappers to manipulate Excel.
You can write the file in a binary form
You can use Excel's import methods to take delimited text in as a spreadsheet.
I chose the latter way, because 1) it is the simplest, and 2) your problem is so poorly stated as it does not require a more complex way. The solution below outputs a tab-delimited text file that Excel can easily support.
In Perl:
use IO::File;
my #field_names = split m|/|, 'No/Value/level match/top level/bottom';
#' # <-- catch runaway quote
my $input = IO::File->new( '<data.txt' );
die 'Could not open data.txt for input!' unless $input;
my #data_rows;
while ( my $line = <$input> ) {
my %fields = $line =~ /(level match|top level|bottom|Value|No)\s+(\d+\S*)/g;
push #data_rows, \%fields if exists $fields{Value};
}
$input->close();
my $tab_file = IO::File->new( '>data.tab' );
die 'Could not open data.tab for output!' unless $tab_file;
$tab_file->print( join( "\t", #field_names ), "\n" );
foreach my $data_ref ( #data ) {
$tab_file->print( join( "\t", #$data_ref{#field_names} ), "\n" );
}
$tab_file->close();
NOTE: Excel's text processing is really quite neat. Try opening the text below (replacing the \t with actual tabs) -- or even copying and pasting it:
1\t2\t3\t=SUM(A1:C1)
I chose c#, because i thought it would be fun to use a recursive lambda. This will create the csv file containing matches to the regex pattern.
string root_path = #"c:\Temp\test";
string match_filename = "test.txt";
Func<string,string,StringBuilder, StringBuilder> getdata = null;
getdata = (path,filename,content) => {
Directory.GetFiles(path)
.Where(f=>
Path.GetFileName(f)
.Equals(filename,StringComparison.OrdinalIgnoreCase))
.Select(f=>File.ReadAllText(f))
.Select(c=> Regex.Match(c, #"value[\s\t]*(\d+)",
RegexOptions.IgnoreCase))
.Where(m=>m.Success)
.Select(m=>m.Groups[1].Value)
.ToList()
.ForEach(m=>content.AppendLine(m));
Directory.GetDirectories(path)
.ToList()
.ForEach(d=>getdata(d,filename,content));
return content;
};
File.WriteAllText(
Path.Combine(root_path, "data.csv"),
getdata(root_path, match_filename, new StringBuilder()).ToString());
No.
just making sure you have a 50/50 chance of getting the right answer
(assuming it was a question answerable by Yes and No) hehehe
File_not_found
Gotta have all three binary states for the response.

Resources