This is my Output using system.out.println which i got in Linux Server .
The problem is taht , i am not able to view the output from serial number 1 , as the below occupies the entire screen .
please tell me how can i scroll to top ??
341:allitems: tq_relation
342:allitems: tr_num
343:allitems: trader_id
344:allitems: tradetick
345:allitems: trading_state
346:allitems: treas_shrs
347:allitems: treas_stk
348:allitems: treasury_yield
349:allitems: trend
350:allitems: uask_quote
351:allitems: uask_quote_date
352:allitems: ubid_quote
353:allitems: ubid_quote_date
354:allitems: under_cusip
355:allitems: undersymbol
356:allitems: unique_symbol
357:allitems: unit_measure
358:allitems: unpriced
359:allitems: unsolicited
360:allitems: valoren
361:allitems: value_pr_shortinterest
362:allitems: value_shortinterest
363:allitems: vega
364:allitems: vl
365:allitems: vol
366:allitems: volatility12
367:allitems: vwap
368:allitems: wanted_off_bid_ind
369:allitems: wk52hi
370:allitems: wk52hidate
371:allitems: wk52hidate_t
372:allitems: wk52lo
373:allitems: wk52lodate
374:allitems: wk52lodate_t
375:allitems: wkinprog
376:allitems: xchg
377:allitems: xdate
378:allitems: xday
379:allitems: xmonth
380:allitems: xyear
381:allitems: yield
The Power of Linux is that you are able to redirect in and output of certain commands to other commands as using the output of program 1 as input for program 2. This can be archived by using:
program1|program2
and more specific you could use a pager like less or more. With the less command you are able to scroll back and forth and search specific patterns in the output. The more command displays the output page wise. The next page of the file is displayed by hitting the enter key. For your purpose all you have to to is using.
myprogram|less
or
myprogram|more
whatever you prefer. The other approach is to redirect the output to a file. You can do this with the Redirection Operator > and
myprogramm > log.txt
will log the output log.txt.
There is even a third method using the script command. When you are typing
script log.txt
at least in bash this would open a subshell and every input and output that is will get logged to log.txt. You can close and afterwards access the logile by using exit. In the last two cases you can view the log.txt file with you favorite text editor or pager.
You can redirect the output of you program to a text file:
$ ./myprogram >output.txt
Then view the text file with any text editor you like.
You can also pipe the result to the more command:
$ ./myprogram | more
Related
hope everyone fine. I am stacked badly so need your help
I am using a for loop to collect all folder names
for tamplate in /root/tool/nuclei-templates/*/
do
echo $tamplate
done
Output
/root/tool/nuclei-templates/brute-force/
/root/tool/nuclei-templates/cves/
/root/tool/nuclei-templates/dns/
/root/tool/nuclei-templates/files/
/root/tool/nuclei-templates/generic-detections/
/root/tool/nuclei-templates/panels/
/root/tool/nuclei-templates/payloads/
/root/tool/nuclei-templates/security-misconfiguration/
/root/tool/nuclei-templates/subdomain-takeover/
/root/tool/nuclei-templates/technologies/
/root/tool/nuclei-templates/tokens/
/root/tool/nuclei-templates/vulnerabilities/
/root/tool/nuclei-templates/workflows/
I am using this output on a tool that need those folder path. That tool also give output like this
nuclei -l url.txt -t /root/tool/nuclei-templates/brute-force/ -o result.brute-force
Output
result.brute-force
But as i am using a for loop to automate this scan part i also need to generate unique output for each result.
I am expecting a output like this
result.brute-force
reesult.cves
result.dns
result.files
Generally for each tamplate load with for loop it should generate a output with the name of that specific tamplate folder.
If everything work well this should give me 13 unique result with that output pattern i mentioned.
I'm trying to figure out a good way to increase the productivity of my data entry job.
What I am looking to do is come up with a way to scrape data from a PDF and input it into Excel.
More specifically the data I am working with is from grocery store flyers. As it stands now we have to manually enter every deal in the flyer into a database. A sample of a flyer is http://weeklyspecials.safeway.com/customer_Frame.jsp?drpStoreID=1551
What I am hoping to do is have columns for products, price, and predefined options (Loyalty Cards, Coupons, Select Variety... that sort of thing).
Any help would be appreciated, and if I need to be more specific let me know.
After looking at the specific PDF linked to by the OP, I have to say that this is not quite displaying a typical table format.
It contains many images inside the "cells", but the cells are not all strictly vertically or horizontally aligned:
So this isn't even a 'nice' table, but an extremely ugly and awkward one to work with...
Having said that, I'll have to add:
Extracting even 'nice' tables from PDFs in general is extremely difficult...
Standard PDFs do not provide any hints about the semantics of what they draw on a page:
the only distinction that the syntax provides is the distinctions between vector elements (lines, fills,...), images and text.
Whether any character is part of a table or part of a line or just a lonely, single character within an otherwise empty area is not easy to recognize programmatically by parsing the PDF source code.
For a background about why the PDF file format should never, ever be thought of as suitable for hosting extractable, structured data, see this article:
Why Updating Dollars for Docs Was So Difficult (ProPublica-Website)
...but doing so with TabulaPDF works very well!
Having said the above now let me add this:
For an amazing open source family of tools that gets better and better from week to week for extracting tabular data from PDFs (unless they are scanned pages) -- contradicting what I said in my introductionary paragraphs! -- check out TabulaPDF. See these links:
Introducing Tabula: Upload a PDF, get back tabular CSV data. Poof!
Tabula-Extractor: A Command Line Interface to Tabula
Tabula source code repository
Tabula API (upcoming, not ready yet)
Tabula-Extractor is written in Ruby.
In the background it makes use of PDFBox (which is written in Java) and a few other third-party libs.
To run, Tabula-Extractor requires JRuby-1.7 installed.
Installing Tabula-Extractor
I'm using the 'bleeding-edge' version of Tabula-Extractor directly from its GitHub source code repository.
Getting it to work was extremely easy, since on my system JRuby-1.7.4_0 is already present:
mkdir ~/svn-stuff
cd ~/svn-stuff
git clone https://github.com/tabulapdf/tabula-extractor.git git.tabula-extractor
Included in this Git clone will already be the required libraries, so no need to install PDFBox.
The command line tool is in the /bin/ subdirectory.
Exploring the command line options:
~/svn-stuff/git.tabula-extractor/bin/tabula -h
Tabula helps you extract tables from PDFs
Usage:
tabula [options] <pdf_file>
where [options] are:
--pages, -p <s>: Comma separated list of ranges, or all. Examples:
--pages 1-3,5-7, --pages 3 or --pages all. Default
is --pages 1 (default: 1)
--area, -a <s>: Portion of the page to analyze
(top,left,bottom,right). Example: --area
269.875,12.75,790.5,561. Default is entire page
--columns, -c <s>: X coordinates of column boundaries. Example
--columns 10.1,20.2,30.3
--password, -s <s>: Password to decrypt document. Default is empty
(default: )
--guess, -g: Guess the portion of the page to analyze per page.
--debug, -d: Print detected table areas instead of processing.
--format, -f <s>: Output format (CSV,TSV,HTML,JSON) (default: CSV)
--outfile, -o <s>: Write output to <file> instead of STDOUT (default:
-)
--spreadsheet, -r: Force PDF to be extracted using spreadsheet-style
extraction (if there are ruling lines separating
each cell, as in a PDF of an Excel spreadsheet)
--no-spreadsheet, -n: Force PDF not to be extracted using
spreadsheet-style extraction (if there are ruling
lines separating each cell, as in a PDF of an Excel
spreadsheet)
--silent, -i: Suppress all stderr output.
--use-line-returns, -u: Use embedded line returns in cells. (Only in
spreadsheet mode.)
--version, -v: Print version and exit
--help, -h: Show this message
Extracting the table which the OP wants
I'm not even trying to extract this ugly table from the OP's monster PDF. I'll leave it as an excercise to these readers who are feeling adventurous enough...
Instead, I'll demo how to extract a 'nice' table. I'll take pages 651-653 from the official PDF-1.7 specification, here represented with screenshots:
I used this command:
~/svn-stuff/git.tabula-extractor/bin/tabula \
-p 651,652,653 -g -n -u -f CSV \
~/Downloads/pdfs/PDF32000_2008.pdf
After importing the generated CSV into LibreOffice Calc, the spreadsheet looks like this:
To me this looks like the perfect extraction of a table which did spread over 3 different PDF pages. (Even the newlines used within table cells made it into the spreadsheet.)
Update
Here is an ASCiinema screencast (which you also can download and re-play locally in your Linux/MacOSX/Unix terminal with the help of the asciinema command line tool), starring tabula-extractor:
I have R code as below. Below code resides in a file called 'iot.R'. I am executing it in Linux.
I want to print content of variable 'fileinformation' to a file mentioned by file=fileConn...
I thought that the 3rd line will solve the issue, but it is not giving the required output :(
fileinformation = system(paste("file", filenames[1]))
#print(fileinformation)
cat(print(fileinformation),"\r\n","\r\n", file=fileConn)
When I run the file, i get below result. It prints to my screen, rather than writing to the file :(
> source('iot.R')
CH7Data_20130401T135010.csv: ASCII text, with CRLF line terminators
[1] 0
--------------------update1
I also tried below command, but didnt get the expected rsult
cat(capture.output(fileinformation),"\r\n","\r\n", file=fileConn)
You need to set the intern argument to TRUE in your call to system. For instance:
fileinformation<-system("file cinzia_2.gif",intern=TRUE)
fileinformation
#[1] "cinzia_2.gif: GIF image data, version 89a, 640 x 640"
Of course I tried a file on my pc. Setting intern to TRUE the return value of system becomes the console output of the command. Then, when you call cat, you don't need to enclose fileinformation into print, but a simple cat(fileinformation,"\r\n","\r\n", file=fileConn) will suffice.
Hi Just a comment as I dont have enough rep to comment in the normal way. but cant you use
write.table
to save the output to a file? It may be easier?
I would like to capture all the commands fired by a user in a session. This is needed for the purpose of auditing.
I used some thing like below,
LoggedIn=`date +"%B-%d-%Y-%M:%H"`
HostName=`hostname`
UNIX_USER=`who am i | cut -d " " -f 1`
echo " Please enter a Change Request Number for which you are looging in : "
read CR_NUMBER
FileName=$HostName-$LoggedIn-$CR_NUMBER-$UNIX_USER
script $FileName
I have put this snippet in .profile file, so that as soon as the user logs in to a SU account this creates the file. The plan is to push this file to a central repository where an auditor can look into those files.
But there are couple of problems in this.
The "script" command spools all the data from the session, for example say, a user cats a property file, It appends all the data of the property file to the auditing file.
Unless user fires the 'exit' command, the data will not be spooled to auditing file, by any chance if user logs out with out firing exit command, the auditing file will be empty.
Is there any better solution for auditing ? History file is not an option since it does not tell me for which Change Request number ( internal to my organisation) the commands are fired. Is there any other way just capture only the commands fired but not the output ?
Some of the previous discussion are here and here
I think this software exactly matches your need:
https://github.com/a2o/snoopy
I have a perl script the creates a report based on an xml definition. Currently these definitions all exist as .xml files.
So I have the script run-report.pl, which can take a path to a definition file and create the report.
Now I want to create run-reports-from-db.pl, which will generate the report definition based on same database entries. I don't want to create temp files to pass to run-report.pl, I would just like to pass in the definition somehow.
So instead of saying:
run-report.pl -def=./path/to/def.xml
I want to be able to say:
run-report.pl --stream
And have the report definition available in <STDIN>
I am sure there is pretty trivial way to do this???
If I understand your question correctly, all you need is one | (pipe).
./generate-xml-from-db.pl | ./run-report.pl --stream
Anything the first process in the pipeline prints to stdout will appear in the second process's stdin.
As long as you read from STDIN, you have it available. Notice what happens with you take the code below name it something like echo.pl run it at the command line and paste reams of text.
#!/usr/bin/perl -w
use 5.010;
use strict;
use warnings;
while ( <> ) {
say;
}
<> is the Perl shorthand for "read from STDIN".
As long as the method you're using to launch the process has a way to get a hold of the standard input and outputs, you can just write it to that handle. You have to use the ways that are available to you. In Java, for example, you'd have to get the input stream of the process, in a batch command you have to pipe it. At a GUI terminal you can cut and paste.