How can I make it read all my pages and not just one PDF.JS - node.js

Good night people, I tell you..
I am working in node and express and I am getting the following error
It turns out that my pdf at the moment has 3 pages, but it can vary. What I need to do is find a way to read the number of sheets that the PDF has, I'm using pdf.js.
So in summary:
So what I need to do is do something in such a way that if the pdf has 3 pages, read me the 3 pages, if it has 4, read me the 4 pages and so on, I was reading the information that is https://mozilla.github.io /pdf.js/examples/ but it doesn't really fix much. Here's a picture of what I've done.
doc.numpages It returns the number of sheets, but when I use it by passing it to it, in this case, as numPages is = 3, it reads only the 3rd sheet

It looks like you are only calling await doc.getPage() after counting all the pages, so you only ever get the last page.
I'd imagine you need to move the getPage and getTextContent calls into the for loop and save the results in a data structure like an array until you've read the whole PDF and are ready to return it. For example:
function getAllPages(doc) {
let pages = [];
for (let i = 1; i < doc.numPages; i++) {
let page = await doc.getPage(i);
let pageContent = await page.getTextContent();
pages.push(pageContent);
}
return pages;
}
(P.S. it's much easier to help if you paste code as text instead of sharing a screenshot)

Related

How to add object to file in nodejs?

title pretty much explains it all. I'm trying to add objects into a nodejs file and cant seem to get it working.
Each file essentially looks like this
[{"name":name,"date":date},{"name":name,"date":date}] (in simplest terms)
I want to be able to add an object, to that array that is in that file. Here is the code I came up with
for(o in collections){
fs.readFile(__dirname + "/HowIsCollections/"+collections[o].mintDate,'utf8',function(err,data){
const dat = JSON.parse(data)
const existedData = []
//console.log(existedData)
for(i in dat){
existedData.push(JSON.stringify(dat[i]))
}
const project = JSON.stringify(collections[o])
if(!existedData.includes(project)){
console.log("?")
dat.push(project)
}
fs.writeFileSync(__dirname + "/HowIsCollections/"+collections[o].mintDate,JSON.stringify(dat))
console.log("????")
})
}
Its pretty self explanatory. From the top, its reading the file, getting the data, taking all of the objects found in the file and putting it into an array.
the second half of the code, stringifys each object, it then compares against the array to see if that object exists in the array (existedData, the data from the file). If it doesnt, it adds it. Then at the end im just resaving the file.
dat.push(project) is the array in the file.
I have similar setups like this in other parts of my code, which work. this however does not, i get no errors, nothing, it just doesnt work. All of my console.log's show, but thats it.
I tried looking on here mostly for solutions, but most of them were just talking about stringifying an object in fs.writefile, which isnt what i need here.

pdfkit nodejs, one element per page from page 2

Im using pdfkit to generate pdf invoice.
When all my content fit in one page I have no issue.
However when it doesn't fit and need an extra page, I have a strange behaviour:
Instead of adding the elements in the second page, it only add one line and the rest of the page is blank.
Then on 3rd page I have another element, and the rest it blank, then 4th page, 5th etc.
Here is the code corresponding to this part:
for (let i = 0; i < data.items.length; i++) {
const item = data.items[i];
this.itemPositionY = this.itemPositionY + 20;
if (item.bio) this.containBioProduct = true;
let itemName = item.bio ? `${item.item}*` : item.item;
this.generateTableRow(
doc,
this.itemPositionY,
itemName,
"",
this.formatCurrency(item.itemPriceDf.toFixed(2)),
item.quantity,
this.formatCurrency(item.itemPriceTotalDf.toFixed(2))
);
this.generateHr(doc, this.itemPositionY + 15);
}
Basically I just iterate over an array of products. For each line my Y position has +20.
Thanks for your help.
In case someone has this issue, here is a solution:
Everywhere in the code I know that an extra page could be generated, I add this:
if (this.position > 680) {
doc.addPage();
this.position = 50;
}
It allows you to control the generation of new pages (instead of pdfkit doing it automatically with potential problems)
You just need to track the position from the initialization of "this.position".
In that way, evertime it's superior than an Y position (680 in my case, it's a bit less than a page with pdfkit), you just do "doc.addPage()", which will create another page, and you reinitialize your position to the beginning of the new page.

Suitescript Pagination

Ive been trying to create a suitelet that allows for a saved search to be run on a collection of item records in netsuite using suitescript 1.0
Pagination is quite easy everywhere else, but i cant get my head around how to do it in NetSuite.
For instance, we have 3,000 items and I'm trying to limit the results to 100 per page.
I'm struggling to understand how to apply a start row and a max row parameter as a filter so i can run the search to return the number of records from my search
I've seen plenty of scripts that allow you to exceed the limit of 1,000 records, but im trying to throttle the amount shown on screen. but im at a loss to figure out how to do this.
Any tips greatly appreciated
function searchItems(request,response)
{
var start = request.getParameter('start');
var max = request.getParameter('max');
if(!start)
{
start = 1;
}
if(!max)
{
max = 100;
}
var filters = [];
filters.push(new nlobjSearchFilter('category',null,'is',currentDeptID));
var productList = nlapiSearchRecord('item','customsearch_product_search',filters);
if(productList)
{
response.write('stuff here for the items');
}
}
You can approach this a couple different ways. Either way, you will definitely need to sort your search results by something meaningful and consistent, like by internal ID. Make sure you've got your results sorted either in your saved search definition or by adding a search column in your script.
You can continue building your search exactly like you are, and then just using the native slice method on the productList Array. You would use your start and end parameters to pass as the arguments to slice appropriately.
Another approach is to use the async API for searches. It will look similar to this:
var search = nlapiLoadSearch("item", "customsearch_product_search");
search.addFilter(new nlobjSearchFilter('category',null,'is',currentDeptID));
var productList = search.runSearch().getResults(start, end);
For more references on this approach, check out the NetSuite Help page titled "Search APIs" and the reference page for nlobjSearch.

Extracting all text from a website to build a concordance

How can I grab all the text in a website, and I don't just mean ctrl+a/c. I'd like to be able to extract all the text from a website (and all the pages associated) and use it to build a concordance of words from that site. Any ideas?
I was intrigued by this so I've written the first part of a solution to this.
The code is written in PHP because of the convenient strip_tags function. It's also rough and procedural but I feel in demonstrates my ideas.
<?php
$url = "http://www.stackoverflow.com";
//To use this you'll need to get a key for the Readabilty Parser API http://readability.com/developers/api/parser
$token = "";
//I make a HTTP GET request to the readabilty API and then decode the returned JSON
$parserResponse = json_decode(file_get_contents("http://www.readability.com/api/content/v1/parser?url=$url&token=$token"));
//I'm only interested in the content string in the json object
$content = $parserResponse->content;
//I strip the HTML tags for the article content
$wordsOnPage = strip_tags($content);
$wordCounter = array();
$wordSplit = explode(" ", $wordsOnPage);
//I then loop through each word in the article keeping count of how many times I've seen the word
foreach($wordSplit as $word)
{
incrementWordCounter($word);
}
//Then I sort the array so the most frequent words are at the end
asort($wordCounter);
//And dump the array
var_dump($wordCounter);
function incrementWordCounter($word)
{
global $wordCounter;
if(isset($wordCounter[$word]))
{
$wordCounter[$word] = $wordCounter[$word] + 1;
}
else
{
$wordCounter[$word] = 1;
}
}
?>
I needed to do this to configure PHP for the SSL the readability API uses.
The next step in the solution would be too search for links in the page and call this recursively in an intelligent way to hance the associated pages requirement.
Also the code above just gives the raw data of a word-count you would want to process it some more to make it meaningful.

Spotify developer search

I am confused about how the search function works in the Spotify API. Their example is like this:
var sp = getSpotifyApi();
var models = sp.require('$api/models');
var search = new models.Search('Rihanna');
search.localResults = models.LOCALSEARCHRESULTS.APPEND;
var searchHTML = document.getElementById('results');
search.observe(models.EVENT.CHANGE, function() {
var results = search.tracks;
var fragment = document.createDocumentFragment();
for (var i=0; i<results.length; i++){
var link = document.createElement('li');
var a = document.createElement('a');
a.href = results[i].uri;
link.appendChild(a);
a.innerHTML = results[i].name;
fragment.appendChild(link);
}
searchHTML.appendChild(fragment);
});
search.appendNext();
So, I guess that calling appendNext() initiates the search, and the inner function is called when it has results? But the results are limited to a certain number (default 50) of the total. How do you get the rest? Do you call appendNext() again recursively from inside the callback? Also, does that mean that after you do that, your list includes the original results, or are the original results replaced? Anyone know of an example that searches through all available results?
Also they mention that if the search is running, appendNext() does nothing. So how do you gracefully wait until the current search is complete before getting the next 'page'?
Their documentation is terrible, IMHO. Say you have 1000 search results total from the server. And say I want to see results 900-1000. Have I got to keep calling AppendNext over and over until I get to 900?
Thanks
Bob
There is no pagination when using the Search functionality built in the Spotify Apps API. You can increase the number of results so it returns more than 50 results (see the Search page in the documentation), although the amount is limited (it seems to be 200 tracks at the moment).
There is an alternative way, which is performing requests to the Web API instead.

Resources