Scrape web with x-ray - node.js

I'm using x-ray to extract some data from a web site but when I get to the point to crawl to another page using the built-in functionality, it simply doesn't work.
UnitPrice is the parameter I want to extract but I get "undefined" all the time.
As you can see, I'm passing the href value previously extracted on the url property.
var Xray = require('x-ray');
var x = Xray();
var x = Xray({
filters: {
cleanPrice: function (value) {
return typeof value === 'string' ? value.replace(/\r|\t|\n|€/g, "").trim() : value
},
whiteSpaces: function (value) {
return typeof value === 'string' ? value.replace(/ +/g, ' ').trim() : value
}
}
});
x('https://www.simply.es/compra-online/aceite-vinagre-y-sal.html',
'#content > ul',
[{
name: '.descripcionProducto | whiteSpaces',
categoryId: 'input[name="idCategoria"]#value',
productId: 'input[name="idProducto"]#value',
url: 'li a#href',
price: 'span | cleanPrice',
image: '.miniaturaProducto#src',
unitPrice: x('li a#href', '.precioKilo')
}])
.paginate('.link#href')
.limit(1)
// .delay(500, 1000)
// .throttle(2, 1000)
.write('results.json')

There's a pull request to fix this. Meanwhile you can use the solution which is just one line of code. See this:
https://github.com/lapwinglabs/x-ray/pull/181

Related

What is a better way of constructing a filters object from query params?

I want to know what is a better or a "best practise" way of constructing filters object from query params which are optional so the code doesn't go messy (Nestjs, Prisma with Postgres).
Here's what I've got so far:
getHomes(
#Query('city') city?: string,
#Query('minPrice') minPrice?: string,
#Query('maxPrice') maxPrice?: string,
#Query('numberOfBedrooms') numberOfBedrooms?: string,
#Query('numberOfBathrooms') numberOfBathrooms?: string,
#Query('minLandSize') minLandSize?: string,
#Query('maxLandSize') maxLandSize?: string,
#Query('propertyType') propertyType?: PropertyType,
): Promise<HomeResponseDto[]> {
const price =
minPrice || maxPrice
? {
...(minPrice && { gte: parseFloat(minPrice) }),
...(maxPrice && { lte: parseFloat(maxPrice) }),
}
: undefined;
const landSize =
minLandSize || maxLandSize
? {
...(minLandSize && { gte: parseInt(minLandSize) }),
...(maxLandSize && { lte: parseInt(maxLandSize) }),
}
: undefined;
const filters = {
...(city && { city }),
...(numberOfBedrooms && { numberOfBedrooms: parseInt(numberOfBedrooms) }),
...(numberOfBathrooms && {
numberOfBathrooms: parseInt(numberOfBathrooms),
}),
...(price && { price }),
...(landSize && { landSize }),
...(propertyType && { propertyType }),
};
I'm building a filters object which I'm providing to the where object of Prisma service, so it follows its rules like "greater then or equal" as gte key and so on.
I thought about creating custom pipes for each of the param, or creating a generic one where we can pass an argument for a object key. For example, if we don't pass any argument to the pipe - value will be transformed to the object with the same name in the object key. But if we pass a gte or lte as an argument to the pipe - value will be transformed to the object, but name of the key will be gte or lte.
Also, I've heard about query builders, but it means that I have to write bare SQL queries?
I'd be grateful if you provide an example, thanks!

Unable to fetch the specific element value using jsonpath node js library

I'm trying to fetch the value of element based on it's value in the complex JSON file.
Trying to fetch the value attribute (Which is 100) if the currency ='BRL' and the index will be subject to change so I just want to try with condition based.
I just tried below so far:
Script:
function test()
{
var result = jsonpath.query(payload,"$..client_balance[?(#.type == 'AVAILABLE')]");
console.log(result);
}
test();
Output:
[
{
amount: { currency: 'BRL', value: '100', skip: false },
type: 'AVAILABLE'
},
{
amount: { currency: 'USD', value: '10', skip: false },
type: 'AVAILABLE'
}
]
Now, I just wanna fetch the value attribute (Which is 100) if the currency code = 'BRL'. I tried to apply the [?(#.currency == 'BRL')]
in the tail of the path variable but it returned empty array.
can someone help me to solve this problem.
Updated:
Tried filter function to get the specific element value.
console.log(Object.values(payload).filter(element =>{
element.currency === 'BRL';
}));
Output:
[]
console.log(Object.values(payload).filter(element =>{
return element.amount.currency === 'BRL';
}));
I think this should work
This is a bit of a complex query, but it should get you what you're looking for.
Start with what you have, which returns the result set you posted:
$..client_balance[?(#.type == 'AVAILABLE')]
Add to this another filter which looks inside the amount field at the currency field for the comparison:
$..client_balance[?(#.type == 'AVAILABLE')][?(#.amount.currency === 'BRL')]
This should give just the one element:
[
{
amount: { currency: 'BRL', value: '100', skip: false },
type: 'AVAILABLE'
}
]
From here you want to get the value field, but to get there, you need the path to it, meaning you have to go through the amount and currency fields first.
$..client_balance[?(#.type == 'AVAILABLE')][?(#.amount.currency === 'BRL')].amount.currency.value
This should return
[
100
]
Please note that we are working on a specification for JSON Path. If this library chooses to adhere to it once published, the === will need to change to a == as this is what we've decided to support.

In SuiteScript, can you set the customform field using record.submitFields?

I have a partner record where I would like to change the form if the category field is set to a certain value. However, I can't use this with certain SuiteScript functions because changing the form wipes out any changes that were made to the record. I'm trying to work around this using an afterSubmit function that will use record.SubmitFields to change the form and then redirect.toRecord to reload the page with the change. However, it's not changing the form value. Is there a way to do this with record.submitFields? Am I doing something incorrectly?
var currentRecord = scriptContext.newRecord;
var category = currentRecord.getValue('category');
if(category == '3'){
try{
record.submitFields({
type: record.Type.PARTNER,
id: currentRecord.id,
values: {
'customform': '105'
}
});
log.debug('success');
} catch (e) {
log.error({title: 'error', details: e});
}
}
redirect.toRecord({
type: 'partner',
id: currentRecord.id,
});
}
Yes you can. Whenever you create a url for a record you can generally add a cf parameter that takes the form id. It's the same vaule you'd use if you were setting the field 'customform'. So just skip the submitFields part and do:
redirect.toRecord({
type: 'partner',
id: currentRecord.id,
parameters:{
cf:105
}
});
You can also set the custom form using the submitFields call but that only works for some types of records.
If you need to do this in the beforeLoad here is a fragment in Typescript. The trick to avoid an infinite loop is to check to see if you already have the correct form:
export function beforeLoad(ctx){
let rec : record.Record = ctx.newRecord;
let user = runtime.getCurrentUser();
if(user.roleCenter =='EMPLOYEE'){
if(rec.getValue({fieldId:'assigned'}) != user.id){
throw new Error('You do not have access to this record');
return;
}
}else{
log.debug({
title:'Access for '+ user.entityid,
details:user.roleCenter
});
}
if(ctx.type == ctx.UserEventType.EDIT){
var approvalForm = runtime.getCurrentScript().getParameter({name:'custscript_kotn_approval_form'});
let rec : record.Record = ctx.newRecord;
if( 3 == rec.getValue({fieldId:'custevent_kotn_approval_status'})){
if(approvalForm != rec.getValue({fieldId:'customform'}) && approvalForm != ctx.request.parameters.cf){
redirect.toRecord({
type: <string>rec.type,
id : ''+rec.id,
isEditMode:true,
parameters :{
cf:approvalForm
}
});
return;
}
}
}

Node Discord.js Axios API Indexing

I want to index through a list of items stored in a json file and call to each API and bring back data. The below code shows indexing/map working by building an API link, but how do I get the whole API call and message to be inside the indexing so each list item is called and returned by the API:
// {"342671641006047252":["MSFT","AMZN","CVNA","TEAM"]}
console.log(list);
// This is indexing through the list and bulding the link
const tickers = list
.map((ticker, index) => `https://financialmodelingprep.com/api/v3/quote/${ticker}?apikey=6c7ee1f3c7a666228979fa0678fa22a3`)
return message.channel.send(tickers)
// This is going to the api for list[0]
axios.get('https://financialmodelingprep.com/api/v3/quote/'+list[0]+'?apikey=6c7ee1f3c7a666228979fa0678fa22a3').then(resp => {
console.log(resp.data);
let symbol = resp.data[0].symbol;
let price = resp.data[0].price;
let changesPercentage = resp.data[0].changesPercentage;
return message.channel.send({embed: {
color: 8311585,
fields: [{
name: "Ticker",
value: `${symbol}`,
inline: "true"
},
{
name: "Price",
value: `${price}`,
inline: "true"
},
{
name: "Change %",
value: `${changesPercentage}`,
inline: "true"
},
Nevermind on this one! I used forEach instead of .map and got it working!

Sequelize: findAll based on param if param is not null and just findAll if param is null

I'm trying to get data using Sequelize in my Express app, and database using MSSQL. Here is my code :
getInstitution: function (req, res) {
var type = req.query.type,
limit = parseInt(req.query.limit),
offset = parseInt(req.query.offset);
Students.findAll({
limit: limit,
offset: offset,
order: [['createdAt', 'DESC']],
where: {
gender: type
}
})
}
Desc:
Type => ( 1 is Male, 2 Female).
I can get students if typeis not null and then I get the students based on the type, but how to get All students(male & female) if the type is null or doesn't contain any value?
I think it would work if I do check first, if type is not null call another function to find based on type, else call another function that find All.
Is there any better way to do that without using two another function above?
Thankyou
How about applying conditional in just a where part?
getInstitution: function (req, res) {
var type = req.query.type,
limit = parseInt(req.query.limit),
offset = parseInt(req.query.offset);
Students.findAll({
limit: limit,
offset: offset,
order: [['createdAt', 'DESC']],
where: type ? { gender: type } : {} // I think {} or null will work.
})
}

Resources