Grabbing text from webpage and storing as variable

Grabbing text from webpage and storing as variable - text

On the webpage
http://services.runescape.com/m=itemdb_rs/Armadyl_chaps/viewitem.ws?obj=19463
It lists prices for a particular item in a game, I wanted to grab the "Current guide price:" of said item, and store it as a variable so I could output it in a google spreadsheet. I only want the number, currently it is "643.8k", but I am not sure how to grab specific text like that.
Since the number is in "k" form, that means I can't graph it, It would have to be something like 643,800 to make it graphable. I have a formula for it, and my second question would be to know if it's possible to use a formula on the number pulled, then store that as the final output?
-EDIT-
This is what I have so far and it's not working not sure why.
function pullRuneScape() {
var page = UrlFetchApp.fetch("http://services.runescape.com/m=itemdb_rs/Armadyl_chaps/viewitem.ws?obj=19463").getContentText();
var number = page.match(/Current guide price:<\/th>\n(\d*)/)[1];
SpreadsheetApp.getActive().getSheetByName('RuneScape').appendRow([new Date(), number]);
}

Your regex is wrong. I tested this one successfully:
var number = page.match(/Current guide price:<\/th>\s*<td>([^<]*)<\/td>/m)[1];
What it does:
Current guide price:<\/th> find Current guide price: and closing td tag
\s*<td> allow whitespace between tags, find opening td tag
([^<]*) build a group and match everything except this char <
<\/td> match the closing td tag
/m match multiline

Use UrlFetch to get the page [1]. That'll return an HTTPResponse that you can read with GetBlob [2]. Once you have the text you can use regular expressions. In this case just search for 'Current guide price:' and then read the next row. As to remove the 'k' you can just replace with reg ex like this:
'123k'.replace(/k/g,'')
Will return just '123'.
https://developers.google.com/apps-script/reference/url-fetch/
https://developers.google.com/apps-script/reference/url-fetch/http-response

Obviously, you are not getting anything because the regexp is wrong. I'm no regexp expert but I was able to extract the number using basic string manipulation
var page = UrlFetchApp.fetch("http://services.runescape.com/m=itemdb_rs/Armadyl_chaps/viewitem.ws?obj=19463").getContentText();
var TD = "<td>";
var start = page.indexOf('Current guide price');
start = page.indexOf(TD, start);
var end = page.indexOf('</td>',start);
var number = page.substring (start + TD.length , end);
Logger.log(number);
Then, I wrote a function to convert k,m etc. to the corresponding multiplying factors.
function getMultiplyingFactor(symbol){
switch(symbol){
case 'k':
case 'K':
return 1000;
case 'm':
case 'M':
return 1000 * 1000;
case 'g':
case 'G':
return 1000 * 1000 * 1000;
default:
return 1;
}
}
Finally, tie the two together
function pullRuneScape() {
var page = UrlFetchApp.fetch("http://services.runescape.com/m=itemdb_rs/Armadyl_chaps/viewitem.ws?obj=19463").getContentText();
var TD = "<td>";
var start = page.indexOf('Current guide price');
start = page.indexOf(TD, start);
var end = page.indexOf('</td>',start);
var number = page.substring (start + TD.length , end);
Logger.log(number);
var numericPart = number.substring(0, number.length -1);
var multiplierSymbol = number.substring(number.length -1 , number.length);
var multiplier = getMultiplyingFactor(multiplierSymbol);
var fullNumber = multiplier == 1 ? number : numericPart * multiplier;
Logger.log(fullNumber);
}
Certainly, not the optimal way of doing things but it works.

Basically I parse the html page as you did (with corrected regex) and split the string into number part and multiplicator (k = 1000). Finally I return the extracted number. This function can be used in Google Docs.
function pullRuneScape() {
var pageContent = UrlFetchApp.fetch("http://services.runescape.com/m=itemdb_rs/Armadyl_chaps/viewitem.ws?obj=19463").getContentText();
var matched = pageContent.match(/Current guide price:<.th>\n<td>(\d+\.*\d*)([k]{0,1})/);
var numberAsString = matched[1];
var multiplier = "";
if (matched.length == 3) {
multiplier = matched[2];
}
number = convertNumber(numberAsString, multiplier);
return number;
}
function convertNumber(numberAsString, multiplier) {
var number = Number(numberAsString);
if (multiplier == 'k') {
number *= 1000;
}
return number;
}

Related

Get Last Column in Visible Views Index - Excel - Office-JS

I'm trying to filter the last column on a worksheet but I can't seem to get the Index of the column. To be clear, I need the index relative to the worksheet, no the range. I used VisibleView to find the Column, but there may be hidden rows, so my plan is to then load that column via getRangeByIndexes but I need the relative columnIndex to the worksheet.
I've tried a bunch of variations of the below, but I either get Object doesn't support 'getColumn' or columnIndex is undefined
Note: In the below example I've hardcoded 7 as that will be the last column relative to the VisibleView (Columns and rows are already hidden), but I'd like this to by dynamic for other functions and just returnthe "last visible column index".
var ws = context.workbook.worksheets.getActiveWorksheet()
var visible_rng = ws.getUsedRange(true).getVisibleView()
visible_rng.load(["columnCount", "columnIndex"])
await context.sync();
console.log('visible_rng.columnIndex')
console.log(visible_rng.getCell(0,7).columnIndex)
console.log(visible_rng.getColumn(7).columnIndex)

Well this method seems a bit hacky, please share if you know a better way! But, first thing I found was that getVisibleView only metions rows in the Description.
Represents the visible rows of the current range.
I decided to try getSpecialCells and was able to load the address property. I then had to use split and get the last column LETTER and convert this to the Index.
I also wanted the columnCount but this wasn't working w/ getSpecialCells so I polled that from getVisibleView and return an Object relating to Visible Views that I can build on the function later if I need more details.
Here it is:
async function Get_Visible_View_Details_Obj(context, ws) {
var visible_rng = ws.getUsedRange(true).getSpecialCells("Visible");
visible_rng.load("address")
var visible_view_rng = ws.getUsedRange(true).getVisibleView()
visible_view_rng.load("columnCount")
await context.sync();
var Filter_Col_Index = visible_rng.address
var Filter_Col_Index = Filter_Col_Index.split(",")
var Filter_Col_Index = Filter_Col_Index[Filter_Col_Index.length - 1]
var Filter_Col_Index = Filter_Col_Index.split("!")[1]
if (Filter_Col_Index.includes(":") == true) {
var Filter_Col_Index = Filter_Col_Index.split(":")[1]
}
var Filter_Col_Index = Get_Alpha_FromString(Filter_Col_Index)
var Filter_Col_Index = Get_Col_Index_From_Letters(Filter_Col_Index)
var Filter_Col_Index_Obj = {
"last_col_ws_index": Filter_Col_Index,
"columnCount": visible_view_rng.columnCount,
}
return Filter_Col_Index_Obj
}
Helper Funcs:
function Get_Alpha_FromString(str) {
return str.replace(/[^a-z]/gi, '');
}
function Get_Col_Index_From_Letters(str) {
str = str.toUpperCase();
let out = 0, len = str.length;
for (pos = 0; pos < len; pos++) {
out += (str.charCodeAt(pos) - 64) * Math.pow(26, len - pos - 1);
}
return out - 1;
}

Insert commas into string number

I have this code, and I want to put commas.
I've seen many examples, but I dont now were put the code.
This is my AS3 code:
calccn.addEventListener(MouseEvent.CLICK,result1);
function result1(e:MouseEvent)
{
var vev: Number = (Number(vev.text));
var cn1: Number = (Number(3/100));
var result1f: Number = (Number(vev*cn1));
var round;
round=result1f.toFixed(0);
v3.text = String(round);
}
Example
If the result give me 1528000,32
I want that the result is 1.528.000 or 1 528 000

I didn't know (live and learn, eh), but there's actually a dedicated class: NumberFormatter.
If you want to do the thing on a regular basis, you might want a method to call:
// Implementation.
import flash.globalization.NumberFormatter;
// The default separator is comma.
function formattedNumber(value:Number, separator:String = ","):String
{
var NF:NumberFormatter;
NF = new NumberFormatter("en_US");
// Enforce the use of the given separator.
NF.groupingSeparator = separator;
// Ignore the fraction part.
NF.fractionalDigits = 0;
return NF.formatNumber(value);
}
// Usage.
//Format the given number with spaces for separator.
trace(formattedNumber(1528000.32, " ")); // 1 528 000
//Format the given number with the default separator.
trace(formattedNumber(1528000.32)); // 1,528,000
But if you want just a simple one-timer, and don't really care if it is commas or spaces as long as they present you may just condense in into a single expression:
calccn.addEventListener(MouseEvent.CLICK, result1);
function result1(e:MouseEvent):void
{
// Declaring variables with the same names as
// other entities is a generally bad idea.
var input:Number = Number(vev.text);
var cn1:Number = 3.0 / 100.0;
// Keep in mind that I used int() here as
// a simple tool to remove the fraction part.
v3.text = (new NumberFormatter("en_US")).formatNumber(int(input * cn1));
}

Node - Test if string contain element of array replace him by random element of same array (synonym)

I want to change each word that matches the synonym list randomly by another synonym or itself (to randomly keep this keyword).
I test if a string (input) contains one element of an array (words). If it's true, I want to randomly replace this with the element of this same list.
var input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
words_synonym = ["amazing", "formidable", "great", "smart"];
// first condition --> true if "input" contain one element of "words_synonym"
input = input.toLowerCase();
console.log(words_synonym.some(word => input.includes(word)));
after, I want to replace the "element" that validated the condition with a random element of the same array (words_synonym).
But I can't select this element. I have just true or false
var random_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))]
input = input.replace(element, random_word, 0)
thanks

The way you have it right now, you're checking if any of the synonyms match any of the words (via words_synonym.some(word => input.includes(word))). In order to do what you want, you'll need both the position of the target word and the new word, neither of which you have now. To do this, you'll want to break apart your nested loops.
The code words_synonym.some(word => input.includes(word)) is equivalent to:
let has_synonym = false;
for (word of words_synonym) { // this is a loop
if (input.includes(word)) { // this is also a loop
has_synonym = true;
break;
}
}
console.log(has_synonym);
So to fix your main issue, just replace includes with indexOf.
To handle the case of replacing all of the tokens, I would suggest keeping track of the token you have replaced outside of the loop, otherwise you end up replacing each token many times which may become very expensive. To do this, just keep track of your starting position outside of the loop and increment it with the end index of the replacement word. indexOf already takes a start argument for exactly this use case!
const input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
const words_synonym = ["amazing", "formidable", "great", "smart"];
let output = input;
let start = 0; // index of the end of the last replaced token
for (word of words_synonym) {
let index = output.indexOf(word, start);
while (index >= 0) {
const new_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))];
output = output.substr(0, index) + new_word + output.substr(index + word.length, output.length);
start = index + new_word.length + 1; // increment the start
index = output.indexOf(word, start);
}
}
console.log("input: ", input);
console.log("output: ", output);

You can use method find:
words_synonym.find(word => input.includes(word))
Which returns
The value of the first element in the array that satisfies the
provided testing function. Otherwise, undefined is returned.
from docs:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find

i have modify answer of dantiston and i have include a loop in order to change all the word match "words_synonym".
But there is a problem. The program don't check all the word of "words_synonym" but only the first with indexof.
var input = "This is an amazing text blob where this word amazing is replaced by a random word from list_of_words. Isn't this amazing!";
words_synonym = ["amazing", "formidable", "great", "smart"];
let output = input;
for (word of words_synonym) {
let index = output.indexOf(word);
if (index >= 0) {
console.log(word);
var indexes = [], i = -1;
while ((i = output.indexOf(word, i+1)) != -1){
index=output.indexOf(word, i);
var new_word = words_synonym[Math.floor(Math.random() * (words_synonym.length))];
output = output.substr(0, index) + new_word + output.substr(index + word.length, output.length);
}
}
}
console.log("input: ", input);
console.log("output: ", output);

AS3 "Advanced" string manipulation

I'm making an air dictionary and I have a(nother) problem. The main app is ready to go and works perfectly but when I tested it I noticed that it could be better. A bit of context: the language (ancient egyptian) I'm translating from does not use punctuation so a phrase canlooklikethis. Add to that the sheer complexity of the glyph system (6000+ glyphs).
Right know my app works like this :
user choose the glyphs composing his/r word.
app transforms those glyphs to alphanumerical values (A1 - D36 - X1A, etc).
the code compares the code (say : A5AD36) to a list of xml values.
if the word is found (A5AD36 = priestess of Bast), the user gets the translation. if not, s/he gets all the possible words corresponding to the two glyphs (A5A & D36).
If the user knows the string is a word, no problem. But if s/he enters a few words, s/he'll have a few more choices than hoped (exemple : query = A1A5AD36 gets A1 - A5A - D36 - A5AD36).
What I would like to do is this:
query = A1A5AD36 //word/phrase to be translated;
varArray = [A1, A5A, D36] //variables containing the value of the glyphs.
Corresponding possible words from the xml : A1, A5A, D36, A5AD36.
Possible phrases: A1 A5A D36 / A1 A5AD36 / A1A5A D36 / A1A5AD36.
Possible phrases with only legal words: A1 A5A D36 / A1 A5AD36.
I'm not I really clear but to things simple, I'd like to get all the possible phrases containing only legal words and filter out the other ones.
(example with english : TOBREAKFAST. Legal = to break fast / to breakfast. Illegal = tobreak fast.
I've managed to get all the possible words, but not the rest. Right now, when I run my app, I have an array containing A1 - A5A - D36 - A5AD36. But I'm stuck going forward.
Does anyone have an idea ? Thank you :)
function fnSearch(e: Event): void {
var val: int = sp.length; //sp is an array filled with variables containing the code for each used glyph.
for (var i: int = 0; i < val; i++) { //repeat for every glyph use.
var X: String = ""; //variable created to compare with xml dictionary
for (var i2: int = 0; i2 < val; i2++) { // if it's the first time, use the first glyph-code, else the one after last used.
if (X == "") {
X = sp[i];
} else {
X = X + sp[i2 + i];
}
xmlresult = myXML.mot.cd; //xmlresult = alphanumerical codes corresponding to words from XMLList already imported
trad = myXML.mot.td; //same with traductions.
for (var i3: int = 0; i3 < xmlresult.length(); i3++) { //check if element X is in dictionary
var codeElement: XML = xmlresult[i3]; //variable to compare with X
var tradElement: XML = trad[i3]; //variable corresponding to codeElement
if (X == codeElement.toString()) { //if codeElement[i3] is legal, add it to array of legal words.
checkArray.push(codeElement); //checkArray is an array filled with legal words.
}
}
}
}
var iT2: int = 500 //iT2 set to unreachable value for next lines.
for (var iT: int = 0; iT < checkArray.length; iT++) { //check if the word searched by user is in the results.
if (checkArray[iT] == query) {
iT2 = iT
}
}
if (iT2 != 500) { //if complete query is found, put it on top of the array so it appears on top of the results.
var oldFirst: String = checkArray[0];
checkArray[0] = checkArray[iT2];
checkArray[iT2] = oldFirst;
}
results.visible = true; //make result list visible
loadingResults.visible = false; //loading screen
fnPossibleResults(null); //update result list.
}
I end up with an array of variables containing the glyph-codes (sp) and another with all the possible legal words (checkArray). What I don't know how to do is mix those two to make legal phrases that way :
If there was only three glyphs, I could probably find a way, but user can enter 60 glyphs max.

Is it possible to do a Levenshtein distance in Excel without having to resort to Macros?

Let me explain.
I have to do some fuzzy matching for a company, so ATM I use a levenshtein distance calculator, and then calculate the percentage of similarity between the two terms. If the terms are more than 80% similar, Fuzzymatch returns "TRUE".
My problem is that I'm on an internship, and leaving soon. The people who will continue doing this do not know how to use excel with macros, and want me to implement what I did as best I can.
So my question is : however inefficient the function may be, is there ANY way to make a standard function in Excel that will calculate what I did before, without resorting to macros ?
Thanks.

If you came about this googling something like
levenshtein distance google sheets
I threw this together, with the code comment from milot-midia on this gist (https://gist.github.com/andrei-m/982927 - code under MIT license)
From Sheets in the header menu, Tools -> Script Editor
Name the project
The name of the function (not the project) will let you use the func
Paste the following code
function Levenshtein(a, b) {
if(a.length == 0) return b.length;
if(b.length == 0) return a.length;
// swap to save some memory O(min(a,b)) instead of O(a)
if(a.length > b.length) {
var tmp = a;
a = b;
b = tmp;
}
var row = [];
// init the row
for(var i = 0; i <= a.length; i++){
row[i] = i;
}
// fill in the rest
for(var i = 1; i <= b.length; i++){
var prev = i;
for(var j = 1; j <= a.length; j++){
var val;
if(b.charAt(i-1) == a.charAt(j-1)){
val = row[j-1]; // match
} else {
val = Math.min(row[j-1] + 1, // substitution
prev + 1, // insertion
row[j] + 1); // deletion
}
row[j - 1] = prev;
prev = val;
}
row[a.length] = prev;
}
return row[a.length];
}
You should be able to run it from a spreadsheet with
=Levenshtein(cell_1,cell_2)

While it can't be done in a single formula for any reasonably-sized strings, you can use formulas alone to compute the Levenshtein Distance between strings using a worksheet.
Here is an example that can handle strings up to 15 characters, it could be easily expanded for more:
https://docs.google.com/spreadsheet/ccc?key=0AkZy12yffb5YdFNybkNJaE5hTG9VYkNpdW5ZOWowSFE&usp=sharing
This isn't practical for anything other than ad-hoc comparisons, but it does do a decent job of showing how the algorithm works.

looking at the previous answers to calculating Levenshtein distance, I think it would be impossible to create it as a formula.
Take a look at the code here

Actually, I think I just found a workaround. I was adding it in the wrong part of the code...
Adding this line
} else if(b.charAt(i-1)==a.charAt(j) && b.charAt(i)==a.charAt(j-1)){
val = row[j-1]-0.33; //transposition
so it now reads
if(b.charAt(i-1) == a.charAt(j-1)){
val = row[j-1]; // match
} else if(b.charAt(i-1)==a.charAt(j) && b.charAt(i)==a.charAt(j-1)){
val = row[j-1]-0.33; //transposition
} else {
val = Math.min(row[j-1] + 1, // substitution
prev + 1, // insertion
row[j] + 1); // deletion
}
Seems to fix the problem. Now 'biulding' is 92% accurate and 'bilding' is 88%. (whereas with the original formula 'biulding' was only 75%... despite being closer to the correct spelling of building)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Grabbing text from webpage and storing as variable - text

Related

Get Last Column in Visible Views Index - Excel - Office-JS

Insert commas into string number

Node - Test if string contain element of array replace him by random element of same array (synonym)

AS3 "Advanced" string manipulation

Is it possible to do a Levenshtein distance in Excel without having to resort to Macros?

Categories

Resources