What is the OpenAI API warning: To avoid an invalid_request_error, best_of was set to equal n. What is "best of"?

What is the OpenAI API warning: To avoid an invalid_request_error, best_of was set to equal n. What is "best of"? - jupyter-lab

This "best of" warning results from using the OpenAI API on a PC running Win10.
The Context:
Using the OpenAI API in Jupyter Lab with the ir kernel, with having only the rgpt3 library installed in this Notebook.
The API successfully performs a test code completion. And it does not matter whether the API is making a single or multiple API request, both return the same warning.
The following results when using 3 queries:
[1] "Request: 1/3" To avoid an invalid_request_error, best_of was
set to equal n
[1] "Request: 2/3" To avoid an invalid_request_error, best_of was
set to equal n
[1] "Request: 3/3" To avoid an invalid_request_error, best_of was
set to equal n
After performing multiple unsuccessful web searches - including a search at Stack Overflow for information about these warnings, I found there exists almost no information about this warning anywhere. It's probably too early in the process because the OpenAI API is relatively new to most people.
Therefore, it was decided to post both the question and the answer regarding this warning because otherwise finding such information is very difficult and time consuming. And for those users who are boldly going where few have gone before, errors and warning messages do not inspire confidence.

What the error following warning message is all about:
To avoid an invalid_request_error, best_of was set to equal `n
The Best Practices guide at OpenAi website provides the source which describes what "best_of" means. This information is currently available at the following website:
https://beta.openai.com/docs/guides/production-best-practices/improving-latencies
In a nutshell, "best_of" is one of the parameters used to define what we want from the OpenAI website when using the API. Using the OpenAI API involves "tokens" - which is something akin to the metering of a user's usage and rate limits at the OpenAI website. In addition, there are also limitations for most of the models at OpenAI based on the context length - with most models having 2048 max context size.
The Best Practices guide at the OpenAI website suggests the following:
Generate fewer completions: lower the values of n and best_of when
possible where n refers to how many completions to generate for each
prompt and best_of is used to represent the result with the highest
log probability per token.
If n and best_of both equal 1 (which is the default), the number of
generated tokens will be at most, equal to max_tokens.
If n (the number of completions returned) or best_of (the number of
completions generated for consideration) are set to > 1, each request
will create multiple outputs. Here, you can consider the number of
generated tokens as [ max_tokens * max (n, best_of) ]
The function used for Requests at the OpenAI website in the Jupyter Notebook has a R wrapper which sends Requests with the range of parameters - including the parameter called best_of. The best_of parameter in the function is already defaulted to equal to 1 and is only changed manually. Copy and paste of this parameter from the function follows:
best_of = 1
Therefore, it can only be presumed that the OpenAI website auto-generates the "best_of" warning for each "Prompt" for every API Request as a friendly reminder. This warning message can be programmatically ignored and removed if so desired

Related

How to parallelize execution of a custom function formula while keeping the Google Sheet shareable and permissionless?

I have a Google Sheet with a custom function formula that: takes in a matrix and two vectors from a spreadsheet, does some lengthy matrix-vector calculations (>30 sec, so above the quota), before outputting the result as a bunch of rows. It is single-threaded, since that's what Google Apps Script (GAS) natively is, but I want to parallelize the calculation using a multi-threading workaround, so it speeds it up drastically.
Requirements (1-3):
UX: It should run the calculations automatically and reactively as a custom function formula, which implies that the user doesn't have to manually start it by clicking a run button or similar. Like my single-threaded version currently does.
Parallelizable: It should ideally spawn ~30 threads/processes, so that instead of taking >30 seconds as it now does (which makes it time out due to Google's quota limit), it should take ~1 second. (I know GAS is single-threaded, but there are workarounds, referenced below).
Shareability: I should ideally be able to share the Sheet with other people, so they can "Make a copy" of it, and the script will still run the calculations for them:
3.1 Permissionless: Without me having to manually hand out individual permissions to users (permissionless). For instance whenever someone "Makes a copy" and "Execute the app as user accessing the web app". My rudimentary testing suggest that this is possible.
3.2 Non-intrusive: Without users of the spreadsheet having to give intrusive authorizations like "Give this spreadsheet/script/app access to your entire Google Drive or Gmail account?". Users having to give an non-intrusive authorization to a script/webapp can be acceptable, as long as requirement 3.1 is still maintained.
3.3 UX: Without forcing users to view a HTML sidebar in the spreadsheet.
I have already read this excellent related answer by #TheMaster which outlines some potential ways of solving parallelization in Google Apps script in general. Workaround #3 google.script.run and workaround #4 UrlFetchApp.fetchAll (both using a Google Web App) looks most promising. But some details are unknown to me, such as if they can adhere to requirements 1 and 3 with its sub-requirements.
I can conceive of an other potential naïve workaround which would be to split the function up into several custom functions formulas and do the parallelization (by some kind of Map/Reduce) inside the spreadsheet itself (storing intermediary results back into the spreadsheet, and having custom function formulas work on that as reducers). But that's undesired, and probably unfeasible, in my case.
I'm very confident my function is parallelizable using some kind of Map/Reduce process. The function is currently optimized by doing all the calculations in-memory, without touching the spreadsheet in-between steps, before finally outputting the result to the spreadsheet. The details of it is quite intricate and well over 100 lines, so I don't want to overload you with more (and potentially confusing) information which doesn't really affect the general applicability of this case. For the context of this question you may assume that my function is parallelizable (and map-reduce'able), or consider any function you already know that would be. What's interesting is what's generally possible to achieve with parallelizationin Google Apps Script, while also maintaining the highest level of shareability and UX. I'll update this question with more details if needed.
Update 2020-06-19:
To be more clear, I do not rule out Google Web App workarounds entirely, as I haven't got experience with their practical limitations to know for sure if they can solve the problem within the requirements. I have updated the sub-requirements 3.1 and 3.2 to reflect this. I also added sub-req 3.3, to be clearer on the intent. I also removed req 4, since it was largely overlapping with req 1.
I also edited the question and removed the related sub-questions, so it is more focused on the single main HOWTO-question in the title. The requirements in my question should provide a clear objective standard for which answers would be considered best.
I realise the question might entail a search for the Holy Grail of Google Sheet multithreading workarounds, as #TheMaster has pointed out in private. Ideally, Google would provide one or more features to support multithreading, map-reduce, or more permissionless sharing. But until then I would really like to know what is the optimal workaround within the current constraints we have. I would hope this question is relevant to others as well, even considering the tight requirements.

If you publish a web-app with "anyone, even anonymous", execute as "Me", then the custom function can use UrlFetchApp.fetchAllAuthorization not needed to post to that web-app. This will run in parallelproof. This solves all the three requirements.
Caveat here is: If multiple people use the sheet, and the custom function will have to post to the "same" webapp (that you published to execute as you) for processing, Google will limit simultaneous executionsquota limit:30.
To workaround this, You can ask people using your sheet to publish their own web-apps. They'll have to do this once at the beginning and no authorization is needed.
If not, you'll need to host a custom server for the load or something like google-cloud-functions might help

I ended up using the naïve workaround that I mentioned in my post:
I can conceive of an other potential naïve workaround which would be
to split the function up into several custom functions formulas and do
the parallelization (by some kind of Map/Reduce) inside the
spreadsheet itself (storing intermediary results back into the
spreadsheet, and having custom function formulas work on that as
reducers). But that's undesired, and probably unfeasible, in my case.
I initially disregarded it because it involves having an extra sheet tab with calculations which was not ideal. But when I reflected on it after investigating alternative solutions, it actually solves all the stated requirements in the most non-intrusive manner. Since it doesn't require anything extra from users the spreadsheet is shared with. It also stays 'within' Google Sheets as far as possible (no semi- or fully external Web App needed), doing the parallelization by relying on the native parallelization of concurrently executing spreadsheet cells, where results can be chained, and appear to the user like using regular formulas (no extra menu item or run-this-script-buttons necessary).
So I implemented MapReduce in Google Sheets using custom functions each operating on a slice of the interval I wanted to calculate. The reason I was able to do that, in my case, was that the input to my calculation was divisible into intervals that could each be calculated separately, and then joined later.**
Each parallel custom function then takes in one interval, calculates that, and outputs the results back to the sheet (I recommend to output as rows instead of columns, since columns are capped at 18 278 columns max. See this excellent post on Google Spreadsheet limitations.) I did run into the only 40,000 new rows at a time limitation, but was able to perform some reducing on each interval, so that they only output a very limited amount of rows to the spreadsheet. That was the parallelization; the Map part of MapReduce. Then I had a separate custom function which did the Reduce part, namely: dynamically target*** the spreadsheet output area of the separately calculated custom functions, and take in their results, once available, and join them together while further reducing them (to find the best performing results), to return the final result.
The interesting part was that I thought I would hit the only 30 simultaneous execution quota limit of Google Sheets. But I was able to parallelize up to 64 independently and seemingly concurrently executing custom functions. It may be that Google puts these into a queue if they exceed 30 concurrent executions, and only actually process 30 of them at any given time (please comment if you know). But anyhow, the parallelization benefit/speedup was huge, and seemingly nearly infinitely scalable. But with some caveats:
You have to define the number of parallelised custom functions up front, manually. So the parallelization doesn't infinitely auto-scale according to demand****. This is important because of the counter-intuitive result that in some cases using less parallelization actually executes faster. In my case, the result set from a very small interval could be exceedingly large, while if the interval had been larger then a lot of the results would have been ruled out underway in the algorithm in that parallelised custom function (i.e. the Map also did some reduction).
In rare cases (with huge inputs), the Reducer function will output a result before all of the parallel (Map) functions have completed (since some of them seemingly take too long). So you seemingly have a complete result set, but then a few seconds later it will re-update when the last parallel function returns its result. This is not ideal, so to be notified of this I implemented a function to tell me if the result was valid. I put it in the cell above the Reduce function (and colored the text red). B6 is the number of intervals (here 4), and the other cell references go to the cell with the custom function for each interval: =didAnyExecutedIntervalFail($B$6,S13,AB13,AK13,AT13)
function didAnyExecutedIntervalFail(intervalsExecuted, ...intervalOutputs) {
const errorValues = new Set(["#NULL!", "#DIV/0!", "#VALUE!", "#REF!", "#NAME?", "#NUM!", "#N/A","#ERROR!", "#"]);
// We go through only the outputs for intervals which were included in the parallel execution.
for(let i=0; i < intervalsExecuted; i++) {
if (errorValues.has(intervalOutputs[i]))
return "Result below is not valid (due to errors in one or more of the intervals), even though it looks like a proper result!";
}
}
The parallel custom functions are limited by Google quota of max 30 sec execution time for any custom function. So if they take too long to calculate, they still might time out (causing the issue mentioned in the previous point). The way to alleviate this timeout is to parallelise more, dividing into more intervals, so that each parallel custom function runs below 30 second.
The output of it all is limited by Google Sheet limitations. Specifically max 5M cells in a spreadsheet. So you may need to perform some reduction on the size of the results calculated in each parallel custom function, before returning its result to the spreadsheet. So that they each are below 40 000 rows, otherwise you'll receive the dreaded "Results too large" error). Furthermore, depending on the size the result of each parallel custom function, it would also limit how many custom functions you could have at the same time, as they and their result cells take space in the spreadsheet. But if each of them take in total, say 50 cells (including a very small output), then you could still parallelize pretty much (5M / 50 = 100 000 parallel functions) within a single sheet. But you also need some space for whatever you want to do with those results. And the 5M cells limit is for the whole Spreadsheet in total, not just for one of its sheet tabs, apparently.
** For those interested: I basically wanted to calculate all combinations of a sequence of bits (by brute force), so the function was 2^n where n was the number of bits. The initial range of combinations was from 1 to 2^n, so it could be divided into intervals of combinations, for example, if dividing into two intervals, it would be one from 1 to X and then one from X+1 to 2^n.
*** For those interested: I used a separate sheet formula to dynamically determine the range for the output of one of the intervals, based on the presence of rows with content. It was in a separate cell for each interval. For the first interval it was in cell S11 and the formula looked like this:
=ADDRESS(ROW(S13),COLUMN(S13),4)&":"&ADDRESS(COUNTA(S13:S)+ROWS(S1:S12),COLUMN(Z13),4) and it would output S13:Z15 which is the dynamically calculated output range, which only counts those rows with content (using COUNTA(S13:S)), thus avoiding to have a statically determined range. Since with a normal static range the size of the output would have to be known in advance, which it wasn't, or it would possibly either not include all of the output, or a lot of empty rows (and you don't want the Reducer to iterate over a lot of essentially empty data structures). Then I would input that range into the Reduce function by using INDIRECT(S$11). So that's how you get the results, from one of the intervals processed by a parallelized custom function, into the main Reducer function.
**** Though you could make it auto-scale up to some pre-defined amount of parallelised custom functions. You could use some preconfigured thresholds, and divide into, say, 16 intervals in some cases, but in other cases automatically divide into 64 intervals (preconfigured, based on experience). You'd then just stop / short-circuit the custom functions which shouldn't participate, based on if the number of that parallelised custom function exceeds the number of intervals you want to divide into and process. On the first line in the parallelised custom function: if (calcIntervalNr > intervals) return;. Though you would have to set up all the parallel custom functions in advance, which can be tedious (remember you have to account for the output area of each, and are limited by the max cell limit of 5M cells in Google Sheets).

Virtual Assistant -> LUIS, QnA, Dispatcher best practice

I have some question about some "best practice" for certain issues that we are facing using LUIS, QnA Maker, in particular for the Dispatcher:
1) Is there any best practice in case we have more that 15k utterances in the Dispatcher? That's looks like a limitation of the LUIS apps but the scalability of the model in the long run will be questionable.
2) Bing Spell Check for LUIS changes names and surnames for example, how to avoid this? I guess that Bing Spell Check is necessary when we are talking about ChatBots, since the typo are always behind the door, but using it for names is dangerous.
3) Cross validation is not supported out of the box, you would have split your data to folds with custom code (not difficult), use the command line to train and publish your model on your k-1/k folds, then send the k-fold utterances to the API one-by-one. Batch upload is only supported through the UI https://cognitive.uservoice.com/forums/551524-language-understanding-luis/suggestions/20082157-add-api-to-batch-test-model and is limited to a test set of 1,000 utterances. If we use the one-by-one approach, we pay $1,50 per 1k transactions https://azure.microsoft.com/de-de/pricing/details/cognitive-services/language-understanding-intelligent-services/ and this means to get cross-validation metrics for the 5 folds for example, we could be paying about 20$ for a single experiment with our current data, more if we add more data.
4) Model is a black box, which doesn't give us the ability to use custom features if needed.

I will try to address your concerns in the best possible way I can as follows:
1) As per the LUIS documentation,
Hence, you cannot exceed the limit. In case of Dispatch apps,if the total utterance exceeds 15k, then dispatch will down sample the utterances to keep it under 15k. There is an optional parameter(--doAutoActiveLearning) for CLI to do auto active learning which will down sample intelligently (remove non relevant utterances).
--doAutoActiveLearning: (optional) Default to false. LUIS limit on training-set size is 15000. When a LUIS app has much more utterances for training, Dispatch's auto active learning process can intelligently down sample the utterances.
2) Bing Spell Check helps users to correct misspelled words in utterances before LUIS predicts the score and entities of the utterance. However, if you want to avoid using Bing Spell Check API service, then you will need to add the correct and incorrect spelling which can be done in two ways:
Label example utterances that have the all the different spellings so that LUIS can learn proper spelling as well as typos. This option requires more labeling effort than using a spell checker.
Create a phrase list with all variations of the word. With this solution, you do not need to label the word variations in the example utterances.
3) As per the current documentation, a maximum of 1000 utterances are allowed per test. The data set is a JSON-formatted file containing a maximum of 1,000 labeled non-duplicate utterances. You can test up to 10 data sets in an app. If you need to test more, delete a data set and then add a new one. I would suggest you to report it as a feature request in the feedback forum.
Hope this helps.

Getting FAGLL03H report using pyrfc

This is a mixed question between SAP and the usage of the pyrfc module. I need to use the FAGLL03H transaction code (tcode) to replicate a G/L report into a database on a daily basis. Now, the thing is that FAGLL03H is not a table per se, but a G/L Account Line Item Browser (G/L View), so I need to access that Tcode and pass a series of parameters in order to get the information we need.
How can I use the RFC protocol to access that tcode and generate a report?
is it possible to do (1) through pyrfc?
This is the code I use to consult tables:
import pyrfc
from pprint import PrettyPrinter
conn = pyrfc.Connection(ashost=...)
options = [{'TEXT': "FCURR = 'USD'"}]
pp = PrettyPrinter(indent=4)
ROWS_AT_A_TIME = 10
rowskips = 0
while True:
print(u"----Begin of Batch---")
result = conn.call('RFC_READ_TABLE', \
QUERY_TABLE='TCURR', \
OPTIONS=options, \
ROWSKIPS=rowskips, ROWCOUNT=ROWS_AT_A_TIME)
pp.pprint(result['DATA'])
rowskips += ROWS_AT_A_TIME
if len(result['DATA']) < ROWS_AT_A_TIME:
break

No way
No
The main point you need to understand is the difference between SAP transaction (tcode) and SAP RFC. The difference is huge and makes it impossible to use them in the similar manner. You are trying to call FAGLL03H report like a table via RFC_READ_TABLE, but it is not a table, it is much more, it is a transaction.
SAP tcode is nothing than a shortcut in SAP that points to some program, usually GUI program, and can contain hundreds of modules, including RFC-enabled ones. And some of these modules are internal and have no RFC equivalent, so it is impossible to call them remotely, at least but not the last it is necessary to know how to call them (in what order) and which parameters to pass.
SAP RFC is like a container for ABAP code (but also a protocol for calling this code) which implements some functionality, either a small piece like converting characters' case or converting measure units, or huge one, for example posting financial documents and creating enterprise hierarchy objects like workcenters, cost centers, sales organizations, etc. RFC-modules can be likened to Python modules or Java methods and they are usually implemented for one single task, and are usually used not standalone but in combination with other methods.
The above-mentioned transaction is huge and is intended for output of G/L account lines and cannot be called via PyRFC. PyRFC features are limited to calling only RFC-modules from which FAGLL03H consists of.
The only thing you can do here is to find equivalent function modules which returns the same items as FAGLL03H. Possible candidates:
BAPI_GLX_GETDOCITEMS
FAGL_GET_OPEN_ITEMS_GL
FAGL_GET_OPEN_ITEMS_KU
FAGL_GET_OPEN_ITEMS_LI
FAGL_GET_OPEN_ITEMS
FKK_GL_LINE_ITEMS_SELECT
BAPI_AP_ACC_GETBALANCEDITEMS
BAPI_AR_ACC_GETBALANCEDITEMS
BAPI_AP_ACC_GETOPENITEMS
BAPI_AR_ACC_GETOPENITEMS
You should try each and compare the output with your tcode, if it is identical. Only after then you can use PyRFC to call them.

Check this in order to get all the specific Tables:
https://www.recercat.cat/bitstream/handle/2072/5419/PFCLopezRuizAnnex3.pdf?sequence=4
You can then either build from there or create a Report (transaction SQ01) and execute through RSAQ_REMOTE_QUERY_CALL.
Your business requirements should decide your code, not the opposite.

Addressing Reliable Output in Newspaper3k

Current Behavior:
In attempting to use the News-aggregator package Newspaper3k , I am unable to produce consistent/reliable output.
System/Environment Setup:
Windows 10
Miniconda3 4.5.12
Python 3.7.1
Newspaper3k 0.2.8
Steps (Code) to Reproduce:
import newspaper
cnn_paper = newspaper.build('http://cnn.com')
print(cnn_paper.size())
Expected Behavior/Output (varies based on current links posted on cnn):
Produce consistent number of posted links on cnn on consecutive Print output runs.
Actual Behavior/Output
Running the code the first time produces a different number of links than code run immediately after.
1st Run Print output: 94 (as of time of posting this question)
2nd Run Print output: 0
3rd Run Print output: 18
4th Run Print output: 7
Printing the actual links will vary the same way as the above link count print. I have tried using a number of different news sources, and the same unexpected variance results. Do I need to change my User-Agent Header? Is this a detection issue? How do I produce reliable results?
Any help would be much appreciated.
Thanks.

My issue was resolved by better understanding of the default caching found under the heading 6.1.3 Article caching in the user documentation .
Apart from my general ignorance, my confusion came from the fact that the read the docs 'Documentation' listed the caching function as a TODO as can be seen here
Upon better scrutiny, I discovered:
By default, newspaper caches all previously extracted articles
andeliminates any article which it has already ex-tracted.This feature
exists to prevent duplicate articles and to increase extraction speed.
The return value of cbs_paper.size()changes from 1030 to 2 because
when we first crawled cbs we found 1030 articles. However, on our
second crawl, we eliminate all articles which have already been
crawled. This means 2 new articles have been published since our first
extraction.
You may opt out of this feature with the
memoize_articlesparameter.
You may also pass in the lower
level ‘‘Config‘‘ objects as covered in the advanced section.
>>>import newspaper
>>>cbs_paper = newspaper.build('http://cbs.com', memoize_articles=False)
>>>cbs_paper.size()1030

Node Js: Not getting the details using dataSources with datasets

I tried to get the step count by date wise. When I took the data from google fit using
API:
https://www.googleapis.com/fitness/v1/users/me/dataSources/derived:com.google.step_count.delta:com.google.android.gms:estimated_steps/datasets/1457548200000000000-1457631000000000000&token=1111111111
I can get only limited step count but not all the steps on that date. Why this kind of problem's are occurs to get the google fit data.
Can any one suggest me the better way to get all the data from google fit.

Using derived:com.google.step_count.delta:com.google.android.gms:estimated_steps datasource
will give you varying results depending on the scenario. The cause of this is mainly from the sensors used. Maybe this is the reason why you think that you have limited results.
estimated_steps also takes into account activity, and estimates steps
when there are none. For instance, assume the user walked for 30
minutes, but the hardware step counter only recorded 10 steps. We
know that number is inaccurate so instead we estimate, say 3000 steps
during that time.
This was noted and discussed in this SO post.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string