Force GPT-NEO to generate despite EOS token

Force GPT-NEO to generate despite EOS token - nlp

I'm trying to use few-shot summarization on GPT-NEO, with custom eos_token_id = '###'.
So when I generate the text, the generator has this parameter:
model.generate(inputs,
max_new_tokens = 80,
eos_token_id = tokenizer.eos_token_id)
The problem is that in some rare cases NOTHING gets generated at all, because '###' somehow gets generated right away after the prompt.
Is there a way to force the model to ignore end of sequence token IF it's the first one generated? So that it never returns NULL?

Related

Using TokenStreamRewriter to insert tokens after lexing but before parsing

Using ANTLR 4.9.2 for C++.
Depending on the first tokens I might need to insert some tokens before parsing. My approach (simplified)
antlr4::ANTLRInputStream antlrIs(properlyEscaped);
Lexer lexer(&antlrIs);
antlr4::CommonTokenStream tokens(&lexer);
antlr4::TokenStreamRewriter tokenStreamRewriter(&tokens);
if (!(tokens.LA(1) == Lexer::MY_SPECIAL_TOKEN))
{
tokenStreamRewriter.insertBefore(tokens.LT(1), string("begin"));
}
Parser parser(&tokens);
Parser::FileContext* fileContext = parser.file();
Stepping with the debugger I see that the token is actually inserted. But the new token I insert seems be be ignored by parser.file().
How can I insert tokens so that parser.file() uses them?

TokenStreamRewriter just builds up a set of instructions for how the input stream should be changed. It doesn’t actually change the token stream itself.
Once you have executed all of your modification calls, you’ll need to call .getText() (or .getText(String programName)) to get get a String that has all of your changes incorporated. Then you can use that as the input to your Lexer to get a token stream containing your modifications.

How to make TAG_ALPHA_IDENTIFIER empty not to ask user for a confirmation

My wallet applet requires to perform actions like PLAY TONE etc. But it requires a prompt "Yes or No?" from user. AFAIK, it is TAG_ALPHA_IDENTIFIER which is responsible for that. However, if I try this code below, it still asks user confirmation but now with "#" text. How to get rid of user confirmation at all?
Attempt 1. Failed with NullPtrException
proHdlr.appendTLV(ToolkitConstants.TAG_ALPHA_IDENTIFIER, null, (short)0, (short)0);
proHdlr.send();
Attempt 2. Prompts '##'
proHdlr.appendTLV(ToolkitConstants.TAG_ALPHA_IDENTIFIER, (byte)0, (byte)0);
proHdlr.send();
Attempt 3. Prompts '#'
proHdlr.appendTLV(ToolkitConstants.TAG_ALPHA_IDENTIFIER, (byte)0);
proHdlr.send();
Attempt 4. Prompts Default Text
byte[] ALPHA_MSG = {};
proHdlr.appendTLV(ToolkitConstants.TAG_ALPHA_IDENTIFIER, ALPHA_MSG, (short)0, (short)ALPHA_MSG.length);
proHdlr.send();
According to ETSI 102.223, "8.2 Alpha identifier" section, it should be:
Description
Length
Alpha identifier tag
1
Length(X)
Y
Alpha identifier
X
And there is also "Default text" in documentation, however since "5.3.7 Text attributes" requires Alpha Identifier to be present, Default text should not bother, right?
In this document "6.4.5 PLAY TONE" section, page 45 it says:
if the alpha identifier is provided by the UICC and is a null data object (i.e. length = '00' and no value part), the terminal should not give any information to the user;
That's what I need. How should I do it Java with ProactiveHandler? All my Google searches end up with some text/menu title for Alpha Identifier.
How to get rid of user confirmation and perform the proactive action without it?

a) Try to pass no data at all, i.e. leave out the proHdlr.appendTLV(ToolkitConstants.TAG_ALPHA_IDENTIFIER line.
b) The behavior might be phone-related or more specific modem-related. Check out a MediaTek based one, a Qualcomm based one and an iPhone and compare the results.

In Gatling, how can I generate a random number each time a call is executed? (not using feeder)

I need to find a way to generate a random number each time the REST call is executed.
I have the following GET call:
exec(http("Random execution")
.get("/randomApi")
.queryParam("id", getRandomId()))
}
Obviously it doesn't work as the random number is only generated once and I end up with the same
number whenever this call is executed. I cant use the feeder option as my feeder is already huge and is generated by a 3rd party for each test.

.queryParam takes Expressions as its arguments, and since Expression is an alias for a session function, you can just do...
.queryParam("id", session => getRandomId())
You could also define a second feeder that uses a function to generate the values - no need to update your existing feeder or add another csv file. This would be useful if you had more complicated logic for getting / generating an Id
val idFeeder = Iterator.continually(Map("id" -> Random.nextInt(999999)))
//in your scenario...
.feed(idFeeder)
.exec(http("Random execution")
.get("/randomApi")
.queryParam("id", "${id}")
)

In the spirit of having options, another option you have is to store an object in the session that support toString, which generates whatever you need. It's a nifty trick that you can use for all kinds of things.
object RANDOM_ID {
toString() { return RandomId().toString() }
}
...
exec( _.set( "RANDOM_ID", RANDOM_ID ) )
...
.exec(
http("Random execution")
.get("/randomApi")
.queryParam( "id", "${RANDOM_ID}" )
)
You can apply the same principle to generating random names, addresses, telephone numbers, you name it.
So, which is the better solution? The feeder, or the object in session?
Most of the time, it'll be the feeder, because you control when it is updated. The object in session will be different every time, whereas the feeder solution, you control when the value updates, and then you can reference it multiple times before you change it.
But there may be instances where the stored object solution results in easier to read code, provided you are good with the value changing every time it is accessed. So it's good to know that it is an option.

Yii2 - how to properly use generatePasswordHash()?

I'm trying to generate a random password for a user in a Yii2 application.
I have the following code:
$rand_password = Yii::$app->security->generateRandomString(8);
$user->password = Yii::$app->security->generatePasswordHash($rand_password);
After that I save the $user model and the hashed string is also saved in the database. However, I cannot log in with the $rand_password string after that as I'm getting Invalid Password error message.
The generatePasswordHash description says that the hash is generated from the provided password and a random salt string. Indeed, I called the function with the same password string several times in a row and I got different result every time. So my question is, if that salt string is random and different every time, how can I use this function at all to verify passwords? When I try to login I call the same function with the password string provided by the user but this time the salt will be different so I'm unable to produce the same hash as before? What am I missing here?

Well, after hours of debugging and looking for resources and explanation, it turns out the the user module I'm using: https://github.com/amnah/yii2-user is actually automatically hashing the passwords before saving them in the database. In other words, as soon as you call:
$user->password = SOMETHING;
that SOMETHING is automatically going through the generatePasswordHash() function upon save. My problem was that I was dropping it in there in my code as well so basically the password got hashed twice.

ANTLR4: Getting start and end index for each rule: $stop behaves strange

I need to get the start and end index of each rule. I.e., the start index is the character position of the first character of the first token belonging to the rule and the end index is the last character position of the last token belonging to the rule. With these numbers I can crop the result of a rule out of the input file precisely.
The straight-forward way of doing this should be using the $start and $stop tokens, i.e., $start.getStartIndex() and $stop.getStopIndex(). However, I have encountered that the $stop token is often null even when used in the #after action.
According to the definitive Antlr4 reference the $stop token is defined as: "The last nonhidden channel token to be matched
by the rule. When referring to the current rule,
this attribute is available only to the after and
finally actions." This sounds as if such token should exist (at least for any rule that matches at least one token). Thus, it is quite strange why this token is null in many cases (even for rules that have a simple token - not a subrule - as their last token. How can a stop token be null in this case?
Right now, I am using a workaround by just asking the input about its current token, moving one token back and using this token as stop token. However, this seems hacky:
#after {
int start = $start.getStartIndex();
int stop = _input.get(_input.index()-1).getStopIndex();
// do something with start and stop
}
The cleaner solution (if stop was not null) should look like this:
#after {
int start = $start.getStartIndex();
int stop = $stop.getStopIndex();
}

The stop token is set in the finally block in the generated code, after any user-defined #finally{} action is executed. The #after{} code is executed in the try block, which also occurs before the stop token is set.
The stop property only works for qualified references. For example, you could do the following:
foo : bar {assert $bar.stop != null};
Also, note that ANTLR 4 is designed to encourage the relocation of action code from embedded actions to listener and/or visitor interfaces that operate on the parse tree after parsing is complete. When used in this manner, the stop tokens will be set for all contexts in the tree. In nearly all cases, the use of a #after or #finally block is a code smell in ANTLR 4 that you should avoid.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Force GPT-NEO to generate despite EOS token - nlp

Related

Using TokenStreamRewriter to insert tokens after lexing but before parsing

How to make TAG_ALPHA_IDENTIFIER empty not to ask user for a confirmation

In Gatling, how can I generate a random number each time a call is executed? (not using feeder)

Yii2 - how to properly use generatePasswordHash()?

ANTLR4: Getting start and end index for each rule: $stop behaves strange

Categories

Resources