How to read predict() result in Tensorflowjs using a SavedModel

How to read predict() result in Tensorflowjs using a SavedModel - node.js

Code using tfjs-node:
const model = await tf.node.loadSavedModel(modelPath);
const data = fs.readFileSync(imgPath);
const tfimage = tf.node.decodeImage(data, 3);
const expanded = tfimage.expandDims(0);
const result = model.predict(expanded);
console.log(result);
for (r of result) {
console.log(r.dataSync());
}
Output:
(8) [Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor]
Float32Array(100) [48700, 48563, 48779, 48779, 49041, 48779, ...]
Float32Array(400) [0.10901492834091187, 0.18931034207344055, 0.9181075692176819, 0.8344497084617615, ...]
Float32Array(100) [61, 88, 65, 84, 67, 51, 62, 20, 59, 9, 18, ...]
Float32Array(9000) [0.009332209825515747, 0.003941178321838379, 0.0005068182945251465, 0.001926332712173462, 0.0020033419132232666, 0.000742495059967041, 0.022082984447479248, 0.0032682716846466064, 0.05071520805358887, 0.000018596649169921875, ...]
Float32Array(100) [0.6730095148086548, 0.1356855034828186, 0.12674063444137573, 0.12360832095146179, 0.10837388038635254, 0.10075071454048157, ...]
Float32Array(1) [100]
Float32Array(196416) [0.738592267036438, 0.4373246729373932, 0.738592267036438, 0.546840488910675, -0.010780575685203075, 0.00041256844997406006, 0.03478313609957695, 0.11279871314764023, -0.0504981130361557, -0.11237315833568573, 0.02907072752714157, 0.06638012826442719, 0.001794634386897087, 0.0009463857859373093, ...]
Float32Array(4419360) [0.0564018189907074, 0.016801774501800537, 0.025803595781326294, 0.011671125888824463, 0.014013528823852539, 0.008442580699920654, ...]
How do I read the predict() response for object detection? I was expecting a dictionary with num_detections, detection_boxes, detection_classes, etc. as described here.
I also tried using tf.execute(), but it throws me the following error: UnhandledPromiseRejectionWarning: Error: execute() of TFSavedModel is not supported yet.
I'm using efficientdet/d0 downloaded from here.

When you download the tensor using dataSync() it just keeps the value. If you wanted the object with a description of each of the results without the tensors you would just have to console.log(result). Then you expand the result from your log in the browsers console it should return something like this:
Tensor {
"dataId": Object {},
"dtype": "float32",
"id": 160213,
"isDisposedInternal": false,
"kept": false,
"rankType": "2",
"scopeId": 365032,
"shape": Array [
1,
3,
],
"size": 3,
"strides": Array [
3,
],
}
The output of your console.log(result) contains 8 tensors within it which shows that it is correct. You are looping over each of the results and each of the outputs should follow this format :
['num_detections', 'detection_boxes', 'detection_classes', 'detection_scores', 'raw_detection_boxes', 'raw_detection_scores, 'detection_anchor_indices', 'detection_multiclass_scores']

Related

Understanding the config file of paraphrase mpnet base v2?

Here is the config file of the paraphrase mpnet transformer model and I would like to understand the meaning with examples of the hidden_size and num_hidden_layers parameters.
{
"_name_or_path": "old_models/paraphrase-mpnet-base-v2/0_Transformer",
"architectures": [
"MPNetModel"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "mpnet",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 1,
"relative_attention_num_buckets": 32,
"transformers_version": "4.7.0",
"vocab_size": 30527
}

Unrecognized configuration class <class 'transformers.models.bert.configuration_bert.BertConfig'> for this kind of AutoModel: AutoModelForSeq2SeqLM

Model type should be one of BartConfig, PLBartConfig, BigBirdPegasusConfig, M2M100Config, LEDConfig, BlenderbotSmallConfig, MT5Config, T5Config, PegasusConfig, MarianConfig, MBartConfig, BartConfig, BlenderbotConfig, FSMTConfig, XLMProphetNetConfig, ProphetNetConfig, EncoderDecoderConfig.
I am trying to load a fine-tuned Bert model for machine translation using AutoModelForSeq2SeqLM but it can't recognize the configuration class.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
model = AutoModelForSeq2SeqLM.from_pretrained('/content/drive/MyDrive/Models/CSE498')
Config File
{
"_name_or_path": "ckiplab/albert-tiny-chinese",
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.0,
"bos_token_id": 101,
"classifier_dropout": null,
"classifier_dropout_prob": 0.1,
"down_scale_factor": 1,
"embedding_size": 128,
"eos_token_id": 102,
"gap_size": 0,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.0,
"hidden_size": 312,
"initializer_range": 0.02,
"inner_group_num": 1,
"intermediate_size": 1248,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"net_structure_type": 0,
"num_attention_heads": 12,
"num_hidden_groups": 1,
"num_hidden_layers": 4,
"num_memory_blocks": 0,
"pad_token_id": 0,
"position_embedding_type": "absolute",
"tokenizer_class": "BertTokenizerFast",
"torch_dtype": "float32",
"transformers_version": "4.18.0",
"type_vocab_size": 2,
"use_cache": true,
"vocab_size": 30522
}

This is because BERT itself is not a seq2seq model. You can consider using a pre-trained BART instead.

HuggingFace - GPT2 Tokenizer configuration in config.json

The GPT2 finetuned model is uploaded in huggingface-models for the inferencing
Below error is observed during the inference,
Can't load tokenizer using from_pretrained, please update its configuration: Can't load tokenizer for 'bala1802/model_1_test'. Make sure that: - 'bala1802/model_1_test' is a correct model identifier listed on 'https://huggingface.co/models' - or 'bala1802/model_1_test' is the correct path to a directory containing relevant tokenizer files
Below is the configuration - config.json file for the Finetuned huggingface model,
{
"_name_or_path": "gpt2",
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"gradient_checkpointing": false,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 768,
"n_head": 12,
"n_inner": null,
"n_layer": 12,
"n_positions": 1024,
"resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"transformers_version": "4.3.2",
"use_cache": true,
"vocab_size": 50257
}
Should I configure the GPT2 Tokenizer just like the "model_type": "gpt2" in the config.json file

Your repository does not contain the required files to create a tokenizer. It seems like you have only uploaded the files for your model. Create an object of your tokenizer that you have used for training the model and save the required files with save_pretrained():
from transformers import GPT2Tokenizer
t = GPT2Tokenizer.from_pretrained("gpt2")
t.save_pretrained('/SOMEFOLDER/')
Output:
('/SOMEFOLDER/tokenizer_config.json',
'/SOMEFOLDER/special_tokens_map.json',
'/SOMEFOLDER/vocab.json',
'/SOMEFOLDER/merges.txt',
'/SOMEFOLDER/added_tokens.json')

TatSu: yaml.representer.RepresenterError when dumping to YAML

I have a object model generated by TatSu after doing a successful parse. The model dumps to stdout using JSON format OK. But when I try to dump it to YAML, I get a RepresenterError exception. I am not sure how to solve this. The object model is generated internally by TatSu. Can anyone shed any light on how to potentially resolve this error?
Using python 3.7.0 with TatSu v4.4.0 with pyyaml 5.1.2.
My code:
import sys
import json
import datetime
import tatsu
from tatsu.ast import asjson
from tatsu.objectmodel import Node
from tatsu.semantics import ModelBuilderSemantics
from tatsu.exceptions import FailedParse
class ModelBase(Node):
pass
class MyModelBuilderSemantics(ModelBuilderSemantics):
def __init__(self, context=None, types=None):
types = [
t for t in globals().values()
if type(t) is type and issubclass(t, ModelBase)
] + (types or [])
super(MyModelBuilderSemantics, self).__init__(context=context, types=types)
def main():
sys.setrecursionlimit(10000)
grammar = open('STIL1999.ebnf.working').read()
parser = tatsu.compile(grammar, semantics=MyModelBuilderSemantics(), asmodel=True)
assert (parser is not None)
try:
start = datetime.datetime.now()
ast = parser.parse(open(sys.argv[1]).read(), filename=sys.argv[1])
finish = datetime.datetime.now()
print('Total = %s' % (finish - start).total_seconds())
print(json.dumps(asjson(ast), indent=2))
except FailedParse as e:
print('Parse error : %s' % e.message)
print(e.buf.line_info(e.pos))
return 1
from tatsu.yaml import ast_dump
ast_dump(ast, stream=open('foo.yaml', 'w'))
return 0
if __name__ == '__main__':
sys.exit(main())
The output:
Total = 0.007043
{
"__class__": "StilSession",
"version": {
"ver": 1.0
},
"header": {
"__class__": "Header",
"objs": [
{
"k": "Title",
"v": "foo.gz"
},
{
"k": "Date",
"v": "Mon Nov 4 02:48:48 2019"
},
{
"k": "Source",
"v": "foo.gz"
},
{
"k": "History",
"objs": [
{
"__class__": "Annotation",
"ann": " This is a test "
}
]
}
]
},
"blocks": []
}
Traceback (most recent call last):
File "./run.py", line 57, in <module>
sys.exit(main())
File "./run.py", line 52, in main
ast_dump(ast, stream=open('foo.yaml', 'w'))
File "/sw_tools/anaconda3/lib/python3.7/site-packages/tatsu/yaml.py", line 50, i
n ast_dump
return dump(data, object_pairs_hook=AST, **kwargs)
File "/sw_tools/anaconda3/lib/python3.7/site-packages/tatsu/yaml.py", line 33, i
n dump
**kwds
File "/sw_tools/anaconda3/lib/python3.7/site-packages/yaml/__init__.py", line 29
0, in dump
return dump_all([data], stream, Dumper=Dumper, **kwds)
File "/sw_tools/anaconda3/lib/python3.7/site-packages/yaml/__init__.py", line 27
8, in dump_all
dumper.represent(data)
File "/sw_tools/anaconda3/lib/python3.7/site-packages/yaml/representer.py", line
27, in represent
node = self.represent_data(data)
File "/sw_tools/anaconda3/lib/python3.7/site-packages/yaml/representer.py", line
58, in represent_data
node = self.yaml_representers[None](self, data)
File "/sw_tools/anaconda3/lib/python3.7/site-packages/yaml/representer.py", line
231, in represent_undefined
raise RepresenterError("cannot represent an object", data)
yaml.representer.RepresenterError: ('cannot represent an object', <tatsu.synth.StilSession object at 0x7ffff6
8e8f98>)

This issue was resolved by the OP in TatSu pull request #146

knex postgres returns strings for numeric/decimal values

I have a table with column
table.decimal('some_column', 30,15) which on postgres is numeric(30,15)
When I run a knex.raw('select some_column from some_table') from node, the response I get in rows is like:
some_column: "5.000000000000000"
some_column: "10.000000000000000"
What really pointed me to this is that I do something like firstValue>lastValue I end up with a true response so that makes me think that these are returned as strings and not as numbers.
Any way to override this behavior?

For an explanation about why and the possible solutions
check this great answer https://stackoverflow.com/a/39176670/7668448
You get the problem and the possible solutions.
Use of pg-types
Check my answer here:
https://stackoverflow.com/a/57210469/7668448
 resume:
All built in types:
const typesBuiltins = {
BOOL: 16,
BYTEA: 17,
CHAR: 18,
INT8: 20,
INT2: 21,
INT4: 23,
REGPROC: 24,
TEXT: 25,
OID: 26,
TID: 27,
XID: 28,
CID: 29,
JSON: 114,
XML: 142,
PG_NODE_TREE: 194,
SMGR: 210,
PATH: 602,
POLYGON: 604,
CIDR: 650,
FLOAT4: 700,
FLOAT8: 701,
ABSTIME: 702,
RELTIME: 703,
TINTERVAL: 704,
CIRCLE: 718,
MACADDR8: 774,
MONEY: 790,
MACADDR: 829,
INET: 869,
ACLITEM: 1033,
BPCHAR: 1042,
VARCHAR: 1043,
DATE: 1082,
TIME: 1083,
TIMESTAMP: 1114,
TIMESTAMPTZ: 1184,
INTERVAL: 1186,
TIMETZ: 1266,
BIT: 1560,
VARBIT: 1562,
NUMERIC: 1700,
REFCURSOR: 1790,
REGPROCEDURE: 2202,
REGOPER: 2203,
REGOPERATOR: 2204,
REGCLASS: 2205,
REGTYPE: 2206,
UUID: 2950,
TXID_SNAPSHOT: 2970,
PG_LSN: 3220,
PG_NDISTINCT: 3361,
PG_DEPENDENCIES: 3402,
TSVECTOR: 3614,
TSQUERY: 3615,
GTSVECTOR: 3642,
REGCONFIG: 3734,
REGDICTIONARY: 3769,
JSONB: 3802,
REGNAMESPACE: 4089,
REGROLE: 4096
};
Which you can find here
https://github.com/brianc/node-pg-types/blob/master/lib/builtins.js
Use example:
const pg = require('pg');
pg.types.setTypeParser(pg.types.builtins.INT8, (value: string) => {
return parseInt(value);
});
pg.types.setTypeParser(pg.types.builtins.FLOAT8, (value: string) => {
return parseFloat(value);
});
pg.types.setTypeParser(pg.types.builtins.NUMERIC, (value: string) => {
return parseFloat(value);
});

You can take a look at pg-types module, which is used by pg module, which is used by knex, and configure parsing of your varialbes
var types = require('pg').types
types.setTypeParser(<I DONT REMEMBER VAR NAME, NEED TO CHECK>, value => value === null ? null : +value)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to read predict() result in Tensorflowjs using a SavedModel - node.js

Related

Understanding the config file of paraphrase mpnet base v2?

Unrecognized configuration class <class 'transformers.models.bert.configuration_bert.BertConfig'> for this kind of AutoModel: AutoModelForSeq2SeqLM

HuggingFace - GPT2 Tokenizer configuration in config.json

TatSu: yaml.representer.RepresenterError when dumping to YAML

knex postgres returns strings for numeric/decimal values

Categories

Resources