python3 cannot successfully compile the proto file - python-3.x

OS: windows 11
libprotoc 3.20.3
Python 3.10.9
I was trying to compile the proto file, there was no error message is prompted after I executed the command
"protoc --proto_path=C:\Work\test pm_event.proto --python_out=." but the DESCRIPTOR._options in the generated pb2.py file is always None, and all fields have only the starting/ending position info.
I can't provide the proto files, but I can confidently say that the proto files are correct, because my colleagues can successfully compile these proto files through Java
It seems something wrong with my env. I just made a very simple proto file:
example.proto
syntax = "proto3";
message ExampleMessage {
int32 id = 1;
string name = 2;
}
And run the command "protoc --python_out=. example.proto" and got the output of example_pb2.py, the content as below
# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler. DO NOT EDIT!
# source: example.proto
"""Generated protocol buffer code."""
from google.protobuf.internal import builder as _builder
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import symbol_database as _symbol_database
# ##protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\rexample.proto\"*\n\x0e\x45xampleMessage\x12\n\n\x02id\x18\x01 \x01(\x05\x12\x0c\n\x04name\x18\x02 \x01(\tb\x06proto3')
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, globals())
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'example_pb2', globals())
if _descriptor._USE_C_DESCRIPTORS == False:
DESCRIPTOR._options = None
_EXAMPLEMESSAGE._serialized_start=17
_EXAMPLEMESSAGE._serialized_end=59
# ##protoc_insertion_point(module_scope)
So what shall I do for the next?

The output event_pb2.py content sounds correct.
Here's an example for a very basic proto file.
event_pb2.py:
# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler. DO NOT EDIT!
# source: event.proto
"""Generated protocol buffer code."""
from google.protobuf.internal import builder as _builder
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import symbol_database as _symbol_database
# ##protoc_insertion_point(imports)
_sym_db = _symbol_database.Default()
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0b\x65vent.proto\"\x15\n\x05\x45vent\x12\x0c\n\x04name\x18\x01 \x01(\tb\x06proto3')
_globals = globals()
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, _globals)
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'event_pb2', _globals)
if _descriptor._USE_C_DESCRIPTORS == False:
DESCRIPTOR._options = None
_globals['_EVENT']._serialized_start=15
_globals['_EVENT']._serialized_end=36
# ##protoc_insertion_point(module_scope)
The Python implementation of Protocol Buffers differs from e.g. Java, Golang and other languages.
protoc's Python-generated code is only descriptors and the runtime produces the implementation. See Python Generated Code Guide.
The way to confirm this is to implement the protoc-generated stub code; it should work.
There's another 'trick' with protoc for Python to generate a descriptor that can be used by Visual Studio Code to provide intellisense. Add --pyi_out to protoc to get it to generate Python .pyi files. See Python pyi file generation description.

Related

Constant variable names in all modules/functions

This is version 2 of my question, after the comment on v1 by Omer Dagry.
What's the best way to ensure that constants are available throughout my code?
I've created this constants.ini file that I want to be able to use in any module I create to ensure standard variables for certain functions.
; dd_config.ini
[DEBUG_TYPE]
DEBUG_SUBSTR = -1
DEBUG_START = 0
DEBUG_OS = 1
DEBUG_GENERAL = 2
DEBUG_END = 3
I have re-ordered program 1 so that it populates dd_con before importing dd_debug_exception:
from configparser import ConfigParser
# read the .ini file for some of the code
dd_con = ConfigParser()
dd_con.read(
"C:/Users/DD/dd_constants.ini"
)
# standard error reporting if debug needed
from test_debug_2 import dd_debug_exception
_SHORT = dd_con.get("DEBUG_STYLE", "DEBUG_SHORT")
print(_SHORT)
dd_debug_exception(0, "test", _SHORT)
Import dd_debug_exception does this:
def dd_debug_exception(debug_type,
debug_str,
debug_style: int = dd_con.get("DEBUG_TYPE",
"DEBUG_NORMAL")):
# Handle exceptions in a standard way
if debug_style == _SHORT:
print("hello")
When I try to run it I get the following error:
Traceback (most recent call last): File
"c:\Users\DD\test_config.py", line 9, in
from test_debug_2 import dd_debug_exception File "C:\Users\DD\test_debug_2.py", line 3, in
debug_style: int = dd_con.get("DEBUG_TYPE", NameError: name 'dd_con' is not defined
The import is still not recognising dd_con and therefore my standard variables.

decodestrings is not an attribute of base64 error in python 3.9.1

After upgrading from python 3.8.0 to python 3.9.1, the tremc front-end of transmission bitTorrent client is throwing decodestrings is not an attribute of base64 error whenever i click on a torrent entry to check the details.
My system specs:
OS: Arch linux
kernel: 5.6.11-clear-linux
base64.encodestring() and base64.decodestring(), aliases deprecated since Python 3.1, have been removed.
use base64.encodebytes() and base64.decodebytes()
So i went to the site-packages directory and with ripgrep tried searching for the decodestring string.
rg decodestring
paramiko/py3compat.py
39: decodebytes = base64.decodestring
Upon examining the py3compat.py file,i found this block:
PY2 = sys.version_info[0] < 3
if PY2:
string_types = basestring # NOQA
text_type = unicode # NOQA
bytes_types = str
bytes = str
integer_types = (int, long) # NOQA
long = long # NOQA
input = raw_input # NOQA
decodebytes = base64.decodestring
encodebytes = base64.encodestring
So decodebytes have replaced(aliased) decodestring attribute of base64 for python version >= 3
This must be a new addendum because tremc was working fine in uptil version 3.8.*.
Opened tremc script, found the erring line (line 441), just replaced the attribute decodestring with decodebytes.A quick fix till the next update.
PS: Checked the github repository, and there's a pull request for it in waiting.
If you don't want to wait for the next release and also don't want to hack the way i did, you can get it freshly build from the repository, though that would be not much of a difference than my method

pytest fails when using io.BytesIO stream instead of PDF file

I'm running pytest to check a function that uses pdfminer to convert PDF to text. The function works when doing $ python function.py and the result is what I expect it to be. I should also point out that I'm using a stream when parsing the file (io.BytesIO) and this stream is the reason my test fails.
Running pytest the function fails with a PDFSyntaxError.
# function.py
...
from pdfminer.pdfparser import PDFParser
from pdfminer.document import PDFDocument
req = requests.get(url_pointing_to_pdf_file)
pdf = io.BytesIO(req.content)
parser = PDFParser(pdf)
document = PDFDocument(parser, password=None) # this fails
...
pytest calls the init method in pdfdocument.py (part of the pdfminer library) and stops here:
for xref in xrefs:
trailer = xref.get_trailer()
...
if 'Root' in trailer:
self.catalog = dict_value(trailer['Root'])
break
else:
raise PDFSyntaxError('No /Root object! - Is this really a PDF?')
...
And this is what pytest shows when testing the function fails:
tests/test_function.py:11:
----------------------------------------------------
.../function.py:157: in function
**document = PDFDocument(parser, password=None)**
...
E pdfminer.pdfparser.PDFSyntaxError: No /Root object! - Is this really a PDF?
lib/python3.6/site-packages/pdfminer/pdfdocument.py:583:PDFSyntaxError
Running the test with a PDF file stored in the same directory as function.py is successful, so the culprit is the io.BytesIO format of the downloaded PDF file. Since I want to use a stream with function.py I would like to know if there is a better way to do this.

How to get the default application mapped to a file extention in windows using Python

I would like to query Windows using a file extension as a parameter (e.g. ".jpg") and be returned the path of whatever app windows has configured as the default application for this file type.
Ideally the solution would look something like this:
from stackoverflow import get_default_windows_app
default_app = get_default_windows_app(".jpg")
print(default_app)
"c:\path\to\default\application\application.exe"
I have been investigating the winreg builtin library which holds the registry infomation for windows but I'm having trouble understanding its structure and the documentation is quite complex.
I'm running Windows 10 and Python 3.6.
Does anyone have any ideas to help?
The registry isn't a simple well-structured database. The Windows
shell executor has some pretty complex logic to it. But for the simple cases, this should do the trick:
import shlex
import winreg
def get_default_windows_app(suffix):
class_root = winreg.QueryValue(winreg.HKEY_CLASSES_ROOT, suffix)
with winreg.OpenKey(winreg.HKEY_CLASSES_ROOT, r'{}\shell\open\command'.format(class_root)) as key:
command = winreg.QueryValueEx(key, '')[0]
return shlex.split(command)[0]
>>> get_default_windows_app('.pptx')
'C:\\Program Files\\Microsoft Office 15\\Root\\Office15\\POWERPNT.EXE'
Though some error handling should definitely be added too.
Added some improvements to the nice code by Hetzroni, in order to handle more cases:
import os
import shlex
import winreg
def get_default_windows_app(ext):
try: # UserChoice\ProgId lookup initial
with winreg.OpenKey(winreg.HKEY_CURRENT_USER, r'SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\FileExts\{}\UserChoice'.format(ext)) as key:
progid = winreg.QueryValueEx(key, 'ProgId')[0]
with winreg.OpenKey(winreg.HKEY_CURRENT_USER, r'SOFTWARE\Classes\{}\shell\open\command'.format(progid)) as key:
path = winreg.QueryValueEx(key, '')[0]
except: # UserChoice\ProgId not found
try:
class_root = winreg.QueryValue(winreg.HKEY_CLASSES_ROOT, ext)
if not class_root: # No reference from ext
class_root = ext # Try direct lookup from ext
with winreg.OpenKey(winreg.HKEY_CLASSES_ROOT, r'{}\shell\open\command'.format(class_root)) as key:
path = winreg.QueryValueEx(key, '')[0]
except: # Ext not found
path = None
# Path clean up, if any
if path: # Path found
path = os.path.expandvars(path) # Expand env vars, e.g. %SystemRoot% for ext .txt
path = shlex.split(path, posix=False)[0] # posix False for Windows operation
path = path.strip('"') # Strip quotes
# Return
return path

How to convert a compiled protocol buffer back to .proto file?

I have a compiled google protocol buffer for python 2 and I'm attempting to port this to python 3. Unfortunately, I cannot find the proto file I used to generate the compiled protocol buffer anywhere. How do I recover the proto file so that I can compile a new one for python 3. I'm unaware of what proto versions were used and all I have is the .py file meant to run on python 2.6.
You will have to write code (in Python for instance) to walk through the tree of your message descriptors. They should - in principle - carry the full information of your original proto file except the code comments. And the generated Python module you still have in your posession should allow you to serialize the file descriptor for your proto file as a file descriptor proto message which could then be fed to code expressing it as proto code.
As a guide you should look into the various code generators for protoc which actually do the same: they read in a file descriptor as a protobuf message, analyze it and generate code.
Here's a basic introduction how to write a Protobuf plugin in Python
https://www.expobrain.net/2015/09/13/create-a-plugin-for-google-protocol-buffer/
Here's the official list of protoc plugins
https://github.com/google/protobuf/blob/master/docs/third_party.md
And here's a protoc plugin to generate LUA code, written in Python.
https://github.com/sean-lin/protoc-gen-lua/blob/master/plugin/protoc-gen-lua
Let's have a look at the main code block
def main():
plugin_require_bin = sys.stdin.read()
code_gen_req = plugin_pb2.CodeGeneratorRequest()
code_gen_req.ParseFromString(plugin_require_bin)
env = Env()
for proto_file in code_gen_req.proto_file:
code_gen_file(proto_file, env,
proto_file.name in code_gen_req.file_to_generate)
code_generated = plugin_pb2.CodeGeneratorResponse()
for k in _files:
file_desc = code_generated.file.add()
file_desc.name = k
file_desc.content = _files[k]
sys.stdout.write(code_generated.SerializeToString())
The loop for proto_file in code_gen_req.proto_file: actually loops over the file descriptor objects for which the code generator plugin was asked by protoc to generate LUA code. So now you could do something like this:
# This should get you the file descriptor for your proto file
file_descr = your_package_pb2.sometype.GetDescriptor().file
# serialized version of file descriptor
filedescr_msg = file_descr.serialized_pb
# required by lua codegen
env = Env()
# create LUA code -> modify it to create proto code
code_gen_file(filedescr, env, "your_package.proto")
As mentioned in the other post(s), you'll need to walk through the tree of your descriptor message and build your proto file contents.
You can find a full C++ example in the protocol buffers github repository. Here are some C++ code snippets from the link in order to give you an idea on how to implement this in Python:
// Special case map fields.
if (is_map()) {
strings::SubstituteAndAppend(
&field_type, "map<$0, $1>",
message_type()->field(0)->FieldTypeNameDebugString(),
message_type()->field(1)->FieldTypeNameDebugString());
} else {
field_type = FieldTypeNameDebugString();
}
std::string label = StrCat(kLabelToName[this->label()], " ");
// Label is omitted for maps, oneof, and plain proto3 fields.
if (is_map() || containing_oneof() ||
(is_optional() && !has_optional_keyword())) {
label.clear();
}
SourceLocationCommentPrinter comment_printer(this, prefix,
debug_string_options);
comment_printer.AddPreComment(contents);
strings::SubstituteAndAppend(
contents, "$0$1$2 $3 = $4", prefix, label, field_type,
type() == TYPE_GROUP ? message_type()->name() : name(), number());
Where the FieldTypeNameDebugString function is shown below:
// The field type string used in FieldDescriptor::DebugString()
std::string FieldDescriptor::FieldTypeNameDebugString() const {
switch (type()) {
case TYPE_MESSAGE:
return "." + message_type()->full_name();
case TYPE_ENUM:
return "." + enum_type()->full_name();
default:
return kTypeToName[type()];
}
}

Resources