RowType support in Presto

RowType support in Presto - presto

Question for those who knows Presto API for plugins.
I implement BigQuery plugin. BigQuery supports struct type, which could be represented as RowType class in Presto.
RowType creates RowBlockBuilder in RowType::createBlockBuilder, which has RowBlockBuilder::appendStructure method, which requires to accept only instance of AbstractSingleRowBlock class.
This means that in my implementation of Presto's RecordCursor BigQueryRecordCursor::getObject method I had to return something that is AbstractSingleRowBlock for field which has type RowType.
But AbstractSingleRowBlock has package private abstract method, which prevents me from implementing this class. The only child SingleRowBlock has package private constructor, and there are no factories or builders that could build an instance for me.
How to implement struct support in BigQueryRecordCursor::getObject?
(reminding: BigQueryRecordCursor is a child of RecordCursor).

You need to assemble the block for the row by calling beginBlockEntry, appending the values for each column via Type.writeXXX with the column's type and then closeEntry. Here's some pseudo-code.
BlockBuilder builder = type.createBlockBuilder(..);
builder = builder.beginBlockEntry();
for each column {
...
columnType.writeXXX(builder, ...);
}
builder.closeEntry();
return (Block) type.getObject(builder, 0);
However, I suggest you use the columnar APIs, instead (i.e., ConnectorPageSource and friends). Take a look at how the Elasticsearch connector implements it:
https://github.com/prestosql/presto/blob/master/presto-elasticsearch/src/main/java/io/prestosql/elasticsearch/ElasticsearchPageSourceProvider.java
https://github.com/prestosql/presto/blob/master/presto-elasticsearch/src/main/java/io/prestosql/elasticsearch/ElasticsearchPageSource.java
Here's how it handles Row types:
https://github.com/prestosql/presto/blob/master/presto-elasticsearch/src/main/java/io/prestosql/elasticsearch/decoders/RowDecoder.java
Also, I suggest you join #dev channel on the Presto Community Slack, where all the Presto developers hang out.

Related

create dynamic object in GO operator controller

I am following the below blog which explains how to create operator and import another CR into existing one.
http://heidloff.net/article/accessing-third-party-custom-resources-go-operators/
here https://github.com/nheidloff/operator-sample-go/blob/aa9fd15605a54f712e1233423236bd152940f238/operator-application/controllers/application_controller.go#L276 , spec is created with hardcoded properties.
I want to import the spark operator types in my operator.
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/pkg/apis/sparkoperator.k8s.io/v1beta2/types.go
This spark operator is having say - 100+ types/properties. By following the above blog , i could create the Go object but it would be hardcoded. I want to create the dynamic object based on user provided values in CR YAML. e.g. - customer can provided 25 attributes , sometimes 50 for spark app. I need to have dynamic object created based on user YAML. Can anybody please help me out ?

If you set the spec type to be a json object, you can have the Spec contain arbitrary json/yaml. You don't have to have a strongly typed Spec object, your operator can then decode it and do whatever you want with it during your reconcile operation as long as its you as you can serialize and deserialize it from json. Should be able to set it to json.RawMesage I think?

What do you mean by hardcoded properties?
If I understood it correctly, you want to define an API for a resource which uses both types from an external operator and your customs. You can extend your API using the types from specific properties such as ScheduledSparkApplicationSpec from this. Here is an example API definition in Go:
type MyKindSpec struct {
// using external third party api (you need to import it)
SparkAppTemplate v1beta2.ScheduledSparkApplicationSpec `json:"sparkAppTemplate,omitempty"`
// using kubernetes core api (you need to import it)
Container v1.Container `json:"container,omitempty"`
// using custom types
MyCustomType MyCustomType `json:"myCustomType,omitempty"`
}
type MyCustomType struct {
FirstField string `json:"firstField,omitempty"`
SecondField []int `json:"secondField,omitempty"`
}

JOOQ generator: use existing enums

I am using nu.studer.jooq gradle plugin to generate pojos, tables and records for a PostgreSQL database with tables that have fields of type ENUM.
We already have the enums in the application, so I would like that the generator uses those enums instead of generating new ones.
I defined in build.gradle for the generator: udts = false, so it doesn't generate the enums, and I wrote a custom generator strategy that sets the package for the enums to be the one of the already existing enums.
I have an issue in the generated table fields, the SQLDataType.VARCHAR.asEnumDataType(mypackage.ExistingEnum) doesn't work because the mypackage.ExistingEnum does not implement org.jooq.EnumType.
public enum ExistingEnum {
VAL1, VAL2
}
Generated table record:
public class EntryTable extends TableImpl<EntryRecord> {
public final TableField<EntryRecord, ExistingEnum> MY_FIELD = createField(DSL.name("my_field"), SQLDataType.VARCHAR.asEnumDataType(mypackage.ExistingEnum.class), this, "");
}
Is there something I can do to fix this issue? Also we have a lot of enums, so writing a converter for each of them is not suitable.

The point of having custom enum types is that they are individual types, independent of whatever you encode with your database enum types. As such, the jOOQ code generator cannot make any automated assumptions related to how to map the generated types to the custom types. You'll have to implement Converter types of some sort.
If you're not relying on the jOOQ provided EnumType types, you could use the <enumConverter/> configuration, or write implementations based on org.jooq.impl.EnumConverter, which help reduce boilerplate code.
If you have some conventions or rules how to map things a bit more automatically (just because jOOQ doesn't know your convention doesn't mean you don't know it either), you could implement a programmatic code generation configuration, where you query your dictionary views (e.g. PG_CATALOG.PG_ENUM) to generate ForcedType objects. You can even use jOOQ-meta for that purpose.

Asp.net core 2.0 Conditional DI

I have an interface that has 2 implementation
IMYService
MyServiceA
MYServiceB
Here the Implementation is dependent on some data in the request header used by the construtor of the Services like
If Header has value of internalId then use MyServiceA and pass that Id to the services in constructor
while if the value is missing use MyServiceB where as this service construct does not expect id.
My Controller is defined with DI for IMyservice

In general, you need a factory. That's generally a separate class that is responsible to managing instances of a particular similar type of thing and returns the appropriate instance based on some sort of convention or condition. For example, if you've used IHttpClientFactory, you've see one in action.
Since the implementation depends on the request, you can cheat a little by using the "factory" overload of AddScoped:
services.AddScoped<IMyService>(p =>
{
// return either MyServiceA or MyServiceB
});
The p param of the lambda is actually a scoped instance of IServiceProvider, so you can do stuff like pull out IHttpContextAccessor to look at the request details.

How to get query string in case CQLStatement,QueryState and QueryOptions is given

Cassandra has org.apache.cassandra.cql3.QueryHandler interface which provide apis to handle external queries from client.
Below api which handles prepared statment:
public ResultMessage processPrepared(CQLStatement statement, QueryState state, QueryOptions options) throws RequestExecutionException, RequestValidationException;
I want to log queryString and value passed to it, in case CQLStatement,QueryState and QueryOptions is given . How can i get it?
I Believe a person who has worked on cassandra code can help me out in this.

This would be very difficult in 2.1. With newer versions where for logging they needed this they just recreate it as well as possible. You can see how in the ReadCommand implementations, theres a name() or toCQLString() used in things like slow query logging. You could backport this and the 2 implementations of appendCQLWhereClause for ability to do similar and then build one for modification statement.
in getPrepared() you can get the rawCQLStatement from the ParsedStatement.Prepared and stash it in the thread local.
You may want to alternatively consider using a custom implementation of tracing (example) or using triggers and building a mutation logger.

Do the following:
create a class that would implement the QueryHandler interface and make Cassandra aware of it
in that class you can maintain a list of the queries (add to this list when prepare method is being called) and the current query that you will get from the list when getPrepared it's called; you can get it from the list using the MD5Digest id
when processPrepared is called you can replace the ? in the query string with the values in the QueryOptions options.getValues().
HTH

Storing object in Esent persistent dictionary gives: Not supported for SetColumn Parameter error

I am trying to save an Object which implements an Interface say IInterface.
private PersistentDictionary<string, IInterface> Object = new PersistentDictionary<string, IInterface>(Environment.CurrentDirectory + #"\Object");
Since many classes implement the same interface(all of which need to cached), for a generic approach I want to store an Object of type IInterface in the dictionary.
So that anywhere I can pull out that object type cast it as IInterface and use that object's internal implementation of methods etc..
But, as soon as the Esent cache is initialized it throws this error:
Not supported for SetColumn
Parameter name: TColumn
Actual value was IInterface.
I have tried to not use XmlSerializer to do the same but is unable to deserialize an Interface type.Also, [Serializable] attribute cannot be used on top of a Interface, so I am stuck.
I have also tried to make all the implementations(classes) of the Interface as [Serializable] as a dying attempt but to no use.
Does any one know a way out ? Thanks in advance !!!

The only reason that only structs are supported (as well as some basic immutable classes such as string) is that the PersistentDictionary is meant to be a drop-in replacement for Dictionary, SortedDictionary and other similar classes.
Suppose I have the following code:
class MyClass
{
int val;
}
.
.
.
var dict = new Dictionary<int,MyClass>();
var x = new MyClass();
x.val = 1;
dict.Add(0,x);
x.val = 2;
var y = dict[0];
Console.WriteLine(y.val);
The output in this case would be 2. But if I'd used the PersistentDictionary instead of the regular one, the output would be 1. The class was created with value 1, and then changed after it was added to the dictionary. Since a class is a reference type, when we retrieve the item from the dictionary, we will also have the changed data.
Since the PersistentDictionary writes the data to disk, it cannot really handle reference types this way. Serializing it, and writing it to disk is essentially the same as treating the object as a value type (an entire copy is made).
Because it's intended to be used instead of the standard dictionaries, and the fact that it cannot handle reference types with complete transparency, the developers instead opted to support only structs, because structs are value types already.
However, if you're aware of this limitation and promise to be careful not to fall into this trap, you can allow it to serialize classes quite easily. Just download the source code and compile your own version of the EsentCollections library. The only change you need to make to it is to change this line:
if (!(type.IsValueType && type.IsSerializable))
to this:
if (!type.IsSerializable)
This will allow classes to be written to the PersistentDictionary as well, provided that it's Serializable, and its members are Serializable as well. A huge benefit is that it will also allow you to store arrays in there this way. All you have to keep in mind is that it's not a real dictionary, therefore when you write an object to it, it will store a copy of the object. Therefore, updating any of your object's members after adding them to the PersistentDictionary will not update the copy in the dictionary automatically as well, you'd need to remember to update it manually.

PersistentDictionary can only store value-structs and a very limited subset of classes (string, Uri, IPAddress). Take a look at ColumnConverter.cs, at private static bool IsSerializable(Type type) for the full restrictions. You'd be hitting the typeinfo.IsValueType() restriction.
By the way, you can also try posting questions about PersistentDictionary at http://managedesent.codeplex.com/discussions .
-martin

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

RowType support in Presto - presto

Related

create dynamic object in GO operator controller

JOOQ generator: use existing enums

Asp.net core 2.0 Conditional DI

How to get query string in case CQLStatement,QueryState and QueryOptions is given

Storing object in Esent persistent dictionary gives: Not supported for SetColumn Parameter error

Categories

Resources