I want to know how the geospatial data like .shp file stored in AgensGraph?
I mean, does AgensGraph support the geo data?
I know the PostGIS extension, but I'm not trying to use the PostGIS.
I think AgensGraph should have a some structure like storage and functions for geo data. Please let me know the geo data is available or not.
I writting to share my experience that deal with geo-spatial data in AgensGraph.
Basically AgensGraph object takes JSON format.
So you can store the data after the coverting SHP to GeoJSON.
The method of Converting process is refered heres
Link 1
Link 2
Related
I’m working on an e-commerce store that is multi-lang and I’m using Node.js and PostgreSQL.
I’m confused if I should store the product images in a separate table or in a JSONB column in database?
The same thing for the translation, the product may have different titles for every language so which one is better ? to store them in a separate table too, or in JSONB column?
Edit 1:
By saving images in the database I mean the path of the image in Amazon S3 or something like this not the image itself
Edit 2:
Let's say I have a logo, thumbnail, and banner for the product also the product has multiple images for the carousel, which one is better, store them in JSONB column or multiple varchar columns or a separate table?
I'm not marking this question as duplicate just because it includes also a question about how to store internationalization data.
I’m confused if I should store the product images in a separate table or in a JSONB column in database?
I would avoid saving images in a database. A better approach would be using file system or cloud storage like Amazon S3; then you could save a filesystem path or cloud URL in a VARCHAR column to have a reference to the image file.
If you still want to save images in PostgreSQL, if for each product you only have one image, I don't see the a reason to use a separate table. You can use bytea or blobs. Look here for more details.
The same thing for the translation, the product may have different titles for every language so which one is better ? to store them in a separate table too, or in JSONB column?
I see two possible approach:
Storing titles in one nullable column for each supported language. If you plan to add languages in the future, you have to handle it with database migrations. Also, if you have more text data like product descriptions that should be supported for all languages, you will need not only the en_tile, fr_title, es_title ... columns but also the en_description, fr_description, es_description ... as well.
Using JSONB (or even better JSON) datatype. You can store all internazionalized data in a single JSON column, with more flexibility if the JSON structure changes. I would use JSON instead of JSONB because it's faster and requires less space on disk. JSONB should be used if you intend to index JSON properties and to query with conditions on them. I don't think this is the case. Here and here you can read more on JSON vs JSONB in postgres
I would go for option 2.
EDIT 1
As Frank Heikens told in comments, JSONB is generally adviced over JSON from postgres documentation.
Benchmarks could help you to understand better if JSON or JSONB is the best choice for you.
I've found a benchmark here. Results here shows better performance on JSONB over JSON. This is strange to me to be honest and if I will have time I will do some more depth test.
Anyway, for your situation, JSONB should just be fine as JSON is.
I wrote a script that iterates over many .CSV files and (should) send the data with sequelize to a SQL Server database. I use csv-parse module to read and 'cast' the data, meaning that it reduces the data to its smallest datatype.
ISSUE: In order to send the data to the DB, I need to define a Model object, that defines the format and datatypes of the table. I want this to be automated and based on the 'casted' csv.
Additional issue: I want to be able to upsert the table too eventually.
I have accomplished to:
Connect to the DB with sequelize and verify the connection (also did a test query)
Read the CSVs and 'cast' the data. It recognises integers, floats and even dates. It also accepts the header that I parsed to the csv module.
I am a novice with js and I feel that I am not grasping something fundamental here. In python and pandas this would be a simple single line. All help would be much appreciated!
I have to extract data from XML files with the size of several hundreds of MB in a Google Cloud Function and I was wondering if there are any best practices?
Since I am used to nodejs I was looking at some popular libraries like fast-xml-parser but it seems cumbersome if you only want specific data from a huge xml. I am also not sure if there are any performance issues when the XML is too big. Overall this does not feel like the best solution to parse and extract data from huge XMLs.
Then I was wondering if I could use BigQuery for this task where I simple convert the xml to json and throw it into a Dataset where I then can use a query to retrieve the data I want.
Another solution could be to use python for the job since it is good in parsing and extracting data from a XML so even though I have no experience in python I was wondering if this path could still be
the best solution?
If anything above does not make sense or if one solution is preferable to the other or if anyone can share any insights I would highly appreciate it!
I suggest you to check this article in which they discuss how to load XML data into BigQuery using Python Dataflow. I think that this approach may work in your situation.
Basically what they suggest is:
To parse the xml into a Python dictionary using the package xmltodict.
Specify a schema of the output table in BigQuery.
Use a Beam pipeline to take an XML file and use it to populate a BigQuery table.
Cassandra newbie question. I'm collecting some data from a social networking site using REST calls. So I end up with the data coming back in JSON format.
The JSON is only one of the columns in my table. I'm trying to figure out what the "best practice" is for storing the JSON string.
First I thought of using the map type, but the JSON contains a mix of strings, numerical types, etc. It doesn't seem like I can declare wildcard types for the map key/value. The JSON string can be quite large, probably over 10KB in size. I could potentially store it as a string, but it seems like that would be inefficient. I would assume this is a common task, so I'm sure there are some general guidelines for how to do this.
I know Cassandra has native support for JSON, but from what I understand, that's mostly used when the entire JSON map matches 1-1 with the database schema. That's not the case for me. The schema has a bunch of columns and the JSON string is just a sort of "payload". Is it better to store the JSON string as a blob or as text? BTW, the Cassandra version is 2.1.5.
Any hints appreciated. Thanks in advance.
In the Cassandra Storage engine there's really not a big difference between a blob and a text, since Cassandra stores text as blobs essentially. And yes the "native" JSON support you speak of is only for when your data model matches your JSON model, and it's only in Cassandra 2.2+.
I would store it as a text type, and you shouldn't have to implement anything to compress your JSON data when sending the data (or handle uncompressing). Since Cassandra's Binary Protocol supports doing transport compression. Also make sure your table is storing the data compressed with the same compression algorithm (I suggest using LZ4 since it's the fastest algo implmeneted) to save on doing compression for each read request. Thus if you configure storing the data compressed and use transport compression, you don't even have to implement either yourself.
You didn't say which Client Driver you're using, but here's the documentation on how to setup Transport Compression for Datastax Java Client Driver.
It depends on how to want to query your JSON. There are 3 possible strategies:
Store as a string
Store as a compressed blob
Store as a blob
Option 1 has the advantage of being human readable when you query your data on command line with cqlsh or if you want to debug data directly live. The drawback is the size of this JSON column (10k)
Option 2 has the advantage to keep the JSON payload small because text elements have a pretty decent compression ration. Drawbacks are: a. you need to take care of compression/decompression client side and b. it's not human readable directly
Option 3 has drawbacks of option 1 (size) and 2 (not human readable)
Does Azure Data Lake Analytics and U-SQL support the notion of Cursors in scripts?
I have a data set that contains paths to further data sets I would like to extract and I want to output the results to separate files.
At the moment I can't seem to find a solution for dynamically extracting and outputting data based on values inside data sets.
U-SQL currently expect that files are known at compile time. Thus, you cannot do extraction or outputting based on locations provided inside a file.
You can specify filesets in the EXTRACT statement that will be somewhat data driven. We are currently working on adding the ability to use filesets on OUTPUT as well.
You can file feature requests at http://aka.ms/adlfeedback.
Cheers
Michael
You might be able to write a Processor to iterate over the rows in the primary dataset. However, you might not be able to access the additional datasets in the Processor.
Another work around might be to concatenate all the additional datasets and perform a join with the primary dataset.