Does the space usage reported by MemSQL INFORMATION_SCHEMA.COLUMNAR_SEGMENTS include redundancy?

Does the space usage reported by MemSQL INFORMATION_SCHEMA.COLUMNAR_SEGMENTS include redundancy? - singlestore

I am trying to get the space used by a columnstore table using INFORMATION_SCHEMA.COLUMNAR_SEGMENTS table. However I am not sure if the reported usage includes redundancy as well. The query I am using is below,
SELECT
DATABASE_NAME AS DB,
TABLE_NAME AS TABLE_NAME ,
"" AS TOTAL_MEMORY_MB ,
SUM(UNCOMPRESSED_SIZE)/(1024*1024) AS DISK_UNCOMP_MB ,
SUM(COMPRESSED_SIZE)/(1024*1024) AS DISK_COMP_MB
FROM INFORMATION_SCHEMA.COLUMNAR_SEGMENTS
WHERE TABLE_NAME = "table_name"
Could somebody please help me understand the space usage reported by this table? Does it include redundancy as well? If we don't aggregate, it gives the result for individual partitions. However, I am not sure if it includes the redundant partition as well.

yes, COLUMNAR_SEGMENTS includes redundancy (it shows all blob files on disk, no matter if on a master or slave). You can see this by inserting one row into a table in redundancy 2 (and running optimize table so the row is converted into columnstore format) and then querying for all blobs (you will see 2 blob files on 2 different nodes).

Related

How do I workaround the 5GB s3 copy limit with pyspark/hive?

I am trying to run a spark sql job against an EMR cluster. My create table operation contains many columns but I'm getting an s3 error:
The specified copy source is larger than the maximum allowable size for a copy source: 5368709120
Is there a hive/spark/pyspark setting that can be set so that _temporary files do not reach that 5GB threshold to write to s3?
This is working: (only 1 column)
create table as select b.column1 from table a left outer join verysmalltable b on ...
This is not working: (many columns)
create table as select b.column1, a.* from table a left outer join verysmalltable b on ...
In both cases, select statements alone work. (see below)
Working:
select b.column1 from table a left outer join verysmalltable b on ...
select b.column1, a.* from table a left outer join verysmalltable b on ...
I'm wondering if memory related - but am unsure. I would think I'd run into a memory error before running into a copy error if it was a memory error (also assuming that the select statement with multiple columns would not work if it was a memory issue)
Only when create table is called do I run into the s3 error. I don't have the option of not using s3 for saving tables and was wondering if there was a way around this issue. The 5GB limit seems to be a hard limit. If anyone has any information about what I can do on the hive/spark end, it would be greatly appreciated.
I'm wondering if there is a specific setting that can be included in the spark-defaults.conf file to limit the size of temporary files.
Extra information: the _temporary file is 4.5 GB after the error occurs.

In the past few months, something changed with how s3 is using the parameter
fs.s3a.multipart.threshold
This setting needs to be under 5G for queries of a certain size to work. Previously I had this setting set at a large number in order to save larger files, but apparently the behavior for this has changed.
The default value for this setting is 2GB. In the spark documentation, there are multiple different definitions based on the hadoop version being used.

In Azure Synapse, is it true that creating a temporary table is discouraged?

I had a good discussion with one of my colleagues and he mentioned creating a temporary table degrades the performance in Azure Synapse because Synapse creates the temporary table first in the master node then distribute them to child node. Is it true? He recommended me to create create permanent table instead of temporary table.

That’s not correct. Temp tables don’t necessarily funnel through the control node. Let’s say you are selecting from a table distributed on ProductKey and loading it into a #temp table distributed on ProductKey. The data will never leave each compute node since it’s a distribution compatible insert.
On the other hand, if you run a query that uses a ROW_NUMBER function, for example, that would have to be calculated on the control node and then the data would be sent back to the compute nodes to be stored in the distributed temp table. But that only happens in the presence of some types of functions and some types of queries. It is not the norm. If you are worried about a particular query then add the word EXPLAIN to the front of it and paste the explain plan XML into your question so we can help you interpret it.
If you load a #temp table with a SELECT INTO statement you can’t specify the table geometry so it will be a round robin distributed columnstore. Usually this isn’t ideal since it takes extra time and memory to compress a columnstore and because round robin distribution isn’t ideal unless there is no good distribution key. Usually the next query which uses the round robin distributed temp table will just reshuffle it so it’s best to properly hash distribute a temp table initially. To do this do a CTAS statement as described here.

Cassandra query table without partition key

I am trying to extract data from a table as part of a migration job.
The schema is as follows:
CREATE TABLE IF NOT EXISTS ${keyspace}.entries (
username text,
entry_type int,
entry_id text,
PRIMARY KEY ((username, entry_type), entry_id)
);
In order to query the table we need the partition keys, the first part of the primary key.
Hence, if we know the username and the entry_type, we can query the table.
In this case the username can be whatever, but the entry_type is an integer in the range 0-9.
When doning the extraction we iterate the table 10 times for every username to make sure we try all versions of entry_type.
We can no longer find any entries as we have depleted our list of usernames. But our nodetool tablestats report that there is still data left in the table, gigabytes even. Hence we assume the table is not empty.
But I cannot find a way to inspect the table to figure out what usernames remains in the table. If I could inspect it I could add the usernames left in the table to our extraction job and eventually we could deplete the table. But I cannot simply query the table as such:
SELECT * FROM ${keyspace}.entries LIMIT 1
as cassandra requires the partition keys to make meaningful queries.
What can I do to figure out what is left in our table?

As per the comment, the migration process includes a DELETE operation from the Cassandra table, but the engine will have a delay before actually removing from disk the affected records; this process is controlled internally with tombstones and the gc_grace_seconds attribute of the table. The reason for this delay is fully explained in this blog entry, for a tl dr, if the default value is still in place, Cassandra will need to pass at least 10 days (864,000 seconds) from the execution of the delete before the actual removal of the data.
For your case, one way to proceed is:
Ensure that all your nodes are "Up" and "Healthy" (UN)
Decrease the gc_grace_seconds attribute of your table, in the example, it will set it to 1 minute, while the default is
ALTER TABLE .entries with GC_GRACE_SECONDS = 60;
Manually compact the table:
nodetool compact entries
Once that the process is completed, nodetool tablestats should be up to date

To answer your first question, I would like to put more light on gc_grace_seconds property.
In Cassandra, data isn’t deleted in the same way it is in RDBMSs. Cassandra is designed for high write throughput, and avoids reads-before-writes. So in Cassandra, a delete is actually an update, and updates are actually inserts. A “tombstone” marker is written to indicate that the data is now (logically) deleted (also known as soft delete). Records marked tombstoned must be removed to claim back the storage space. Which is done by a process called Compaction. But remember that tombstones are eligible for physical deletion / garbage collection only after a specific number of seconds known as gc_grace_seconds. This is a very good blog to read more in detail : https://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
Now possibly you are looking into table size before gc_grace_seconds and data is still there.
Coming to your second issue where you want to fetch some samples from the table without providing partition keys. You can analyze your table content using Spark. The Spark Cassandra Connector allows you to create Java applications that use Spark to analyze database data. You can follow the articles / documentation to write a quick handy spark application to analyze Cassandra data.
https://www.instaclustr.com/support/documentation/cassandra-add-ons/apache-spark/using-spark-to-sample-data-from-one-cassandra-cluster-and-write-to-another/
https://docs.datastax.com/en/dse/6.0/dse-dev/datastax_enterprise/spark/sparkJavaApi.html
I would recommend not to delete records while you do the migration. Rather first complete the migration and post that do a quick validation / verification to ensure all records are migrated successfully (this use can easily do using Spark buy comparing dataframes from old and new tables). Post successful verification truncate the old table as truncate does not create tombstones and hence more efficient. Note that huge no of tombstone is not good for cluster health.

how to solve error like partition's table metadata are out of sync for table

Recently i've faced a memsql leaf hardware error and we ended up missing partitions and their data due to the fact that we run a replication-1 memsql cluster.
Then we started noticing errors like:
"Java.sql.SQLException: Leaf Error (10.XXXX:3306): Partition's table metadata are out of sync for table"
despite having recreated the missing partitions.
Is there a way to approach this issue? Or i will have to drop data in all affected tables and import that from other sources ?

It sounds likely that the imported tables had mismatching table metadata, because the metadata changed at some point in between. You can try:
Recreate the tables. This can be done by insert-selecting the data into a newly created table (it may not be possible if no select queries work) or reloading the data from an external source.
Check the table schemas on the recreated partition vs the other partitions and see if you can find the mismatch - you could diff the show create tables. Then it may be possible to manually alter on the leaf partitions to correct them, or recreate the formerly missing partition with matching schemas.

Performance - Table Service, SQL Azure - insert. Query speed on large amount of data

I'd read many posts and articles about comparing SQL Azure and Table Service and most of them told that Table Service is more scalable than SQL Azure.
http://www.silverlight-travel.com/blog/2010/03/31/azure-table-storage-sql-azure/
http://www.intertech.com/Blog/post/Windows-Azure-Table-Storage-vs-Windows-SQL-Azure.aspx
Microsoft Azure Storage vs. Azure SQL Database
https://social.msdn.microsoft.com/Forums/en-US/windowsazure/thread/2fd79cf3-ebbb-48a2-be66-542e21c2bb4d
https://blogs.msdn.com/b/windowsazurestorage/archive/2010/05/10/windows-azure-storage-abstractions-and-their-scalability-targets.aspx
https://stackoverflow.com/questions/2711868/azure-performance
http://vermorel.com/journal/2009/9/17/table-storage-or-the-100x-cost-factor.html
Azure Tables or SQL Azure?
http://www.brentozar.com/archive/2010/01/sql-azure-frequently-asked-questions/
https://code.google.com/p/lokad-cloud/wiki/FatEntities
Sorry for http, I'm new user >_<
But http://azurescope.cloudapp.net/BenchmarkTestCases/ benchmark shows different picture.
My case. Using SQL Azure: one table with many inserts, about 172,000,000 per day(2000 per second). Can I expect good perfomance for inserts and selects when I have 2 million records or 9999....9 billion records in one table?
Using Table Service: one table with some number of partitions. Number of partitions can be large, very large.
Question #1: is Table service has some limitations or best practice for creating many, many, many partitions in one table?
Question #2: in a single partition I have a large amount of small entities, like in SQL Azure example above. Can I expect good perfomance for inserts and selects when I have 2 million records or 9999 billion entities in one partition?
I know about sharding or partition solutions, but it is a cloud service, is cloud not powerfull and do all work without my code skills?
Question #3: Can anybody show me benchmarks for quering on large amount of datas for SQL Azure and Table Service?
Question #4: May be you could suggest a better solution for my case.

Short Answer
I haven't seen lots of partitions cause Azure Tables (AZT) problems, but I don't have this volume of data.
The more items in a partition, the slower queries in that partition
Sorry no, I don't have the benchmarks
See below
Long Answer
In your case I suspect that SQL Azure is not going work for you, simply because of the limits on the size of a SQL Azure database. If each of those rows you're inserting are 1K with indexes you will hit the 50GB limit in about 300 days. It's true that Microsoft are talking about databases bigger than 50GB, but they've given no time frames on that. SQL Azure also has a throughput limit that I'm unable to find at this point (I pretty sure it's less than what you need though). You might be able to get around this by partitioning your data across more than one SQL Azure database.
The advantage SQL Azure does have though is the ability to run aggregate queries. In AZT you can't even write a select count(*) from customer without loading each customer.
AZT also has a limit of 500 transactions per second per partition, and a limit of "several thousand" per second per account.
I've found that choosing what to use for your partition key (PK) and row key depends (RK) on how you're going to query the data. If you want to access each of these items individually, simply give each row it's own partition key and a constant row key. This will mean that you have lots of partition.
For the sake of example, if these rows you were inserting were orders and the orders belong to a customer. If it was more common for you to list orders by customer you would have PK = CustomerId, RK = OrderId. This would mean to find orders for a customer you simply have to query on the partition key. To get a specific order you'd need to know the CustomerId and the OrderId. The more orders a customer had, the slower finding any particular order would be.
If you just needed to access orders just by OrderId, then you would use PK = OrderId, RK = string.Empty and put the CustomerId in another property. While you can still write a query that brings back all orders for a customer, because AZT doesn't support indexes other than on PartitionKey and RowKey if your query doesn't use a PartitionKey (and sometimes even if it does depending on how you write them) will cause a table scan. With the number of records you're talking about that would be very bad.
In all of the scenarios I've encountered, having lots of partitions doesn't seem to worry AZT too much.
Another way you can partition your data in AZT that is not often mentioned is to put the data in different tables. For example, you might want to create one table for each day. If you want to run a query for last week, run the same query against the 7 different tables. If you're prepared to do a bit of work on the client end you can even run them in parallel.

Azure SQL can easily ingest that much data an more. Here's a video I recorded months ago that show a sample (available on GitHub) that shows one way you can do that.
https://www.youtube.com/watch?v=vVrqa0H_rQA
here's the full repo
https://github.com/Azure-Samples/streaming-at-scale/tree/master/eventhubs-streamanalytics-azuresql

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string