I have a .NET application on a Windows machine and a Cassandra database on a Linux (CentOS) server. The Windows machine might be with a couple of seconds in the past sometimes and when that thing happens, the deletes or updates queries does not take effect.
Does Cassandra require all servers to have the same time? Does the Cassandra driver send my query with timestamp? (I just write simple delete or update query, with not timestamp or TTL).
Update: I use the Datastax C# driver
Cassandra operates on the "last write wins" principle. The system time is therefore critically important in determining which writes succeed. Google for "cassandra time synchronization" to find results like this and this which explain why time synchronization is important and suggests a method to solve the problem utilizing an internal NTP pool. The articles specifically refer to AWS architecture, but the principals apply to any cassandra installation.
The client timestamps are used to order mutation operations relative to each other.
The DataStax C# driver uses a generator to create them, it's important for all the hosts running the client drivers (origin of the execution request) to have clocks in sync with each other.
Related
I have recently deployed PostgreSQL database to Linux server and one of the stored procedure is taking around 24 to 26 second to fetch the result. Previously i was deployed PostgreSQL database to windows server and the same stored procedure is taking around only 1 to 1.5 second.
In both cases i have tested with same database with same amount of data. and also both server have same configuration like RAM, Processor,.. etc.
While executing my stored procedure in Linux server CPU usages goes to 100%.
Execution Plan for Windows:
Execution Plan for Linux:
Let me know if you have any solution for the same.
It also might be because of JIT coming into play in the Linux server and not on windows. Check if the query execution plan on the linux server includes information about JIT.
If yes, check if that's the same in windows version. If not, than I suspect that is the case.
JIT might be adding more overhead, hence try changing the jit parameters like jit_above_cost, jit_inline_above_cost to appropriate values as per your system requirements or disable those completely by setting
jit=off
or
jit_above_cost = -1
The culprit seems to be on
billservice.posid = pos.posid
More specificly its doing a Sequence Scan on pos table. It should be doing Index scan.
Check if you have indexes on these two fileds in the database.
I have an HTTP server receiving new client connections all the time. Each time, I have to reconnect to the Cassandra cluster (each client is attached to a new process via a fork() call.)
I have two problems:
Speed: I'd like to make use of a connection as fast as possible;
Robustness: any one of the Cassandra node could be down.
I would imagine that the best mechanism will work with any cluster, not just Cassandra.
We use thrift to connect, although we may change that later. Either way, as far as network connections are concerned, we just do the regular socket(), bind(), and connect() call sequence.
Most of the code I have seen dealing with similar problems is very serial: it tries to connect to host 1, if it times out, try again with host 2, etc. until all hosts are exhausted.
I was thinking I could instead create one thread per connection attempt (with some sort of limit like 3, 4, or 5 parallel attempts--the number will depend on the size of the Cassandra cluster.) However, I am although thinking that if all connections succeed, I am probably going to waste a lot of time on the cluster side...
Is there a specific way this sort of a thing is generally resolved?
Most (if not all) of these features (failover, smart request routin, retries) are available in the DataStax drivers for Cassandra.
If possible you should migrate away from Thrift.
If you really really have to (please consider the time to develop and maintain this solution on a protocol that has been deprecated for a long while) create your own, you could take some inspiration from the DataStax drivers.
I am a novice programmer in Node JS. I have a few queries regarding process related issues like locking and race conditions in Node JS and Mongo DB.
My codes are working perfectly in local environment,but when I am moving to production and come across large number of requests,I might encounter certain issues.
How do we avoid write level race conditions for mongo slaves located in different regions? ie say one piece of data is being written locally but the true value for it is being written remotely that is delayed
Consider we have node processes located regionally would it need to hit mongo master located in another region which then routes the request to a regional slave? This considerably increases the latency of each write - how do we avoid this? Can we have direct writes to regional slaves from local processes and some kind of replication to maintain data consistency?
I use a Node REST api and use mongoose as the Mongo DB driver.Any help would be deeply appreciated .Thank you .
MongoDB's automatic failover and high availability features are provided by what's called replication. The standard MongoDB terms are "primary" for master and "secondary" for slave, so I'll use those terms to be consistent with the documentation and the user base at large. I think both of your questions are answered by one fact: in a replica set, the primary is the only member that accepts writes from clients, ever. The secondaries get the data replicated to them asynchronously a short time later. To answer the questions directly:
No writes to slaves except internal replication of writes from the primary, so no "race condition" with writes can arise.
All writes must go to the primary. The replication system will distribute to data to the secondaries asynchronously. You can read from secondaries, but it isn't a best practice despite its occasional utility. I'd suggest reading about replica set read preference and reading Asya Kamsky's blog post about scaling with replica sets before deciding to read from secondaries.
I have recently just started working with firebird DB v2.1 on a Linux Redhawk 5.4.11 system. I am trying to create a monitor script that gets kicked off via a cron job. However I am running into a few issues and I was hoping for some advice...
First off I have read through most of the documentation that come with the firebird DB and a lot of the documentation that is provided on their site. I have tried using the gstat tool which is supplied but that didn't seem to give me the kind of information I was looking for. I ran across README.monitoring_tables file which seemed to be exactly what I wanted to monitor. Yet this is where I started to hit a snag in my progress....
After running from logging into the db via isql, I run SELECT MON$PAGE_READS, MON$PAGE_WRITES FROM MON$IO_STATS; I was able to get some numbers which seemed okay. However upon running the command again it appeared the data was stale because the numbers were not updating. I waited 1 minute, 5 minutes, 15 minutes and all the data was the same during each. Once I logged off and back on to run the command again the data changed. It appears that only on a relog does the data refresh and yet I am not sure if even then the data is correct.
My question is now am I even doing this correct? Are these commands truly monitoring my db or are just monitoring the command itself? Also why does it take a relog to refresh the statistics? One thing I was worried about was inconsistency in my data. In other words my system was running yet when I would logon each time the read/writes were not linearly increasing. They would vary from 10k to 500 to 2k. Any advice or help would be appreciated!
When you query a monitoring table, a snapshot of the monitoring information is created so the contents of the monitoring tables are stable for the rest of the transaction. You need to commit and start a new transaction if you want fresh information. Firebird always uses a transaction (and isql implicitly starts a transaction if none was started explicitly).
This is also documented in doc/README.monitoring_tables (at least in the Firebird 2.5 version):
A snapshot is created the first time any of the monitoring tables is being selected from in the given transaction and it's preserved until the transaction ends, so multiple queries (e.g. master-detail ones) will always return the consistent view of the data. In other words, the monitoring tables always behave like a snapshot (aka consistency) transaction, even if the host transaction has been started with another isolation level. To refresh the snapshot, the current transaction should be finished and the monitoring tables should be queried in the new transaction context.
(emphasis mine)
Note that depending on your monitoring needs, you should also look at the trace functionality that was introduced in Firebird 2.5.
I am using oracle 11g and i have an application which is coded in Spring framework. Once i configure the database on Sun fire 4170 installed with Linux the machine's CPU utilization is around 80-100% and, however, when i shift the same database to Sun M3000 server installed with Unix OS (supposedly more powerful machine) the application performance goes down and CPU utilization remains 90-100%. I can't figure out if its the application which is making the such utilization or its the database design.
It is added that the database is not relational; things are handled by the application.
Well you certainly can find some interesting opinions on the intertubes.
Oracle does not have a true server
architecture (others have it).
Rather than performing classic server
tasks, such as multi-threading,
caching of data pages, parallel
processing (split a query across many
devices) etc. within itself, it uses
the o/s to do all that. That means for
each user process (PL/SQL connection)
there is one unix process; 1000 users
means 1000 unix processes, all
competing for the same resources.
You might note that Oracle has had
a connection pooling architecture (multi-threaded server) since version 7 (1992).
a cache for data pages (known helpfully as the buffer cache) since forever
parallel query (splitting a query across many processes) since version 7.1 (1993)
splitting queries across multiple servers since OPS (version 6) or across distributed databases (version 5)
It's also noteworthy that even if all that was said was correct rather than incorrect it doesn't actually help you in determining root cause.
Especially noteworthy, because it uses
file system files (not raw
partitions), and the "caching" is
outside, it relies heavily on (and is
very sensitive to) the file system
cache that you have set up. likewise,
Oracle needs a massive amount of
memory for these processes.
Oracle certainly can use raw partitions again dating back to the last millenium, moreover if you wish to cache within the database - using the buffer cache that PerformanceDBA has forgotten about - and bypass the filesystem cache this feature is available on all current filesystems. Oracle also supplies it's own combined filesystem/volume manager in ASM which you can use if you wish.
Oracle is also rather well instrumented (and if you have access to dtrace so is solaris) and can certainly tell you what sessions, processes etc are using the CPU, what the time the application spends in the database is consumed by (down to individual block read times if you care) and so is very susceptible to profiling. I'd recommend that you check out Thinking Clearly about Performance available at http://www.method-r.com/downloads/cat_view/38-papers-and-articles and written by one of the top Oracle Performance experts in the world. If you have access to the Oracle Diagnostics pack then checking out first of all ADDM reports and secondly AWR reports would be profitable.
Trying to avoid a flame war here.
I should probably have separated out the "how to find out" part of my response more clearly from my responses to the comments about server architecture from PerformanceDBA. I share Stephanie's suspicions about the spring framework, but without properly scoped measurement evidence there is no point in blaming any particular attribute of the environment, that would be just particular bias. Fortunately the instrumentation built into the oracle kernel allows you to trace and then profile the slow sessions to determine exactly where the issue lies. So I would do the following:
1) enable tracing for a representative session (you can use the dbms_monitor package for that).
2) also gather an execution plan for the statement(s) involved with the gather_plan_statistics hint.
3) profile the trace file by time using an appropriate profile (tkprof,orasrp,method-r profiler)
Investigate the problem statements in contribution to response time order.
If you can't carry out the above, then you can use ADDM and/or AWR if licenced as I originally suggested or statspack if not licensed for the diagnostics pack. ADDM naturally concentrates on time consumers, I suggest if you are forced down the statspack route you do the same.
The M3000 is certainly a more powerful machine, but it is more suitable for true servers. The X4170 with hyper-threads is more suited for file servers.
I'm not so certain about that. Have any data to support that claim?
An M3000 has one SPARC64 VII processor with 4 cores (tech specs) while a X4170 has 1 or 2 Intel 5500 "Nehalem-EP" processors each with 4 cores (tech specs). I know that I would expect much more from even a single processor Nehalem-EP system, than the M3000. Obviously data will vary slightly with the workload, but I know where I'd put my money.