How to enable security in a Kafka cluster without having downtime - security

We have a Kafka cluster in production without any security. We plan to turn on security (SASL/OAUTHBEARER) on the broker side. But looks like as soon as we turn on broker side security all the insecure client will be dropped immediately.
For smooth transition from insecure to secure cluster, without any downtime, we want Kafka clients to first enable security. And once all our clients have migrated, we can turn on security on the broker level.
However I do not find a way secure clients can talk to an insecure broker. Has anyone done this?
Any ideas on smooth migration to security in production?

In Kafka 2.0, the following protocol combinations are allowed:
+------------------+-------+-----------+
| | SSL | Kerberos |
+------------------+-------+-----------+
| PLAINTEXT | No | No |
| SSL | Yes | No |
| SASL_PLAINTEXT | No | Yes |
| SASL_SSL | Yes | Yes |
+------------------+-------+-----------+
Those combinations are applicable for both broker-to-broker and broker-to-client but the key config here is security.inter.broker.protocol that does not have to be the same for broker-to-broker and broker-to-client. This means that we can enable security in a Kafka Cluster without having any downtime.
Enabling Kerberos
Step 1: Disable security for broker-to-broker
security.inter.broker.protocol=PLAINTEXT
Step 2: Enable Kerberos for broker-to-client in server.properties
Step 3: Do a rolling restart
Step 4: Enable Kerberos for broker-to-broker
security.inter.broker.protocol=SASL_PLAINTEXT
Enabling SSL
Step 1: Disable security for broker-to-broker
security.inter.broker.protocol=PLAINTEXT
Step 2: Enable SSL for broker-to-client in server.properties
Step 3: Do a rolling restart
Step 4: Enable SSL for broker-to-broker
security.inter.broker.protocol=SSL

Setting the following property in server.properites will allow insecure clients to connect to port 9097 and secure clients to connect to port 9096.
listeners=SASL_PLAINTEXT://:9096,PLAINTEXT://:9097

Related

Java gRPC - TLS - how to set up mutual TLS on the client side?

I work on a software application that uses gRPC to establish a bi-directional stream between client and a server.
I'm looking for something similar to this ticket's answer only in java: How to enable server side SSL for gRPC?
I would like to configure my application so that they can choose what TLS scenario they want to use:
Scenario 1: plaintext (no encryption)
Scenario 2: Server-side TLS
Scenario 3: Mutual TLS
For TLS setups, I am using Java on non-Android environments, so I will only be considering the OpenSSL installed scenario using https://github.com/grpc/grpc-java/blob/master/SECURITY.md#openssl-statically-linked-netty-tcnative-boringssl-static
Configuring the server side seems pretty straight forward because it is documented quite well: https://github.com/grpc/grpc-java/blob/master/SECURITY.md#mutual-tls
Here would be the steps for the corresponding TLS options:
Sever-side configuration for Scenario 1: Use builder.usePlaintext
Sever-side configuration for Scenario 2: Add a NettyServerBuilder.sslContext built by SSL Context Builder with GrpcSslContexts.forServer and set the cert chain and cert key (and password if needed)
Sever-side configuration for Scenario 3: Add a NettyServerBuilder.sslContext built by SSL Context Builder with GrpcSslContexts.forServer and set the cert chain and cert key (and password if needed), and also set a trustManager on the sslContextBuidler set to the trust cert file.
The server-side part is well documented which is excellent.
Now I want to configure a NettyChannelBuilder on the client side. The only thing I can find information on this is in this unit test: https://github.com/grpc/grpc-java/blob/master/interop-testing/src/test/java/io/grpc/testing/integration/TlsTest.java
Here are the configurations I think are needed, but need to get confirmation on.
Client-side configuration for Scenario 1: Use nettyChannelBuilder.usePlaintext(true). This will disable TLS on the netty channel to grpc.
Client-side configuration for Scenario 2: Set the sslContext using nettyChannelBuilder.negotiationType(NegotiationType.TLS).sslContext(GrpcSslContexts.configure(SslContextBuilder.forClient(), SslProvider.OPENSSL).build()). This will configure the channel to communicate through TLS to grpc server using the default ciphers and application protocol configs.
Client-side configuration for Scenario 3: Set up TLS for the netty channel using nettyChannelBuilder.negotiationType(NegotiationType.TLS).sslContext(GrpcSslContexts.configure(SslContextBuilder.forClient(), SslProvider.OPENSSL).sslContextBuilder.trustManager(clientAuthCertFile)
.clientAuth(ClientAuth.OPTIONAL).build()) where clientAuthCertFile is the trust cert file and ClientAuth.OPTIONAL can also be ClientAuth.REQUIRED if you require mutual TLS.
Is there anything incorrect with my client-side configurations? Do I need any tweaks? I will add this as a PR to the security.md file after getting some blessing from the community on this post.
I added a hello world TLS PR to the grpc-java project here https://github.com/grpc/grpc-java/pull/3992
the latest version of grpc-java as soon as this pr is merged will have a really nice working hello-world example. So all you have to do is git clone that project from master, and look at the example/README.md.

Different versions of TLS in record layer and handshake layer

There is application that uses OpenSSL 1.0.2g. Application is able to receive incoming connections and initiate outbound connection. Application sends ClientHello message during create outbound connection.
In this time I see that versions of TLS are different in record layer and handshake layer.
It's not clear why it happens. The following flags are used to set in global context.
SSL_CTX_set_options(ssl_list[i].ctx, SSL_OP_NO_SESSION_RESUMPTION_ON_RENEGOTIATION | SSL_OP_CIPHER_SERVER_PREFERENCE| SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3 | SSL_OP_NO_TLSv1 | SSL_OP_NO_TLSv1_1);
What can be wrong?
Thanks!

postgresql database not replicating. No errors

Background:
I am trying to set up streaming replication between two servers. Although postgresql is running on both boxes without any errors, when I add or change a record from the primary, these changes are not reflected on the secondary server.
I have the following set up on my primary database server:
(Note: I'm using fake ip addresses but 10.222.22.12 represents the primary server and .21 the secondary)
primary server - posgresql.conf
listen_addresses = '10.222.22.12'
unix_socket_directory = '/tmp/postgresql'
wal_level = hot_standby
max_wal_senders = 5 # max number of walsender processes
# (change requires restart)
wal_keep_segments = 32 # in logfile segments, 16MB each; 0 disables
primary server - pg_hba.conf
host all all 10.222.22.21/32 trust
host replication postgres 10.222.22.0/32 trust
primary server - firewall
I've checked to make sure all incoming to the fw is open and that all traffic out is allowed.
secondary server - posgresql.conf
listen_addresses = '10.222.22.21'
wal_level = hot_standby
max_wal_senders = 5 # max number of walsender processes
# (change requires restart)
wal_keep_segments = 32 # in logfile segments, 16MB each; 0 disables
hot_standby = on
secondary server - pg_hba.conf
host all all 10.222.22.12/32 trust
host all all 10.222.22.21/32 trust
host replication postgres 10.222.22.0/24 trust
secondary server - recovery.conf
standby_mode='on'
primary_conninfo = 'host=10.222.22.12 port=5432 user=postgres'
secondary server firewall
everything is open here too.
What I've tried so far
Made a change in data on the primary. Nothing replicated over.
Checked the firewall settings on both servers.
Checked the arp table on the secondary box to make sure it can communicate with the primary.
checked the postmaster.log file on both servers. They are empty.
Checked the syslog file on both servers. no errors noticed.
restarted postgresql on both servers to make sure it starts without errors.
I'm not sure what else to check. If you have any suggestions, I'd appreciate it.
EDIT 1
I've checked the pg_stat_replication table on the master and I get the following results:
psql -U postgres testdb -h 10.222.22.12 -c "select * from pg_stat_replication;"
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | state | sent_location | write_location | flush_location | repl
ay_location | sync_priority | sync_state
-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+-------+---------------+----------------+----------------+-----
------------+---------------+------------
(0 rows)
And on the slave, notice the results from the following query:
testdb=# select now()-pg_last_xact_replay_timestamp();
?column?
----------
(1 row)
openser=#
I changed the pg_hba.conf file on the primary and added the exact ip addr of my slave like so:
host all all 10.222.22.21/32 trust
host replication postgres 10.222.22.0/32 trust
#added the line below
host replication postgres 10.222.22.12/32 trust
Then I restarted postgresql and it worked.
I guess I was expecting that the line above the new line I added would work, but it's not. I have to do more reading on subnetting.
On master, listen address should allow connection from slave, i.e
listen_addresses = '10.222.22.21'
It seems your postgres logging not well configured.
My guess is, the slave cannot stream because it falls behind the master, it can be due to network latency.
My suggestion is, You should archive the wal files, so if the slave falls behind the master, it can replay wal files from the archive.
You can also check by doing
select * from pg_stat_replication;
on master. If it does not show any rows, it means that streaming fails, probably due to slave falls behind the master.
You can also check by issuing :
select now()-pg_last_xact_replay_timestamp();
on slave. The query count how far the slave falls behind the master.
Usually, streaming replication lag is under 1s. Lag larger than 1s, then streaming will be cut off.

Identify what cipher strength HTTPS apache connections are using

How can I identify the cipher strength of an active https connection to a linux redhat apache webserver. I want to harden my web server by removing lower strength ciphers and would like to check if clients are even using them.
EDIT
My goal is to avoid negative impact of removal of a lower security cipher that a client relies on. Worst case scenario there is a stupid non browser (or old browser) app that is using an old insecure cipher, when I disallow the use of this cipher his/her app could break. I'm trying to proactively identify if there are any apps/browsers using any of the ciphers I'm going to disable.
You can identify unsuccessful handshakes by enabling the appropriate level of logging on mod_ssl. See the Custom Log Formats section on http://httpd.apache.org/docs/2.2/mod/mod_ssl.html, notably
CustomLog logs/ssl_request_log \ "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
This should enable you to make a list of ciphers requested by clients and configure Apache accordingly.
Your question and your goal aren't necessarily related. Each active connection may use a difference cipher based on the combination of: (a) the capabilities on the server (b) the capabilities of the client (c) cipher preference of the server and client. Looking at any individual connection will not tell you if your SSL configuration is optimal.
If your goal is to harden your SSL configuration, I suggest you use
the SSL Server Test from SSL labs. It grades your server configuration based on known SSL vulnerabilities and best practices.
The last time I updated my SSL configuration I used the configuration tips from this blog post. Note that understanding of SSL vulnerabilities is constantly changing so I suggest you rerun the test every once in a while to ensure your configuration is the best that is currently known.

How can I secure memcached/beanstalkd in a hostile cloud environment?

Here's how my servers (in Amazon EC2) would look like:
Server 1 Server 2 Server 3
__________________________ _____________________ _______________
| Cloud Monitor Daemon | | Memcached daemon | | beanstalkd |
| | | Memcached daemon | ________________
| "Hostile" user process | / | Memcached daemon |
| "Hostile" user process | / | Memcached daemon |
| "Hostile" user process | / | Memcached daemon |
| "Hostile" user process | / ______________________
| "Hostile" user process | /
__________________________
There's multiple user processes on one server. Each user then has their own memcached instance running on a (separate) server (with many other memcached instances). Without any sort of security (as it is by default), user process B could guess the port of the memcached instance of user A and access it. How can I secure this system so that user C could only access memcached instance C and no other (even though the memcached instances are all on the same server)? My user should not have to do anything to make use of the security (just continue connecting to the memcached port as usual), it should all happen automatically by the system.
Also, the Cloud Monitor Daemon on the server along with the "hostile" user processes needs to be able to access a remote beanstalkd server. Beanstalkd has no authentication either, so if my Monitor Daemon can access beanstalkd, so can the "hostile" user processes, and I don't want that. How can I secure this part?
I mentioned some tips on securing memcached in a blog post recently. For your case, SASL will probably help a lot.
I don't know if beanstalk ever got SASL support, but that's kind of a different app.
You could build a VPN or enable IPSEC to control access to all services on all machines at the node level, of course.
You can start beanstalkd on Server3 local IP (127.0.0.1)
and then use SSH Tunnels from the Server 1 to Server 3.
Combine it with inetd and ssh-keys to be failsafe.
I ended up going with plain old iptables. Allows me to do per-uid rules and is very easy to configure programatically. Most importantly, the users don't need to be involved in the process, they can continue using the standard protocols and not have to deal with authentication, and iptables will drop any "naughty" packets that are going where they shouldn't.
A couple weeks ago Amazon has announced the Amazon VPC (Virtual Private Cloud) which we are using to secure memcached and beanstalkd.
Works great! Seriously reccomend it; one less overhead to have to deal with ourselves.

Resources