HDFS group Permissions issue, Cluster integrated with Kerberos + AD - security

CDH cluster is integrated with Kerberos + AD.
user_A is added to groups groupX and AD_GROUP_X
user_B is added to groups groupX and AD_GROUP_X
There are two files in HDFS with different group permissions:
/user/file_a
Owner: user_A, Group: groupA
Permissions: u=rwx, g=rwx, o=---
/user/file_b
Owner: user_B, Group: AD_GROUP_X
Permissions: u=rwx, g=rwx, o=---
Scenario #1:
user_A wants to access file /user/file_b ==> Success
Scenario #2:
user_B wants to access file /user/file_a ==> failed expected is success
Once AD is integrated with cluster, HDFS reads only AD groups or it can read both AD groups and unix groups.

It is possible to configure and combine multiple existing mapping providers without expecting all the users at a single place. i.e AD User can use LdapGroupMapping provider for group. Unix user can use the default provider ShellBasedUnixGroupsMapping for unix group mapping.
It can be configured as shown below.
<property>
<name>hadoop.security.group.mapping</name>
<value>org.apache.hadoop.security.CompositeGroupsMapping</value>
</property>
<property>
<name>hadoop.security.group.mapping.providers</name>
<value>unix,ad01,ad02</value>
</property>
<property>
<name>hadoop.security.group.mapping.providers.combined</name>
<value>true</value>
<description>true or false to indicate whether groups from the providers are combined or not. If true, all the providers are tried and the final result is all the groups where the user exists. If false, the first group in which the user was found is returned. Default value is true.
</description>
</property>
<property>
<name>hadoop.security.group.mapping.provider.unix</name>
<value>org.apache.hadoop.security.ShellBasedUnixGroupsMapping</value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad01</name>
<value>org.apache.hadoop.security.LdapGroupsMapping</value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad02</name>
<value>org.apache.hadoop.security.LdapGroupsMapping</value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad01.ldap.url</name>
<value>ldap://</value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad02.ldap.url</name>
<value>ldap://</value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad01.ldap.bind.user</name>
<value></value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad02.ldap.bind.user</name>
<value></value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad01.ldap.base</name>
<value></value>
</property>
<property>
<name>hadoop.security.group.mapping.provider.ad02.ldap.base</name>
<value></value>
</property>
Support multiple group providers - JIRA

Related

NiFi LDAP Authentication: Unable To Locate Initial Admit Identity

We're trying to configure NiFi to require users to authenticate using their Active Directory credentials when accessing the UI. We've tried multiple permutations, but we always get an error message along the lines of "cannot locate Initial Admin Identity jane.doe".
We've already tried the following:
Comment out nifi.security.identity.mapping.pattern.dn in nifi.properties and specify the entire dn for the Initial Admin Identity
Modifying the User Identity Attribute to something other than cn
Can anyone think of a reason why we might be running into this issue?
Thanks,
Cory
authorizers.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<authorizers>
<!--
The FileUserGroupProvider will provide support for managing users and groups which is backed by a file
on the local file system.
- Users File - The file where the FileUserGroupProvider will store users and groups.
- Legacy Authorized Users File - The full path to an existing authorized-users.xml that will be automatically
be used to load the users and groups into the Users File.
- Initial User Identity [unique key] - The identity of a users and systems to seed the Users File. The name of
each property must be unique, for example: "Initial User Identity A", "Initial User Identity B",
"Initial User Identity C" or "Initial User Identity 1", "Initial User Identity 2", "Initial User Identity 3"
NOTE: Any identity mapping rules specified in nifi.properties will also be applied to the user identities,
so the values should be the unmapped identities (i.e. full DN from a certificate).
-->
<userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
<property name="Legacy Authorized Users File"></property>
<property name="Initial User Identity Node1">CN=xxx, OU=yyy, O=zzz, L=www, ST=vvv, C=kkk</property>
<property name="Initial User Identity Node2">CN=xxx, OU=yyy, O=zzz, L=www, ST=vvv, C=kkk</property>
</userGroupProvider>
<!--
The LdapUserGroupProvider will retrieve users and groups from an LDAP server. The users and groups
are not configurable.
'Authentication Strategy' - How the connection to the LDAP server is authenticated. Possible
values are ANONYMOUS, SIMPLE, LDAPS, or START_TLS.
'Manager DN' - The DN of the manager that is used to bind to the LDAP server to search for users.
'Manager Password' - The password of the manager that is used to bind to the LDAP server to
search for users.
'TLS - Keystore' - Path to the Keystore that is used when connecting to LDAP using LDAPS or START_TLS.
'TLS - Keystore Password' - Password for the Keystore that is used when connecting to LDAP
using LDAPS or START_TLS.
'TLS - Keystore Type' - Type of the Keystore that is used when connecting to LDAP using
LDAPS or START_TLS (i.e. JKS or PKCS12).
'TLS - Truststore' - Path to the Truststore that is used when connecting to LDAP using LDAPS or START_TLS.
'TLS - Truststore Password' - Password for the Truststore that is used when connecting to
LDAP using LDAPS or START_TLS.
'TLS - Truststore Type' - Type of the Truststore that is used when connecting to LDAP using
LDAPS or START_TLS (i.e. JKS or PKCS12).
'TLS - Client Auth' - Client authentication policy when connecting to LDAP using LDAPS or START_TLS.
Possible values are REQUIRED, WANT, NONE.
'TLS - Protocol' - Protocol to use when connecting to LDAP using LDAPS or START_TLS. (i.e. TLS,
TLSv1.1, TLSv1.2, etc).
'TLS - Shutdown Gracefully' - Specifies whether the TLS should be shut down gracefully
before the target context is closed. Defaults to false.
'Referral Strategy' - Strategy for handling referrals. Possible values are FOLLOW, IGNORE, THROW.
'Connect Timeout' - Duration of connect timeout. (i.e. 10 secs).
'Read Timeout' - Duration of read timeout. (i.e. 10 secs).
'Url' - Space-separated list of URLs of the LDAP servers (i.e. ldap://<hostname>:<port>).
'Page Size' - Sets the page size when retrieving users and groups. If not specified, no paging is performed.
'Sync Interval' - Duration of time between syncing users and groups (i.e. 30 mins). Minimum allowable value is 10 secs.
'Group Membership - Enforce Case Sensitivity' - Sets whether group membership decisions are case sensitive. When a user or group
is inferred (by not specifying or user or group search base or user identity attribute or group name attribute) case sensitivity
is enforced since the value to use for the user identity or group name would be ambiguous. Defaults to false.
'User Search Base' - Base DN for searching for users (i.e. ou=users,o=nifi). Required to search users.
'User Object Class' - Object class for identifying users (i.e. person). Required if searching users.
'User Search Scope' - Search scope for searching users (ONE_LEVEL, OBJECT, or SUBTREE). Required if searching users.
'User Search Filter' - Filter for searching for users against the 'User Search Base' (i.e. (memberof=cn=team1,ou=groups,o=nifi) ). Optional.
'User Identity Attribute' - Attribute to use to extract user identity (i.e. cn). Optional. If not set, the entire DN is used.
'User Group Name Attribute' - Attribute to use to define group membership (i.e. memberof). Optional. If not set
group membership will not be calculated through the users. Will rely on group membership being defined
through 'Group Member Attribute' if set. The value of this property is the name of the attribute in the user ldap entry that
associates them with a group. The value of that user attribute could be a dn or group name for instance. What value is expected
is configured in the 'User Group Name Attribute - Referenced Group Attribute'.
'User Group Name Attribute - Referenced Group Attribute' - If blank, the value of the attribute defined in 'User Group Name Attribute'
is expected to be the full dn of the group. If not blank, this property will define the attribute of the group ldap entry that
the value of the attribute defined in 'User Group Name Attribute' is referencing (i.e. name). Use of this property requires that
'Group Search Base' is also configured.
'Group Search Base' - Base DN for searching for groups (i.e. ou=groups,o=nifi). Required to search groups.
'Group Object Class' - Object class for identifying groups (i.e. groupOfNames). Required if searching groups.
'Group Search Scope' - Search scope for searching groups (ONE_LEVEL, OBJECT, or SUBTREE). Required if searching groups.
'Group Search Filter' - Filter for searching for groups against the 'Group Search Base'. Optional.
'Group Name Attribute' - Attribute to use to extract group name (i.e. cn). Optional. If not set, the entire DN is used.
'Group Member Attribute' - Attribute to use to define group membership (i.e. member). Optional. If not set
group membership will not be calculated through the groups. Will rely on group membership being defined
through 'User Group Name Attribute' if set. The value of this property is the name of the attribute in the group ldap entry that
associates them with a user. The value of that group attribute could be a dn or memberUid for instance. What value is expected
is configured in the 'Group Member Attribute - Referenced User Attribute'. (i.e. member: cn=User 1,ou=users,o=nifi vs. memberUid: user1)
'Group Member Attribute - Referenced User Attribute' - If blank, the value of the attribute defined in 'Group Member Attribute'
is expected to be the full dn of the user. If not blank, this property will define the attribute of the user ldap entry that
the value of the attribute defined in 'Group Member Attribute' is referencing (i.e. uid). Use of this property requires that
'User Search Base' is also configured. (i.e. member: cn=User 1,ou=users,o=nifi vs. memberUid: user1)
NOTE: Any identity mapping rules specified in nifi.properties will also be applied to the user identities.
Group names are not mapped.
-->
<userGroupProvider>
<identifier>ldap-user-group-provider</identifier>
<class>org.apache.nifi.ldap.tenants.LdapUserGroupProvider</class>
<property name="Authentication Strategy">LDAPS</property>
<property name="Manager DN">CN=xxx,OU=yyy,OU=zzz,OU=www,DC=vvv,DC=ggg,DC=hhh,DC=ppp</property>
<property name="Manager Password">xxx</property>
<property name="TLS - Keystore">/etc/localhost-keystore.jks</property>
<property name="TLS - Keystore Password">xxx</property>
<property name="TLS - Keystore Type">JKS</property>
<property name="TLS - Truststore">/etc/all-truststore.jks</property>
<property name="TLS - Truststore Password">xxx</property>
<property name="TLS - Truststore Type">JKS</property>
<property name="TLS - Client Auth"></property>
<property name="TLS - Protocol">TLS</property>
<property name="TLS - Shutdown Gracefully"></property>
<property name="Referral Strategy">FOLLOW</property>
<property name="Connect Timeout">10 secs</property>
<property name="Read Timeout">10 secs</property>
<property name="Url">ldaps://xxx:636</property>
<property name="Page Size"></property>
<property name="Sync Interval">30 mins</property>
<property name="Group Membership - Enforce Case Sensitivity">false</property>
<property name="User Search Base">OU=xxx,OU=yyy,OU=zzz,DC=www,DC=vvv,DC=ddd,DC=eee</property>
<property name="User Object Class">organizationalPerson</property>
<property name="User Search Scope">SUBTREE</property>
<property name="User Search Filter"></property>
<property name="User Identity Attribute">cn</property>
<property name="User Group Name Attribute"></property>
<property name="User Group Name Attribute - Referenced Group Attribute"></property>
<property name="Group Search Base"></property>
<property name="Group Object Class">group</property>
<property name="Group Search Scope">ONE_LEVEL</property>
<property name="Group Search Filter"></property>
<property name="Group Name Attribute"></property>
<property name="Group Member Attribute"></property>
<property name="Group Member Attribute - Referenced User Attribute"></property>
</userGroupProvider>
<!--
The ShellUserGroupProvider provides support for retrieving users and groups by way of shell commands
on systems that support `sh`. Implementations available for Linux and Mac OS, and are selected by the
provider based on the system property `os.name`.
'Refresh Delay' - duration to wait between subsequent refreshes. Default is '5 mins'.
'Exclude Groups' - regular expression used to exclude groups. Default is '', which means no groups are excluded.
'Exclude Users' - regular expression used to exclude users. Default is '', which means no users are excluded.
-->
<!-- To enable the shell-user-group-provider remove 2 lines. This is 1 of 2.
<userGroupProvider>
<identifier>shell-user-group-provider</identifier>
<class>org.apache.nifi.authorization.ShellUserGroupProvider</class>
<property name="Refresh Delay">5 mins</property>
<property name="Exclude Groups"></property>
<property name="Exclude Users"></property>
</userGroupProvider>
To enable the shell-user-group-provider remove 2 lines. This is 2 of 2. -->
<!--
The CompositeUserGroupProvider will provide support for retrieving users and groups from multiple sources.
- User Group Provider [unique key] - The identifier of user group providers to load from. The name of
each property must be unique, for example: "User Group Provider A", "User Group Provider B",
"User Group Provider C" or "User Group Provider 1", "User Group Provider 2", "User Group Provider 3"
NOTE: Any identity mapping rules specified in nifi.properties are not applied in this implementation. This behavior
would need to be applied by the base implementation.
-->
<userGroupProvider>
<identifier>composite-user-group-provider</identifier>
<class>org.apache.nifi.authorization.CompositeUserGroupProvider</class>
<property name="User Group Provider 1">ldap-user-group-provider</property>
<property name="User Group Provider 2">file-user-group-provider</property>
</userGroupProvider>
<!--
The CompositeConfigurableUserGroupProvider will provide support for retrieving users and groups from multiple sources.
Additionally, a single configurable user group provider is required. Users from the configurable user group provider
are configurable, however users loaded from one of the User Group Provider [unique key] will not be.
- Configurable User Group Provider - A configurable user group provider.
- User Group Provider [unique key] - The identifier of user group providers to load from. The name of
each property must be unique, for example: "User Group Provider A", "User Group Provider B",
"User Group Provider C" or "User Group Provider 1", "User Group Provider 2", "User Group Provider 3"
NOTE: Any identity mapping rules specified in nifi.properties are not applied in this implementation. This behavior
would need to be applied by the base implementation.
-->
<!-- To enable the composite-configurable-user-group-provider remove 2 lines. This is 1 of 2.
<userGroupProvider>
<identifier>composite-configurable-user-group-provider</identifier>
<class>org.apache.nifi.authorization.CompositeConfigurableUserGroupProvider</class>
<property name="Configurable User Group Provider">file-user-group-provider</property>
<property name="User Group Provider 1"></property>
</userGroupProvider>
To enable the composite-configurable-user-group-provider remove 2 lines. This is 2 of 2. -->
<!--
The FileAccessPolicyProvider will provide support for managing access policies which is backed by a file
on the local file system.
- User Group Provider - The identifier for an User Group Provider defined above that will be used to access
users and groups for use in the managed access policies.
- Authorizations File - The file where the FileAccessPolicyProvider will store policies.
- Initial Admin Identity - The identity of an initial admin user that will be granted access to the UI and
given the ability to create additional users, groups, and policies. The value of this property could be
a DN when using certificates or LDAP, or a Kerberos principal. This property will only be used when there
are no other policies defined. If this property is specified then a Legacy Authorized Users File can not be specified.
NOTE: Any identity mapping rules specified in nifi.properties will also be applied to the initial admin identity,
so the value should be the unmapped identity. This identity must be found in the configured User Group Provider.
- Legacy Authorized Users File - The full path to an existing authorized-users.xml that will be automatically
converted to the new authorizations model. If this property is specified then an Initial Admin Identity can
not be specified, and this property will only be used when there are no other users, groups, and policies defined.
NOTE: Any users in the legacy users file must be found in the configured User Group Provider.
- Node Identity [unique key] - The identity of a NiFi cluster node. When clustered, a property for each node
should be defined, so that every node knows about every other node. If not clustered these properties can be ignored.
The name of each property must be unique, for example for a three node cluster:
"Node Identity A", "Node Identity B", "Node Identity C" or "Node Identity 1", "Node Identity 2", "Node Identity 3"
NOTE: Any identity mapping rules specified in nifi.properties will also be applied to the node identities,
so the values should be the unmapped identities (i.e. full DN from a certificate). This identity must be found
in the configured User Group Provider.
- Node Group - The name of a group containing NiFi cluster nodes. The typical use for this is when nodes are dynamically
added/removed from the cluster.
NOTE: The group must exist before starting NiFi.
-->
<accessPolicyProvider>
<identifier>file-access-policy-provider</identifier>
<class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
<property name="User Group Provider">composite-user-group-provider</property>
<property name="Authorizations File">./conf/authorizations.xml</property>
<property name="Initial Admin Identity">jane.doe</property>
<property name="Node Identity 1">CN=xxx, OU=yyy, O=zzz, L=www, ST=vvv, C=sss</property>
<property name="Node Identity 2">CN=xxx, OU=yyy, O=zzz, L=www, ST=vvv, C=sss</property>
<property name="Legacy Authorized Users File"></property>
<property name="Node Group"></property>
</accessPolicyProvider>
<!--
The StandardManagedAuthorizer. This authorizer implementation must be configured with the
Access Policy Provider which it will use to access and manage users, groups, and policies.
These users, groups, and policies will be used to make all access decisions during authorization
requests.
- Access Policy Provider - The identifier for an Access Policy Provider defined above.
-->
<authorizer>
<identifier>managed-authorizer</identifier>
<class>org.apache.nifi.authorization.StandardManagedAuthorizer</class>
<property name="Access Policy Provider">file-access-policy-provider</property>
</authorizer>
<!--
NOTE: This Authorizer has been replaced with the more granular approach configured above with the Standard
Managed Authorizer. However, it is still available for backwards compatibility reasons.
The FileAuthorizer is NiFi's provided authorizer and has the following properties:
- Authorizations File - The file where the FileAuthorizer will store policies.
- Users File - The file where the FileAuthorizer will store users and groups.
- Initial Admin Identity - The identity of an initial admin user that will be granted access to the UI and
given the ability to create additional users, groups, and policies. The value of this property could be
a DN when using certificates or LDAP, or a Kerberos principal. This property will only be used when there
are no other users, groups, and policies defined. If this property is specified then a Legacy Authorized
Users File can not be specified.
NOTE: Any identity mapping rules specified in nifi.properties will also be applied to the initial admin identity,
so the value should be the unmapped identity.
- Legacy Authorized Users File - The full path to an existing authorized-users.xml that will be automatically
converted to the new authorizations model. If this property is specified then an Initial Admin Identity can
not be specified, and this property will only be used when there are no other users, groups, and policies defined.
- Node Identity [unique key] - The identity of a NiFi cluster node. When clustered, a property for each node
should be defined, so that every node knows about every other node. If not clustered these properties can be ignored.
The name of each property must be unique, for example for a three node cluster:
"Node Identity A", "Node Identity B", "Node Identity C" or "Node Identity 1", "Node Identity 2", "Node Identity 3"
NOTE: Any identity mapping rules specified in nifi.properties will also be applied to the node identities,
so the values should be the unmapped identities (i.e. full DN from a certificate).
-->
<!-- <authorizer>
<identifier>file-provider</identifier>
<class>org.apache.nifi.authorization.FileAuthorizer</class>
<property name="Authorizations File">./conf/authorizations.xml</property>
<property name="Users File">./conf/users.xml</property>
<property name="Initial Admin Identity"></property>
<property name="Legacy Authorized Users File"></property>
</authorizer>
-->
</authorizers>
login-identity-providers.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<loginIdentityProviders>
<!--
Identity Provider for users logging in with username/password against an LDAP server.
'Authentication Strategy' - How the connection to the LDAP server is authenticated. Possible
values are ANONYMOUS, SIMPLE, LDAPS, or START_TLS.
'Manager DN' - The DN of the manager that is used to bind to the LDAP server to search for users.
'Manager Password' - The password of the manager that is used to bind to the LDAP server to
search for users.
'TLS - Keystore' - Path to the Keystore that is used when connecting to LDAP using LDAPS or START_TLS.
'TLS - Keystore Password' - Password for the Keystore that is used when connecting to LDAP
using LDAPS or START_TLS.
'TLS - Keystore Type' - Type of the Keystore that is used when connecting to LDAP using
LDAPS or START_TLS (i.e. JKS or PKCS12).
'TLS - Truststore' - Path to the Truststore that is used when connecting to LDAP using LDAPS or START_TLS.
'TLS - Truststore Password' - Password for the Truststore that is used when connecting to
LDAP using LDAPS or START_TLS.
'TLS - Truststore Type' - Type of the Truststore that is used when connecting to LDAP using
LDAPS or START_TLS (i.e. JKS or PKCS12).
'TLS - Client Auth' - Client authentication policy when connecting to LDAP using LDAPS or START_TLS.
Possible values are REQUIRED, WANT, NONE.
'TLS - Protocol' - Protocol to use when connecting to LDAP using LDAPS or START_TLS. (i.e. TLS,
TLSv1.1, TLSv1.2, etc).
'TLS - Shutdown Gracefully' - Specifies whether the TLS should be shut down gracefully
before the target context is closed. Defaults to false.
'Referral Strategy' - Strategy for handling referrals. Possible values are FOLLOW, IGNORE, THROW.
'Connect Timeout' - Duration of connect timeout. (i.e. 10 secs).
'Read Timeout' - Duration of read timeout. (i.e. 10 secs).
'Url' - Space-separated list of URLs of the LDAP servers (i.e. ldap://<hostname>:<port>).
'User Search Base' - Base DN for searching for users (i.e. CN=Users,DC=example,DC=com).
'User Search Filter' - Filter for searching for users against the 'User Search Base'.
(i.e. sAMAccountName={0}). The user specified name is inserted into '{0}'.
'Identity Strategy' - Strategy to identify users. Possible values are USE_DN and USE_USERNAME.
The default functionality if this property is missing is USE_DN in order to retain
backward compatibility. USE_DN will use the full DN of the user entry if possible.
USE_USERNAME will use the username the user logged in with.
'Authentication Expiration' - The duration of how long the user authentication is valid
for. If the user never logs out, they will be required to log back in following
this duration.
-->
<provider>
<identifier>ldap-provider</identifier>
<class>org.apache.nifi.ldap.LdapProvider</class>
<property name="Authentication Strategy">LDAPS</property>
<property name="Manager DN">CN=xxx,OU=yyy,OU=zzz,OU=www,DC=vvv,DC=rrr,DC=ooo,DC=ppp</property>
<property name="Manager Password">xxx</property>
<property name="TLS - Keystore">/etc/localhost-keystore.jks</property>
<property name="TLS - Keystore Password">xxx</property>
<property name="TLS - Keystore Type">JKS</property>
<property name="TLS - Truststore">/etc/all-truststore.jks</property>
<property name="TLS - Truststore Password">xxx</property>
<property name="TLS - Truststore Type">JKS</property>
<property name="TLS - Client Auth"></property>
<property name="TLS - Protocol">TLS</property>
<property name="TLS - Shutdown Gracefully"></property>
<property name="Referral Strategy">FOLLOW</property>
<property name="Connect Timeout">10 secs</property>
<property name="Read Timeout">10 secs</property>
<property name="Url">ldaps://xxx:636</property>
<property name="User Search Base">OU=xxx,OU=yyy,OU=zzz,DC=www,DC=vvv,DC=uuu,DC=qqq</property>
<property name="User Search Filter">cn={0}</property>
<property name="Identity Strategy">USE_DN</property>
<property name="Authentication Expiration">12 hours</property>
</provider>
<!--
Identity Provider for users logging in with username/password against a Kerberos KDC server.
'Default Realm' - Default realm to provide when user enters incomplete user principal (i.e. NIFI.APACHE.ORG).
'Authentication Expiration' - The duration of how long the user authentication is valid for. If the user never logs out, they will be required to log back in following this duration.
-->
<!-- To enable the kerberos-provider remove 2 lines. This is 1 of 2.
<provider>
<identifier>kerberos-provider</identifier>
<class>org.apache.nifi.kerberos.KerberosProvider</class>
<property name="Default Realm">NIFI.APACHE.ORG</property>
<property name="Authentication Expiration">12 hours</property>
</provider>
To enable the kerberos-provider remove 2 lines. This is 2 of 2. -->
</loginIdentityProviders>
nifi.properties
nifi.security.keystore=/etc/localhost-keystore.jks
nifi.security.keystoreType=JKS
nifi.security.keystorePasswd=xxx
nifi.security.keyPasswd=
nifi.security.truststore=/etc/all-truststore.jks
nifi.security.truststoreType=JKS
nifi.security.truststorePasswd=xxx
nifi.security.user.authorizer=managed-authorizer
nifi.security.user.login.identity.provider=ldap-provider
nifi.security.ocsp.responder.url=
nifi.security.ocsp.responder.certificate=
...
# Identity Mapping Properties #
# These properties allow normalizing user identities such that identities coming from different identity providers
# (certificates, LDAP, Kerberos) can be treated the same internally in NiFi. The following example demonstrates normalizing
# DNs from certificates and principals from Kerberos into a common identity string:
#
# nifi.security.identity.mapping.pattern.dn=^CN=(.*?), OU=(.*?), O=(.*?), L=(.*?), ST=(.*?), C=(.*?)$
nifi.security.identity.mapping.pattern.dn=^CN=(.*?), OU=(.*?), OU=(.*?), OU=(.*?), DC=(.*?), DC=(.*?), DC=(.*?), DC=(.*?)$
nifi.security.identity.mapping.value.dn=$1
# nifi.security.identity.mapping.pattern.dn2=^CN=(.*?), OU=(.*?), O=(.*?), L=(.*?), S=(.*?), C=(.*?)$
# nifi.security.identity.mapping.value.dn2=$1
# nifi.security.identity.mapping.value.dn=$1#$2
# nifi.security.identity.mapping.transform.dn=NONE
# nifi.security.identity.mapping.pattern.kerb=^(.*?)/instance#(.*?)$
# nifi.security.identity.mapping.value.kerb=$1#$2
# nifi.security.identity.mapping.transform.kerb=UPPER
You have an "Initial Admin Identity" for jane.doe defined in the <accessPolicyProvider> but not in your file-based <userGroupProvider>. Is jane.doe an (exact) user identity you expect to be available in LDAP? If not, you should define the same user as the "Initial User Identity 1" in the file-based <userGroupProvider> alongside the node identities. If you do expect Jane Doe to be provided via LDAP, have you tried setting the IAI to CN=jane.doe, OU=<your org>, ... (the full value of the DN returned from LDAP)?
You can also look at the logs/nifi-app.log and logs/nifi-user.log files for more details, and increase the verbosity of those logs by modifying conf/logback.xml.

Unable to connect to s3 buckets from pyspark

I am trying to connect to my s3 buckets using my Spark as follow:
rdd=sc.textFile("s3n://bucketname/objectname")
rdd=sc.textFile(""s3a://bucketname/objectname")
and changed my cores-site.xml as pers s3a or s3n but I am getting error as follow. Tried various changes in my hadoop core-site.xml. I am getting errors such as "load aws credentials from any provider in chain". {/.aws credentials file is there with right credentials}
ResponseStatus: Bad Request, XML Error Message: AuthorizationHeaderMalformedThe authorization header is malformed; a non-empty Access Key (AKID) must
be provided in the credential
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://********.compute-1.amazonaws.com:9000</value>
</property>
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>ACCESSKEYID</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>SECRETKEY</value>
</property>
</configuration>
I added aws-sdk-s3 into my spark jars file. Please provide me directions to get me on to the right track.
Complete error message:
Bad Request, XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>AuthorizationHeaderMalformed</Code><Message>The authorization header is malformed; a non-empty Access Key (AKID) must be provided in the credential.</Message><RequestId>E64EEB94923F0FF7</RequestId><HostId>cmAiSUGZo7w7IgK3gJ+ubuWdlXwffEhpnpdnkoJQ2hLP8EHBXZDau0mFCKCC3eWBtfL9V1Le4Mw=</HostId></Error>

Nifi 1.5 Untrusted Proxy on cluster

I've done my best to follow: https://pierrevillard.com/2016/11/29/apache-nifi-1-1-0-secured-cluster-setup/
I'm running nifi-1.5.0 and when I go to each of the pages I see an error like: Untrusted proxy CN=nifi-{1-3}.east.companyname.com, OU=NIFI.
I'm using ldap authentication, and just accepting the "invalid" certificate.
I've used an unrelated key-server to generate the keystore/truststore/certs as per the link above.
I also have the
nifi.security.needClientAuth=true
and
nifi.cluster.protocol.is.secure=true
set in the nifi.properties files on all of my nodes
my authorizers file includes entries for all of the nodes like:
<property name="Node Identity 1">CN=nifi-1.east.companyname.com, OU=NIFI</property>
<property name="Node Identity 2">CN=nifi-2.east.companyname.com, OU=NIFI</property>
<property name="Node Identity 3">CN=nifi-3.east.companyname.com, OU=NIFI</property>
Thanks in advance!
I would recommend configuring your authorizer in authorizers.xml to use a CompositeConfigurableUserGroupProvider that has two user group providers:
file-user-group-provider: this will be used to store the identities (certificate DNs) of your cluster nodes
ldap-user-group-provider: for your end users, that will be proxied when the cluster is replicating requests
Configure both of these UserGroupProviders, then configure the CompositeConfigurableUserGroupProvider to use the file-user-group-provider as the "Configurable Provider" and the ldap-user-group-provider as "User Group Provider 1". Here is an example:
<authorizers>
<userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
<property name="Legacy Authorized Users File"></property>
<property name="Initial User Identity 1">CN=nifi-1.east.companyname.com, OU=NIFI</property>
<property name="Initial User Identity 1">CN=nifi-2.east.companyname.com, OU=NIFI</property>
<property name="Initial User Identity 1">CN=nifi-3.east.companyname.com, OU=NIFI</property>
</userGroupProvider>
<userGroupProvider>
<identifier>ldap-user-group-provider</identifier>
<class>org.apache.nifi.ldap.tenants.LdapUserGroupProvider</class>
<!-- ... configure this to match the settings in login-identity-providers.xml ... -->
</userGroupProvider>
<userGroupProvider>
<identifier>composite-configurable-user-group-provider</identifier>
<class>org.apache.nifi.authorization.CompositeConfigurableUserGroupProvider</class>
<property name="Configurable User Group Provider">file-user-group-provider</property>
<property name="User Group Provider 1">ldap-user-group-provider</property>
</userGroupProvider>
<accessPolicyProvider>
<identifier>file-access-policy-provider</identifier>
<class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
<property name="User Group Provider">composite-configurable-user-group-provider</property>
<property name="Authorizations File">./conf/authorizations.xml</property>
<property name="Initial Admin Identity"></property>
<property name="Legacy Authorized Users File"></property>
<property name="Node Identity 1">CN=nifi-1.east.companyname.com, OU=NIFI</property>
<property name="Node Identity 2">CN=nifi-2.east.companyname.com, OU=NIFI</property>
<property name="Node Identity 3">CN=nifi-3.east.companyname.com, OU=NIFI</property>
</accessPolicyProvider>
<authorizer>
<identifier>managed-authorizer</identifier>
<class>org.apache.nifi.authorization.StandardManagedAuthorizer</class>
<property name="Access Policy Provider">file-access-policy-provider</property>
</authorizer>
</authorizers>
Configure this on each node, then remove users.xml and authorizations.xml and restart NiFi on each node. (This is necessary to create the users.xml and authorizations.xml with your node identities setup to act as proxies, which will not happen if users.xml and authorizations.xml exist with data.) If done correctly, each node should allow the clustered nodes to authenticate using the client certificate (from their keystore.jks) and each node will be authorized to act as proxies, meaning that when an end-user is talking to one cluster, that interaction will be replicated to all nodes in the cluster, which is what you want.
You should be able to set nifi.security.needClientAuth=false. Certificate-based authentication will still work, it just won't be required (i.e., for the initial communication from an end-user to a node, LDAP credentials will be enough).
Hope this helps!
Reference: NiFi Admin Guide

How to configure LDAP on Spark-Thrift server on AWS EMR?

Note that we are not talking about hiveserver2, or hive-thrift server here.
If anyone has experience with this, I want to configure LDAP auth on spark-thrift server. I am using AWS EMR as my cluster.
I am able to start the server and query using it, but without any username or password. Not even sure where to specify authentication related properties. There's just very little documentation on this stuff.
Looking forward to hear from anyone who has experience doing this.
copy the hive-site.xml from your ~/hive/conf directory to your ~/spark/conf/ directory.
you need to configure egress rules to allow your EMR cluster to connect to your LDAP server/ip/port.
As per official documentation from hiveserver2 :
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2
Set following for LDAP mode:
hive.server2.authentication.ldap.url – LDAP URL (for example, ldap://hostname.com:389).
hive.server2.authentication.ldap.baseDN – LDAP base DN. (Optional for AD.)
hive.server2.authentication.ldap.Domain – LDAP domain. (Hive 0.12.0 and later.)
See User and Group Filter Support with LDAP Atn Provider in HiveServer2 for other LDAP configuration parameters in Hive 1.3.0 and later.
hive-site.xml – changes :
<property>
<name>hive.server2.authentication</name>
<value>LDAP</value>
<description>
Expects one of [nosasl, none, ldap, kerberos, pam, custom].
Client authentication types.
NONE: no authentication check
LDAP: LDAP/AD based authentication
KERBEROS: Kerberos/GSSAPI authentication
CUSTOM: Custom authentication provider
(Use with property hive.server2.custom.authentication.class)
PAM: Pluggable authentication module
NOSASL: Raw transport
</description>
<property>
<name>hive.server2.authentication.ldap.url</name>
<value>ldaps://changemetoyour.ldap.url:5000</value>
<description>
LDAP connection URL(s),
this value could contain URLs to mutiple LDAP servers instances for HA,
each LDAP URL is separated by a SPACE character. URLs are used in the
order specified until a connection is successful.
</description>
</property>
<property>
<name>hive.server2.authentication.ldap.baseDN</name>
<value>changeme.mydomain.com</value>
<description>LDAP base DN</description>
</property>
<property>
<name>hive.server2.authentication.ldap.Domain</name>
<value/>
<description/>
</property>
<property>
<name>hive.server2.authentication.ldap.groupDNPattern</name>
<value/>
<description>
COLON-separated list of patterns to use to find DNs for group entities in this directory.
Use %s where the actual group name is to be substituted for.
For example: CN=%s,CN=Groups,DC=subdomain,DC=domain,DC=com.
</description>
</property>

Spring Integration & Retry: Do I need a separate retry bean for each service-activator?

I've got a spring integration pipeline and I've got a number of different service activators that I want to enable retry for.
I want to use the same retry policy (i.e. the number of retries, back-off policy, etc). Can I just have one bean that implements the retry policy and use it for several different service activators, or does each service activator need its own retry bean? In other words, can I just make one bean "retryWithBackupAdviceSession" and set it the request-hadler-advice-chain for several service activators? Or does each one need its own?
Here's an example of the retry policy I'm using.
<bean id="retryWithBackoffAdviceSession" class="org.springframework.integration.handler.advice.RequestHandlerRetryAdvice">
<property name="retryTemplate">
<bean class="org.springframework.retry.support.RetryTemplate">
<property name="backOffPolicy">
<bean class="org.springframework.retry.backoff.ExponentialBackOffPolicy">
<property name="initialInterval" value="2000" /> <!-- 2 seconds -->
<property name="multiplier" value="2" /> <!-- double the wait each time -->
<property name="maxInterval" value="30000"/> <!-- maximum of 30 seconds -->
</bean>
</property>
<property name="retryPolicy">
<bean class="org.springframework.retry.policy.SimpleRetryPolicy">
<property name="maxAttempts" value="3"/>
</bean>
</property>
</bean>
</property>
<property name="recoveryCallback">
<bean class="org.springframework.integration.handler.advice.ErrorMessageSendingRecoverer">
<constructor-arg ref="myErrorChannel"/>
</bean>
</property>
</bean>
As a follow up-question, if my service activator is running in an executor channel, does it somehow keep track of the retries per-thread? Or is there something I need to do to ensure that there isn't cross-talk between the different threads retrying on different messages on the same thread-safe service activator?
You go right way: the RequestHandlerRetryAdvice is thread-safe, so you can use the same beand from several places.
I had to implement Retry mechanism in my project as well and I created my own implementation.
Retry using AOP
This works like a charm.
You can just annotate your methods with the #Retry annotation, provide some config you want and its done.

Resources