How to Enable Hadoop RPC Protection

Goal 

Security infrastructure for Hadoop Remote Procedure Call (RPC) uses Java Simple Authentication and Security Layer (SASL) APIs. Quality of Protection (QOP) settings can be used to enable encryption for Hadoop RPC protocols.

Java SASL provides following QOP settings:

  • auth – This setting is the default and stands for authentication only. This setting implies that the client and server mutually authenticate during connection setup.
  • auth-int – This setting indicates authentication and integrity. This setting guarantees the integrity of data exchanged between client and server as well as authentication.
  • auth-conf – This setting indicates authentication, integrity, and confidentiality. This setting guarantees that data exchanged between client and server is encrypted.

Hadoop lets cluster administrators control the quality of protection using the configuration parameter hadoop.rpc.protection in core-site.xml. It is an optional parameter and if it is not present the default QOP setting of auth is used, which implies authentication only. The valid values for this parameter are:

  • authentication = auth
  • integrity = auth-int
  • privacy = auth-conf

The default setting is kept as authentication only because integrity checks and encryption have a cost in terms of performance.

Learn

Configuring the RPC Protection Level for Datameer 

If you decide to use Hadoop RPC protection for your cluster, Datameer should be configured accordingly. Otherwise, communication between Datameer and cluster services fails with the error:

SASL negotiation failure
javax.security.sasl.SaslException: No common protection layer between client and server

In order to configure Hadoop RPC protection in Datameer, add the hadoop.rpc.protection=<QOP level configured at cluster side> property to the das.common-properties file and restart Datameer service.

Configuring QOP Levels for Hive

If you enable Hadoop RPC protection at your cluster, every service at the cluster should be configured accordingly to avoid communication issues. In HiveServer2, QOP level is controlled by the hive.server2.thrift.sasl.qop property set in the hive-site.xml file. In case this property isn't specified, the default QOP level is auth.

To set up the same QOP for HiveServer2, configure this property according to the one set for your cluster services and restart Hive services. Note that Hive operates Java SASL QOP levels (auth, auth-int, auth-conf).

Note that it might be also required to introduce the property hadoop.rpc.protection=<QOP level configured at cluster side> as custom for Datameer Hive connection, to execute export jobs. 

Configuration Example

core-site.xml

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>

hive-site.xml

<property>
<name>hive.server2.thrift.sasl.qop</name>
<value>auth-conf</value>
</property>

das-common-properties 

#Enable Hadoop RPC Encryption
hadoop.rpc.protection=privacy

Datameer Hive Connection custom properties (for Export jobs)

hadoop.rpc.protection=privacy

References

Fine-Tune Your Apache Hadoop Security Settings

Wire Encryption in Hadoop
Configuring Encrypted Communication Between HiveServer2 and Client Drivers
How-to: Set Up a Hadoop Cluster with Network Encryption