Securely Providing ADLS Credentials

You can provide your ADLS credentials securely by hiding the open, readable configuration on the command line using the Hadoop credential provider.

Procedure

  1. Generate a jceks file for ADLS authorization:
    hadoop credential create dfs.adls.oauth2.client.id -provider jceks://hdfs/user/USER_NAME/adlskeyfile.jceks -value client ID
    hadoop credential create dfs.adls.oauth2.credential -provider jceks://hdfs/user/USER_NAME/adlskeyfile.jceks -value client secret
    hadoop credential create dfs.adls.oauth2.refresh.url -provider jceks://hdfs/user/USER_NAME/adlskeyfile.jceks -value refresh URL
  2. Run the DistCp example using the jceks file:
    hadoop distcp
    [-D hadoop.security.credential.provider.path=localjceks://hdfs/user/USER_NAME/adlskeyfile.jceks]
    hdfs://<NameNode Hostname>:9001/user/foo/007020615
    adl://<Account Name>.azuredatalakestore.net/testDir/
  3. Configure the core-site.xml file to use the jceks file:
    <property>
      <name>hadoop.security.credential.provider.path</name>
      <value>localjceks://hdfs/user/USER_NAME/adlskeyfile.jceks</value>
      <description>Path to interrogate for protected credentials.</description>
    </property>