Viewing Spark Job Results on Livy

This topic describes how to view results for submitted Spark jobs on Livy in HPE Ezmeral Runtime Enterprise.

You can view the results for submitted Spark jobs in the following ways:
  1. Using Notebook’s UI on the interactive notebooks like Jupyter Notebook.
  2. Accessing Livy UI through Livy HTTP endpoints. For example:https://xx-xxx-xxx.xx.lab:10075/ui or https://xx-xxx-xxx.xx.lab:10075
  3. Using a REST API.

    For example, perform the following steps to view the results for submitted jobs using a REST API:

    1. Create a Livy session.
      curl -k -s \
          -X POST \
          -H "Content-Type:application/json" \
          -d '{"kind": "spark"}' \
          -u "username:password" \
          https://xx-xxx-xxx.xx.lab:10075/sessions | jq
      
    2. Execute code on a newly created Livy session.
      curl -k -s \
          -X POST \
          -H "Content-Type:application/json" \
          -d '{"code": "sc.parallelize(List(1,2,3)).reduce(_*_)"}' \
          -u "username:password" \
          https://cxx-xxx-xxx.xx.lab:10075:10075/sessions/0/statements | jq
      
      When you execute a block of statement, a Livy server assigns an id to that block of statement.
      {
        "id": 0,
        "code": "sc.parallelize(List(1,2,3)).reduce(_*_)",
        "state": "waiting",
        "output": null,
        "progress": 0
      }
      
    3. You can either see the output for a particular block of statement or for the total number of statements using the following commands:

      1. Run the following commands to see the state and output for the specific submitted block of statement for the specific session.
        curl -k -s \
            -u "username:password" \
            https://xx-xxx-xxx.xx.lab:10075/sessions/<session id>/statements/<id> | jq
        
        Example of output result in a session zero for a block of statement with an id zero.
        {
          "id": 0,
          "code": "sc.parallelize(List(1,2,3)).reduce(_*_)",
          "state": "available",
          "output": {
            "status": "ok",
            "execution_count": 0,
            "data": {
              "text/plain": "res0: Int = 6\n"
            }
          },
          "progress": 1
        }
        
      2. Run the following commands to see the state and output of all the submitted blocks of statement for a specific session.
        curl -k -s \
            -u "username:password" \
            https://xx-xxx-xxx.xx.lab:10075/sessions/0/statements | jq
        
        Example of output result in a session zero where the total number of submitted statements are two.
        {
          "total_statements": 2,
          "statements": [
            {
              "id": 0,
              "code": "sc.parallelize(List(1,2,3)).reduce(_*_)",
              "state": "available",
              "output": {
                "status": "ok",
                "execution_count": 0,
                "data": {
                  "text/plain": "res0: Int = 6\n"
                }
              },
              "progress": 1
            },
            {
              "id": 1,
              "code": "sc.parallelize(List(10,20,30)).reduce(_*_)",
              "state": "available",
              "output": {
                "status": "ok",
                "execution_count": 1,
                "data": {
                  "text/plain": "res2: Int = 6000\n"
                }
              },
              "progress": 1
            }
          ]
        }