Understanding the DELETE FROM Operation
In EEP 6.3.1
and later, you can use the DELETE FROM
operation with HPE Ezmeral Data Fabric Database JSON tables.
Delete All Data from a Table
To delete all data from a HPE Ezmeral Data Fabric DatabaseJSON table use the following operator:
DELETE FROM <table_name>;
Example. In this example we create a table, insert data, and delete all rows:
DROP TABLE IF EXISTS customer;
CREATE TABLE customer(doc_id STRING, first_name STRING, last_name STRING) STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
TBLPROPERTIES("maprdb.table.name" = "/customer","maprdb.column.id" = "doc_id");
INSERT INTO TABLE customer VALUES ("001", "Max", "Born"), ("002", "Demmy", "John"), ("003", "Robby", "Smart");
SELECT doc_id, first_name, last_name FROM customer;
doc_id | first_name | last_name |
---|---|---|
001 | Max | Born |
002 | Demmy | John |
003 | Robby | Smart |
The following query gives an empty set. Deletions are supported only for HPE Ezmeral Data Fabric Database JSON tables and transactional tables.
DELETE FROM customer;
SELECT doc_id, first_name, last_name FROM customer;
Example. In this example, we try to delete data from a non-transactional and non-Data Fabric Database JSON table:
DROP TABLE IF EXISTS simple_data;
CREATE TABLE simple_data (id INT);
INSERT INTO TABLE simple_data VALUES (1), (2), (3);
DELETE FROM simple_data;
The result is:
FAILED: SemanticException Operation is not supported. Table is nor ACID neither MapRDbJSON
Delete a Single Row from a Table
To delete a single row from a HPE Ezmeral Data Fabric Database JSON table, use the following syntax:
DELETE FROM <table_name> WHERE <id> = <value>;
Where:
<table_name>
is the HPE Ezmeral Data Fabric Database JSON
table.
<id>
is a key column of the Data Fabric
Database JSON table. It corresponds to the maprdb.column.id
property.
<value>
is the value to be deleted.
Example. In this example, we create a table, insert data, and delete a single row:
DROP TABLE IF EXISTS customer;
CREATE TABLE customer(doc_id STRING, first_name STRING, last_name STRING) STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
TBLPROPERTIES("maprdb.table.name" = "/customer","maprdb.column.id" = "doc_id");
INSERT INTO customer VALUES ("001", "Max", "Born"), ("002", "Demmy", "John"), ("003", "Robby", "Smart");
SELECT doc_id, first_name, last_name FROM customer;
doc_id | first_name | last_name |
---|---|---|
001 | Max | Born |
002 | Demmy | John |
003 | Robby | Smart |
The following query deletes a single row using the WHERE
clause:
DELETE FROM customer WHERE doc_id = "002";
SELECT doc_id, first_name, last_name FROM customer;
doc_id | first_name | last_name |
---|---|---|
001 | Max | Born |
003 | Robby | Smart |
Note. Deletions are supported only for key columns of HPE Ezmeral Data Fabric Database JSON tables.
Example. In this example, we try to use a column other than a key column of the Data Fabric Database JSON table in deletion.
DROP TABLE IF EXISTS customer;
CREATE TABLE customer(doc_id STRING, first_name STRING, last_name STRING) STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
TBLPROPERTIES("maprdb.table.name" = "/customer","maprdb.column.id" = "doc_id");
INSERT INTO customer VALUES ("001", "Max", "Born"), ("002", "Demmy", "John"), ("003", "Robby", "Smart");
SELECT doc_id, first_name, last_name FROM customer;
Column first_name
is not the key column of the table.
DELETE FROM customer WHERE first_name = "Max";
The result is:
FAILED: SemanticException Deletion over column first_name is forbidden. Use only key column of MapR Db Json table: doc_id
Delete Several Rows from a Table
To delete several rows from a table, use the following syntax:
DELETE FROM <table_name> WHERE <id> IN (<value1>, <value2>,
...);
Where:
<table_name>
is the HPE Ezmeral Data Fabric Database JSON
table.
<id>
is a key column of the MapR Database JSON table. It corresponds to
the maprdb.column.id
property.
<value1>, <value2>,
are values to be deleted.
Example. In this example, we create a table, insert data, and delete several rows:
DROP TABLE IF EXISTS customer;
CREATE TABLE customer(doc_id STRING, first_name STRING, last_name STRING) STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
TBLPROPERTIES("maprdb.table.name" = "/customer","maprdb.column.id" = "doc_id");
INSERT INTO TABLE customer VALUES ("001", "Max", "Born"), ("002", "Demmy", "John"), ("003", "Robby", "Smart");
SELECT doc_id, first_name, last_name FROM customer;
doc_id | first_name | last_name |
---|---|---|
001 | Max | Born |
002 | Demmy | John |
003 | Robby | Smart |
The following query deletes several rows using the WHERE ... IN
clause:
DELETE FROM customer WHERE doc_id IN ("001", "002");
SELECT doc_id, first_name, last_name FROM customer;
doc_id | first_name | last_name |
---|---|---|
003 | Robby | Smart |
Delete All Rows in a Table Except Listed Rows
To delete all rows from a table except a listed row, use the following syntax:
DELETE FROM <table_name> WHERE <id> NOT IN (<value1>, <value2>,
...);
Where:
<table_name>
is the HPE Ezmeral Data Fabric Database JSON
table.
<id>
is a key column of the MapR Database JSON table. It corresponds to
the maprdb.column.id
property.
<value1>, <value2>,
are values to be preserved.
Example. In this example, we create a table, insert data, and delete all rows except the listed rows:
DROP TABLE IF EXISTS customer;
CREATE TABLE customer(doc_id STRING, first_name STRING, last_name STRING) STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
TBLPROPERTIES("maprdb.table.name" = "/customer","maprdb.column.id" = "doc_id");
INSERT INTO TABLE customer VALUES ("001", "Max", "Born"), ("002", "Demmy", "John"), ("003", "Robby", "Smart");
SELECT doc_id, first_name, last_name FROM customer;
doc_id | first_name | last_name |
---|---|---|
001 | Max | Born |
002 | Demmy | John |
003 | Robby | Smart |
The following query deletes all rows except the listed rows:
DELETE FROM customer WHERE doc_id NOT IN ("003");
SELECT doc_id, first_name, last_name FROM customer;
doc_id | first_name | last_name |
---|---|---|
003 | Robby | Smart |
Limitations of the DELETE FROM Operation
The following are three limitations of the current implementation:
- The current implementation does not support arbitrary conditions in the
WHERE
clause of theDELETE
statement even if a key column is used.Example. In this example,
DELETE FROM
is used with an arbitrary condition:
The result is:DELETE FROM customer WHERE doc_id == "003" OR doc_id <> "005";
FAILED: SemanticException This condition is not supported for MapR Db Json deletions. Supported WHERE clauses are: <id> = value, <id> IN (value1, value2, ...), <id> NOT IN (value1, value2, ...)
- The current implementation does not support subqueries in the
WHERE
clause. - The current implementation does not support deletions in the
MERGE
statement.