Understanding the INSERT INTO Statement
This section describes how to use the INSERT INTO
statement to insert
or overwrite rows in nested HPE Ezmeral Data Fabric Database JSON tables, using the Hive
connector.
Single-row insert
You can use the INSERT INTO
statement to insert a single table row into a
nested HPE Ezmeral Data Fabric Database table using one of two methods.
For example, imagine that you have the following Hive HPE Ezmeral Data Fabric Database JSON table, nested_data_insert
:
CREATE TABLE nested_data_insert
(
entry STRING,
num INT,
postal_addresses MAP <STRING,
struct <USER_ID:STRING,ADDRESS:STRING,ZIP:STRING,COUNTRY:STRING>>
)
stored BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' tblproperties
(
"maprdb.table.name" = "/nested_data_insert",
"maprdb.column.id" = "entry"
);
- You can insert the new row into your table by using a dummy table:
WITH dummy_table AS (SELECT '001' AS KEY, '1' AS num, MAP ('Adam', Named_struct ('user_id', '1', 'address', '3205 Woodlake ct', 'zip', '45040', 'country', 'Usa'), 'Wilfred', Named_struct ('user_id', '2', 'address', '777 Brockton Avenue', 'zip', '34000', 'country', 'Ita')) AS postal_addresses) INSERT INTO nested_data_insert SELECT * FROM dummy_table;
- Alternatively, you can insert the new row into your table by using a
SELECT
statement:INSERT INTO TABLE nested_data_insert SELECT '002', '2', MAP ('Bill', Named_struct ('user_id', '1', 'address', '328 Virginia Ave', 'zip', '54956', 'country', 'Bol'), 'Stiv', Named_struct ('user_id', '2', 'address', 'Schererville', 'zip', '46375', 'country', 'Efi'));
- Verify the insertion into the Hive table by using the
SELECT * FROM
syntax.SELECT * FROM nested_data_insert;
Sample output:
Table 1. Result of the insert queries 1.1 and 1.2: entry num postal_address USER_ID ADDRESS ZIP COUNTRY 001 1 Adam 1 3205 Woodlake ct 45040 Usa Wilfred 2 777 Brockton Avenue 34000 Ita 002 2 Bill 1 328 Virginia Ave 54956 Bol Stiv 2 Schererville 46375 Efi - Verify the insertion into the HPE Ezmeral Data Fabric Database JSON table data
using the
find
statement:find '/nested_data_insert' { "Adam": { "user_id": "1", "address": "3205 Woodlake ct", "zip": "45040", "country": "Usa" }, "Wilfred": { "user_id": "2", "address": "777 Brockton Avenue", "zip": "34000", "country": "Ita" } } { "Bill": { "user_id": "1", "address": "328 Virginia Ave", "zip": "54956", "country": "Bol" }, "Stiv": { "user_id": "2", "address": "Schererville", "zip": "46375", "country": "Efi" } }
Multiple-row insert
Now imagine that you want to insert three rows of data into nested_data_insert
.
- You can insert the new rows into your table by using a dummy table:
WITH dummy_table AS (SELECT '003' AS KEY, '3' AS num, MAP ('Rony', Named_struct ('user_id', '1', 'address', '4333 Backer str', 'zip', '12311', 'country', 'Hun')) AS postal_addresses UNION ALL SELECT '004' AS KEY, '4' AS num, MAP ('Ivan', Named_struct ('user_id', '1', 'address', '833 Bridle Avenue', 'zip', '95111', 'country', 'CA')) AS postal_addresses UNION ALL SELECT '005' AS KEY, '5' AS num, MAP ('Ivan', Named_struct ('user_id', '1', 'address', '664 Devon Ave', 'zip', '92021', 'country', 'Tog')) AS postal_addresses) INSERT INTO nested_data_insert SELECT * FROM dummy_table;
- Alternatively, you can insert the new rows into your table by using a
SELECT
statement:INSERT INTO TABLE nested_data_insert SELECT '006', '6', MAP ('Rony', Named_struct ('user_id', '1', 'address', '150 National City', 'zip', '91950', 'country', 'Hun')) UNION ALL SELECT '007', '7', MAP ('Tomason', Named_struct ('user_id', '1', 'address', '272 Ocean Circle' , 'zip', '92801', 'country', 'CA')) UNION ALL SELECT '008', '8', MAP ('Davin', Named_struct ('user_id', '1', 'address', '81 Augusta Ave', 'zip', '93905', 'country', 'CA'));
- Verify the insertion into the Hive table by using the
SELECT * FROM
syntax.SELECT * FROM nested_data_insert WHERE entry > '002' ;
Sample output:
Table 2. Result of the insert queries 2.1 and 2.2: entry num postal_address USER_ID ADDRESS ZIP COUNTRY 003 3 Rony 1 4333 Backer str 12311 Hun 004 4 Ivan 1 833 Bridle Avenue 95111 CA 005 5 Ivan 1 664 Devon Ave. 92021 Tog 006 6 Rony 1 150 National City 91950 Hun 007 7 Tomason 1 272 Ocean Circle 92801
CA 008 8 Davin 1 81 Augusta Ave 93905 CA - Verify the insertion into the HPE Ezmeral Data Fabric Database JSON table data
using the
find
statement:find '/nested_data_insert' { "_id": "003", "num": { "$numberInt": 3 }, "postal_addresses": { "Rony": { "address": "4333 Backer str", "country": "Hun", "user_id": "1", "zip": "12311" } } } { "_id": "004", "num": { "$numberInt": 4 }, "postal_addresses": { "Ivan": { "address": "833 Bridle Avenue", "country": "CA", "user_id": "1", "zip": "95111" } } } { "_id": "005", "num": { "$numberInt": 5 }, "postal_addresses": { "Ivan": { "address": "664 Devon Ave", "country": "Tog", "user_id": "1", "zip": "92021" } } } { "_id": "006", "num": { "$numberInt": 6 }, "postal_addresses": { "Rony": { "address": "150 National City", "country": "Hun", "user_id": "1", "zip": "91950" } } } { "_id": "007", "num": { "$numberInt": 7 }, "postal_addresses": { "Tomason": { "address": "272 Ocean Circle", "country": "CA", "user_id": "1", "zip": "92801" } } } { "_id": "008", "num": { "$numberInt": 8 }, "postal_addresses": { "Davin": { "address": "81 Augusta Ave", "country": "CA", "user_id": "1", "zip": "93905" } } }
Overwriting data
Still using sample table nested_data_insert
, you can use the INSERT
statement on a dummy table to overwrite one or more complete rows.
nested_data_insert (001)
with
new values, use the following syntax:
WITH dummy_table AS
(SELECT '001' AS KEY,
'1' AS num,
MAP ('newAdam',
Named_struct ('user_id', '1', 'address', 'newAdress', 'zip', 'newZip', 'country', 'newCountry')) AS postal_addresses)
INSERT INTO nested_data_insert
SELECT *
FROM dummy_table;
- Verify the data into the Hive table by using the
SELECT * FROM
syntax.hive> SELECT * FROM nested_data_insert WHERE entry = '001';
Sample output:
Table 3. Result of the insert query 3.1: entry num postal_address USER_ID ADDRESS ZIP COUNTRY 001 1 newAdam 1 newAddress newZip newCountry - Verify the data in the HPE Ezmeral Data Fabric Database JSON table data using
the
findbyid
statement:findbyid '/nested_data_insert' --id 001 { "_id": "001", "num": { "$numberInt": 1 }, "postal_addresses": { "newAdam": { "address": "newAdress", "country": "newCountry", "user_id": "1", "zip": "newZip" } } }
003
and
004
rows in nested_data_insert
with new values:
WITH dummy_table AS (
SELECT '003' AS KEY,
'3' AS num,
MAP ('newName1',
Named_struct ('user_id', '1', 'address', 'newAdress1', 'zip', 'newZip1', 'country', 'newCountry1')) AS postal_addresses
UNION ALL
SELECT '004' AS KEY,
'4' AS num,
MAP ('newName2',
Named_struct ('user_id', '1', 'address', 'newAdress2', 'zip', 'newZip2', 'country', 'newCountry2')) AS postal_addresses)
INSERT INTO nested_data_insert
SELECT * FROM dummy_table;
- Verify the data in the Hive table by using the
SELECT * FROM
syntax.hive> SELECT * FROM nested_data_insert WHERE entry IN ('003', '004');
Sample output:
Verify the data in the HPE Ezmeral Data Fabric Database JSON table data using theTable 4. Result of the insert query 3.3: entry num postal_address USER_ID ADDRESS ZIP COUNTRY 003 3 newName1 1 newAddress1 newZip1 newCountry1 004 4 newName2 1 newAddress2 newZip2 newCountry2 findbyid
statement:findbyid '/nested_data_insert' --id 003 { "_id": "003", "num": { "$numberInt": 3 }, "postal_addresses": { "newName1": { "address": "newAdress1", "country": "newCountry1", "user_id": "1", "zip": "newZip1" } } } findbyid '/nested_data_insert' --id 004 { "_id": "004", "num": { "$numberInt": 4 }, "postal_addresses": { "newName2": { "address": "newAdress2", "country": "newCountry2", "user_id": "1", "zip": "newZip2" } } }
WARNINGIf you exclude columns both from the SELECT statement in your INSERT statement and from the table schema, the value of this column changes to NULL.
nested_data_insert
(001)
with new values and overwrite the num
column to
NULL
:
WITH dummy_table AS
(SELECT '001' AS KEY,
MAP ('newAdam',
Named_struct ('user_id', '1', 'address', 'newAdress', 'zip', 'newZip', 'country', 'newCountry')) AS postal_addresses)
INSERT INTO nested_data_insert (entry, postal_addresses)
SELECT * FROM dummy_table;
- Verify the data in the Hive table by using the
SELECT * FROM
syntax.hive> SELECT * FROM nested_data_insert WHERE entry = '001';
Sample output:
Table 5. Result of the insert query 3.5: entry num postal_address USER_ID ADDRESS ZIP COUNTRY 001 NULL newAdam 1 newAddress newZip newCountry - Verify the data in the HPE Ezmeral Data Fabric Database JSON table
(
num
row is not present):findbyid '/nested_data_insert' --id 001 { "_id": "001", "postal_addresses": { "newAdam": { "address": "newAdress", "country": "newCountry", "user_id": "1", "zip": "newZip" } } }