Posted 05/28/2025
Optimize bulk insert of json data using temporary table.
CREATE OR REPLACE FUNCTION bulk_insert(
p_data json
)
RETURNS INT[]
AS $$
DECLARE
ids INT[];
BEGIN
WITH inserted_users AS (
INSERT INTO user (
name,
street,
postal_code,
city,
created,
country
)
SELECT
(j.p_data->>'Name')::VARCHAR(150),
(j.p_data->>'Street')::VARCHAR(100),
(j.p_data->>'PostalCode')::VARCHAR(10),
(j.p_data->>'City')::VARCHAR(100),
(j.p_data->>'Created')::TIMESTAMP,
(j.p_data->>'Country')::SMALLINT
FROM json_array_elements(p_data) as j(p_data)
RETURNING id
)
SELECT array_agg(id) INTO ids FROM inserted_users;
RETURN ids;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION bulk_insert(
p_data json
)
RETURNS INT[]
AS $$
DECLARE
ids INT[];
BEGIN
-- Step 1: Create a temporary table
CREATE TEMP TABLE temp_user (
name VARCHAR(150),
street VARCHAR(100),
postal_code VARCHAR(10),
city VARCHAR(100),
created TIMESTAMP,
country SMALLINT
);
-- Step 2: Insert data into the temporary table
INSERT INTO temp_user (
name,
street,
postal_code,
city,
created,
country
)
SELECT
(j.p_data->>'Name')::VARCHAR(150),
(j.p_data->>'Street')::VARCHAR(100),
(j.p_data->>'PostalCode')::VARCHAR(10),
(j.p_data->>'City')::VARCHAR(100),
(j.p_data->>'Created')::TIMESTAMP,
(j.p_data->>'Country')::SMALLINT
FROM json_array_elements(p_data) AS j(p_data);
-- Step 3: Bulk insert into the main table
WITH inserted_users AS (
INSERT INTO user (
name,
street,
postal_code,
city,
created,
country
)
SELECT * FROM temp_user
RETURNING id
)
SELECT array_agg(id) INTO ids FROM inserted_users;
-- Step 4: Drop the temporary table
DROP TABLE temp_user;
RETURN ids;
END;
$$ LANGUAGE plpgsql;
What Happens in not optimized Function
In not optimized function, you’re directly inserting rows into the user table using a WITH clause and INSERT INTO … SELECT. This works fine, but PostgreSQL has to do extra work for each row being inserted:
This overhead happens for every row, which can slow things down when inserting a large number of rows.
What Happens with a Temporary Table
When you use a temporary table, the initial insert goes into the temporary table instead of the user table. Temporary tables are special because:
After all the rows are inserted into the temporary table, you can bulk insert them into the user table in one operation. This reduces the overhead compared to inserting rows one by one.
Temporary tables are especially useful when:
Using a temporary table improves performance because it defers the expensive operations (index updates, constraint checks, logging) until the final bulk insert into the user table. This reduces the overhead and speeds up the overall process.