How to insert (file) data into a PostgreSQL bytea column?
This question is not about bytea v. oid v. blobs v. large objects, etc.
I have a table containing a primary key
integerfield and a
byteafield. I'd like to enter data into the
byteafield. This can, presumably, be done by one of the
PL/languages, and I may look into doing this with
PL/Pythonin the future.
As I am still testing and experimenting, I would simply like to insert data from a file (on the server) using "standard" SQL statements. I am aware that only administrators with write permission on the server would be able to insert data in the way I would like to. I'm not concerned about that at this stage as users would not be inserting
byteadata at present. I have searched the various StackExchange sites, the PostgreSQL Archives and the Internet generally, but have not been able to find an answer.
Edit: This discussion from 2008 implies that what I want to do is not possible. How are
byteafields used then?
Edit: This similar question from 2005 remains unanswered.
Solved: The details provided here on the
psycopgwebsite provided the basis for a solution I've written in Python. It may also be possible to insert binary data into a
PL/Python. I don't know if this is possible using "pure" SQL.
create or replace function bytea_import(p_path text, p_result out bytea) language plpgsql as $$ declare l_oid oid; begin select lo_import(p_path) into l_oid; select lo_get(l_oid) INTO p_result; perform lo_unlink(l_oid); end;$$;
lo_getwas introduced in 9.4 so for older versions you would need:
create or replace function bytea_import(p_path text, p_result out bytea) language plpgsql as $$ declare l_oid oid; r record; begin p_result := ''; select lo_import(p_path) into l_oid; for r in ( select data from pg_largeobject where loid = l_oid order by pageno ) loop p_result = p_result || r.data; end loop; perform lo_unlink(l_oid); end;$$;
insert into my_table(bytea_data) select bytea_import('/my/file.name');
create table test(id int, image bytea); insert into test values (1, pg_read_file('/home/xyz')::bytea);
Or a little simpler, `pg_read_binary_file('/path/to/file')`. See https://www.postgresql.org/docs/current/static/functions-admin.html
This solution isn't exactly efficient in terms of runtime, but it's trivially easy compared to making your own headers for
COPY BINARY. Further, it doesn't require any libraries or scripting languages outside of bash.
First, convert the file into a hexdump, doubling the size of the file.
xxd -pgets us pretty close, but it throws in some annoying newlines that we have to take care of:
xxd -p /path/file.bin | tr -d '\n' > /path/file.hex
Next, import the data in PostgreSQL as a very large
textfield. This type holds up to one GB per field value, so we should be okay for most purposes:
CREATE TABLE hexdump (hex text); COPY hexdump FROM '/path/file.hex';
Now that our data is a gratuitously large hex string, we use PostgresQL's
decodeto get it into a
CREATE TABLE bindump AS SELECT decode(hex, 'hex') FROM hexdump;
SabreWolfy: No, it doesn't. The `tr -d '\n'` is operating on the output of xxd, which encodes the binary content of the input as ASCII hexadecimal characters (0-9 and a-f). xxd also happens to output line feeds at regular intervals to make the output human-readable, but in this case we want them removed. Line feeds in the *original* data will be in hex form, and will remain unaffected.
The answer with xxd is nice and, for small files, very fast. Below is an example script that I'm using.
xxd -p /home/user/myimage.png | tr -d '\n' > /tmp/image.hex echo " -- CREATE TABLE hexdump (hex text); DELETE FROM hexdump; COPY hexdump FROM '/tmp/image.hex'; -- CREATE TABLE bindump (binarydump bytea); DELETE FROM bindump; INSERT INTO bindump (binarydump) (SELECT decode(hex, 'hex') FROM hexdump limit 1); UPDATE users SET image= ( SELECT decode(hex, 'hex') FROM hexdump LIMIT 1 ) WHERE id=15489 ; " | psql mydatabase
Here is how to do it without superuser privileges (such as on Heroku).
\lo_import '/cygdrive/c/Users/Chloe/Downloads/Contract.pdf' update contracts set contract = lo_get(:LASTOID) where id = 77;
You can use
\lo_listto see the large objects, and
\lo_unlinkto delete them. The
contractfield in my example is
Column | Type | contract | bytea |
Thanks. The link you gave indicates that the data must either be in ASCII or PostgreSQL's binary table format. Further down the page, mention is made that the binary table format is first created with a COPY TO command. Would either of these approaches allow me to insert a binary file (PDF, document, spreadsheet) into a `bytea` column?
The PostgreSQL documentation on COPY BINARY (http://www.postgresql.org/docs/8.4/interactive/sql-copy.html) indicates that a special file header is required when inserting binary data. Do I need to build this header and append it to binary data? That does seem somewhat complex for simply storing a string of binary data.