How to Access a PostgreSQL Database from Any ... - This is My Page

footmouthΔιαχείριση Δεδομένων

16 Δεκ 2012 (πριν από 4 χρόνια και 6 μήνες)

217 εμφανίσεις

How to Access a PostgreSQL Database from Any Language
If you’re a system administrator, chances are you use programs that interface with data-
bases every day. One great advantage of open source software is that you can modify
your applications’ code to customize it for your needs. If your application uses
Post-
greSQL
on the back end, it’s not hard to access your database from a variety of lan-
guages. Here’s how to get started, whether your code is in C, C++, PHP, Tcl, Python, or
Perl.
If you’re a complete neophyte when it comes to programming, this process probably
isn’t for you. However, if you have at least basic knowledge of SQL and of at least one
programming language, I’ll walk you past the first hurdles. You can also turn to the
PostgreSQL manual
for specific information. While you’re probably running Linux, the
same program code should work on other operating systems, as long the same client in-
terfaces are available there. My examples rely on Postgres 9.1, the latest stable release,
being already installed and tested. My setup has the server running on a different ma-
chine from the client, but on the same LAN; in such a setup, you can easily connect to
the server with the command
pgsql -U postgres -h
x.x.x.x
.
Once you know your server accepts connections, make sure you have the rights to in-
stall software on the client machine. Also make sure that the database you’re working
with is for testing only, so as not to accidentally corrupt anything crucial.
C Interface
C is a popular language when it comes to interfacing with a database. Many database
systems are themselves written in C. The language offers speed and flexibility, so if you
want to write a client interface, be it console-only or GUI-based, and you don’t want to
hear about browsers, C might be your best choice.
The software you need to interface C with Postgres is named libpq. It comes bundled
with the PostgreSQL source tree. If you installed the database system from binaries
rather than from source, you can always get libpq separately, but keep in mind that you
need the -dev package as well (or -devel, depending on your Linux distribution). On De-
bian and derivatives, the command to install it is
# aptitude install libpq-dev
.
On Red Hat-based systems like
CentOS
, you can find libpq in the postgresql
xx
-libs
package, where
xx
is the major and minor version number. So if you want to interface
with a PostgreSQL 9.1 database, you will need to install postgresql91-libs, or postgresql-
devel if you’re running Fedora 16. The difference in the names is because RHEL/Cen-
tOS users use mostly the Postgres repository, while Fedora users have more up-to-date
packages and need a third-party repository only if they’re after an older version. Long
story short, any system that supports PostgreSQL will have libpq available, one way or
another.
libpq is the
API
engine for not only C, but also C++, Perl, and Tcl. It gives you the basic
functions to connect, query, and alter a database. Most of the functions that you’ll use
frequently have names that start with “PQ,” such as PQconnectdb or PQerrormessage.
For more examples, turn to the PostgreSQL documentation, or look in src/test/exam-
ples. In your C program files, include libpq-fe.h and add the appropriate linker flag,
which is
-lpq
when compiling.
The first thing you’ll want to do is connect to an existing database. PQconnectdb() takes
an argument in the form of
const char *conninfo
that contains the keywords you
want, such as
dbname=[
database_name
]
, and you use this pairing format with the
other keywords as well. The most used ones are host, hostaddr (numeric form, to avoid
useless DNS lookups), port, user, password, and sslmode. If you don’t provide any ar-
guments, the default options will be used, which is probably not what you want. Sup-
pose your database server has the address 192.168.0.101, the username is postgres, and
the database name is testdb1. A first attempt to connect might look like this:
#include <stdio.h>
#include <libpq-fe.h>
int main(int argc,char argv[])
{
const char *conninfo;
PGconn *conn;
if (argc > 1)
conninfo = argc[1];
else
{
printf("Not enough arguments, exiting...");
return 1;
}
conn = PQconnectdb(conninfo);
/*Check to see how I did */
if(PQstatus(conn) = CONNECTION_OK)
printf("Connection succeeded.\n");
else
{
/*Do something to deal with the error*/
}
}
Save this code as testlibpq, compile it, and use it like this:
$ testlibpg "hostaddr=192.168.0.101 user=postgres dbname=testdb1"
If everything goes well, you should see “Connection succeeded.” on the screen. That’s
gratifying, but so far not very practical. To get anything useful, you need to query the
database – but since I started with connecting, I should tell you that in order to discon-
nect, you call PQfinish, which returns a void and takes
*conn
as the sole argument.
PQexec, the function for executing queries, returns a PGresult and takes
*conn
and
const char *command
as arguments. In our example, let’s just declare a PGresult
variable and send a command to the server. I leave to you the part where you check
whether the connection is there, and error checking.
PGresult *res;
res = PQexec(conn, "SELECT * FROM mydatabase");
PQclear(res);
Needless to say, the code presented here is nowhere near complete. Its purpose is to
show you what the library offers, not to offer ready-to-use code. Nevertheless, with this
code,
res
contains the result of the query, and you can parse it at will. PQresultStatus
lets you know how the command went, and returns a status of the form
PGRES_COM

MAND_OK
or
PGRES_FATAL_ERROR
. You can find
a comprehensive list of the exec func-
tions
on the PostgreSQL project’s website.
Let’s move on to some more useful functions. For example, PQntuples, given
res
as an
argument, returns the number of columns in the table as an
int
. It takes a PGresult ob-
ject as an argument
if
the query’s status is
PGRES_TUPLES_OK
, and returns an integer
value. PQnfields gives you the number of columns in each row. PQfname returns the
column name associated with a number, while PQfnumber does the opposite. To get the
value of a single cell, pass to PQgetvalue the usual PG result, along with the cell’s col-
umn and row numbers.
All this is good, but PQexec isn’t a genius when it comes to handling multiple SQL com-
mands. Because it returns only one structure, if you have more than one command, it
will take into consideration only the results from the last command. Another drawback
is that PQexec waits for a command until it’s completed, so you must take extra care
when it comes to blocking execution. You can use a series of functions if these shortcom-
ings affect you, such as PQsendQuery and PQgetResult, and use them intelligently for
async-friendly code.
C++ Interface
The official PostgreSQL client API for C++ is called libpqxx, but it doesn’t come bun-
dled with the PostgreSQL source. Instead, you can install it from your distribution’s
repository or just
download it
. The README file says you can use libpqxx on Linux,
BSD, Solaris, Irix, HP-UX, AIX, and Windows, with Cygwin. You must install libpq be-
fore installing libpqxx, as the latter is based on the former.
In C++, the namespace is named pqxx, and you must include a header with that name.
Here’s a simple connection and query:
#include <iostream>
#include <pqxx/pqxx>
using namespace std;
int main()
{
pqxx::connection conn;
pqxx::work w(conn);
pqxx::result res = w.exec("SELECT 1");
w.commit();
cout << res[0][0].as<int>() << endl;
}
This program will print 1 if everything goes as it should. Use and abuse the try/catch
keywords to defend against connection errors. If you want to use
argv[x]
, as I did in
the C example, you can concatenate the string like this:
pqxx::result res = w.exec("SELECT" + w.quote(argv[1]));
Again, not very useful so far, so how do you actually access results that a query returns?
Since
res
, which stores the data, is an array of tuples, you can use for loops to iterate
through the elements:
for (int rownr=0; rownr < res.size(); ++rownr)
{
const result::tuple row = res[rownr];
for (int colnr=0; colnr < row.size(); ++colnr)
{
const result::field = row[colnr];
cout << field.c_str() << \t;
}
cout << endl;
}
Remember that I used
using namespace std;
; if you don't use it, alter the lines with
cout and endl accordingly.
libpqxx offers a lot of options when it comes to string conversion. The most interesting
functions are from_string and to_string; the first basically takes a string argument and a
T & obj, which is an existing built-in type, and the second takes only the T & obj, con-
verting it to a string. The complete
documentation
on this is a useful read.
Connections, Queries, and Transactions
For the rest of this section, I recommend you insert
using namespace pqxx;
, as I did
when writing the code, in order to save some typing, so you won't have to type
pqxx::...
. With that in mind, let's open a connection to an existing database and per-
form a transaction and then a query. You will notice the code does about the same as the
C code above.
I can use a constructor that takes only one argument, again a string, which will define
the connection options (dbname, user, and so on). Since the default username is post-
gres, I won't include it in my string this time.
connection Conn("dbname=testdb1 hostaddr=192.168.0.101");
I now have an open connection, provided everything went well (again, take care of ex-
ceptions!). This, however, is not enough for executing queries – you need to open a
transaction by using a
transactor
, which will save you hassle and time when a connec-
tion drops off on you. A transactor is a
functor
, and functors are handy when dealing
with problems that create connections, when you need to know the state of said connec-
tion. You can't get that information with simple functions or function pointers, given the
way the language is designed.
A functor is a function object, so you can treat it as a function that is state-aware. Let me
explain. The operator() in C++ can be overloaded, so you can put any number of ele-
ments inside the parentheses. You could say that a functor is a class that defines opera-
tor(). If you want a function that adds two values (given as arguments), you would have
to hard-code those values. Functors take care of the hardcode limitation, allowing you
to use a constructor and add a new object that "overrides" the first values to be added,
thus giving the developer more flexibility.
Now that you know what a functor is, let's see how to use the transactor.
// I already have a connection open, as above
class Transaction : public transactor<>
{
public:
void operator () (transaction<> & t)
{
t.exec ("INSERT INTO mytable VALUES(val1,val2)");
}
void on_abort (const string & msg)
{
cout << "Transaction failed with message: " << msg <<

endl;
}
};
//Inside main()
conn.perform (Transaction());
As you can see, coding queries in C++ isn't hard. You can define transactions and prop-
erly format the results you get from the queries (think stringstream), and the rest is easy
to accomplish, using loops.
PHP Interface
PHP is widely used for database integration with web pages. The language offers a sim-
ple way for programmers to interface with databases.
Make sure you have PHP installed and its PostgreSQL module, which is named php5-
pgsql on Debian and derivatives and php-pgsql if you're using CentOS 5. If you compile
from source, use the
--with-pgsql=[
DIR
]
configure option, where
DIR
is the direc-
tory where Postgres lies. You can use phpinfo() on a web page to check whether Post-
greSQL support is enabled and what options have what values. Note that the examples
below work with PHP 5, and not necessarily with older versions.
It's more straightforward to open a connection, make a query, and print out the results
using PHP than it is with C/C++:
<?php
$dbconn = pg_connect("host=192.168.0.101 dbname=testdb1

user=postgres password=
xxx
")
or die('Could not connect: ' . pg_last_error());
$query = 'SELECT * FROM mytable';
$result = pg_query($query) or die('Query failed: ' . pg_last_er

ror());
echo "
\n";
while ($line = pg_fetch_array($result, null, PGSQL_ASSOC)) {
echo "\t
\n";
foreach ($line as $col_value) {
echo "\t\t
\n";
}
echo "\t
\n";
}
echo "
\n";
//I printed the result in a nice table; thanks to the PHP manual
pg_free_result($result);
pg_close($dbconn);
?>
As you can see, connecting, making a query, and using the results are all a breeze. How-
ever, beware when you get user/password information interactively from users. A SQL
injection attack might gain a malicious individual access to your database, so make sure
you keep informed about security issues.
When working with arrays, and you will need them sooner or later, you can start with
the pg_fetch_array() function and move from there:
//I presume I have a connection and a result/query that returns

something suitable
$arr = pg_fetch_array($result);
echo $arr[0];
Look at other pg_fetch* functions and use one that fits the task you need to do.
pg_num_fields and pg_num_rows can be useful when you want to alter the way you
display the results from a query depending on their size.
Tcl Interface
The Tcl language is known for its string processing capabilities, so it might be a good
choice if you want powerful parsing of query results. Another cool feature is security,
because PL/Tcl (yes, we're talking about an integrated Tcl interface provided by Post-
greSQL) has no means to access the inner workings of the database server. If, however,
you do need more capabilities, you can look at PL/TclU, where U stands for untrusted.
You need superuser access to the database to use it, and the usual extra care is recom-
mended.
Because PL/Tcl is integrated, you don't need to install anything before you use it, but
you do need to add to your code the line
CREATE EXTENSION pltcl #or pltclu
Since PL/Tcl is integrated into the Postgres system, it's more useful when it comes to
functions and triggers rather than queries and transactions. Generally speaking, lan-
guages starting with PL/ are more useful for extending some capabilities instead of pro-
viding a complete database-interfacing environment. The general syntax to create your
own functions takes the form
CREATE FUNCTION
name
(
argtypes
) RETURNS
type
AS $$
#here goes the function body
$$ LANGUAGE pltcl;
So, for instance, here's a simple add function:
CREATE FUNCTION tcl_add (integer, integer) RETURNS integer AS $$
return $1 + $2
$$ LANGUAGE pltcl;
If you use STRICT at the end of the line with the LANGUAGE instruction, you don't
have to check for null input. If you don't want to use STRICT, use some if blocks and the
argisnull keyword (the manual has
more on this
).
Using values from the database is easy. Say you have a table with computer specs and
you want a function to see whether the hardware in question is obsolete and should be
changed. You could use code like this:
CREATE TABLE machine (
name text,
cpuspeed integer #Mhz
memory integer #MB
hard_drive integer #GB
);
CREATE FUNCTION isObsolete (machine) RETURNS boolean AS $$
if {$1(cpuspeed) < 266 && $1(memory) < 256 && $1(hard_drive)

< 2} {
return "t"
}
return "f"
$$ LANGUAGE pltcl;
As a small mnemonic, remember the string "spi" (server programming interface) for
functions you can use to select, insert, and update objects in a Postgres database. spi
functions, such as spi_exec, spi_prepare, spi_execp, and quote, are somewhat similar to
PQ functions in libpq. spi_exec takes a SQL command and executes it, and returns the
number of rows it processed. You can specify the max number of rows to be returned,
and save the values of the selected columns inside Tcl variables, as follows.
spi_exec "SELECT count(*) AS cnt FROM mytable"
spi_prepare prepares and stores a query for later use. When you're ready to use it, you
can do that with spi_execp. spi_prepare returns a query ID, so that spi_execp knows
what to execute. Finally, quote is useful when single quotes and/or escape chars give
you nothing but trouble. Here's an example:
"SELECT $myvar AS ret"
# myvar contains the string "won't"
# when expanding, the parser will give you a nice error message.
"SELECT '[ quote $val ]' AS ret" # this will do just fine
Python Interface
To use PL/Python, make sure you have the following line in the database code you
want to work with:
CREATE EXTENSION plpythonu
The final "u" in the above statement stands for untrusted, as we saw with PL/Tcl. Since
PostgreSQL 7.4 it's the only available variant. With it, your code can execute with the
privileges of a DBA, which again means you must take extra steps to ensure everything
is right. PL/Python supports Python 2 and 3, with 2 as the default, which is to say that
the plpythonu extension refers to Python 2. If you need one or the other explicitly, use
plpython2u or plpython3u.
Function creation in PL/Python has the exact same syntax as the Tcl example, except
that you substitute pltcl with plpythonu. Since I don't want to be redundant, I won't re-
peat the code here. Of course, the function body is simply Python code, as you can see
in this reiteration of the add function:
CREATE FUNCTION pyadd (x integer, y integer)
RETURNS integer
AS $$
return x + y
$$ LANGUAGE plpythonu;
Before you move on, I recommend you see the subchapter in the manual (
42.3.1 and fol-
lowing
) on data mapping between PostgreSQL and Python. I guarantee this will save
you lots of headaches later.
Using Postgres with Python and even the function names will seem familiar if you read
the Tcl section above. Here, the module is called plpy, it's imported automatically, and it
offers two functions, execute and prepare:
myobj = plpy.execute("SELECT * FROM mytable", 10)
After you execute this statement, you can access myobj by indexes, as you'd do with any
array-type variable. Just as in Tcl, plpy.execute can limit the number of rows it returns; I
specified 10.
Some other functions you might find useful are easy to understand because they're
named appropriately: plpy.debug(message), plpy.log(message), plpy.info(message),
plpy.notice(message), plpy.warning(message), and plpy.error(message). Quoting func-
tions, similar to those in the Tcl example, include quote_nullable(string) and
quote_ident(string).
Perl Interface
Perl's string-processing capabilities are second to none, but it's also adept at handling
complex query results. Remember to use
CREATE EXTENSION plperl
in your data-
base code.
Creating a function with PL/Perl is just as simple as it would be with Tcl or Python; just
replace the last line with
$$ LANGUAGE plperl;
. Here's a simple function that re-
turns the bigger of two integers:
CREATE FUNCTION perlmax (integer, integer) RETURNS integer AS $$
if ($_[0] > $_[1]) { return $_[0]; }
return $_[1];
$$ LANGUAGE plperl;
If you choose to use STRICT, and I personally recommend it, the function will not get
executed at all if it receives null input, and you'll have to write some conditional code to
take care of that possibility:
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
my ($x, $y) = @_;
if (not defined $x) {
return undef if not defined $y;
return $y;
}
return $x if not defined $y;
return $x if $x > $y;
return $y;
$$ LANGUAGE plperl;
Remember spi_exec from PL/Tcl? In PL/Perl it's called spi_exec_query and, you
guessed it, it does the same thing with the same arguments, the limiter on the rows re-
turned being optional. However, the manual warns us of shortcomings when having
huge result sets, so the query I used as an example for Tcl becomes:
$myobject = spi_exec_query('SELECT * FROM mytable', 10);
#Let's get the value of row $i corresponding to mycolumn
$myrow = $myobject->{rows}[$i]->{mycolumn};
#Here's the total number of rows resulted from a query
$nrows = $myobject->{processed};
#What's the command's status?
$rp = $myobject->{status};
As you would expect, the same logging, error message, and quoting functionalities
we've seen before are present here as well: elog(level, message) emits a log/warning/er-
ror message, quote_literal(string) and quote_ident(string) do what you'd expect,
looks_like_number(string) returns true if the arguments seems like a string, and is_ar-
ray_ref(arg) returns true if the argument can be treated as an array reference.
Had enough? I'll stop here. This should be enough to get you started and to give you an
idea of how simple program interfaces to a PostgreSQL database can be.