boreddizzyΔιαχείριση Δεδομένων

16 Δεκ 2012 (πριν από 5 χρόνια και 7 μήνες)

290 εμφανίσεις


Roman Tekhov

“Computer Security” course report

Lecturer: Meelis Roos

TARTU 2006

The described problem affects all “PostgreSQL” database server versions before 8.1.4,
8.0.8, 7.4.13, 7.3.15. It is based on problems connected with multibyte character processing,
that means the characters, which are encoded using two or more bytes. The risk was first
discovered by Akio Ishida and Yasuo Ohgaki.

The security hole occures in “PostgreSQL” if the client (most commonly an application
using the database for persistence and queries) and the database server behave differently
while processing the multibyte characters used in SQL commands, that are formed by
inserting the user input into the query string (and the practice shows that they behave
differently in almost every application). More precisely, it is most often the case that client
code tries to escape the special characters in user input (in order to make the query safe)
without being aware of the used character encoding, which can result in forming the
commands, that can lead to the injection of any SQL provided by the attacker, because some
single quotes can just be skipped by the escaping process. Then the classical SQL-injection
scenario takes place.

An example: suppose that the used multibyte supporting encoding is UTF-8. In that case the
byte 0xc8 indicates the beginning of the two-byte character, where the second byte must be in
range 0xA0-0xFF. The attacker submits the string that contains 0x08’text, where 0x08
represents a single byte by its hex value. Because the “ ‘ ” (ASCII hex code 0x27) is not in the
valid range the whole string is therefore not validly encoded in the scope of UTF-8. But the
client is unaware of the encoding and will just ignore that fact. It will try to escape all special
characters before passing the string to the database server. That will result in either 0x08\’text
or 0x08’’text (SQL standard single quote escaping) depending on which escaping technique is
used, “PostgreSQL” supports both. But the server will then treat the 0x08\ or the 0x08’ as a
single multibyte character, as the UTF-8 insists, leaving the “ ’ ” unescaped and text injected.
Of course the server will know that the multibyte character is invalid (the second byte is not
in the valid range), but it will only generate a warning and the execution of such command
will not be prevented. In some cases even the warning is omitted.
The fix is quite straightforward and effects only the part of the scenario which is about the
server warning. In case of invalid multibyte character the server must generate an error
instead of a warning, which will stop the query from being executed. In case of application
client this error may be then used to inform the user that the input is in illegal form. In all new
versions of “PostgreSQL” the server behaves this way, which removes the problem for most
multibyte supporting encodings, like UTF-8.

The problem is that there are some encodings, which allow the 0x5c (backslash, “\”) to be
the second byte in a two-byte character, for example SJIS, BIG5, GBK, GB18030, UHC and
some other Far Eastern encodings. If SJIS is used and the attacker submits the string 0x95’text
(0x95 means the beginning of a two-byte character in SJIS), and if the client escapes the
single quotes with backslashes, then escaping will produce the totally valid SJIS two-byte
character followed by an unescaped single quote – 0x095\’text. The server will then execute
this valid command and text will be executed as well.
The usage of such encodings produces two additional security risks, which are described
separately as CVE-2006-2314. Because they are rather closely connected, a brief description
of that problem will be provided here as well. If the used encoding is SJIS and the attacker
submits a valid string 0x95\’text (as already said, 0x95\ or 0x95 0x5c is a valid SJIS two-byte
character), the client will try to escape the single quote and the backslash as well, because it
treats the backslash as a single character, not as part of the multibyte character. This will
produce either 0x95\\\’text or 0x95\\’’text depending on escaping standard used. The server
will then treat 0x95\ as a valid two-byte character, “ \\ ” or “ \’ ” will be treated literally and

the last “ ’ ” will remain unescaped leaving the text injected. The second risk appears if an
attacker submits the string 0x95’text. The client escapes the single quote resulting in
0x95\’text, the server will then interprete it as a valid two-byte character 0x95\ followed by an
unescaped single quote, and so text will be successfully injected once again.

The possible solution for such cases is to modify the client itself. The single quote escaping
must be done in the form of “ ’’ ” (which is actually the standard) not the “ \’ ”. Because there
are no encodings where the 0x27 (a single quote) is a valid second byte value in a two-byte
character, the problem will not occure (of course only in case the server validates the
multibyte characters as described above). Not to expect all the clients to modify their escaping
code, the “PostgreSQL” developers have also modified the server, so now it will reject the
sequence “ \’ ” if an encoding such as SJIS is used. This will partially solve the CVE-2006-
2314 problem as well, but again, only if the client uses the “ ’’ ” technique to escape single
quotes. It must be said that actually it is not right to escape single quotes with backlashes
anyway, because this technique is considered deprecated due to SQL standard. That is the
reason why, for example, widely used PHP addslashes function is deprecated after version 4.0
as well. But because it may not be possible for all clients to modify their escaping code,
“PostgreSQL” developers have added a server configuration parameter “backslash_quote”,
which can accept three values: “on” (always allow “ \’ ”, insecure and not recommended,
must be used only if it is absolutely necessary to keep the application working), “off” (always
reject “ \’ ”, most secure but can break existing application code if modification is not
possible), “safe_encoding” (default, allow “ \’ ” if clients encoding is safe).
Generally speaking, if for some reason the application needs to keep inserting user input as
part of the query directly, then escaping should not be done “by hand”. Instead “PostgreSQL”
offers the C-language client library called “pqlib”, which includes the new functions:
PQescapeStringConn and PQescapeByteaConn, for character and binary data respectively,
which can be used to perform the escaping in the secure way.

So, the solutions for CVE-2006-2313 include: an update of the database server to a version,
where the described problem is solved on the server side. It may, however, require to
introduce some processing of the generated errors to inform the user of the invalid input he
maid. It may also require some additional changes, because there may be other modifications
in the new “PostgreSQL” version. Generally it is most likely, that if the application is
vulnerable to this risk, then the server update will require some changes in the client code as
well. So, the update is needed, but it is not enough for the majority of cases.
Secondly, developers should modify their escaping, so that it would be done in encoding-
aware manner, for example the 0x5c bytes (“backslashes”), that are part of the multibyte
character, must not be escaped in case of SJIS and similar encodings. As already said,
“PostgreSQL” offers solutions for that. The developers may also choose to make their input
validation this way, not to let the database server decide if the query is validly encoded or not.
For example in case of UTF-8 and the input 0x08’text the client must treat the 0x08’ as a
single multibyte character and reject it because it is not valid.
The third solution is not to embed user input in the queries at all and instead use prepared
statements with placeholders. In such case the used SQL engine would generate an error, if
the given parameter contains something more than the expected data for that particular
placeholder. This is actually the best solution that web application developers should use,
because it is considered not right to put user input in database queries directly at the first
place. All modern web frameworks, written in .NET, Java, Python or other languages, support
this functionality.
The application is of cource not vulnerable to this problem if it uses a single-byte encoding,
such as any LATINx family encoding.

Used material
1. “PostgreSQL” documentation:
a. risk description: -

b. pqlib library -

c. fixes -

2. Article “SQL injection vulnerabilities in PostgreSQL”-