On Dec. 26, 2007, Albert Gonzalez, a 28-year-old resident of Miami, launched an attack against the database servers of Heartland Payment Systems, kicking off one of the most successful computer crimes in history. By the end of the attack, Gonzalez walked away with more than 130 million credit card numbers, pilfered directly from the systems of Heartland, one of the world's largest credit card companies.
How did Gonzalez carry out such a massive attack? He exploited a web vulnerability known as SQL injection. This attack allows the malicious manipulation of a website to send unintended commands to the underlying database. Despite it being fairly easy to protect websites against SQL injection attacks, many sites remain vulnerable today.
SQL Injection Explained
Modern websites often rely upon databases to help create dynamic content based upon user requests. In the early days of the web, static websites were the rule. The webmaster simply created all of the content that users might want to view, and then users downloaded those files using a web browser. The web quickly grew beyond that as users began to demand content that was based upon their earlier input. For example, a website's shopping cart functionality can't be created as a static web page — the webmaster would have no idea what users might want to order.
Dynamic websites accept information from users and then build customized web pages on the fly. This often involves accessing a backend database that stores both content and details about website users. The web server and database server communicate using a language called the Structured Query Language (SQL). For example, when you log into a website, the web server might retrieve information about you from the database using an SQL query such as this:
SELECT* FROM< users WHERE username = 'your_username';
You don't need to understand the details of the syntax to understand SQL injection, but there are a few important points that you'll need to know. First, this query retrieves all of the data stored about a specific user from a table (similar to a spreadsheet) in the database called "users". Second, semicolons are used to indicate the end of an SQL command. Finally, the text between the single quotation marks is treated as a variable and often comes from user input. For example, you might type your username into a form on the website and the web server would then plug that username into a query template that looks something like this:
SELECT *FROM usersWHERE username = 'USER_INPUT';
The keyword USER_INPUT is replaced with the information provided by the website user. This is where things might go awry. Imagine if instead of a username, the website user entered the following text:
mike' OR '1'='1
When this gets plugged into the query template, the following command gets sent to the database:
SELECT *FROM usersWHERE username = 'mike' OR '1'='1';
As the string '1' will always equal the string '1', this basically tells the database to retrieve all user records. If the web server then displays the user information on the screen, this simple tweak to a query might result in the attacker seeing the records corresponding to all of your website users.
The attacker is typically only able to view data from your database in cases where the input is reflected, meaning that the resulting web page displays the data from the database rather than just making a blind update. However, even pages that don't reflect input can be dangerous. Imagine the case where the attacker enters the following string:
mike'; DELETE FROM users WHERE '1'='1
Plugging this into the query template (and reformatting slightly) yields:
SELECT *FROM usersWHERE username = 'mike';
DELETEFROM usersWHERE '1'='1';
This technique, known as blind SQL injection, allows the attacker to execute arbitrary commands against the database where viewing the output is not important. In this particular example, a successful attacker would delete all of the user records from your database.
Defending Against Injection Attacks
Fortunately, the defense against SQL injection attacks is quite straightforward: It's a technique called input validation that requires developers to carefully check the input received from web users before passing it along to the database. There are many factors to consider when performing input validation that will vary depending upon your website's purpose and the underlying technology, but there are some basic rules that will get you started.
First, ensure that input matches the intended format. For example, if you're asking the user to provide a U.S. Postal Service ZIP code, make sure that the input provided is exactly five digits. If it varies from the expected format, return an error to the user without passing the command to the database.
Second, escape single quotation marks. As you saw in the earlier examples, the single quotation mark is the most dangerous character in an SQL statement, potentially allowing an attacker to "break out" of a query and change the underlying command syntax. The exact procedure varies from language to language, but developers always have the ability to tell the database to interpret a character literally by "escaping" it. For example, PHP provides a function called addslashes() that performs this operation for you.
Third, use stored procedures instead of passing full SQL statements to the database. Stored procedures keep the SQL query template stored on the database server rather than on the web server. The web application then merely needs to provide the user variables to the stored procedure, eliminating the possibility of a SQL injection attack.
SQL injection attacks do pose a significant risk to unprotected database-driven web applications. However, with careful coding and the use of input validation, web developers can protect their applications from this insidious attack.