We have lots of corporate clients. They have important websites, sometimes with particular intellectual property and information that they’d rather not share with rivals. Consequently, knowing what information is being sent by the software underpinning your website is important. We knew that WordPress is chatty from previous articles such as Lynne Pope’s on WordPress Privacy, various discussions on WordPress support forums and mailing lists like this, this, this, and this.
Now, it’s important to note that we had no awareness of the above articles prior to writing this piece, and the links were added in editing afterwards. The idea was to keep an open mind on the subject and not cover prior literature. All we wanted to know was – what information is being shared by a default WordPress install?
In order to see what, if any, information was sent out by WordPress and the popular commercial anti-spam plug-in Akismet to Automattic which is included by default in all WordPress installs, we decided to carry out a few tests. This article details the testing environment used, the testing method employed and the results.
Testing Environment
A copy of WordPress 3.0.5, obtained from the WordPress SVN repository was installed on a local Windows 7 PC running XAMPP as a single-instance setup. The domain name networktest.dev was used for the site, with an entry in the local machines ‘hosts’ file standing in for a DNS server.
The ‘Wireshark’ network analysis tool was used to capture all HTTP traffic sent and recieved by WordPress to/from external sites. Firefox was used perform the tests and was run in “safe mode” (all add-ons disabled) in order to minimise the amount of erroneous traffic captured.
Testing Method
In order to see what information was been sent by WordPress itself, the following user activites were performed with Wireshark running on our test machine and recording all HTTP and HTTPS traffic:
- Visiting the Dashboard login page immediately after installation.
- Installing, activating and deleting an additional theme.
- Activating and De-activating the “Hello Dolly” plug-in.
- Activating Akismet.
- Posting a comment with Akismet activated.
- Marking a comment as “spam” using Akismet.
Results
Visiting the Dashboard Login Page Immediately after Installation
The first thing we noticed is that information is sent out to api.wordpress.org (72.233.56.138) when the page is loaded:
================HEADERS==================== GET /core/version-check/1.5/?version=3.0.5&php=5.3.1&locale=en_US&mysql=5.1.41&local_package=&blogs=1&users=1&multisite_enabled=0 HTTP/1.0 Host: api.wordpress.org User-Agent: WordPress/3.0.5; http://networktest.dev/ wp_install: http://networktest.dev/ wp_blog: http://networktest.dev/ Accept-Encoding: deflate;q=1.0, compress;q=0.5 ==============END OF HEADERS================
This request appears to be checking for updates to the WordPress core. On the top line, you can see that a GET request is sent which contains information on:
- Version of WordPress in use.
- PHP and MySQL versions in use on the server.
- Language/Locale used on the server.
- Whether the ‘‘multisites’ functionality is enabled on the site.
- Number of blogs on the site.
- Number of users (that is, WordPress users) configured on the site.
The reply from wordpress.org can be found below:
================HEADERS==================== HTTP/1.1 200 OK Server: nginx Date: Wed, 09 Feb 2011 15:14:45 GMT Content-Type: text/plain; charset=utf-8 Connection: close Content-Length: 102 ==============END OF HEADERS================ latestDownloadhttp://wordpress.org/wordpress-3.0.5.zip 3.0.5 en_US 4.3 4.1.2
Nothing sinsiter going on here. The first line of text in the reply after the headers, ‘latest‘, serves as an indicator we have the latest version of WordPress installed. We found that, when repeating the test using an earlier version of WordPress, the text ‘upgrade‘ was present in the same place instead.
The two URL’s on the second and third lines tell WordPress where to find the latest version, with the forth being language/locale information. The last two lines refer to the minimum PHP and MySQL version requriements, respectively, needed to run the current version of WordPress.
Other information sent to was related to plugins present on our test install of WordPress, the packet capture of which can be seen below:
==================HEADERS================== POST /plugins/update-check/1.0/ HTTP/1.0 Host: api.wordpress.org User-Agent: WordPress/3.0.5; http://networktest.dev Accept-Encoding: deflate;q=1.0, compress;q=0.5 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Content-Length: 2061 ==============END OF HEADERS================ plugins=O%3A8%3A%22stdClass%22%3A2%3A%7Bs%3A7%3A%22plugins%22%3Ba%3A2%3A%7Bs%3A19%3A%22akismet%2Fakismet.php%22%3Ba%3A10%3A%7Bs%3A4%3A%22Name%22%3Bs%3A7%3A%22Akismet%22%3Bs%3A9%3A%22PluginURI%22%3Bs%3A19%3A%22http%3A%2F%2Fakismet.com%2F%22%3Bs%3A7%3A%22Version%22%3Bs%3A5%3A%222.4.0%22%3Bs%3A11%3A%22Description%22%3Bs%3A409%3A%22Akismet+checks+your+comments+against+the+Akismet+web+service+to+see+if+they+look+like+spam+or+not.+You+need+an+%3Ca+href%3D%22http%3A%2F%2Fakismet.com%2Fget%2F%22%3EAPI+key%3C%2Fa%3E+to+use+it.+You+can+review+the+spam+it+catches+under+%22Comments.%22+To+show+off+your+Akismet+stats+just+put+%3Ccode%3E%26lt%3B%3Fphp+akismet_counter%28%29%3B+%3F%26gt%3B%3C%2Fcode%3E+in+your+template.+See+also%3A+%3Ca+href%3D%22http%3A%2F%2Fwordpress.org%2Fextend%2Fplugins%2Fstats%2F%22%3EWP+Stats+plugin%3C%2Fa%3E.%22%3Bs%3A6%3A%22Author%22%3Bs%3A10%3A%22Automattic%22%3Bs%3A9%3A%22AuthorURI%22%3Bs%3A40%3A%22http%3A%2F%2Fautomattic.com%2Fwordpress-plugins%2F%22%3Bs%3A10%3A%22TextDomain%22%3Bs%3A0%3A%22%22%3Bs%3A10%3A%22DomainPath%22%3Bs%3A0%3A%22%22%3Bs%3A7%3A%22Network%22%3Bb%3A0%3Bs%3A5%3A%22Title%22%3Bs%3A7%3A%22Akismet%22%3B%7Ds%3A9%3A%22hello.php%22%3Ba%3A10%3A%7Bs%3A4%3A%22Name%22%3Bs%3A11%3A%22Hello+Dolly%22%3Bs%3A9%3A%22PluginURI%22%3Bs%3A22%3A%22http%3A%2F%2Fwordpress.org%2F%23%22%3Bs%3A7%3A%22Version%22%3Bs%3A5%3A%221.5.1%22%3Bs%3A11%3A%22Description%22%3Bs%3A295%3A%22This+is+not+just+a+plugin%2C+it+symbolizes+the+hope+and+enthusiasm+of+an+entire+generation+summed+up+in+two+words+sung+most+famously+by+Louis+Armstrong%3A+Hello%2C+Dolly.+When+activated+you+will+randomly+see+a+lyric+from+%3Ccite%3EHello%2C+Dolly%3C%2Fcite%3E+in+the+upper+right+of+your+admin+screen+on+every+page.%22%3Bs%3A6%3A%22Author%22%3Bs%3A14%3A%22Matt+Mullenweg%22%3Bs%3A9%3A%22AuthorURI%22%3Bs%3A13%3A%22http%3A%2F%2Fma.tt%2F%22%3Bs%3A10%3A%22TextDomain%22%3Bs%3A0%3A%22%22%3Bs%3A10%3A%22DomainPath%22%3Bs%3A0%3A%22%22%3Bs%3A7%3A%22Network%22%3Bb%3A0%3Bs%3A5%3A%22Title%22%3Bs%3A11%3A%22Hello+Dolly%22%3B%7D%7Ds%3A6%3A%22active%22%3Ba%3A0%3A%7B%7D%7D
Obviously, this needed clearing up a bit before we could look at it properly. After running the output through an online URL-encoded text un-escaping tool we got the following serialised string:
plugins=O:8:"stdClass":2:{s:7:"plugins";a:2:{s:19:"akismet/akismet.php";a:10:{s:4:"Name";s:7:"Akismet";s:9:"PluginURI";s:19:"http://akismet.com/";s:7:"Version";s:5:"2.4.0";s:11:"Description";s:409:"Akismet+checks+your+comments+against+the+Akismet+web+service+to+see+if+they+look+like+spam+or+not.+You+need+an+API+key+to+use+it.+You+can+review+the+spam+it+catches+under+"Comments."+To+show+off+your+Akismet+stats+just+put+<?php+akismet_counter();+?>
+in+your+template.+See+also:+WP+Stats+plugin.";s:6:"Author";s:10:"Automattic";s:9:"AuthorURI";s:40:"http://automattic.com/wordpress-plugins/";s:10:"TextDomain";s:0:"";s:10:"DomainPath";s:0:"";s:7:"Network";b:0;s:5:"Title";s:7:"Akismet";}s:9:"hello.php";a:10:{s:4:"Name";s:11:"Hello+Dolly";s:9:"PluginURI";s:22:"http://wordpress.org/#";s:7:"Version";s:5:"1.5.1";s:11:"Description";s:295:"This+is+not+just+a+plugin,+it+symbolizes+the+hope+and+enthusiasm+of+an+entire+generation+summed+up+in+two+words+sung+most+famously+by+Louis+Armstrong:+Hello,+Dolly.+When+activated+you+will+randomly+see+a+lyric+from+Hello,+Dolly+in+the+upper+right+of+your+admin+screen+on+every+page.";s:6:"Author";s:14:"Matt+Mullenweg";s:9:"AuthorURI";s:13:"http://ma.tt/";s:10:"TextDomain";s:0:"";s:10:"DomainPath";s:0:"";s:7:"Network";b:0;s:5:"Title";s:11:"Hello+Dolly";}}s:6:"active";a:0:{}}
Looking at the output above, a list of all plugins installed and information about them (version numbers, authors, descriptions, etc) is sent to wordpress.org. The code that generates the string seen above can be found in the wp_update_plugins() function in wp-includes/update.php on lines 126-196
Obviously, WordPress would need to know the plugin names and version numbers to check for updates, nothing sinister going on there.
What is less obvous however, is why wordpress.org would need to know whether a plugin was activated on a site or not (by updating the value of the “active” field in the serialised string), when its just checking for updates?
This information could be used to build up statistics on plugins installed and/or activated on sites. Combined with the site URL in the ‘User-Agent’ field of the headers (see the packet capture, above), wordpress.org could potentially identify each site and the plugins installed and/or activated on it uniquely.
On our test instance, the Akismet plugin needed updating. We recieved the following serialised string as a reply from wordpress.org containing information about the new version of the plugin and a URL for downloading it:
================HEADERS==================== HTTP/1.1 200 OK Server: nginx Date: Wed, 09 Feb 2011 15:14:45 GMT Content-Type: text/plain; charset=utf-8 Connection: close Content-Length: 265 ==============END OF HEADERS================ a:1:{s:19:"akismet/akismet.php";O:8:"stdClass":5:{s:2:"id";s:2:"15";s:4:"slug";s:7:"akismet";s:11:"new_version";s:5:"2.5.3";s:3:"url";s:44:"http://wordpress.org/extend/plugins/akismet/";s:7:"package";s:55:"http://downloads.wordpress.org/plugin/akismet.2.5.3.zip";}}
A check for updates to installed themes was also carried out. The cleaned-up output from Wireshark can be seen below. If you want to see how WordPress generates it, take a look at the wp_update_themes() function on lines 211-295 in wp-includes/update.php
==================HEADERS================== POST /themes/update-check/1.0/ HTTP/1.0 Host: api.wordpress.org User-Agent: WordPress/3.0.5; http://networktest.dev Accept-Encoding: deflate;q=1.0, compress;q=0.5 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Content-Length: 582 ==============END OF HEADERS================ themes=a:2:{s:13:"current_theme";s:9:"twentyten";s:9:"twentyten";a:9:{s:4:"Name";s:10:"Twenty+Ten";s:7:"Version";s:3:"1.1";s:5:"Title";s:10:"Twenty+Ten";s:6:"Author";s:18:"the+WordPress+team";s:11:"Author+Name";s:18:"the+WordPress+team";s:10:"Author+URI";s:0:"";s:8:"Template";s:9:"twentyten";s:10:"Stylesheet";s:9:"twentyten";s:12:"Parent+Theme";s:0:"";}}
The serialised string above contains information on the currently active themes sub-directory as well as information on all the themes currently installed on our test instance. We only had the “Twenty Ten” theme installed and activated for this test.
Installing and Activating an Additional Theme Manually
The “Blend” theme was used for this test and was installed manually in a sub-directory named ‘blend’. When the theme was installed and not activated, information in the form of a serialised string was sent to wordpress.org. This is the result of WordPress checking for updates to installed themes, as was seen in the last test:
==================HEADERS================== POST /themes/update-check/1.0/ HTTP/1.0 Host: api.wordpress.org User-Agent: WordPress/3.0.5; http://networktest.dev Accept-Encoding: deflate;q=1.0, compress;q=0.5 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Content-Length: 1237 ==============END OF HEADERS================ themes=a:3:{s:13:"current_theme";s:9:"twentyten";s:5:"blend";a:9:{s:4:"Name";s:5:"Blend";s:7:"Version";s:5:"2.0.0";s:5:"Title";s:5:"Blend";s:6:"Author";s:105:"Interconnect+IT,+James+R+Whitehead";s:11:"Author+Name";s:34:"Interconnect+IT,+James+R+Whitehead";s:10:"Author+URI";s:26:"https://interconnectit.com/";s:8:"Template";s:5:"blend";s:10:"Stylesheet";s:5:"blend";s:12:"Parent+Theme";s:0:"";}s:9:"twentyten";a:9:{s:4:"Name";s:10:"Twenty+Ten";s:7:"Version";s:3:"1.1";s:5:"Title";s:10:"Twenty+Ten";s:6:"Author";s:18:"the+WordPress+team";s:11:"Author+Name";s:18:"the+WordPress+team";s:10:"Author+URI";s:0:"";s:8:"Template";s:9:"twentyten";s:10:"Stylesheet";s:9:"twentyten";s:12:"Parent+Theme";s:0:"";}}
When the newly installed theme was activated, an updated version of the same serialised string is sent to wordpress.org to reflect the change in theme on the site. No additional information is sent
=============HEADERS=================== POST /themes/update-check/1.0/ HTTP/1.0 Host: api.wordpress.org User-Agent: WordPress/3.0.5; http://networktest.dev Accept-Encoding: deflate;q=1.0, compress;q=0.5 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Content-Length: 1233 ==============END OF HEADERS================ themes=a:3:{s:13:"current_theme";s:5:"blend";s:5:"blend";a:9:{s:4:"Name";s:5:"Blend";s:7:"Version";s:5:"2.0.0";s:5:"Title";s:5:"Blend";s:6:"Author";s:105:"Interconnect+IT,+James+R+Whitehead";s:11:"Author+Name";s:34:"Interconnect+IT,+James+R+Whitehead";s:10:"Author+URI";s:26:"https://interconnectit.com/";s:8:"Template";s:5:"blend";s:10:"Stylesheet";s:5:"blend";s:12:"Parent+Theme";s:0:"";}s:9:"twentyten";a:9:{s:4:"Name";s:10:"Twenty+Ten";s:7:"Version";s:3:"1.1";s:5:"Title";s:10:"Twenty+Ten";s:6:"Author";s:18:"the+WordPress+team";s:11:"Author+Name";s:18:"the+WordPress+team";s:10:"Author+URI";s:0:"";s:8:"Template";s:9:"twentyten";s:10:"Stylesheet";s:9:"twentyten";s:12:"Parent+Theme";s:0:"";}}
Activating and De-activating the “Hello Dolly” plug-in.
Ok, so you maybe asking yourself at this point: What could this plugin possibly be sending out? We chose this plug-in because we wanted to see what the WordPress core (and not an individual plug-in) sent out, when a plug-in was activated and de-activated.
Nothing was sent to wordpress.org (or anywhere else, for that matter) when this plug-in was activated or de-activated.
Akismet Plugin Tests
When Akismet was activated, the following information , sent to akismet.com (72.233.69.2) in the form of a POST request, was captured by Wireshark:
=============HEADERS=============== POST /1.1/verify-key HTTP/1.0 User-Agent: WordPress/3.0.5 | Akismet/2.5.3 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Host: rest.akismet.com Accept-Encoding: deflate;q=1.0, compress;q=0.5 Content-Length: 50 ==========END OF HEADERS============ key=1234567890ab&blog=http%3A%2F%2Fnetworktest.dev
Here, we can see two items of information are sent: The API key used on the site and the URL of the site.
Here, Akismet is just checking that we have a valid API key configured on our site. As we had just activated and not yet configured Akismet at this point, it would appear that a dummy value of ‘1234567890ab’ is sent instead.
As we expected the API key value passed back to akismet.com to be invalid, the reply we observed from akismet.com came as no surprise:
=============HEADERS=============== HTTP/1.1 200 OK Server: nginx Date: Thu, 10 Feb 2011 11:11:53 GMT Content-Type: text/plain; charset=utf-8 Connection: close X-akismet-server: 192.168.6.48 Content-length: 7 ==========END OF HEADERS============ invalid
Things got a little bit more interesting when we started to post comments on the test installation with Akismet activated and configured. When a test comment was submitted, without being logged into the WordPress Dashboard, the following data was sent to akismet.com:
=============HEADERS=============== POST /1.1/comment-check HTTP/1.0 User-Agent: WordPress/3.0.5 | Akismet/2.5.3 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Host: 47d9da91cd8f.rest.akismet.com Accept-Encoding: deflate;q=1.0, compress;q=0.5 Content-Length: 3217 ==========END OF HEADERS============ comment_post_ID=358&comment_author=peter&[email protected]&comment_author_url=http//site&comment_content=test+comment&comment_type=&comment_parent=0&user_ID=0 &user_ip=127.0.0.1&user_agent=Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-GB;+rv:1.9.2.13)+Gecko/20101203+Firefox/3.6.13 &referrer=http://networktest.dev/?p=358&blog=http://networktest.dev&blog_lang=en_US&blog_charset=UTF-8 &permalink=http://networktest.dev/?p=358&user_role=&akismet_comment_nonce=passed&POST_author=peter&[email protected]&POST_url=http//site&POST_comment=test+comment &POST_submit=Post+Comment&POST_comment_post_ID=358&POST_comment_parent=0&POST_akismet_comment_nonce=3e4d8f4d4b &SERVER_SOFTWARE=Apache/2.2.14+(Win32)+DAV/2+mod_ssl/2.2.14+OpenSSL/0.9.8l+mod_autoindex_color+PHP/5.3.1+mod_apreq2-20090110/2.7.1+mod_perl/2.0.4+Perl/v5.10.1&REQUEST_URI=/wp-comments-post.php&MIBDIRS=/xampp/php/extras/mibs&MYSQL_HOME=\xampp\mysql\bin&OPENSSL_CONF=/xampp/apache/bin/openssl.cnf&PHP_PEAR_SYSCONF_DIR=\xampp\php&PHPRC=\xampp\php&TMP=\xampp\tmp&HTTP_HOST=networktest.dev&HTTP_USER_AGENT=Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-GB;+rv:1.9.2.13)+Gecko/20101203+Firefox/3.6.13&HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8&HTTP_ACCEPT_LANGUAGE=en-gb,en;q=0.5&HTTP_ACCEPT_ENCODING=gzip,deflate&HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7&HTTP_KEEP_ALIVE=115&HTTP_CONNECTION=keep-alive&HTTP_REFERER=http://networktest.dev/?p=358&HTTP_COOKIE=&CONTENT_TYPE=application/x-www-form-urlencoded&CONTENT_LENGTH=160&PATH=C:\Perl\site\bin;C:\Perl\bin;C:\Program+Files+(x86)\ActiveState+Komodo+Edit+5\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program+Files+(x86)\ATI+Technologies\ATI.ACE\Core-Static;C:\Program+Files\TortoiseSVN\bin;c:\putty\;C:\Program+Files+(x86)\WinSCP\;C:\Program+Files+(x86)\Graphviz2.26.3\bin;C:\Program+Files+(x86)\GmoteServer\bin\vlc\;C:\Program+Files+(x86)\Google\Google+Apps+Sync\&SystemRoot=C:\Windows&COMSPEC=C:\Windows\system32\cmd.exe&PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC&WINDIR=C:\Windows&SERVER_SIGNATURE=Apache/2.2.14+(Win32)+DAV/2+mod_ssl/2.2.14+OpenSSL/0.9.8l+mod_autoindex_color+PHP/5.3.1+mod_apreq2-20090110/2.7.1+mod_perl/2.0.4+Perl/v5.10.1+Server+at+networktest.dev+Port+80&SERVER_NAME=networktest.dev &SERVER_ADDR=127.0.0.1&SERVER_PORT=80&REMOTE_ADDR=127.0.0.1&DOCUMENT_ROOT=T:/hosts/networktest&[email protected]&SCRIPT_FILENAME=T:/hosts/networktest/wp-comments-post.php &REMOTE_PORT=51573&GATEWAY_INTERFACE=CGI/1.1&SERVER_PROTOCOL=HTTP/1.1&REQUEST_METHOD=POST&QUERY_STRING=&SCRIPT_NAME=/wp-comments-post.php&PHP_SELF=/wp-comments-post.php&REQUEST_TIME=1297355111&argv=&argc=0&
While it is understandable that Akismet would need information about the comment (author details, origin IP address, the user agent sent by the comnment-posters web browser, etc) additional information is sent as well. We also repeated this test while logged in and got the same result.
What is especially worrying here is that the contents of our test systems PATH variable, as well as other information on our server configuration was also sent, in the clear, over HTTP. This has obvious privacy and security implications. Anyone could intercept this information in transit (as we did on our test machine) and this information can be useful to anyone looking for vulnerabilities in a server.
Automattic’s Privacy Policy does say that information about comments marked as spam using Akismet are sent and used by Automattic. However, no mention is made of information on server environment (like the contents of the system PATH variable, for example) information being sent or how/if it is used by Automattic.
Conclusion
The information we observed being sent out by the WordPress core itself, when checking for updates, was fairly innocuous. The information sent out about plugins and themes installed and/or activated on the site was not particularly sensitive and, although useful to Automattic and WordPress when compiling usage statistics, would not represent a privacy or security risk .
The most worrying results came from the Akismet tests. We can understand why it sends out some of the information it does – in part it’ll be because there will be attempts to poison the service by sending fake approvals. We can imagine that what happens is that some spammers, somewhere, set up little mini WP servers (perhaps as trojans on old unpatched Windows machines) which approve whole streams of spam comments. The servers are then set up to approve everything so that they can fool Akismet into believing the spam is legitimate. Consequently, however, there will be a pattern to these mini servers and this could be revealed with the path information that is sent out. The Bayesian filters we’re sure Akismet uses will therefore be harder to poison.
We believe that Automattic need to be careful in that area to ensure that this information is tightly managed. We also believe that both Automattic and WordPress.org need to be open about the information they retain, and how long for. That will fuel the inevitable debates, of course, but if we all know then we can make our own decisions on whether or not Akismet’s model is acceptable to us. It would be hard to recommend Akismet for a super-high security website, but then those kind of websites don’t tend to run WordPress anyway. For most sites it’s unlikely to be a major issue, and many of us at interconnect/it will continue to use Akismet for our personal blogs.
Addenda
David Coveney, 13 Sept 2011 – We’ve recently spotted that the text editor uses Google’s Spell API for spell checking. Although the information is sent via https and is free from eavesdropping it does mean that spell checked content will be available to Google. What’s worrying is that the API is not properly documented and it’s not known what information Google retains. For a public facing website this is unlikely to be a significant concern, but for private Intranets where information stored may be confidential then this would need careful consideration and possibly blocking through the firewall.