<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Linux Sysadmin Blog &#187; Performance</title>
	<atom:link href="http://linuxsysadminblog.com/category/performance/feed/" rel="self" type="application/rss+xml" />
	<link>http://linuxsysadminblog.com</link>
	<description></description>
	<lastBuildDate>Tue, 10 May 2011 03:23:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<item>
		<title>A Day in the Life of Facebook Operations</title>
		<link>http://linuxsysadminblog.com/2010/09/a-day-in-the-life-of-facebook-operations/</link>
		<comments>http://linuxsysadminblog.com/2010/09/a-day-in-the-life-of-facebook-operations/#comments</comments>
		<pubDate>Thu, 30 Sep 2010 19:23:51 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=1145</guid>
		<description><![CDATA[Notes from the &#8220;A Day in the Life of Facebook Operations&#8221; presentation by Tom Cook, Systems Engineer, Facebook at Surge2010 conference. So far this is the most attended session.  Standing room only available only before it start. What does facebook sysadmins have to support? Monthly 700 million minutes of time spent on fb 6billion pieces [...]]]></description>
			<content:encoded><![CDATA[<p>Notes from the &#8220;A Day in the Life of Facebook Operations&#8221; presentation by Tom Cook, Systems Engineer, Facebook at <a href="http://omniti.com/surge/2010">Surge2010</a> conference.</p>
<p>So far this is the most attended session.  Standing room only available only before it start.</p>
<p>What does facebook sysadmins have to support?</p>
<ul>
<li>Monthly 700 million minutes of time spent on fb</li>
<li>6billion pieces of content updated</li>
<li>3 billion photos</li>
<li>1 million connect implementations</li>
<li>1/2 billion active users</li>
</ul>
<p>Infrastructure Growth</p>
<ul>
<li>fb reached a limit on leasing datacenter space</li>
<li>fb is building their own http://www.facebook.com/prinevilledatacenter</li>
<li>currently serving out of california and Virginia</li>
</ul>
<p>Initially a LAMP stack.  LB -&gt; Web Servers -&gt; Services/Memcached/Databases</p>
<p>Originally facebook was a simple Apache PHP site.  When fb started hitting a limit on this, they started compiling PHP into C++ (<a href="http://developers.facebook.com/blog/post/358">HipHop</a> for PHP).</p>
<p>FB claims to be the biggest memcache deployment in the world.  They server 300 Terbytes of memcached data out of memory.</p>
<p><a href="http://www.facebook.com/MySQLatFacebook">MySQL improvement</a>s contributed back is flashcache.</p>
<p>Services supported</p>
<ul>
<li>News Feed</li>
<li>Search</li>
<li>Cache</li>
</ul>
<p>Service implementation languages</p>
<ul>
<li>C++</li>
<li>PHP &#8211; front end</li>
<li>python</li>
<li>Ruby</li>
<li>Java</li>
<li>erlang (chat room)</li>
</ul>
<p>How do they talk between these?  Json?  SOAP?  No, fb implemented Thrift &#8211; ligtwaith software framework for cross language development, a common glue behind all facebook systems.</p>
<p>For Systems, what does fb have to worry about on a daily basis?</p>
<ul>
<li>deployment</li>
<li>monitoring</li>
<li>data manaement</li>
<li>Core operating updates</li>
</ul>
<p>Facebook OS is&#8230;. CentOS!</p>
<p>Systems Management</p>
<ul>
<li>Configuration Management</li>
<li>CFengine for system management</li>
<li>On Demand</li>
</ul>
<p>Deployments</p>
<ul>
<li>Web Push &#8211; new code gets deployed to fb at least once a day.  Its a coordinated push, everyone is aware, notification happens to dev team.  Everyone sites on IRC during the push.  It is undestood by engineers and the rest of the company
<ul>
<li>push software built over on-demand control tools</li>
<li>code distributed via internal BitTorrent swarm</li>
<li>php gets compiled, the few hundred MB binary gets rapidly pushed bia bit torrent.</li>
<li>it takes one minute to push across the entire network</li>
</ul>
</li>
<li>Backend Deployments &#8211; only Engineering and Operations.  Engineers write, test and display
<ul>
<li>Quickly make performance decisions</li>
<li>Expose changes to subset of real traffic</li>
<li>No &#8216;commit and quit&#8217;</li>
<li>Deeply involved in moving services to production</li>
<li>Ops &#8216;embeded&#8217; into engineering teams</li>
</ul>
</li>
<li>Heavy Change logging &#8211; pin pointing code to every push and change</li>
</ul>
<p>Monitoring and Metrics of servers and performance at facebook</p>
<ul>
<li><a href="http://ganglia.sourceforge.net/">Ganglia</a> &#8211; aggregated metrics
<ul>
<li>fast</li>
<li>straightforward</li>
<li>nested grids &amp; pools</li>
<li>over 5 million monitored metrics</li>
</ul>
</li>
<li>facebook inhouse monitoring system</li>
</ul>
<p>Monitoring &#8211; facebook still uses <a href="http://www.nagios.org/">Nagios</a>!</p>
<p>To manage complexity and the number of alarms and systems monitoring the fb team uses aggregation.  Initially alarms were managed by email.</p>
<p>Scribe &#8211; high performance logging application.  Initially used syslog.  Also used Hadoop and Hive.</p>
<p>How does it work and gets done?</p>
<ul>
<li>clear delineation of dependencies and responsibilities</li>
<li>Constant Failure</li>
<li>Servers were the first line of defense, then started focusing on racks</li>
<li>Now is focused on clusters.  Logical delineation based on function (web, db, feed, etc)</li>
<li>Next stage is datacenters &#8211; what to do if a natural disaster strikes?</li>
<li>Constant Communication &#8211; information is shared constantly.
<ul>
<li>IRC</li>
<li>lots of automated bots, get and set data</li>
<li>internal news updates</li>
<li>&#8220;Headers&#8221; on internal tools</li>
<li>Change log/feeds</li>
</ul>
</li>
<li>Small teams</li>
</ul>
<p>Interesting fact &#8211; each fb server gets an update on average every eight minutes.</p>
<p>Busiest day for FB is day after halloween <img src='http://linuxsysadminblog.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>URLS to check out:</p>
<p><a href="http://www.facebook.com/Engineering">facebook.com/engineering</a></p>
<p><a href="http://developers.facebook.com/opensource/">facebook.com/opensource</a></p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2010/09/a-day-in-the-life-of-facebook-operations/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Drupal Performance talk video from Drupal Con 2010</title>
		<link>http://linuxsysadminblog.com/2010/07/drupal-performance-talk-video-from-drupal-con-2010/</link>
		<comments>http://linuxsysadminblog.com/2010/07/drupal-performance-talk-video-from-drupal-con-2010/#comments</comments>
		<pubDate>Fri, 30 Jul 2010 04:01:17 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[drupal]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=1122</guid>
		<description><![CDATA[I have already mentioned that google will start penalizing your site in search if it is not fast enough. I have recently given a presentation at Drupal Con about Drupal performance and page speed. Below is a video of our session &#8211; its currently the fifth most watched session from the conference according to archive.org. [...]]]></description>
			<content:encoded><![CDATA[<p>I have already mentioned that <a href="http://linuxsysadminblog.com/2010/04/google-will-use-site-performance-and-page-load-speed-in-serp-ranking-sysadmin-seo-here-we-come/">google will start penalizing your site</a> in search if it is not fast enough.  I have recently given a presentation at Drupal Con about Drupal performance and page speed.  Below is a video of our session &#8211; its currently the fifth most watched session from the conference according to <a href="http://www.archive.org/search.php?query=DrupalCon%20SF%202010&amp;sort=-downloads">archive.org</a>.  Alternatively, there are several different formats <a href="http://www.archive.org/details/MakeDrupalRunFast-IncreasePageLoadSpeed">there as well</a>.</p>
<p><embed src="http://blip.tv/play/hpMogfiKHQA" type="application/x-shockwave-flash" width="560" height="385" allowscriptaccess="always" allowfullscreen="true"></embed></p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2010/07/drupal-performance-talk-video-from-drupal-con-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google will use site performance and page load speed in SERP ranking &#8211; sysadmin SEO here we come</title>
		<link>http://linuxsysadminblog.com/2010/04/google-will-use-site-performance-and-page-load-speed-in-serp-ranking-sysadmin-seo-here-we-come/</link>
		<comments>http://linuxsysadminblog.com/2010/04/google-will-use-site-performance-and-page-load-speed-in-serp-ranking-sysadmin-seo-here-we-come/#comments</comments>
		<pubDate>Wed, 14 Apr 2010 04:48:56 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[page load speed]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=1075</guid>
		<description><![CDATA[Page Load speed just became a lot more important &#8211; Google announced recently that it will use page speed in its SERP rankings. Here is a quote that will make the SEO and marketing folks knock on sysadmin doors: We encourage you to start looking at your site&#8217;s speed (the tools above provide a great [...]]]></description>
			<content:encoded><![CDATA[<p>Page Load speed just became a lot more important &#8211; Google announced recently that it will use <a href="http://googlewebmastercentral.blogspot.com/2010/04/using-site-speed-in-web-search-ranking.html">page speed in its SERP</a> rankings.  Here is a quote that will make the SEO and marketing folks knock on sysadmin doors:</p>
<blockquote><p>We encourage you to start looking at your site&#8217;s speed (the tools above provide a great starting point) — not only to improve your ranking in search engines, but also to improve everyone&#8217;s experience on the Internet.</p></blockquote>
<p>The post lists a number of tools everyone should be using already, such as <a href="http://developer.yahoo.com/yslow/" target="_blank">YSlow</a>, google&#8217;s own <a href="http://code.google.com/speed/page-speed/">PageSpeed</a>, <a href="http://www.webpagetest.org/">online speed waterfall diagram</a> tool and webmaster tools.  Webmaster tools recently added a beta feature which provides data about your sites speed relative to other sites on the internet.</p>
<p>Here is a sample report:</p>
<div id="attachment_1076" class="wp-caption aligncenter" style="width: 760px"><a href="http://linuxsysadminblog.com/wp-content/uploads/2010/04/chart.png"><img class="size-full wp-image-1076" title="webmaster tools page speed chart" src="http://linuxsysadminblog.com/wp-content/uploads/2010/04/chart.png" alt="google webmaster tools page load speed chart" width="750" height="150" /></a><p class="wp-caption-text">webmaster tools page speed chart</p></div>
<blockquote><p>Performance overview<br />
On average, pages in your site take 6.3 seconds to load (updated on Apr 11, 2010). This is slower than 83% of sites. These estimates are of medium accuracy (between 100 and 1000 data points). The chart below shows how your site&#8217;s average page load time has changed over the last few months. For your reference, it also shows the 20th percentile value across all sites, separating slow and fast load times.</p></blockquote>
<p>Linus System admin blog will have a series on page speed improvement.  Stay tuned!</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2010/04/google-will-use-site-performance-and-page-load-speed-in-serp-ranking-sysadmin-seo-here-we-come/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Enable/Disable APC on virtual host level</title>
		<link>http://linuxsysadminblog.com/2010/03/enabledisable-apc-on-virtual-host-level/</link>
		<comments>http://linuxsysadminblog.com/2010/03/enabledisable-apc-on-virtual-host-level/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 21:53:29 +0000</pubDate>
		<dc:creator>Marius</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[Tips and Tricks]]></category>
		<category><![CDATA[apc]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=1069</guid>
		<description><![CDATA[APC (Alternative PHP Cache) is a free, open, and robust framework for caching and optimizing PHP intermediate code. APC is a great tool to speed up a php driven site and I can&#8217;t even think of a big site running on a php framework without an opcode cache (other good choices are eaccelerator or xcache). [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://pecl.php.net/package/APC" target="_blank"><strong>APC</strong></a> (Alternative PHP Cache) is a free, open, and robust framework for caching and optimizing PHP intermediate code. <strong>APC</strong> is a great tool to speed up a php driven site and I can&#8217;t even think of a big site running on a php framework without an <em>opcode cache</em> (other good choices are <strong>eaccelerator</strong> or <strong>xcache</strong>). Why would not everyone want to use this? The reason why this is not enabled by default everywhere is because in certain situations it can break things. Most peoples will not see any problems, but still, if you run a server with many clients sharing the same apache service this might be a problem (as the apc module loading it is a server-wide config). This post will show how we can use APC globally and disable it for some vhosts (that might have a problem with using APC) or the reverse to just use it one a special vhost that might need this.</p>
<p>I&#8217;ll assume that you have installed apc already, if this is not the case this will probably be something as simple as running<br />
<code>pecl install apc</code><br />
or downloading the archive from pecl and running:<br />
<code>phpize; ./configure; make; make install</code></p>
<p>The APC extension needs to be enabled either in <strong>php.ini</strong> or in one included file with a line like this:<br />
<code>extension=apc.so</code><br />
there are many other parameters that apc can be fine tuned (see the official doc for more info), but without any other change, just with this line apc will be enabled on all the vhosts on the server.</p>
<p><strong>Disabling some vhosts from using APC</strong><br />
- if we want to disable APC for a particular vhost we just have to add to the vhost config or to .htaccess:<br />
<code>php_flag apc.cache_by_default Off</code></p>
<p><strong>Enabling APC only on some vhosts</strong><br />
- if we want to have APC disabled by default globally we will have in php.ini:</p>
<pre><code>extension=apc.so
[apc]
apc.cache_by_default=0 # disable by default
... other apc settings...</code></pre>
<p>and we will enable APC for the particular vhost config or using .htaccess using:<br />
<code>php_flag apc.cache_by_default On</code></p>
<p>Hopefully you found this post useful and this will give you a reason to use APC with more confidence knowing that you have the granularity to disable/enable it as needed in a shared environment.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2010/03/enabledisable-apc-on-virtual-host-level/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Google to offer free DNS service</title>
		<link>http://linuxsysadminblog.com/2009/12/google-to-offer-free-dns-service/</link>
		<comments>http://linuxsysadminblog.com/2009/12/google-to-offer-free-dns-service/#comments</comments>
		<pubDate>Fri, 04 Dec 2009 02:55:05 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=1024</guid>
		<description><![CDATA[Google will start pushing for a faster web next year, and there have been several rumors in the SEO and marketing world that google will add page speed to its SEO rankings algorithm.  Yesterday they have]]></description>
			<content:encoded><![CDATA[<p>Google will start pushing for a faster web next year, and there have been several rumors in the SEO and marketing world that google will add <a href="http://www.steverenner.com/ranking-web-site-speed">page speed to its SEO rankings algorithm</a>.  Yesterday they have <focuses on speed" href="http://googleblog.blogspot.com/2009/12/introducing-google-public-dns.html">announced </a>that  Google will offer a<a href="http://code.google.com/speed/public-dns/" target="_blank"> free DNS service</a>.</p>
<p>First off, this is great.  It should improve the speed of looking up the DNS info of many sites, and if the service takes off, it should take the load off your NS.</p>
<p>The focus on speed if very clear, the Google public DNS server lists this first as one of the advantages.  It also points to the speed problems caused by <a href="http://code.google.com/speed/public-dns/docs/performance.html">DNS latency</a>.</p>
<h3>Google Public DNS IP addresses</h3>
<p>The Google Public DNS IP addresses are as follows:</p>
<ul>
<li>8.8.8.8</li>
<li>8.8.4.4</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/12/google-to-offer-free-dns-service/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Tracing memory leaks with pidstat</title>
		<link>http://linuxsysadminblog.com/2009/06/tracing-memory-leaks-with-pidstat/</link>
		<comments>http://linuxsysadminblog.com/2009/06/tracing-memory-leaks-with-pidstat/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 19:53:10 +0000</pubDate>
		<dc:creator>max</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[pidstat]]></category>
		<category><![CDATA[script]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=734</guid>
		<description><![CDATA[Finding application memory leaks is important part of keeping systems stable and often very hard to track down. Monitoring application memory consumption can be performed in a few different ways, the easiest is a simple capture of ps and append to log file triggered via cron at desired interval. In this example we will track [...]]]></description>
			<content:encoded><![CDATA[<p>Finding application memory leaks is important part of keeping systems stable and often very hard to track down. Monitoring application memory consumption can be performed in a few different ways, the easiest is a simple capture of ps and append to log file triggered via cron at desired interval. In this example we will track sshd memory usage via shell script.</p>
<p><code>
<pre>
#!/bin/bash

PID=`cat /var/run/sshd.pid`
ps -p $PID -o pid -o rss -o %mem -o cmd &gt;&gt; logname
exit
</pre>
<p></code><br />
<span id="more-734"></span><br />
The output of log file will looks like this:<br />
<code>
<pre>
PID RSS %MEM CMD
2607  1036  0.0 /usr/sbin/sshd</pre>
<p></code></p>
<p>Note the RSS column, it the values keep increasing with usage of application that would indicate a memory leak.</p>
<p>Another way to do the same thing is with <a href="http://pagesperso-orange.fr/sebastien.godard/man_pidstat.html">pidstat</a> which is a part of the <a href="http://pagesperso-orange.fr/sebastien.godard/">sysstat</a> package. The package can be installed via yum or aptitude but may not come with pidstat, if so download and install or build the latest version from <a href="http://pagesperso-orange.fr/sebastien.godard/download.html">here</a>.</p>
<p>Pidstat is better way to track resources of an application because it has built in polling as well as combining process&#8217;s children statistics into the total. See pidstat man page options. In the script below we will use pidstat to  track memory usage for sshd by polling 12 times every 5 minutes and e-mailing a report and writing to a log file. This script can also be run via cron.</p>
<p><code>
<pre>#!/bin/bash
# track memory usage of sshd using pidstat and send report
# http://www.linuxsysadminblog.com - MaxV

export PID=/var/run/sshduse.pid
export TIMESTAMP=`date +%Y%m%d_%H%M%S`
export LOGDIR=/var/log/
export SSHD_LOG="${LOGDIR}sshd_memUsage_${TIMESTAMP}"
export SSHD_PID=`cat /var/run/sshd.pid`
export MAILTO=user@domain.com

if [ ! -e ${PID} ]; then

#create pid file
echo $$ &gt; ${PID}

#log begin of script to /var/log/messages
/usr/bin/logger "Starting SSHD Memory Usage Tracker"

# pidstat portion, poll 12 times with 5 minutes apart
/usr/bin/pidstat -r -p ${SSHD_PID} 300 12 &gt;&gt; ${SSHD_LOG}

#e-mail report
mail -s "SSHD memory usage ${TIMESTAMP}" ${MAILTO} &lt; ${SSHD_LOG}

#clean up pid file
if [ -f ${PID} ]; then
rm -rf ${PID}

#log end of script to /var/log/messages
/usr/bin/logger "Ending SSHD Memory Usage Tracker"
fi
exit 0
else

exit 0
fi</pre>
<p></code></p>
<p>The output of this script:<br />
<code>
<pre>
Linux 2.6.18-6-686 (hostname)       06/16/09        _i686_  (2 CPU)

14:21:26          PID  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
14:21:56         2564      0.00      0.00    4932   1108   0.05  sshd
14:22:26         2564      0.00      0.00    4932   1108   0.05  sshd
14:22:56         2564      0.00      0.00    4932   1108   0.05  sshd
14:23:26         2564      0.00      0.00    4932   1108   0.05  sshd
14:23:56         2564      0.00      0.00    4932   1108   0.05  sshd
14:24:26         2564      0.00      0.00    4932   1108   0.05  sshd
14:24:56         2564      0.00      0.00    4932   1108   0.05  sshd
14:25:26         2564      0.00      0.00    4932   1108   0.05  sshd
14:25:56         2564      0.00      0.00    4932   1108   0.05  sshd
14:26:26         2564      0.00      0.00    4932   1108   0.05  sshd
14:26:56         2564      0.47      0.00    4932   1108   0.05  sshd
14:27:26         2564      0.00      0.00    4932   1108   0.05  sshd
Average:         2564      0.04      0.00    4932   1108   0.05  sshd</pre>
<p></code></p>
<p>From the output we can see that &#8220;RSS&#8221; values are not increasing as time progresses which means that it&#8217;s a solid piece of coding.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/06/tracing-memory-leaks-with-pidstat/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MySQL Query Cache, Good or Bad?</title>
		<link>http://linuxsysadminblog.com/2009/04/mysql-query-cache-good-or-bad/</link>
		<comments>http://linuxsysadminblog.com/2009/04/mysql-query-cache-good-or-bad/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 01:55:06 +0000</pubDate>
		<dc:creator>Pim</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[cache]]></category>
		<category><![CDATA[tuning]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=536</guid>
		<description><![CDATA[MySQL has a number of different caches. Most of those are dependent on the storage engine that is used. The key buffer for example caches the indexes for MyISAM tables while the caching of data is left to the OS. InnoDB has the buffer pool for both data and indexes and so on. The query [...]]]></description>
			<content:encoded><![CDATA[<p>MySQL has a number of different caches. Most of those are dependent on the storage engine that is used. The key buffer for example caches the indexes for MyISAM tables while the caching of data is left to the OS. InnoDB has the buffer pool for both data and indexes and so on. The query cache however, is independent of the storage engine that is used. Unlike most caches it does not store records or pages of data but complete result sets and the queries that caused those results to be returned. This is a very disputable concept since the way that it works is that if any of the tables  used in a result set is modified, the whole cached result set is thrown out of the cache.</p>
<p><span id="more-536"></span>The good news is that if you have data that does not change very much the query cache can give you an enormous performance boost. It even bypasses the query optimizer so that if the query is complex even more cpu time is saved. Knowing this you can optimize your application by chopping complex queries into smaller queries that only use that data that never changes.</p>
<p>Of course there are some tricks to using the query cache. The first one is the size of the query cache. The default is 16MB which doesn&#8217;t do much. However, keep in mind that any memory assigned to the query cache is removed from another cache so it&#8217;s very important to strike a good balance. Of course the balance is very application dependent. The second parameter is the maximum allowed result set size. It really doesn&#8217;t do any good to allow 16MB result sets into the cache because it would take only one badly written query to flush out the entire cache. 1MB is standard but in my personal experience queries that return 1MB of results on a frequent basis usually indicate that the software needs to be optimized.</p>
<p>So when is the query cache a bad thing? Well, in short, when the cache gets flushed out all the time and only adds to the overhead it&#8217;s usually better to assign the memory to storage engine dependent cache. If there are constant updates and inserts to most of your tables it will invalidate the results in the query cache pretty quickly and assigning memory to it is a waste of resources.</p>
<p>Useful tools like <a title="MySQL Tuner" href="http://wiki.mysqltuner.com/MySQLTuner" target="_blank">MySQL Tuner</a> will give some quick information about the efficiency of the query cache but I do think it is a bit quick in suggesting more memory for the cache.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/04/mysql-query-cache-good-or-bad/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Drupal performance tips from DrupalCon</title>
		<link>http://linuxsysadminblog.com/2009/03/drupal-performance-tips-from-drupalcon/</link>
		<comments>http://linuxsysadminblog.com/2009/03/drupal-performance-tips-from-drupalcon/#comments</comments>
		<pubDate>Thu, 05 Mar 2009 17:27:21 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[drupal]]></category>
		<category><![CDATA[monitoring]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=397</guid>
		<description><![CDATA[increasing drupal performance notes from drupal con dc 2009]]></description>
			<content:encoded><![CDATA[<p>Still reporting from DrupalCon. So far there have been a number of sessions I have attended. Here are some highlights from those sessions on how to increase performance on your drupal site.</p>
<ul>
<li>Look at the number of requests a page makes to the server</li>
<li>Use yslow to measure page rendering (often a page performance is perceived, not just based on the server response time)</li>
</ul>
<ul>
<li>Remove search, use alternate solutions such as Apache Solr or Google Search API</li>
<li>Use CDN as much as possible</li>
<li>Use Reverse Proxy Cache and memcache</li>
<li>Obviously use drupal cache</li>
</ul>
<p>Some other notes that are somewhat related to drupal performance and site performance management in a clustered hosting environment.</p>
<p><strong>Manual updates and rollback</strong></p>
<p>OLD WAY: tar, move/copy untar restart services</p>
<p>OLD WAY: rsync</p>
<p>BETTER WAY: <a href="http://www.capify.org/" target="_blank">Capistrano </a></p>
<p><strong>Managing systems:</strong></p>
<p>BETTER WAY: <a href="http://trac.mcs.anl.gov/projects/bcfg2" target="_blank">bcfg2</a></p>
<p><strong>Monitoring Tools</strong></p>
<ul>
<li>Capacity Load: analyzing trends, predicting load, checking results of configuration and software changes (<a href="http://www.cacti.net/" target="_blank">cacti</a>, <a href="http://munin.projects.linpro.no/" target="_blank">munin</a>)</li>
<li>Failure: analyzing downtime, notification (<a href="http://www.nagios.org/" target="_blank">nagios</a> &#8211; using <strong>nrpe</strong> agents to monitor diverse services (do we use it this way?) , hyperin)</li>
</ul>
<p>Use monitoring tools to closely observe cluster replication and cashing as the failures in this area are the most difficult to solve.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/03/drupal-performance-tips-from-drupalcon/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Cloud computing scenario&#8217;s for database servers</title>
		<link>http://linuxsysadminblog.com/2009/02/cloud-computing-scenarios-for-database-servers/</link>
		<comments>http://linuxsysadminblog.com/2009/02/cloud-computing-scenarios-for-database-servers/#comments</comments>
		<pubDate>Tue, 17 Feb 2009 15:09:35 +0000</pubDate>
		<dc:creator>Pim</dc:creator>
				<category><![CDATA[Down Time]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Replication]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[ec2]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=322</guid>
		<description><![CDATA[We&#8217;ve been investigating the possibilities of using cloud computing for our clients. Especially Amazon EC2 has the potential to be be really effective in offering flexible, pay-as-you-go computing. From my own perspective I have been looking at how to use cloud computing in combination with MySQL and I must say that I&#8217;m a bit sceptical [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve been investigating the possibilities of using cloud computing for our clients. Especially Amazon EC2 has the potential to be be really effective in offering flexible, pay-as-you-go computing. From my own perspective I have been looking at how to use cloud computing in combination with MySQL and I must say that I&#8217;m a bit sceptical about the effectiveness of cloud computing in replacing the primary database server. First off there does not seem to be that much in the way of performance data for this type of installation. Can a cloud server really offer the I/O performance necessary to replace a dedicated database server? And even if the performance is equal, what is the main advantage? Scaling web sites is done by adding more servers in most cases but the same approach only works for database servers when clusters are used. So in what other scenario&#8217;s does cloud computing give us an edge?</p>
<p><span id="more-322"></span><strong>Temporary reporting servers<br />
</strong>Create a one time copy of an existing production database server to run specific heavy reports. This is ideal for monthly reports since the server only needs to be up and running for several hours per month.</p>
<p><strong>Backup database server<br />
</strong>This is a backup solution where the server is only allocated once there is a problem with the primary server which makes a lot of sense because the client only pays for the server once it is used. One downside to this scenario is that the server has to created and loaded with the latest backup which will result in a decent amount of downtime but at least all of this can be automated. A bigger problem is the loss of data since the latest backup.For our high availability sites we have a standby database server replicating all changes from the master so we can switch over at a moment&#8217;s notice without losing any data.</p>
<p><strong>Migrations<br />
</strong>Performing a migration or a system upgrade usually brings some downtime. Promoting a standby system to primary creates a single point of failure so it makes sense to create a remporary standby of the standby.</p>
<p><strong>Development branches and testing environments<br />
</strong>For development branches we usually only need an extra database for a short amount of time although truth be told, those database are not very large in general so we tend to put them on the same development database server anyway. The same is true for testing and QA. These activities usually occur in cycles which means that they are very attractive targets for cloud based servers.</p>
<p><strong>Alternative data center<br />
</strong>Yes, it happened to us once that our datacenter went off line due to a very heavy attack. Instead of finding another data center for these eventualities it could be useful to have cloud based backup servers defined. However, this requires the extra effort of keeping these instances up to date for this eventuality. Additionally, DNS caching will stop the switch from being instantaneous. A geographical load balancing solution would be the answer to that but at that point the cost for preparing for this eventuality will have to be compared to the loss due to down time.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/02/cloud-computing-scenarios-for-database-servers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Extending the slow query log</title>
		<link>http://linuxsysadminblog.com/2009/02/extending-the-slow-query-log/</link>
		<comments>http://linuxsysadminblog.com/2009/02/extending-the-slow-query-log/#comments</comments>
		<pubDate>Wed, 04 Feb 2009 20:54:30 +0000</pubDate>
		<dc:creator>Pim</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[slow query log]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=302</guid>
		<description><![CDATA[Andy posted some very good links recently to video&#8217;s on how to optimize your web site. Although I spend more time optimizing the database you always have to go where the actual performance is lost. For MySQL the place to check for performance issues is the slow query log which I have mentioned in earlier [...]]]></description>
			<content:encoded><![CDATA[<p>Andy posted some very good links recently to video&#8217;s on how to optimize your web site. Although I spend more time optimizing the database you always have to go where the actual performance is lost. For MySQL the place to check for performance issues is the slow query log which I have mentioned in earlier posts. The limitation of this log is that a query has to take at least one second to appear in this log. This skips over queries that are executed thousands if not millions of times and which take less than a second. These queries might have just as much of a performance impact as queries that last several seconds each.</p>
<p>In the article below it shows how to patch the slow query log to track queries that last less than a second. Obviously you don&#8217;t want to have this running continually in production because the amount of logging would be enormous but in test environments or for a limited time in production this can be very useful. Be prepared to analyse some huge amounts of data though.</p>
<p><a href="http://www.mysqlperformanceblog.com/2006/09/06/slow-query-log-analyzes-tools/" target="_blank">http://www.mysqlperformanceblog.com/2006/09/06/slow-query-log-analyzes-tools/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/02/extending-the-slow-query-log/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

