April 8, 2011

Performance Engineering – SSD file systems.

Solid state device threatens to challenge and change existing computing paradigms.

While traditional disk access times are of the order of a few milli seconds, ssd access times are under 100 micro seconds for reads and writes respectively. Speeding up by a factor of 100. And that is significant.

Operating system components have evolved over the last 4 decades at a much slower speed. For an enterprise platform like IBM AIX or HPUX it takes A few years (my guess is a minimum 6) to push a new technology. The cycle is as follows – a new hardware technology is invented. OS vendors take a few years to adopt and evangelize. Enterprise customers longer to test, adopt and deploy.

SSD promises to deliver better performance by lowering IO latency and increasing throughput. File systems have evolved to do the same. Specialized caches have been invented to speed up performance. For example directory name lookup cache, page cache, inode cache, large directory cache, buffer cache and so on.

Quite a lot of focus is on being clever with reads and writes of application data. Engineers go to great extents to squeeze the last bit of performance out of the system. Sadly performance is not a main consideration during implementation (functionality is) and is often times applied as an afterthought.

The result is hacks rather than elegant solutions to performance issues.

Coming back to SSDs – a large portion of the file system implementation has to be re looked at and parts of it have to be thrown away completely. Especially true when complete filesystems are laid out on SSD. We need to look at how filesystems can take advantage of SSDs.

Advertisements
April 7, 2011

Putting bugzilla to work – automating reports

If you are stuck with zilla like us and miss having an executive dashboard … How many issues were reported resolved verified etc in a specified date range … It is not possible in buzilla 3.

We started with 3 a few years back and stayed with it … It is not easy to upgrade and we would not risk it.

Since we have a fair bit of web expertise and specially with python – it was fairly simple to put together a python script which did the following.

– login to bugzilla via the web
– run a query and collect csv results
– process the results into required statistics
– send html formatted mail

Now we get a daily, weekly, monthly and quarterly email report indicating changes in key metrics.

In a subsequent post I will share some of the key code snippets.

April 7, 2011

Balance File System Caching and Flushing policies

Most file-systems have intelligent caching policies. The policies are designed to increase the throughput and decrease IO latency rates thereby providing faster service times to the users and applications. Based on the nature of the workload, appropriate caching policies can be set to achieve maximum cache-hit rates.

However, this works as long as there is enough memory to store revisited pages. Once the number of pages cached exceeds the set memory limit, the file-system decides to reclaim space by flushing older pages to the storage. If flushing is not done frequently, a lot of data may suddenly be dumped on the disk leading to storage bottleneck.

Depending on storage bandwidth, flushing large amount of file-system data can lead to very large service times for the users or the application. For all-round good performance one thus needs to also look at how frequently and how much of data is flushed to the storage. Just like caching policies should take into account the nature of workload, flushing policies should take into account the nature of storage and storage i/o bandwidths.

A good sustained file-system performance is possible only when both, caching and flushing, policies are set optimally.

April 7, 2011

Performance Engineering: Watch out for dynamic CPU revving.

These days, most machines come with advance power saving options. To save power, machines reduce cpu speeds whenever it is not in great demand. Even though the cpu may have a specified speed of 2 or 3 GHz, these speeds are reached only when there is enough load on the system and when full processing power is needed.

This is very important to remember which discussing or measuring system performance. Since the performance can drastically  change with cpu speed, cpu speed must be held constant while doing any performance measurement. Typically, you will need to switch of the dynamic cpu speed revving by setting appropriate flags in the power-saving setup files.

Measuring maximum performance of a system or comparing performance of two software products is meaningful only when cpu power is held constant.

April 5, 2011

Performance Engineering: Best Practices

Documenting best practices is often a sign of failure. In the software development environment one often starts to document best practices. The intention here is to make everyone in the team aware of how good work can be done.

 

However over time, writing down best practices becomes a norm and they become a set format in which the work needs to be done. Over time,  as the team grows, best practices are used as rules. However the scope and the magnitude of the teams work might have far overgrown the original mandate. And so, the best practices may not necessarily be applicable. In such cases writing down best practices sounds like admitting that the team members cannot evolve appropriate strategies to deal with contemporary nature of projects.

April 5, 2011

Performance Engineering: True measure of code performance

If you understand throughput as the effective number of transactions that users experience per second from your hardware software stack then you would ideally want maximum possible transactions per second from your setup.

 

We understand that any software stack would ultimately use CPU cycles to process all these transactions. Hence, the transactions per second delivered per CPU cycle is the true measure of the performance of your system.

 

Ideally we should be measuring throughput of a system and how many CPU cycles does it take to deliver that throughput. Inefficient code would spend many more CPU cycles to deliver X number of transactions per second aka tps. Optimized and efficient code would deliver the same transactions per second using far lesser CPU cycles.

 

Thus a good measure of system performance is throughput per CPU usage. This is the number to watch.

April 5, 2011

Performance Engineering: IO Latency Vs Bandwidth

It is important to understand if the performance problem is due to the latency or the bandwidth. These two are very different phenomena but they may have similar looking outcomes.
In case of latency a transaction or an IO is taking a long time to complete. The question would be: is the time taken to complete the transaction reasonable or is it longer than what is desired. In many cases, the latency is unavoidable but one can use several techniques such as filling the pipeline or sending the IO in blocks. And storing in the buffer to hide the delay due to latency.
Bandwidth on the other hand, sometimes the performance is low because either the bandwidth is not utilized fully or it is inadequate. Even in this case one would observe low performance. To improve the performance which is limited by bandwidth, one has to understand and match the bandwidth of various devices along the IO path.  Several techniques can be used to optimize the bandwidth. Such as load balancing, multi path and adaptively adjusting the parameters.
July 2, 2010

Integrating Plone and Dojo

Note: This makes sense only if you want to ship a custom dojo build with your Plone application or if you find it slow and annoying to go get dojo from one of the CDNs. If you don’t have a custom dojo build (if you don’t know what I am talking about, then you certainly don’t need one) just get dojo from the CDN – works beautifully with Plone.

We recently had to integrate the amazing  dojo toolkit with Plone. The problem we encountered was pretty simple – Plone does not serve any files/objects etc which have their names  starting with an underscore character ‘_’. Dojo has plenty of javascript modules beginning with ‘_’.

The existing solutions ranged from the brute force to the exotic and excessive.

The solution we settled down to (after a little bit of thinking) is to setup dojo behind apache and serve the files from there. Here’s how to do it.

  1. Get the dojo sdk
  2. extract it to a directory like /home/xxx/dojo and do a cross domain build (see dojo documentation on how to do this).
  3. Setup apache if you haven’t already done this. Don’t be scared – setting up apache these days just means installing the package from your favorite distribution.
  4. Open up /etc/apache2/apache2.conf (This is where it resides on my ubuntu server) and add the following at the end
Alias /dojo/ "/home/xxx/dojo/"
<Directory "/home/xxx/dojo">
Options Indexes MultiViews FollowSymLinks
AllowOverride None
Order deny,allow
Deny from all
Allow from 127.0.0.0/255.0.0.0 ::1/128
</Directory>

That will setup your dojo files to be served to the localhost – for development purposes it is ok. You can eventually change it to serve your intranet/internet users as well.

May 27, 2010

Python and Atomic Bomb

  • One is infinitely simpler than the other.
  • Both are devastating in the wrong hands.
May 27, 2010

Plone Vs Drupal Vs *

Python is a religion at BYGSoft. We love everything about it -period. Given that, we naturally propose Plone to anyone who wants us to develop a CMS based App.

Normally our clients don’t challenge our technology proposals. This one case I had the client ask me, Plone? Python? isn’t that going to be expensive to maintain? Why not PHP? I knew exactly where he was coming from.

Naturally there is varied opinion – esp on the Internet almost like vi vs emacs – and no prizes for guessing what we like at BYGSoft.

I am not sure what convinced the clients, but we believe that the best answer out there and the one we concur with is

“irrespective of the language or the framework or any other tool – the solution is only going to be as good as the people mean it to be”.

If you have a talented bunch of programmers it does not matter – you will still get a good solution out.

Anyone who is familiar with Solaris or AIX internals will tell you that long long before C++ took center stage as an OOD platform, they had all the OOD covered in C – in the kernel.