The Open Source Technologies that Power BisManOnline

BisManOnline.com uses a number of open source software components to power its increasingly growing platform.  The benefits of Open Source are primarily cost, flexibility, documentation, and a large community of support.

Follow us on our BisManOnline Developers Facebook Page - The page dedicated to our geeky tech stuff.

Open Source allows us to scale faster and at a lower cost than if we were to choose off the shelf, closed-source software.

Below are a few of the technologies we use, how we use them and why:


Apache Web Server

Our front end web servers and our ad server use PHP on an Apache Webserver running on a Red Hat Linux OS.   Red Hat was chosen because of its level of support offered by our hardware technology partner BTINet.

Apache is an extremely flexible web server package and is utilized by over 66% of all websites on the Internet.   Our implementation features some customization and settings that reduce memory load by using only modules required for our service.  In addition, in order to support PHP, we use Apache's Pre-Fork MPM.


Lighttpd

Image and static content serving is not a strong point of the Apache web server.  We evaluated a number of platforms and decided on Lighttpd as the solution for serving all static content including images, javascript and CSS.   Lighttpd is a very fast and efficient web server designed with a low memory footprint.  You can view the results of their performance comparison to Apache here.

Out Lighttpd implementation uses a dedicated server for serving our static content.  We keep the server heavy on ram to increase the percentage of items that can be cached in memory to reduce disk-reads.


PHP

From the ground up and since its early days, BisManOnline.com has run on PHP for its code base.  A whopping 77% of all websites on the Internet use some version of PHP as their scripting language.   By design, PHP is extremely customizable, flexible, and has broad community support and documentation.

Although many websites may use an off the shelf framework such as Wordpress, Drupal, CakePHP, etc., we chose to build our own framework from scratch.   Loosely built around the MVC concept (Model - View - Controller) a custom framework gives us more flexibility in tuning the system for performance and our specific requirements.


MySQL

The MySQL database system provides us our primary data store for BisManOnline's back-end.  MySQL is also used by our ad serving platform for banner ad deliveries and statistics.

We use a combination of MyISAM and InnoDB table types for our data storage depending on various factors (see this comparison of MyISAM vs InnoDB)  and we make use of a large amount of RAM on the DB Server in order to hold as many reads as possible off of the disks and in memory.   In fact, even though our data store is huge and we are hitting it with sometimes thousands of queries per second, our disk-activity on the server barely registers on our real-time performance graphs.

We also implement a custom caching solution within MySQL for handling the majority of our ad list and search requests.   The caching solution ensures that only active data with high-use rates is cached in large tables that prevent us from having to use SQL Joins in real-time.   Because we are pulling the majority of our read and search queries from the cached tables, this speeds our updates and inserts on the base-tables.


APC PHP Cache

One of the biggest performance improvements we made early on was the implementation of the APC PHP Caching System.   The problem with PHP is that it is a scripted language and the PHP Processor compiles and executes the PHP Code in real-time on every request.   The read/compiling of a PHP script on every load is a waste of resources because the PHP Files themselves rarely change except when code changes/revisions are loaded.  The APC system pre-compiles the PHP code and stores it in memory so on subsequent requests the PHP Processor does not need to read the files from disk and re-compile them.  Read more on PHP Accelerators here.

This did add some complexity to our code-change process, but the performance improvement and drop in CPU usage on the web server and the ad server was astronomical.


Memcached

Caching, caching and more caching.   Usually the biggest bottlenecks in any web application are disks and databases.   Most relational databases are designed for storing and retrieving massive amounts of data, unfortunately, under heavy load with many reads, writes and join's, these requests start to queue waiting for row and table locks, and introduce large performance issues.   Most web developers will find, over time, that they are issuing queries that return the exact same data over and over.

Caching solutions allow the developer to cache the result of a query, and store it for later retrieval without having to make a request to the database.

For years our system used a disk-based file-caching solution.  This, however, introduced new performance problems as we had to continue to read from disk.  Also, a file-caching solution doesn't work when you load-balance your webservers in a cluster and you need to share the cache across more than one node.

Memcached allows us to cache data, counters, etc. in-memory on a centralized shared server that our webserver nodes can access.  In fact, by using memcached to fully cache an entire page, we can process and deliver a fully-cached version of our home page in under 4 milliseconds (server time)  That's 0.004 seconds!.   Roughly 25% of our page-views are delivered fully from a memcached store.

Things like SQL Query results, unread message counters ( which need to be checked in realtime ), ad view counters and other statistics are now stored in-memory for super fast retrieval.   Each memcached server, in a proper configuration can handle anywhere from 50,000 to 120,000 requests per second.   Memcached also has built in scaling that allows you to add nodes to the cluster.  Memcached's ability to route the requests to the cluster is built in meaning you can scale to 20 memcached servers with a single code-change.



OpenX

Our ad serving platform is based on the open source technologies from OpenX.   Although we have long since forked-off the OpenX source into our own customized version.   Our enhancements include updates to their ad allocation algorithm (math-geek alert) as well as enhancements for mobile ad serving.


More

In addition to these major technologies, we use a number of other open source products, including ImageMagick, FCKEditor, Sendmail, and more.  And although not technically open source, we use products such as Google Analytics and Quantcast for data measurement, as well as api implementations from Google Maps, the Facebook API, the Ebay Developer Network, and YouTube.


For even more geeky tech updates, follow us on our BisManOnline Developers Facebook Page -