Showing entries 1 to 10 of 31
10 Older Entries »
Displaying posts with tag: cmt (reset)
Scaling Memcached: 500,000+ Operations/Second with a Single-Socket UltraSPARC T2

A software-based distributed caching system such as memcached is an important piece of today's largest Internet sites that support millions of concurrent users and deliver user-friendly response times. The distributed nature of memcached design transforms 1000s of servers into one large caching pool with gigabytes of memory per node. This blog entry explores single-instance memcached scalability for a few usage patterns.

Table below shows out-of-the-box (no custom OS rewrites or networking tuning required) performance with 10G networking hardware and one single-socket UltraSPARC T2-based server with 8 cores and 8 threads per core (64 threads on a chip). All runs are done with a single memcached instance and 40 worker threads so that about 3 cores (24 threads) are used for the critical networking stack that is also heavily parallelized. 40+24 threads is a nice balance for this …

[Read more]
Scaling Memcached: 500,000+ Operations/Second with a Single-Socket UltraSPARC T2

A software-based distributed caching system such as memcached is an important piece of today's largest Internet sites that support millions of concurrent users and deliver user-friendly response times. The distributed nature of memcached design transforms 1000s of servers into one large caching pool with gigabytes of memory per node. This blog entry explores single-instance memcached scalability for a few usage patterns.

Table below shows out-of-the-box (no custom OS rewrites or networking tuning required) performance with 10G networking hardware and one single-socket UltraSPARC T2-based server with 8 cores and 8 threads per core (64 threads on a chip). All runs are done with a single memcached instance and 40 worker threads so that about 3 cores (24 threads) are used for the critical networking stack that is also heavily parallelized. 40+24 threads is a nice balance for this …

[Read more]
Sequential Web Frontends/Browsers are the Killer

Response times of any web application are very critical for the end-user experience. Steve Souders takes a detailed look at several large Web sites and concludes that 80-90% of the end-user response time is spent on the frontend, i.e., program code that is running inside your Web browser.

Traditional parallelization techniques and caching are without a doubt very effective in the design of scalable Web servers, databases, operating systems and other mission-critical software and hardware components. Assume that all these components are perfectly parallel and optimized, Amdhal's law still suggests that response time improvements will be very modest, or barely measurable.

Sequential Web Frontends/Browsers are the Killer

Response times of any web application are very critical for the end-user experience. Steve Souders takes a detailed look at several large Web sites and concludes that 80-90% of the end-user response time is spent on the frontend, i.e., program code that is running inside your Web browser.

Traditional parallelization techniques and caching are without a doubt very effective in the design of scalable Web servers, databases, operating systems and other mission-critical software and hardware components. Assume that all these components are perfectly parallel and optimized, Amdhal's law still suggests that response time improvements will be very modest, or barely measurable.

Real-World Concurrency

One interesting and useful paper on real-world concurrency by Bryan Cantrill and Jeff Bonwick.

Abstract: In this look at how concurrency affects practitioners in the real world, Cantrill and Bonwick argue that much of the anxiety over concurrency is unwarranted. Most developers who build typical MVC systems can leverage parallelism by combining pieces of already concurrent software such as database and operating systems (i.e., concurrency through architecture), rather than by writing multithreaded code themselves. And for those who actually must deal with threads and locks, the authors include a helpful list of best practices to help minimize the pain.

Real-World Concurrency

One interesting and useful paper on real-world concurrency by Bryan Cantrill and Jeff Bonwick.

Abstract: In this look at how concurrency affects practitioners in the real world, Cantrill and Bonwick argue that much of the anxiety over concurrency is unwarranted. Most developers who build typical MVC systems can leverage parallelism by combining pieces of already concurrent software such as database and operating systems (i.e., concurrency through architecture), rather than by writing multithreaded code themselves. And for those who actually must deal with threads and locks, the authors include a helpful list of best practices to help minimize the pain.

MySQL 5.4 Sysbench Scalability on 64-way CMT Servers

As a followup to my MySQL 5.4 Scalability on 64-way CMT Servers blog, I'm posting MySQL 5.4 Sysbench results on the same platform. The tests were carried out using the same basic approach (i.e. turning off entire cores at a time) - see my previous blog for more details.

The Sysbench version used was 0.4.8, and the read-only runs were invoked with the following command:

sysbench --max-time=300 --max-requests=0 --test=oltp --oltp-dist-type=special --oltp-table-size=10000000 \
   --oltp-read-only=on --num-threads=[NO_THREADS] run

The "oltp-read-only=on" parameter was omitted for the read-write tests. The my.cnf file listed in my previous blog was also used unchanged for these tests.

Here is the data presented graphically. Note that the number of vCPUs is the same as the number of active threads up to 64. Active threads beyond …

[Read more]
MySQL 5.4 Scalability on 64-way CMT Servers

Today Sun Microsystems announced MySQL 5.4, a release that focuses on performance and scalability. For a long time it's been possible to escape the confines of a single system with MySQL, thanks to scale-out technologies like replication and sharding. But it ought to be possible to scale-up efficiently as well - to fully utilize the CPU resource on a server with a single instance.

MySQL 5.4 takes a stride in that direction. It features a number of performance and scalability fixes, including the justifiably-famous Google SMP patch along with a range of other fixes. And there's plenty more to come in future releases. For specifics about the MySQL 5.4 fixes, check out Mikael Ronstrom's blog.

So how well does MySQL 5.4 scale? To help answer the question I'm going to take a look at some performance data from one of Sun's CMT systems based on the UltraSPARC T2 chip. This …

[Read more]
MySQL 5.4 Sysbench Scalability on 64-way CMT Servers

As a followup to my MySQL 5.4 Scalability on 64-way CMT Servers blog, I'm posting MySQL 5.4 Sysbench results on the same platform. The tests were carried out using the same basic approach (i.e. turning off entire cores at a time) - see my previous blog for more details.

The Sysbench version used was 0.4.8, and the read-only runs were invoked with the following command:

sysbench --max-time=300 --max-requests=0 --test=oltp --oltp-dist-type=special --oltp-table-size=10000000 \\
   --oltp-read-only=on --num-threads=[NO_THREADS] run

The "oltp-read-only=on" parameter was omitted for the read-write tests. The my.cnf file listed in my previous blog was also used unchanged for these tests.

Here is the data presented graphically. Note that the number of vCPUs is the same as the number of active threads up to 64. Active threads beyond …

[Read more]
MySQL 5.4 Sysbench Scalability on 64-way CMT Servers

As a followup to my MySQL 5.4 Scalability on 64-way CMT Servers blog, I'm posting MySQL 5.4 Sysbench results on the same platform. The tests were carried out using the same basic approach (i.e. turning off entire cores at a time) - see my previous blog for more details.

The Sysbench version used was 0.4.8, and the read-only runs were invoked with the following command:

sysbench --max-time=300 --max-requests=0 --test=oltp --oltp-dist-type=special --oltp-table-size=10000000 \\
   --oltp-read-only=on --num-threads=[NO_THREADS] run

The "oltp-read-only=on" parameter was omitted for the read-write tests. The my.cnf file listed in my previous blog was also used unchanged for these tests.

Here is the data presented graphically. Note that the number of vCPUs is the same as the number of active threads up to 64. Active threads beyond …

[Read more]
Showing entries 1 to 10 of 31
10 Older Entries »