Showing entries 11 to 17
« 10 Newer Entries
Displaying posts with tag: parallelism (reset)
Intra-query parallelism for MySQL queries without an appliance or closed source database

*edit* I want to point out that this test was done on a single database server which used MySQL partitioning. This is a demonstration of how Shard-Query can improve performance in non-sharded databases too.*edit*.

Over the weekend I spent a lot of time improving my new Shard-Query tool (code.google.com/p/shard-query) and the improvements can equate to big performance gains on partitioned data sets versus executing the query directly on MySQL.


I'll explain this graph below, but lower is better (response time) and Shard-Query is the red line.

MySQL understands that queries which access data in only certain partitions don't have to read the rest of the table. This partition elimination works well, but MySQL left a big optimization out of partitioning: …

[Read more]
Scaling Memcached: 500,000+ Operations/Second with a Single-Socket UltraSPARC T2

A software-based distributed caching system such as memcached is an important piece of today's largest Internet sites that support millions of concurrent users and deliver user-friendly response times. The distributed nature of memcached design transforms 1000s of servers into one large caching pool with gigabytes of memory per node. This blog entry explores single-instance memcached scalability for a few usage patterns.

Table below shows out-of-the-box (no custom OS rewrites or networking tuning required) performance with 10G networking hardware and one single-socket UltraSPARC T2-based server with 8 cores and 8 threads per core (64 threads on a chip). All runs are done with a single memcached instance and 40 worker threads so that about 3 cores (24 threads) are used for the critical networking stack that is also heavily parallelized. 40+24 threads is a nice balance for this …

[Read more]
Scaling Memcached: 500,000+ Operations/Second with a Single-Socket UltraSPARC T2

A software-based distributed caching system such as memcached is an important piece of today's largest Internet sites that support millions of concurrent users and deliver user-friendly response times. The distributed nature of memcached design transforms 1000s of servers into one large caching pool with gigabytes of memory per node. This blog entry explores single-instance memcached scalability for a few usage patterns.

Table below shows out-of-the-box (no custom OS rewrites or networking tuning required) performance with 10G networking hardware and one single-socket UltraSPARC T2-based server with 8 cores and 8 threads per core (64 threads on a chip). All runs are done with a single memcached instance and 40 worker threads so that about 3 cores (24 threads) are used for the critical networking stack that is also heavily parallelized. 40+24 threads is a nice balance for this …

[Read more]
Sequential Web Frontends/Browsers are the Killer

Response times of any web application are very critical for the end-user experience. Steve Souders takes a detailed look at several large Web sites and concludes that 80-90% of the end-user response time is spent on the frontend, i.e., program code that is running inside your Web browser.

Traditional parallelization techniques and caching are without a doubt very effective in the design of scalable Web servers, databases, operating systems and other mission-critical software and hardware components. Assume that all these components are perfectly parallel and optimized, Amdhal's law still suggests that response time improvements will be very modest, or barely measurable.

Sequential Web Frontends/Browsers are the Killer

Response times of any web application are very critical for the end-user experience. Steve Souders takes a detailed look at several large Web sites and concludes that 80-90% of the end-user response time is spent on the frontend, i.e., program code that is running inside your Web browser.

Traditional parallelization techniques and caching are without a doubt very effective in the design of scalable Web servers, databases, operating systems and other mission-critical software and hardware components. Assume that all these components are perfectly parallel and optimized, Amdhal's law still suggests that response time improvements will be very modest, or barely measurable.

Real-World Concurrency

One interesting and useful paper on real-world concurrency by Bryan Cantrill and Jeff Bonwick.

Abstract: In this look at how concurrency affects practitioners in the real world, Cantrill and Bonwick argue that much of the anxiety over concurrency is unwarranted. Most developers who build typical MVC systems can leverage parallelism by combining pieces of already concurrent software such as database and operating systems (i.e., concurrency through architecture), rather than by writing multithreaded code themselves. And for those who actually must deal with threads and locks, the authors include a helpful list of best practices to help minimize the pain.

Real-World Concurrency

One interesting and useful paper on real-world concurrency by Bryan Cantrill and Jeff Bonwick.

Abstract: In this look at how concurrency affects practitioners in the real world, Cantrill and Bonwick argue that much of the anxiety over concurrency is unwarranted. Most developers who build typical MVC systems can leverage parallelism by combining pieces of already concurrent software such as database and operating systems (i.e., concurrency through architecture), rather than by writing multithreaded code themselves. And for those who actually must deal with threads and locks, the authors include a helpful list of best practices to help minimize the pain.

Showing entries 11 to 17
« 10 Newer Entries