Showing entries 1 to 6
Displaying posts with tag: squid (reset)
Advanced Squid Caching in Scribd: Cache Invalidation Techniques

Having a reverse-proxy web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache – cached content invalidation.

It is a complex problem that usually consists of two smaller ones: individual cache elements invalidation (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and full cache purges (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I’d like to look at a few techniques we use at …

[Read more]
Solaris On Demand for Sun Partners

You are an Independent Software Vendor, and you want to develop, port, or test your application on Solaris or OpenSolaris? Sun's online Virtual Lab environment - EZQual - makes it easy for you, and it's free!

The Lab features pre-installed SPARC or x86 processor-based Sun servers with development tools (SunStudio and Netbeans), Java, AMP, memcached, Squid, Httplight, PostgreSQL, Solaris or OpenSolaris, and more.

In addition, thanks to the Sun's Secure Global Desktop, accessing this secure development environment over the Internet is just like running Solaris on your own laptop:

Want to know more? Check out the EZQual web page.

Advanced Squid Caching in Scribd: Hardware + Software Used

After the previous post in this caching related series I’ve received many questions on hardware and software configuration of our servers so in this post I’ll describe our server’s configs and the motivation behind those configs.

Hardware Configuration

Since in our setup Squid server uses one-process model (with an asynchronous requests processing) there was no point in ordering multi-core CPUs for our boxes and since we have a lots of pages on the site and the cache is pretty huge all the servers ended up being highly I/O bound. Considering these facts we’ve decided to use the following hardware specs for the servers:

CPU: One pretty cheap dual-core Intel Xeon 5148 (no need in multiple cores or really high frequencies – even these CPUs have ~1% avg load)
RAM: 8Gb (basically …

[Read more]
WebStack 1.5 - Your (L)AMP Stack

Sun's LAMP support is assembled from two pieces: the L is from our Linux/GNU Support (see SunSolve entry), while the AMP comes from the GlassFish WebStack, which, in its latest incarnation includes Apache HTTP Server, lighttpd, memcached, MySQL, PHP, Python, Ruby, Squid, Tomcat, GlassFish (v2.1) and Hudson (features).

The inclusion of Hudson is a bit of an opportunistic move (more on that in a bit), the rest comprises a well tested, integrated, …

[Read more]
Advanced Squid Caching in Scribd: Logged In Users and Complex URLs Handling

It’s been a while since I’ve posted my first post about the way we do document pages caching in Scribd and this approach has definitely proven to be really effective since then. In the second post of this series I’d like to explain how we handle our complex document URLs and logged in users in the caching architecture.

First of all, let’s take a look at a typical Scribd’s document URL: http://www.scribd.com/doc/1/Improved-Statistical-Test.

As we can see, it consists of a document-specific part (/doc/1) and a non-unique human-readable slug part (/Improved-Statistical-Test). When a user comes to the site with a wrong slug in the document URL, we need to make sure we send the user to the correct …

[Read more]
Advanced Squid Caching for Rails Applications: Preface

Since the day one when I joined Scribd, I was thinking about the fact that 90+% of our traffic is going to the document view pages, which is a single action in our documents controller. I was wondering how could we improve this action responsiveness and make our users happier.

Few times I was creating a git branches and hacking this action trying to implement some sort of page-level caching to make things faster. But all the time results weren’t as good as I’d like them to be. So, branches were sitting there and waiting for a better idea.

Few months ago my good friend has joined Scribd and we’ve started thinking on this problem together. As the result of our brainstorming we’ve managed to figure out what were the problems preventing us from doing efficient caching: …

[Read more]
Showing entries 1 to 6