Showing entries 11 to 20 of 27
« 10 Newer Entries | 7 Older Entries »
Displaying posts with tag: Admin-tips (reset)
Advanced Squid Caching in Scribd: Logged In Users and Complex URLs Handling

It’s been a while since I’ve posted my first post about the way we do document pages caching in Scribd and this approach has definitely proven to be really effective since then. In the second post of this series I’d like to explain how we handle our complex document URLs and logged in users in the caching architecture.

First of all, let’s take a look at a typical Scribd’s document URL: http://www.scribd.com/doc/1/Improved-Statistical-Test.

As we can see, it consists of a document-specific part (/doc/1) and a non-unique human-readable slug part (/Improved-Statistical-Test). When a user comes to the site with a wrong slug in the document URL, we need to make sure we send the user to the correct …

[Read more]
ActiveMQ Tips: Flow Control and Stalled Producers Problem

It’s been a few months since we‘ve started actively using ActiveMQ queue server in our project. For some time we had pretty weird problems with it and even started thinking about switching to something else or even writing our own queue server which would comply with our requirements. The most annoying problem was the following: some time after activemq restart everything worked really well and then activemq started lagging, queue started growing and all producer processes were stalling on push() operations. We rewrote our producers from Ruby to JRuby, then to Java and still – after some time everything was in a bad shape until we restarted the queue server.

So, long story short, after a lots of docs and source code reading we’ve found really interesting thing. There is a “feature” added in the recent ActiveMQ release …

[Read more]
Using SSH tunnel connection as a SOCKS5 proxy

Month ago I was on a vacation and as usual even though our hotel provided us with an internet connection on a pretty decent speeds, I wasn’t able to work there because they’ve banned all tcp ports but some major ones (like 80, 21, etc) and I needed to be able to use ssh, mysql, IMs and other non-web software.

After a short research I’ve found a pretty simple to set up and easy to use approach to such a connection problems I’d like to describe here.

First, you’ll need someone (or you can do it before leaving home) to start an ssh daemon on port 80 on one of your servers. I use one of my Slicehub slices for this to permanently have an ability to use it. You can do it like this (if it is a temporary solution):

1
# `which sshd` -p 80

Notice: this `which sshd` was used because on some OSes sshd does not want to start w/o an …

[Read more]
Lighttpd Book from Packt – Great Thanksgiving Present

Many people know me as a nginx web server evangelist. But as (IMHO) any professional I think that it is really rewarding to know as much as possible about all the tools available on the market so every time you need to make a decision on some technical issue, you’d consider all pros and cons based on my own knowledge.

This is why when I received an email from Packt company asking if I’d like to read and review their book on Lighttpd I decided to give it a shot (I usually do not review any books because I do not always have enough time to read a book thoroughly to be able to write a review). So, here are my impressions from this book.

First, when I received the book, I was in doubt: how such a small book could cover so flexible and multi-purpose piece of software like …

[Read more]
ActiveMQ + Ruby Stomp Client: How to process elements one by one

Few months ago I’ve switched one of our internal projects from doing synchronous database saves of analytics data to an asynchronous processing using starling + a pool of workers. This was the day when I really understood the power of specialized queue servers. I was using database (mostly, MySQL) for this kind of tasks for years and sometimes (especially under a highly concurrent load) it worked not so fast… Few times I worked with some queue servers, but those were either some small tasks or I didn’t have a time to really get the idea, that specialized queue servers were created just to do these tasks quickly and efficiently.

All this time (few months now) I was using starling noticed really bad thing in how it works: if workers die (really die, or lock on something for a long time, or just start lagging) …

[Read more]
Found an Ideal I/O Scheduler for my MySQL boxes

Today I was doing some work on one of our database servers (each of them has 4 SAS disks in RAID10 on an Adaptec controller) and it required huge multi-thread I/O-bound read load. Basically it was a set of parallel full-scan reads from a 300Gb compressed innodb table (yes, we use innodb plugin). Looking at the iostat I saw pretty expected results: 90-100% disk utilization and lots of read operations per second. Then I decided to play around with linux I/O schedulers and try to increase disk subsystem throughput. Here are the results:

Scheduler Reads per second
cfq 20000-25000
noop 35000-60000
deadline 33000-45000
[Read more]
Using Sphinx for Non-Fulltext Queries

How often do you think about the reasons why your favorite RDBMS sucks? Last few months I was doing this quite often and yes, my favorite RDBMS is MySQL. The reason why I was thinking so because one of my recent tasks at Scribd was fixing scalability problems in documents browsing.

The problem with browsing was pretty simple to describe and as hard to fix - we have large data set which consists of a few tables with many fields with really bad selectivity (flag fields like is_deleted, is_private, etc; file_type, language_id , category_id and others). As the result of this situation it becomes really hard (if possible at all) to display documents lists like “most popular 1-10 pages PDF documents in Italian language from the category “Business” (of course, non-deleted, …

[Read more]
32bit VS 64bit - what do you use?

Hello my dear readers.

Today I have a question for all of you. What platforms (32bit or 64 bit) do you use for your servers with more than 4Gb RAM? I’m asking because recently we‘ve hit few really weird bugs in Linux kernels 2.6.18 to 2.6.22 and all those bugs were PAE-related. Now I’d really love to move all machines to 64-bit, but I’m in doubt because we don’t know too much about Rails stack (ruby, mongrel, haproxy) on 64-bit platforms (all our DB boxes are 64-bit of course).

So, please drop me a line if you have any experience (negative or positive) with Rails platform on 64-bit machines. I’d really appreciate your help.

Puppet - Admin’s Best Friend

If you’ve ever worked in companies with 5-10+ servers and it was your responsibility to install new boxes, change some configuration files and install new software on many boxes you definitely know how painful this work is. Every time you need to change something on 3-5-100 boxes, you go there and make those changes. Most experienced of us used some weird scripts to perform some task on many boxes or used some stuff like dsh. Even with those tricks I’d never wish this work to anyone.

While I was working in Galt, I’ve asked our junior admin to check out puppet and try to use it on our servers. After a week of screaming he’s managed to install and configure it and …

[Read more]
Small Tip: How to set up two interface Xen machine

This will be one of those posts I’d like to publish primarily to be able to coma back later and check it out instead of reading docs again

So, we have a server with two (or more) network interfaces are we need to be able to use more than one interface in our VDS machines. How do we set it up?

(more…)

Showing entries 11 to 20 of 27
« 10 Newer Entries | 7 Older Entries »