Showing entries 1 to 10 of 61
10 Older Entries »
Displaying posts with tag: Architecture (reset)
Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

By Min Cai & Mayank Bansal

Cluster management, a common software infrastructure among technology companies, aggregates compute resources from a collection of physical hosts into a shared resource pool, amplifying compute power and allowing for the flexible use of data …

The post Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads appeared first on Uber Engineering Blog.

No-Downtime Cluster Software Upgrades

One important way to protect your data is to keep your Continuent Clustering software up-to-date.

A standard cluster deployment uses three nodes, which allows for no-downtime upgrades along with the ability to have a fully available cluster during maintenance.

Please note that with only two database cluster nodes, there is a window of vulnerability created by leaving zero failover candidates available when the lone slave is taken down for service.

The Best Practices: Staging Performing a No-Downtime Upgrade for a Staging Deployment

When upgrading a Staging-style deployment, all nodes are upgraded at once in parallel via the tools/tpm update command run from inside the staging directory on the staging host.

No Master switch happens, and all layers are restarted to use the new code. …

[Read more]
Picking a Deployment Method: Staging versus INI

Continuent Clustering is an extraordinarily flexible tool, with options at every layer of operation.

In this blog post, we will describe and discuss the two different methods for installing, updating and upgrading Continuent Clustering software.

When first designing a deployment, the question of installation methodology is answered by inspecting the environment and reviewing the customer’s specific needs.

Staging Deployment Methodology

All for One and One for All

Staging deployments were the original method of installing Continuent Clustering, and relied upon command-line tools to configure and install all cluster nodes at once from a central location called the staging server.

This staging server (which could be one of the cluster nodes) requires SSH access to …

[Read more]
Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

By Reza Shiftehfar

Uber is committed to delivering safer and more reliable transportation across our global markets. To accomplish this, Uber relies heavily on making data-driven decisions at every level, from forecasting rider demand during high traffic events to identifying …

The post Uber’s Big Data Platform: 100+ Petabytes with Minute Latency appeared first on Uber Engineering Blog.

Cluster Performance Validation via Load Testing

Your database cluster contains your most business-critical data and therefore proper performance under load is critical to business health. If response time is slow, customers (and staff) get frustrated and the business suffers a slow-down.

If the database layer is unable to keep up with demand, all applications can and will suffer slow performance as a result.

To prevent this situation, use load tests to determine the throughput as objectively as possible.

In the sample load.pl script below, increase load by increasing the thread quantity.

You could also run this on a database with data in it without polluting the existing data since new test databases are created to match each node’s hostname for uniqueness.

Note: The examples in this blog post assume that a Connector is …

[Read more]
Essential Cluster Monitoring Using Nagios and NRPE

In a previous post we went into detail about how to implement Tungsten-specific checks. In this post we will focus on the other standard Nagios checks that would help keep your cluster nodes healthy.

Your database cluster contains your most business-critical data. The slave nodes must be online, healthy and in sync with the master in order to be viable failover candidates.

This means keeping a close watch on the health of the databases nodes from many perspectives, from ensuring sufficient disk space to testing that replication traffic is flowing.

A robust monitoring setup is essential for cluster health and viability – if your replicator goes offline and you do not know about it, then that slave becomes effectively useless because it has stale data.

Nagios Checks The Power of Persistence

One …

[Read more]
Global Multimaster Cluster Monitoring Using Nagios and NRPE

Your database cluster contains your most business-critical data. The slave nodes must be online, healthy and in sync with the master in order to be viable failover candidates.

This means keeping a close watch on the health of the databases nodes from many perspectives, from ensuring sufficient disk space to testing that replication traffic is flowing.

A robust monitoring setup is essential for cluster health and viability – if your replicator goes offline and you do not know about it, then that slave becomes effectively useless because it has stale data.

Big Brother is Watching You! The Power of Nagios

Even while you sleep, your servers are busy, and you simply cannot keep watch all the time. Now, more than ever, with global deployments, it is literally impossible to watch everything all the time.

Enter Nagios, you best big brother ever. As a long-time player in the monitoring market, Nagios has both …

[Read more]
Worldwide Multimaster Cluster Administration Using Tungsten Dashboard

Continuent Clustering support true distributed multimaster clustering. In this topology, there are cross-site replicator services for each remote site. In a 3-site configuration, there are a total of 9 replication streams to manage.

Continuent Clustering also offers a graphical administration tool called the Tungsten Dashboard to help with your management burden. The GUI makes the deployment much easier to visualize and administer.

For our example, we will have a Composite Multimaster dataservice called global with three active, writable member clusters (one per site), east, west and north.

Dashboard Summary View

In the summary, collapsed view, the composite service and all member clusters are listed with associated information and controls. Note that the Type for the composite dataservice global is CompMM

[Read more]
Databook: Turning Big Data into Knowledge with Metadata at Uber

By Luyao Li, Kaan Onuk, Lauren Tindal

From driver and rider locations and destinations, to restaurant orders and payment transactions, every interaction on Uber’s transportation platform is driven by data. Data powers Uber’s global marketplace, enabling more reliable and seamless …

The post Databook: Turning Big Data into Knowledge with Metadata at Uber appeared first on Uber Engineering Blog.

Donkey System

Donkey system is a fully automatic MySQL database change system.
It gives a great help both to the release of the business and the company’s automated operation and maintenance.

Donkey.pptx
Donkey_intro.pdf

Showing entries 1 to 10 of 61
10 Older Entries »