Planet MySQL

Displaying posts with tag: Nagios (reset)

May

2019

The Important Role of a Tungsten Rollback Error

Posted by Continuent on Fri 24 May 2019 21:01 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, NRPE, Mastering Tungsten Replicator

The Question Recently, a customer asked us:

What is the meaning of this error message found in trepsvc.log?

2019/05/14 01:48:04.973 | mysql02.prod.example.com | [east - binlog-to-q-0] INFO pipeline.SingleThreadStageTask Performing rollback of possible partial transaction: seqno=(unavailable)

Simple Overview The Skinny

This message is an indication that we are dropping any uncommitted or incomplete data read from the MySQL binary logs due to a pending error.

The Answer Safety First

This error is often seen before another error and is an indication that we are rolling back anything uncommitted, for safety. On a master this is normally very little and would likely be internal transactions in the trep_commit_seqno table, for example.

As you may know with the replicator we always extract complete transactions, and so this particular message is …

[Read more]

May

2019

Performance Tuning Tungsten Replication to MySQL

Posted by Continuent on Tue 21 May 2019 15:57 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, NRPE, Mastering Tungsten Replicator

The Question Recently, a customer asked us:

Why would Tungsten Replicator be slow to apply to MySQL?

The Answer Performance Tuning 101

When you run trepctl status and see:
appliedLatency : 7332.394
like this on a slave, it is almost always due to the inability for the target database to keep up with the applier.

This means that we often need to look first to the database layer for the solution.

Here are some of the things to think about when dealing with this issue:

Architecture and Environment
√ Are you on bare metal?
√ Using the cloud?
√ Dev or Prod?
√ Network speed and latency?
√ Distance the data needs to travel?
√ Network round trip times? Is the …

[Read more]

May

2019

SSH Differences Between Staging and INI Configuration Methods

Posted by Continuent on Tue 07 May 2019 19:54 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, NRPE, Mastering Tungsten Clustering

The Question Recently, a customer asked us:

If we move to using the INI configuration method instead of staging, would password-less SSH still be required?

The Answer The answer is both “Yes” and “No”

No, for installation and updates/upgrades specifically. Since INI-based configurations force the tpm command to act upon the local host only for installs and updates/upgrades, password-less SSH is not required.

Yes, because there are certain commands that do rely upon password-less SSH to function. These are:

tungsten_provision_slave
prov-sl.sh
multi_trepctl
tpm diag (pre-6.0.5)
tpm diag --hosts (>= 6.0.5)
Any tpm-based backup and restore operations that involve a remote node

Summary The Wrap-Up

In …

[Read more]

Apr

2019

How to Integrate Tungsten Clustering Monitoring Tools with PagerDuty Alerts

Posted by Continuent on Tue 23 Apr 2019 20:07 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, NRPE, Mastering Tungsten Clustering

Overview The Skinny

In this blog post we will discuss how to best integrate various Continuent-bundled cluster monitoring solutions with PagerDuty (pagerduty.com), a popular alerting service.

Agenda What’s Here?

Briefly explore the bundled cluster monitoring tools
Describe the procedure for establishing alerting via PagerDuty
Examine some of the multiple monitoring tools included with the Continuent Tungsten Clustering software, and provide examples of how to send an email to PagerDuty from each of the tools.

Exploring the Bundled Cluster Monitoring Tools A Brief Summary

Continuent provides multiple methods out of the box to monitor the cluster health. The most popular is the suite of Nagios/NRPE scripts (i.e. cluster-home/bin/check_tungsten_*). We also have Zabbix scripts (i.e. cluster-home/bin/zabbix_tungsten_*). Additionally, there is …

[Read more]

Oct

2018

No-Downtime Cluster Software Upgrades

Posted by Continuent on Tue 23 Oct 2018 14:30 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, NRPE, Mastering Tungsten Clustering

One important way to protect your data is to keep your Tungsten Clustering software up-to-date.

A standard cluster deployment uses three nodes, which allows for no-downtime upgrades along with the ability to have a fully available cluster during maintenance.

Please note that with only two database cluster nodes, there is a window of vulnerability created by leaving zero failover candidates available when the lone slave is taken down for service.

The Best Practices: Staging Performing a No-Downtime Upgrade for a Staging Deployment

When upgrading a Staging-style deployment, all nodes are upgraded at once in parallel via the tools/tpm update command run from inside the staging directory on the staging host.

No Master switch happens, and all layers are restarted to use the new code. …

[Read more]

Oct

2018

Picking a Deployment Method: Staging versus INI

Posted by Continuent on Thu 18 Oct 2018 15:05 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, NRPE, Mastering Tungsten Clustering

Tungsten Clustering is an extraordinarily flexible tool, with options at every layer of operation.

In this blog post, we will describe and discuss the two different methods for installing, updating and upgrading Tungsten Clustering software.

When first designing a deployment, the question of installation methodology is answered by inspecting the environment and reviewing the customer’s specific needs.

Staging Deployment Methodology

All for One and One for All

Staging deployments were the original method of installing Tungsten Clustering, and relied upon command-line tools to configure and install all cluster nodes at once from a central location called the staging server.

This staging server (which could be one of the cluster nodes) requires SSH access to all …

[Read more]

Oct

2018

Cluster Performance Validation via Load Testing

Posted by Continuent on Tue 16 Oct 2018 13:20 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, Mastering Continuent Clustering, NRPE

Your database cluster contains your most business-critical data and therefore proper performance under load is critical to business health. If response time is slow, customers (and staff) get frustrated and the business suffers a slow-down.

If the database layer is unable to keep up with demand, all applications can and will suffer slow performance as a result.

To prevent this situation, use load tests to determine the throughput as objectively as possible.

In the sample load.pl script below, increase load by increasing the thread quantity.

You could also run this on a database with data in it without polluting the existing data since new test databases are created to match each node’s hostname for uniqueness.

Note: The examples in this blog post assume that a Connector is …

[Read more]

Oct

2018

Essential Cluster Monitoring Using Nagios and NRPE

Posted by Continuent on Thu 11 Oct 2018 21:21 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, Mastering Continuent Clustering, NRPE

In a previous post we went into detail about how to implement Tungsten-specific checks. In this post we will focus on the other standard Nagios checks that would help keep your cluster nodes healthy.

Your database cluster contains your most business-critical data. The slave nodes must be online, healthy and in sync with the master in order to be viable failover candidates.

This means keeping a close watch on the health of the databases nodes from many perspectives, from ensuring sufficient disk space to testing that replication traffic is flowing.

A robust monitoring setup is essential for cluster health and viability – if your replicator goes offline and you do not know about it, then that slave becomes effectively useless because it has stale data.

Nagios Checks The Power of Persistence

One …

[Read more]

Oct

2018

Global Multimaster Cluster Monitoring Using Nagios and NRPE

Posted by Continuent on Tue 02 Oct 2018 14:39 UTC
Tags:

monitoring, ha, High Availability, Architecture, Nagios, MySQL, multisite, Multimaster, Mastering Continuent Clustering, NRPE

Your database cluster contains your most business-critical data. The slave nodes must be online, healthy and in sync with the master in order to be viable failover candidates.

This means keeping a close watch on the health of the databases nodes from many perspectives, from ensuring sufficient disk space to testing that replication traffic is flowing.

Big Brother is Watching You! The Power of Nagios

Even while you sleep, your servers are busy, and you simply cannot keep watch all the time. Now, more than ever, with global deployments, it is literally impossible to watch everything all the time.

Enter Nagios, you best big brother ever. As a long-time player in the monitoring market, Nagios has both …

[Read more]

Aug

2015

How to Add Remote MySQL Server to Nagios Monitoring

Posted by Kedar Vaijanapurkar on Mon 17 Aug 2015 11:47 UTC
Tags:

Nagios, MySQL monitoring, mysql tools, MySQL, MySQL-Articles, Monitor MySQL with Nagios

We already have seen two articles for setting up MySQL Monitoring with Nagios and Percona Monitoring Tools for Nagios. Those posts covers configuration of nagios on single instance. Though following…

The post How to Add Remote MySQL Server to Nagios Monitoring first appeared on Change Is Inevitable.

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links