Showing entries 31 to 40 of 1008
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: Uncategorized (reset)
LinkedIn China new Social Platform Chitu. Interview with Dong Bin.

“Complicated queries, like looking for second degree friends, is really hard to traditional databases.” –Dong Bin

I have interviewed Dong Bin, Engineer Manager at LinkedIn China. The LinkedIn China development team launched a new social platform — known as Chitu — to attract a meaningful segment of the Chinese professional networking market.

RVZ

Q1. What is your role at LinkedIn China?

Dong Bin: I am an Engineer Manager in charge of the backend services for Chitu. The backend includes all Chitu`s consumer based features, like feeds, chat, event, etc.

Q2. You recently launched a new social platform, called Chitu. Which segment of the …

[Read more]
Optimized State Snapshot Transfers in a WAN Environment

Introduction

Galera Cluster is a robust product that allows you to create geo-distributed database clusters where the nodes are located in geographically separate locations. WAN links tend to be slow while database sizes tend to grow, and the disparity may become painfully obvious in cases where the entire dataset needs to be shipped over the network.

If a node joins the cluster either for the first time or after a period of prolonged downtime, it may need to obtain a complete snapshot of the database from some other node. This operation is called State Snapshot Transfer or SST, and is often reasonably quick in a LAN environment.

In a geo-distributed cluster, however, the dataset may need to travel over a slow WAN link. A transfer that takes seconds over a 10Gb network can take hours over a cable modem.

SST does not happen during the normal operation of the cluster, but may be needed during an outage situation …

[Read more]
Faceted search, why the DevAPI could matter one day

Faceted search or faceted navigation is a highly praised and widely use search pattern. And, it is a great reply to an off the records sales engineering question. MySQL finally has some document store features built-in. A bit of a yawn in 2016. There is a new X DevAPI available with some Connectors. A bit of a yawn technically. But it is a non-technical change of mind: developer centric counts! Sales, all, technical value could show at non-trivial developer tasks, like faceted search.

Todays X DevAPI does not get you very far

There are great stories to tell about the X …

[Read more]
On Using HPE Vertica. Interview with Eva Donaldson.

“After you have built out your data lake, use it. Ask it questions. You will begin to see patterns where you want to dig deeper. The Hadoop ecosystem doesn’t allow for that digging and not at a speed that is customer facing. For that, you need some sort of analytical database.”– Eva Donaldson.

I have interviewed Eva Donaldson, software engineer and data architect at iContact. Main topic of the interview is her experience in using HPE Vertica.

RVZ

Q1. What is the business of iContact?

Eva Donaldson: iContact is a provider of cloud based email marketing, marketing automation and social media marketing products. We offer expert advice, design services, and an award-winning Salesforce email integration and Google Analytics tracking features specializing in …

[Read more]
MySQL Workbench 6.3.7 GA has been released

Dear MySQL users,

The MySQL developer tools team announces 6.3.7 as our GA release for
MySQL Workbench 6.3.

For the full list of changes in this revision, visit
http://dev.mysql.com/doc/relnotes/workbench/en/changes-6-3.html

For discussion, join the MySQL Workbench Forums:
http://forums.mysql.com/index.php?152

Download MySQL Workbench 6.3.7 GA now, for Windows, Mac OS X 10.9+,
Oracle Linux 6 and 7, Fedora 23 and Fedora 24, Ubuntu 16.04
or sources, from:

http://dev.mysql.com/downloads/tools/workbench/

Downsizing to SSDs

System management can be a big deal. At Etsy, we DBAs have been feeling the pain of getting spread too thin. You get a nice glibc vulnerability and have to patch and reboot hundreds of servers. There goes your plans for the week.

We decided last year to embark on a 2016 mission to get better performance, easier management and reduced power utilization through a farm reduction in server count for our user generated, sharded data.

Scoping the Problems

Taking on a massive project to replace your entire fleet of hardware is not an easy sell, especially when on the surface it seems like everything is fine. As DBAs, we experienced challenges that were not readily visible to the rest of the organization or even to end users. The first challenge was to quantify what are the reasons for doing such a large uplift.

Server Quantity

In the beginning, Etsy had one instance of mysqld per server, and one …

[Read more]
Source of Truth or Source of Madness?

This year at Etsy, we spun up a “Database Working Group” that talks about all things data. It’s made up of members from many teams: DBA, core development, development tools and data engineering (Hadoop/Vertica). At our last two meetings, we started talking about how many “sources of information” we have in our environment. I hesitate to call them “sources of truth” because in many cases, we just report information to them, not action data based on them. We spent a session whiteboarding all of of these sources and drawing the relationships between them. It was a bit overwhelming to actually visualize the madness.

A few examples:

  • We use Chef for configuration management and Chef knows about all database server. It made sense for us to build out our monitoring to generate Nagios configuration based on that data from Chef. When …
[Read more]
on removing files

If you remove a file, file system generally just marks in its metadata that previously occupied blocks can now be used for other files – that operation is usually cheap, unless the file has millions of segments (that is such a rare case, only seen in experimental InnoDB features that Oracle thought was a good idea).

This changes a bit with SSDs – if you update underlying device metadata, it can have smarter compaction / grooming / garbage collection underneath. Linux file systems have ‘discard’ option that one should use on top of SSDs – that will extend the life time of their storage quite a bit by TRIM’ing underlying blocks.

Now, each type of storage device will react differently to that, some of them support large TRIM commands, some of them will support high rate of them, some of them won’t, etc – so one has to take that into account when removing files in production environments.

Currently Linux block …

[Read more]
A Grand Tour of Big Data. Interview with Alan Morrison

“Leading enterprises have a firm grasp of the technology edge that’s relevant to them. Better data analysis and disambiguation through semantics is central to how they gain competitive advantage today.”–Alan Morrison.

I have interviewed Alan Morrison, senior research fellow at PwC, Center for Technology and Innovation.
Main topic of the interview is how the Big Data market is evolving.

RVZ

Q1. How do you see the Big Data market evolving? 

Alan Morrison: We should note first of all how true Big Data and analytics methods emerged and what has been disruptive. Over the course of a decade, web companies have donated IP and millions of lines of code that serves as the foundation for what’s being built on top.  In the …

[Read more]
MySQL 5.7 multi-source replication – automatically combining data from multiple databases into one

MySQL’s multi-source replication allows you to replicate data from multiple databases into one database in parallel (at the same time). This post will explain and show you how to set up multi-source replication. (WARNING: This is a very long and detailed post. You might want to grab a sandwich and a drink.)

In most replication environments, you have one master database and one or more slave databases. This topology is used for high-availability scenarios, where the reads and writes are split between multiple servers. Your application sends the writes to the master, and reads data from the slaves. This is one way to scale MySQL horizontally for reads, as you can have more than one slave. Multi-source replication allows you to write to multiple MySQL instances, and then combine the data into one server.

Here is a quick overview of …

[Read more]
Showing entries 31 to 40 of 1008
« 10 Newer Entries | 10 Older Entries »