Showing entries 1 to 10 of 93
10 Older Entries »
Displaying posts with tag: data (reset)
Black-Box Auditing: Verifying End-to-End Replication Integrity between MySQL and Redshift

Since Yelp introduced its real-time streaming data infrastructure, “Data Pipeline”, it has grown in scope and matured vastly. It now supports some of Yelp’s most critical business requirements in its mission to connect people with great local businesses. Today, it has expanded into a diverse ecosystem of connectors sourcing data from Kafka and MySQL, and sinking data into Cassandra, Elasticsearch, Kafka, MySQL, Redshift, and S3. To ensure that the whole ecosystem is functioning correctly, Yelp’s Data Pipeline infrastructure is continually growing its repertoire of reliability techniques such as write-ahead logging, two-phase commit, fuzz testing, monkey testing, and black-box auditing to...

Array Ranges in MySQL JSON

Pretend you have a JSON array of data that looks roughly like the following.

mysql> insert into x(y) values('["a","b","c","d"]');
Query OK, 1 row affected (0.10 sec)


You could get all the values from that array using $[*]


mysql> select y->"$[*]" from x;
+----------------------+
| y->"$[*]" |
+----------------------+
| ["a", "b", "c", "d"] |
+----------------------+
1 row in set (0.00 sec)

Or the individual members of the array with an index that starts with zero.


mysql> select y->"$[0]" from x;
+-----------+
| y->"$[0]" |
+-----------+
| "a" |
+-----------+
1 row in set (0.00 sec)


But what about the times you want the last item in the array and really do not want to loop through all the items? How about using …

[Read more]
Two New MySQL Books!

There are two new MySQL books both from Apress Press. One is an in depth master course on the subject and the other is a quick introduction.


ProMySQL NDB Cluster is subtitled Master the MySQL Cluster Lifecycle and at nearly 700 pages it is vital resource to anyone that runs or is thinking about running NDB Cluster. The authors, Jesper Wisborg Krogh and Mikiya Okuno, have distilled their vast knowledge of this difficult subject in a detail packed but easily readable book.  MySQL Cluster is much more complex in many areas than a regular MySQL server and here you will find all those details. If you run MySQL NDB Cluster then you need this book. The partitioning information in chapter 2 is worth the price of the book alone.  I am only a third of the way …

[Read more]
Handy JSON to MySQL Loading Script

JSON in Flat File to MySQL DatabaseSo how do you load that JSON data file into MySQL. Recently I had this question presented to me and I thought I would share a handy script I use to do such work. For this example I will use the US Zip (postal) codes from JSONAR. Download and unzip the file. The data file is named zips.json and it can not be bread directly into MySQL using the SOURCE command. It needs to have the information wrapped in a more palatable fashion.

head zips.json 
{ "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA", "_id" : "01001" }
{ "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA", "_id" : "01002" }
{ "city" : "BARRE", "loc" : [ -72.10835400000001, 42.409698 ], "pop" : 4546, "state" : "MA", "_id" : "01005" }
{ "city" : "BELCHERTOWN", "loc" : [ -72.41095300000001, …
[Read more]
How to interview an amazon database expert

via GIPHY Amazon releases a new database offering every other day. It sure isn’t easy to keep up. Join 35,000 others and follow Sean Hull on twitter @hullsean. Let’s say you’re hiring a devops & you want to suss out their database knowledge? Or you’re hiring a professional services firm or freelance consultant. Whatever the … Continue reading How to interview an amazon database expert →

Using find() with the MySQL Document Store

The VideoThe find() function for the MySQL Document Store is a very powerful tool and I have just finished a handy introductory video. By the way -- please let me have feed back on the pace, the background music, the CGI special effects (kidding!), and the amount of the content. The ScriptFor those who want to follow along with the videos, the core examples are below. The first step is to connect to a MySQL server to talk to the world_x schema (Instructions on loading that schema at the first link above).

\connect root@localhost/world_x

db is an object to points to the world_x schema. To find the records in the countryinfo collection, use db.countryinfo.find(). But that returns 237 JSON documents, too many! So lets cut it down to …

[Read more]
MySQL Document Store Video Series

I am starting a series of videos on the MySQL Document Store. The Document Store allows those who do not know Structured Query Language (SQL) to use a database without having to know the basics of relational databases, set theory, or data normalization. The goal is to have sort 2-3 minute episodes on the various facets of the Document Store including the basics, using various programming languages (Node.JS, PHP, Python), and materializing free form schemaless, NoSQL data into columns for use with SQL.

The first Episode, Introduction, can be found here.

Please provide feedback and let me know if there are subjects you would want covered in the near future.

Random human recognizable dataset

We all do need sometimes to generate raw valid dummy data for our use cases and applications as we start them. Obviously, one can write their own scripts to generate random data, but it is much better to have data, to which human beings can associate with like names, addresses instead of having them filled with random "lorem ipsum" string data :)

While searching for such a tool, I found a site which does exactly this: http://www.generatedata.com/

Documentation: http://benkeen.github.io/generatedata/

This can also be downloaded and installed locally. It supports three types of installations:
- A single, anonymous user account
- A single user account, requires login
- Multiple accounts

Below is the set of wide varied data types it supports for …

[Read more]
A roughneck walk down database alley

via GIPHY I was just responding to some Disqus comments on a recent blog post. Admittedly it had a provocative title Will SQL databases just die already. What do you think? Join 34,000 others and follow Sean Hull on twitter @hullsean. A reader pointed out that some No-SQL databases do support joins. Huh? My face … Continue reading A roughneck walk down database alley →

Will SQL just die already?

With tons of new No-SQL database offerings everyday, developers & architects have a lot of options. Cassandra, Mongodb, Couchdb, Dynamodb & Firebase to name a few. Join 33,000 others and follow Sean Hull on twitter @hullsean. What’s more in the data warehouse space, you have Hadoop, which can churn through terabytes of data and get … Continue reading Will SQL just die already? →

Showing entries 1 to 10 of 93
10 Older Entries »