“Application designers need to start by thinking about what level of data integrity they need, rather than what they want, and then design their technology stack around that reality. Everyone would like a database that guarantees perfect availability, perfect consistency, instantaneous response times, and infinite throughput, but it´s not possible to create a product with [...]
Pivotal launches. SkySQL and Mony Program merge. And much, much more
Our report on the changes in the MySQL ecosystem is now available for 451 clients and non-clients alike at bit.ly/451mysql
— Matt Aslett (@maslett) April 25, 2013
For 451 Research clients: VMware expands Serengeti’s horizons with updated Hadoop virtualization project bit.ly/17muQFI
— Matt Aslett (@maslett) April 26, 2013
For 451 Research clients: SkySQL, Monty Program merge to support MariaDB following formation of MariaDB Foundation bit.ly/10dsdjf
— Matt Aslett (@maslett) …
[Read more]Continuing on from yesterday, the biggest news that I’ve noted in the past 24 hours:
- The commitment from Oracle’s MySQL team to release a new GA about once every 24 months, with a Developer Milestone Release (DMR), with “GA quality” every 4-6 months. Tomas Ulin announced MySQL 5.7 DMR1 (milestone 11) [download, release notes, manual]. He also announced MySQL Cluster 7.3 DMR2 [download, …
MySQL replication enables data to be
replicated from one MySQL database server (the master) to one or
more MySQL database servers (the slaves). However, imagine the
number of use cases being served if the slave (to which data is
replicated) isn't restricted to be a MySQL server; but it can be
any other database server or platform with replication events
applied in real-time!
This is what the new Hadoop Applier empowers you to
do.
An example of such a slave could be a data warehouse system such
as Apache
Hive, which uses HDFS as a data store. If you have a Hive
metastore associated with HDFS(Hadoop Distributed File System), the Hadoop
Applier can populate Hive tables in real time. Data is …
This is a follow up post, describing the implementation details
of Hadoop Applier, and steps to configure and install it.
Hadoop Applier integrates MySQL with Hadoop providing the
real-time replication of INSERTs to HDFS, and hence can be
consumed by the data stores working on top of Hadoop. You can
know more about the design rationale and per-requisites in the
previous post.
Design and Implementation:
Hadoop Applier replicates rows inserted into a table in MySQL to
the Hadoop Distributed File System(HDFS). It uses an API provided by libhdfs,
a C library to manipulate files in HDFS.
The library comes pre-compiled with Hadoop distributions. It
connects to the MySQL master (or read …
Enabling Real-Time MySQL to HDFS Integration
Batch processing delivered by Map/Reduce remains central to Apache Hadoop, but as the pressure to gain competitive advantage from “speed of thought” analytics grows, so Hadoop itself is undergoing significant evolution. The development of technologies allowing real time queries, such as Apache Drill, Cloudera Impala and the Stinger Initiative are emerging, supported by new generations of resource management with Apache YARN
To support this growing emphasis on real-time operations, we are releasing a new …
[Read more]A little while ago I blogged about (and open sourced) an Impala-powered soccer visualization demo, designed to demonstrate just how responsive Impala queries can be. Since not everyone has the time or resources to run the project themselves, we’ve decided to host it ourselves on an EC2 instance. You can try the visualization; we’ve also opened up the Impala web interface, where you can see query profiles and performance numbers, and Hue (username and password are both ‘test’), where you can run your own queries on the dataset.
Deploying Impala on EC2
While there are …
[Read more]ClearStory sheds light on data analysis service. Illuminating ‘dark data’. More.
For 451 clients: ClearStory bags $9m in series A funding, sheds light on its data analysis service bit.ly/Y6v8sV By Krishna Roy
— Matt Aslett (@maslett) February 12, 2013
For 451 clients: Global IDs makes ‘big data’ MDM play via cloud and Hadoop, touts profitable growth bit.ly/Y6v6kL By Krishna Roy
— Matt Aslett (@maslett) February 12, 2013
ScaleBase releases version 2.0 of its MySQL database scalability software bit.ly/WGtEtN
— Matt Aslett (@maslett) …
[Read more]“With MySQL 5.6, developers can now commingle the “best of both worlds” with fast key-value look up operations and complex SQL queries to meet user and application specific requirements” –Tomas Ulin. On February 5, 2013, Oracle announced the general availability of MySQL 5.6. I have interviewed Tomas Ulin, Vice President for the MySQL Engineering team [...]
Teradata results. Funding for DataXu. The chemistry of data. And more.
For 451 Research clients: Oracle launches major update to MySQL open source database bit.ly/TSONAt
— Matt Aslett (@maslett) February 8, 2013
For 451 clients: Analyzing the chemistry of data bit.ly/TSOV2R By @451wendy Treating sensitive data like dangerous chemicals
— Matt Aslett (@maslett) February 8, 2013
Teradata: Q4 net income $112m on revenue up 10% to $740m, FY net income $419m on revenue up 13% to $2.7bn. bit.ly/14FNS8L (PDF)
— Matt …
[Read more]