Showing entries 1 to 10 of 18
8 Older Entries »
Displaying posts with tag: ETL (reset)
What products & improvements are new on AWS?

Amazon is releasing new products & services to it’s global cloud compute network at a rate that has all of our heads spinning. Join 32,000 others and follow Sean Hull on twitter @hullsean. Here’s new stuff worth mentioning around databases & data. 1. For ETL – AWS GLUE Moving data from your transactional MySQL or … Continue reading What products & improvements are new on AWS? →

Using JSON’s Arrays for MariaDB Dynamic Columns

The JSON format includes the concept of array. A JSON object cant contain an attribute of array type. We have seen that we can use the MariaDB CONNECT Storage Engine provided UDFs (user defined functions) to implement dynamic columns.

Let us create a table with a text column containing a a JSON string and let [...]

MariaDB CONNECT Storage Engine JSON Autodiscovery

The MariaDB CONNECT storage engine offers access to JSON file and allows you to see a external JSON file as a MariaDB table. A nice feature of the CONNECT storage Engine is its capability to auto discover a table structure when the table correspond to external data. In our case the CONNECT storage engine will automatically [...]

Log Buffer #429: A Carnival of the Vanities for DBAs

This Log Buffer Edition gathers a wide sample of blogs and then purifies the best ones from Oracle, SQL Server and MySQL.

Oracle:

  • If you take a look at the “alter user” command in the old 9i documentation, you’ll see this: DEFAULT ROLE Clause.
  • There’s been an interesting recent discussion on the OTN Database forum regarding “Index blank blocks after a large update that was rolled back.”
  • 12c Parallel Execution New Features: 1 SLAVE distribution
  • Index Tree Dumps in Oracle 12c …
[Read more]
Resources for Database Clusters: Performance Tuning for HAProxy, Support for MariaDB 10, Technical Blogs & More

August 28, 2014 By Severalnines Check Out Our Latest Resources for MySQL, MariaDB & MongoDB Clusters

 

Here is a summary of resources & tools that we’ve made available to you in the past weeks. If you have any questions on these, feel free to contact us!

 

New Technical Webinars

 

Performance Tuning of HAProxy for Database Load Balancing

09 September 2014 - with Baptiste Assmann of HAProxy Technologies

Do you know what HAProxy can tell you about your application and database instances? Do you know the difference …

[Read more]
Big Data Integration & ETL - Moving Live Clickstream Data from MongoDB to Hadoop for Analytics

June 16, 2014 By Severalnines

MongoDB is great at storing clickstream data, but using it to analyze millions of documents can be challenging. Hadoop provides a way of processing and analyzing data at large scale. Since it is a parallel system, workloads can be split on multiple nodes and computations on large datasets can be done in relatively short timeframes. MongoDB data can be moved into Hadoop using ETL tools like Talend or Pentaho Data Integration (Kettle).

 

In this blog, we’ll show you how to integrate your MongoDB and Hadoop datastores using Talend. We have a MongoDB database collecting clickstream data from several websites. We’ll create a job in Talend to extract the documents from MongoDB, transform and then load them into HDFS. We will also show you how to schedule this job to be executed every 5 minutes.

 

Test Case

 

We have an application …

[Read more]
MariaDB CONNECT Storage Engine as an ETL (or ELT) ?

The MariaDB CONNECT Storage Engine allows to access heterogeneous data sources. In my previous post I show you how to use the MariaDB CONNECT Storage Engine to access an Oracle database. This is quite easy through the CONNECT Storage Engine ODBC table type.

For most architectures where heterogeneous databases are involved an ETL (Extract-Transform-Load) is [...]

Exploring SAP HANA – Powering Next Generation Analytics

SAP HANA , having entered the data 2.0/3.0 space at the right time, has been getting traction lately; and there will be lot of users like me who wants to[...]

Take the time now for gains later.

Regardless of which data warehouse paradigm you follow or have heard of, Kimball or Inmon. We should all agree that the data warehouse is often a requirement for business. Different people want different things and they all want it from your data. The data warehouse is not a new concept and yet they are over looked at times. A warehouse is never complete, it is an evolving entity that adjusts with the requirements it is given. It is up to us to make sure that the access to enterprise data in an accurate and timely manner is easy and the standard. MySQL can handle a data warehouse perfectly.
MySQL databases are designed in numerous ways, some good some bad. A warehouse can take that data and organize it for the best use of others. What concerns or issues do you often hear when it comes to gathering data from your database? It is easy for all of your developers to query and get the same data? How many ways does your company slice and dice data? …

[Read more]
HPCC vs Hadoop at a glance

Update

Since this article was written, HPCC has undergone a number of significant changes and updates. This addresses some of the critique voiced in this blog post, such as the license (updated from AGPL to Apache 2.0) and integration with other tools. For more information, refer to the comments placed by Flavio Villanustre and Azana Baksh.

The original article can be read unaltered below:

Yesterday I noticed this tweet by Andrei Savu: . This prompted me to read the related GigaOM article and then check out the HPCC Systems …

[Read more]
Showing entries 1 to 10 of 18
8 Older Entries »