Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 30 of 45 Next 15 Older Entries

Displaying posts with tag: analytics (reset)

MariaDB CONNECT Storage Engine replay & slides available
+1 Vote Up -1Vote Down

The slides and replay of yesterday’s webinar on the MariaDB CONNECT storage engine have just been posted. First I want to thank the numerous attendees. You have shown great interest on the parallel execution of query on distributed MySQL Servers. I agree this is cool. The ODBC capabilities seems also to generate interest. This make [...]

MariaDB CONNECT Storage Engine and parallelism
+1 Vote Up -1Vote Down

The CONNECT Storage engine implement the concept of a table made of multiple tables. These underlying tables can be distributed remotely. For example the underlying remote tables can be of ODBC or MySQL table type. this allows to execute distributed queries. What is nice is that we can execute this distributed query with parallelism.

How [...]

InfiniDB column store moves to open source ! Congrats !
+2 Vote Up -0Vote Down

Like TokuDB, InfiniDB is now a fully open source server product. In the past infiniDB was “almost open source”. The open source version was an old release with no access to the advance functions like MPP multi-server execution. This is no more the case. With InfiniDB 4 the open source version is the latest release [...]

Data Analytics at NBCUniversal. Interview with Matthew Eric Bassett.
+0 Vote Up -0Vote Down
“The most valuable thing I’ve learned in this role is that judicious use of a little bit of knowledge can go a long way. I’ve seen colleagues and other companies get caught up in the “Big Data” craze by spend hundreds of thousands of pounds sterling on a Hadoop cluster that sees a few megabytes [...]
MySQL webinar: ‘Introduction to open source column stores’
+1 Vote Up -0Vote Down

Join me Wednesday, September 18 at 10 a.m. PDT for an hour-long webinar where I will introduce the basic concepts behind column store technology. The webinar’s title is: “Introduction to open source column stores.”

What will be discussed?

This webinar will talk about Infobright, LucidDB, MonetDB, Hadoop (Impala) and other column stores

  • I will compare features between major column stores (both open and closed source).
  • Some benchmarks will be used to demonstrate the basic
  [Read more...]
On PostgreSQL. Interview with Tom Kincaid.
+0 Vote Up -1Vote Down
“Application designers need to start by thinking about what level of data integrity they need, rather than what they want, and then design their technology stack around that reality. Everyone would like a database that guarantees perfect availability, perfect consistency, instantaneous response times, and infinite throughput, but it´s not possible to create a product with [...]
What’s the data on the 3Ci Data Team?
+0 Vote Up -0Vote Down
3Ci processes over a billion transactions a month. More than 100 million unique U.S. consumers have engaged with a business through our platform. All that activity creates massive amounts of data. The Data Team at 3Ci is responsible for keeping our offerings running at optimal performance and for making sense of our data. They manage MySQL [...]   [Read more...]
Latest Addition to MySQL & Cloud Database Solutions Day: Calpont CTO Jim Tommaney joins as guest speaker
+1 Vote Up -0Vote Down

Join us next Friday in Santa Clara for a free day of learning and fun from the SkySQL & MariaDB gang & their partners

We’re proud to announce that Jim Tommaney, CTO of Calpont, has just signed on to speak at the MySQL & Cloud Database Solutions Day, hosted by SkySQL and MariaDB - taking place next Friday, April 26, directly after Percona Live: MySQL Conference & Expo.

read more

Latest Addition to MySQL & Cloud Database Solutions Day: Calpont CTO Jim Tommaney joins as guest speaker
+0 Vote Up -0Vote Down

Join us next Friday in Santa Clara for a free day of learning and fun from the SkySQL & MariaDB gang & their partners

We’re proud to announce that Jim Tommaney, CTO of Calpont, has just signed on to speak at the MySQL & Cloud Database Solutions Day, hosted by SkySQL and MariaDB - taking place next Friday, April 26, directly after Percona Live: MySQL Conference & Expo.

read more

Big Data for Genomic Sequencing. Interview with Thibault de Malliard.
+0 Vote Up -0Vote Down
“Working with empirical genomic data and modern computational models, the laboratory addresses questions relevant to how genetics and the environment influence the frequency and severity of diseases in human populations” –Thibault de Malliard. Big Data for Genomic Sequencing. On this subject, I have interviewed Thibault de Malliard, researcher at the University of Montreal’s Philip Awadalla [...]
Super Python: three applications involving IRC bot master, MySQL optimization, and Website stress testing.
+1 Vote Up -0Vote Down

In my ongoing efforts to migrate my fun side projects and coding experiments from SVN to Git I’ve come across some of my favorite Python based apps – which are all available in their respective repos on BitBucket, as follows:

IRC Bot Commander

  • What it does: it’s an IRC bot that takes commands and does your bidding on whichever remote server the bot is installed on.
  • How it does it: the bot runs on whatever server you install it on, then it connects to the IRC server and channel you configured it to connect to and it waits for you to give it commands, then it execs the commands and returns the output to your IRC chat window.
  [Read more...]
Data Science vs. Data Analytics
+1 Vote Up -0Vote Down
As this topic came up a few times this week for discussion at various places, I thought of composing a post on “Data Scientist vs. Data Analytics Engineer”; even though[...]
Bash scripting: ElasticSearch and Kibana init.d scripts
+0 Vote Up -0Vote Down

As a follow up to the previous post about logstash, here are a couple of related init scripts for anyone implementing the OpenSource Log Analytics setup that is explained over at divisionbyzero. These have been tested on CentOS 6.3 and are based on generic RC functions from Redhat so they will work with Redhat, CentOS, Fedora, Scientific Linux, etc.

  [Read more...]
On Big Data, Analytics and Hadoop. Interview with Daniel Abadi.
+0 Vote Up -0Vote Down
“Some people even think that “Hadoop” and “Big Data” are synonymous (though this is an over-characterization). Unfortunately, Hadoop was designed based on a paper by Google in 2004 which was focused on use cases involving unstructured data (e.g. extracting words and phrases from Webpages in order to create Google’s Web index). Since it was not [...]
Two Cons against NoSQL. Part I.
+1 Vote Up -1Vote Down
Two cons against NoSQL data stores read like this: 1. It’s very hard to move data out from one NoSQL to some other system, even other NoSQL. There is a very hard lock in when it comes to NoSQL. If you ever have to move to another database, you have basically to re-implement a lot [...]
On Eventual Consistency– Interview with Monty Widenius.
+1 Vote Up -0Vote Down
“For analytical things, eventual consistency is ok (as long as you can know after you have run them if they were consistent or not). For real world involving money or resources it’s not necessarily the case.” — Michael “Monty” Widenius. In a recent interview, I asked Justin Sheehy, Chief Technology Officer at Basho Technologies, maker [...]
On Eventual Consistency — An interview with Michael Monty Widenius.
+0 Vote Up -1Vote Down
“For analytical things, eventual consistency is ok (as long as you can know after you have run them if they were consistent or not). For real world involving money or resources it’s not necessarily the case.” — Michael “Monty” Widenius. In a recent interview, I asked Justin Sheehy, Chief Technology Officer at Basho Technologies, maker [...]
Scaling MySQL and MariaDB to TBs: Interview with Martín Farach-Colton.
+1 Vote Up -0Vote Down
“While I believe that one size fits most, claims that RDBMS can no longer keep up with modern workloads come in from all directions. When people talk about performance of databases on large systems, the root cause of their concerns is often the performance of the underlying B-tree index”– Martín Farach-Colton. Scaling MySQL and MariaDB [...]
So now Hadoop's days are numbered?
+3 Vote Up -0Vote Down
Earlier this week we all read GigaOM's article with this title:
"Why the days are numbered for Hadoop as we know it"
I know GigaOM like to provoke scandals sometimes, we all remember some other unforgettable piece, but there is something behind it...

Hadoop today (after SOA not so long ago) is one of the worst case of an abused buzzword ever known to men. It's everything, everywhere, can cure illnesses and do "big-data" at the same time! Wow! Actually Hadoop is a software framework that supports data-intensive distributed applications, derived from Google's MapReduce and Google File System (GFS) papers.

My take from the article is




  [Read more...]
Scale differences between OLTP and Analytics
+1 Vote Up -0Vote Down

In my previous post,http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html, I reviewed the differences between OLTP and Analytics databases.

Scale challenges are different between those 2 worlds of databases.



Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: Columnar storage, RAM and parallelism.
Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to







  [Read more...]
MySQL Community – what do you want in a load testing framework?
+0 Vote Up -0Vote Down

So I’ve been doing a fair number of automated load tests these past six months. Primarily with Sysbench, which is a fine, fine tool. First I started using some simple bash based loop controls to automate my overnight testing, but as usually happens with shell scripts they grew unwieldy and I rewrote them in python. Now I have some flexible and easily configurable code for sysbench based MySQL benchmarking to offer the community. I’ve always been a fan of giving back to such a helpful group of people – you’ll never hear me complain about “my time isn’t free”. So, let me know what you want in an ideal testing environment (from a load testing framework automation standpoint) and I’ll integrate it into my existing framework and then release it via the BSD license. The main goal here is to have a standardized modular framework, based on sysbench,

  [Read more...]
Outliers and coexistence are the new normal for big data
+0 Vote Up -0Vote Down

Letting data speak for itself through analysis of entire data sets is eclipsing modeling from subsets. In the past, all too often what were once disregarded as "outliers" on the far edges of a data model turned out to be the telltale signs of a micro-trend that became a major event. To enable this advanced analytics and integrate in real-time with operational processes, companies and public sector organizations are evolving their enterprise architectures to incorporate new tools and approaches.

Whether you prefer "big," "very large," "extremely large," "extreme," "total," or another adjective for the "X" in the "X Data" umbrella term, what's important is accelerated growth in three dimensions: volume, complexity and speed.

Big data is not without its limitations. Many organizations need to revisit business processes, solve data silo

  [Read more...]
MySQL Analytics: updated query for table engine data statistics
+1 Vote Up -0Vote Down

This is a follow up to my previous post titled “MySQL analytics: information_schema polling for table engine percentages”. Here’s an updated query with more output and quicker execution time. What you get: innodb table space utilization percentage, data+index usage total and per innodb/myisam engine, innodb data/index/percentage, myisam data/index/percentages, and overall percentage values. Rather useful for profiling your table engine usage.

Sample output:
innodb_tablespace_utilization_perc: 100
total_size_gb: 26.275011910126
index_size_gb: 2.994891166687
data_size_gb: 23.280120743439
innodb_total_size_gb: 6.751220703125
innodb_data_size_gb: 5.2576751708984
innodb_index_size_gb: 1.4935455322266
myisam_total_size_gb: 19.523791207001









  [Read more...]
Not excited about paying for MySQL monitoring for your enterprise?
+1 Vote Up -2Vote Down
I think most people will agree that one of the biggest advantages of MySQL Community Server is that it’s free. Being free doesn’t get you a multi-million user community though; MySQL offers a great array of transactional engines, advanced high-availability features, robust I/O performance, and it powers many of the top-500 internet sites. When it […]
MySQL analytics: information_schema polling for table engine percentages
+1 Vote Up -0Vote Down

If you’ve ever needed to know how the data and index percentages per table engine were laid out on your MySQL server, but didn’t have the time to write out a query… here it is!

select
(select (sum(DATA_LENGTH)+sum(INDEX_LENGTH))/(POW(1024,3)) as total_size from tables) as total_size_gb,
(select sum(INDEX_LENGTH)/(POW(1024,3)) as index_size from tables) as total_index_gb,
(select sum(DATA_LENGTH)/(POW(1024,3)) as data_size from tables) as total_data_gb, 

(select ((sum(INDEX_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_index from tables) as perc_index,
(select ((sum(DATA_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_data from tables) as perc_data,

(select ((sum(INDEX_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_index from tables where ENGINE='innodb') as innodb_perc_index,
(select ((sum(DATA_LENGTH) / (
  [Read more...]
Kontrollbase – new version available with improved analytics
+1 Vote Up -0Vote Down
A new version of Kontrollbase – the enterprise monitoring, analytics, reporting, and historical analysis webapp for MySQL database administrators and advanced users of MySQL databases – is available for download. There are several upgrades to the reporting code with improved alert algorithms as well as a new script for auto-archiving of the statistics table based […]
InfiniDB Subquery Performance Profile - New with 1.1.1 Alpha
+2 Vote Up -0Vote Down

Let's take quick look at the performance of the new InfiniDB Subquery processing available with the 1.1.1 Alpha.  The arrow was added to be sure our timings weren't confused with the axis.



This was against a relatively small dataset, the Star Schema Benchmark with 6 million rows in the fact table.  A base query was run where the outer query...

Kontrollbase – revision 297 fixes Reporter-CLI “alert_22″ sub-routine
+0 Vote Up -0Vote Down
Quick note to let our users know that there was an XML tag closure error on the “alert_22″ subroutine in the “bin/kontroll-reporter-cli.pl” script. This does not affect the webapp portion of Kontrollbase – only reports generated via the command line reporter script. It is not a fatal error but will cause the XML file to […]
Kontrollbase – graph “no data to display” on new install has been fixed
+0 Vote Up -0Vote Down
If you have been wondering why the overview and graphs pages say “no data to display” on the graphs when you first install Kontrollbase, it’s because there’s no data in the database being returned from the queries that generate the graphs – this is because a new install has no data to graph. This has […]
Having an issue with a Kontrollbase upgrade?
+0 Vote Up -0Vote Down
If you’ve noticed that your recent upgrade did not go as planned and now the application does not load – please check this page: http://wiki.kontrollsoft.com/wiki/UpgradingReleases for notes on upgrades between versions. Typically you need to execute a SQL file against the current schema to bring it up to date. If you have any questions please [...]
Showing entries 1 to 30 of 45 Next 15 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.