Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 30 of 30

Displaying posts with tag: analytics (reset)

What’s the data on the 3Ci Data Team?
+0 Vote Up -0Vote Down
3Ci processes over a billion transactions a month. More than 100 million unique U.S. consumers have engaged with a business through our platform. All that activity creates massive amounts of data. The Data Team at 3Ci is responsible for keeping our offerings running at optimal performance and for making sense of our data. They manage MySQL [...]   [Read more...]
Latest Addition to MySQL & Cloud Database Solutions Day: Calpont CTO Jim Tommaney joins as guest speaker
+1 Vote Up -0Vote Down

Join us next Friday in Santa Clara for a free day of learning and fun from the SkySQL & MariaDB gang & their partners

We’re proud to announce that Jim Tommaney, CTO of Calpont, has just signed on to speak at the MySQL & Cloud Database Solutions Day, hosted by SkySQL and MariaDB - taking place next Friday, April 26, directly after Percona Live: MySQL Conference & Expo.

read more

Big Data for Genomic Sequencing. Interview with Thibault de Malliard.
+0 Vote Up -0Vote Down
“Working with empirical genomic data and modern computational models, the laboratory addresses questions relevant to how genetics and the environment influence the frequency and severity of diseases in human populations” –Thibault de Malliard. Big Data for Genomic Sequencing. On this subject, I have interviewed Thibault de Malliard, researcher at the University of Montreal’s Philip Awadalla [...]
Super Python: three applications involving IRC bot master, MySQL optimization, and Website stress testing.
+1 Vote Up -0Vote Down

In my ongoing efforts to migrate my fun side projects and coding experiments from SVN to Git I’ve come across some of my favorite Python based apps – which are all available in their respective repos on BitBucket, as follows:

IRC Bot Commander

  • What it does: it’s an IRC bot that takes commands and does your bidding on whichever remote server the bot is installed on.
  • How it does it: the bot runs on whatever server you install it on, then it connects to the IRC server and channel you configured it to connect to and it waits for you to give it commands, then it execs the commands and returns the output to your IRC chat window.
  [Read more...]
Data Science vs. Data Analytics
+1 Vote Up -0Vote Down
As this topic came up a few times this week for discussion at various places, I thought of composing a post on “Data Scientist vs. Data Analytics Engineer”; even though[...]
Bash scripting: ElasticSearch and Kibana init.d scripts
+0 Vote Up -0Vote Down

As a follow up to the previous post about logstash, here are a couple of related init scripts for anyone implementing the OpenSource Log Analytics setup that is explained over at divisionbyzero. These have been tested on CentOS 6.3 and are based on generic RC functions from Redhat so they will work with Redhat, CentOS, Fedora, Scientific Linux, etc.

  [Read more...]
On Big Data, Analytics and Hadoop. Interview with Daniel Abadi.
+0 Vote Up -0Vote Down
“Some people even think that “Hadoop” and “Big Data” are synonymous (though this is an over-characterization). Unfortunately, Hadoop was designed based on a paper by Google in 2004 which was focused on use cases involving unstructured data (e.g. extracting words and phrases from Webpages in order to create Google’s Web index). Since it was not [...]
Two Cons against NoSQL. Part I.
+1 Vote Up -1Vote Down
Two cons against NoSQL data stores read like this: 1. It’s very hard to move data out from one NoSQL to some other system, even other NoSQL. There is a very hard lock in when it comes to NoSQL. If you ever have to move to another database, you have basically to re-implement a lot [...]
On Eventual Consistency– Interview with Monty Widenius.
+1 Vote Up -0Vote Down
“For analytical things, eventual consistency is ok (as long as you can know after you have run them if they were consistent or not). For real world involving money or resources it’s not necessarily the case.” — Michael “Monty” Widenius. In a recent interview, I asked Justin Sheehy, Chief Technology Officer at Basho Technologies, maker [...]
On Eventual Consistency — An interview with Michael Monty Widenius.
+0 Vote Up -1Vote Down
“For analytical things, eventual consistency is ok (as long as you can know after you have run them if they were consistent or not). For real world involving money or resources it’s not necessarily the case.” — Michael “Monty” Widenius. In a recent interview, I asked Justin Sheehy, Chief Technology Officer at Basho Technologies, maker [...]
Scaling MySQL and MariaDB to TBs: Interview with Martín Farach-Colton.
+1 Vote Up -0Vote Down
“While I believe that one size fits most, claims that RDBMS can no longer keep up with modern workloads come in from all directions. When people talk about performance of databases on large systems, the root cause of their concerns is often the performance of the underlying B-tree index”– Martín Farach-Colton. Scaling MySQL and MariaDB [...]
So now Hadoop's days are numbered?
+3 Vote Up -0Vote Down
Earlier this week we all read GigaOM's article with this title:
"Why the days are numbered for Hadoop as we know it"
I know GigaOM like to provoke scandals sometimes, we all remember some other unforgettable piece, but there is something behind it...

Hadoop today (after SOA not so long ago) is one of the worst case of an abused buzzword ever known to men. It's everything, everywhere, can cure illnesses and do "big-data" at the same time! Wow! Actually Hadoop is a software framework that supports data-intensive distributed applications, derived from Google's MapReduce and Google File System (GFS) papers.

My take from the article is




  [Read more...]
Scale differences between OLTP and Analytics
+1 Vote Up -0Vote Down

In my previous post,http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html, I reviewed the differences between OLTP and Analytics databases.

Scale challenges are different between those 2 worlds of databases.



Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: Columnar storage, RAM and parallelism.
Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to







  [Read more...]
MySQL Community – what do you want in a load testing framework?
+0 Vote Up -0Vote Down

So I’ve been doing a fair number of automated load tests these past six months. Primarily with Sysbench, which is a fine, fine tool. First I started using some simple bash based loop controls to automate my overnight testing, but as usually happens with shell scripts they grew unwieldy and I rewrote them in python. Now I have some flexible and easily configurable code for sysbench based MySQL benchmarking to offer the community. I’ve always been a fan of giving back to such a helpful group of people – you’ll never hear me complain about “my time isn’t free”. So, let me know what you want in an ideal testing environment (from a load testing framework automation standpoint) and I’ll integrate it into my existing framework and then release it via the BSD license. The main goal here is to have a standardized modular framework, based on sysbench,

  [Read more...]
Outliers and coexistence are the new normal for big data
+0 Vote Up -0Vote Down

Letting data speak for itself through analysis of entire data sets is eclipsing modeling from subsets. In the past, all too often what were once disregarded as "outliers" on the far edges of a data model turned out to be the telltale signs of a micro-trend that became a major event. To enable this advanced analytics and integrate in real-time with operational processes, companies and public sector organizations are evolving their enterprise architectures to incorporate new tools and approaches.

Whether you prefer "big," "very large," "extremely large," "extreme," "total," or another adjective for the "X" in the "X Data" umbrella term, what's important is accelerated growth in three dimensions: volume, complexity and speed.

Big data is not without its limitations. Many organizations need to revisit business processes, solve data silo

  [Read more...]
MySQL Analytics: updated query for table engine data statistics
+1 Vote Up -0Vote Down

This is a follow up to my previous post titled “MySQL analytics: information_schema polling for table engine percentages”. Here’s an updated query with more output and quicker execution time. What you get: innodb table space utilization percentage, data+index usage total and per innodb/myisam engine, innodb data/index/percentage, myisam data/index/percentages, and overall percentage values. Rather useful for profiling your table engine usage.

Sample output:
innodb_tablespace_utilization_perc: 100
total_size_gb: 26.275011910126
index_size_gb: 2.994891166687
data_size_gb: 23.280120743439
innodb_total_size_gb: 6.751220703125
innodb_data_size_gb: 5.2576751708984
innodb_index_size_gb: 1.4935455322266
myisam_total_size_gb: 19.523791207001









  [Read more...]
Not excited about paying for MySQL monitoring for your enterprise?
+1 Vote Up -2Vote Down
I think most people will agree that one of the biggest advantages of MySQL Community Server is that it’s free. Being free doesn’t get you a multi-million user community though; MySQL offers a great array of transactional engines, advanced high-availability features, robust I/O performance, and it powers many of the top-500 internet sites. When it [...]
MySQL analytics: information_schema polling for table engine percentages
+1 Vote Up -0Vote Down

If you’ve ever needed to know how the data and index percentages per table engine were laid out on your MySQL server, but didn’t have the time to write out a query… here it is!

select
(select (sum(DATA_LENGTH)+sum(INDEX_LENGTH))/(POW(1024,3)) as total_size from tables) as total_size_gb,
(select sum(INDEX_LENGTH)/(POW(1024,3)) as index_size from tables) as total_index_gb,
(select sum(DATA_LENGTH)/(POW(1024,3)) as data_size from tables) as total_data_gb, 

(select ((sum(INDEX_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_index from tables) as perc_index,
(select ((sum(DATA_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_data from tables) as perc_data,

(select ((sum(INDEX_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_index from tables where ENGINE='innodb') as innodb_perc_index,
(select ((sum(DATA_LENGTH) / (
  [Read more...]
Kontrollbase – new version available with improved analytics
+1 Vote Up -0Vote Down
A new version of Kontrollbase – the enterprise monitoring, analytics, reporting, and historical analysis webapp for MySQL database administrators and advanced users of MySQL databases – is available for download. There are several upgrades to the reporting code with improved alert algorithms as well as a new script for auto-archiving of the statistics table based [...]
InfiniDB Subquery Performance Profile - New with 1.1.1 Alpha
+2 Vote Up -0Vote Down

Let's take quick look at the performance of the new InfiniDB Subquery processing available with the 1.1.1 Alpha.  The arrow was added to be sure our timings weren't confused with the axis.



This was against a relatively small dataset, the Star Schema Benchmark with 6 million rows in the fact table.  A base query was run where the outer query...

Kontrollbase – revision 297 fixes Reporter-CLI “alert_22″ sub-routine
+0 Vote Up -0Vote Down
Quick note to let our users know that there was an XML tag closure error on the “alert_22″ subroutine in the “bin/kontroll-reporter-cli.pl” script. This does not affect the webapp portion of Kontrollbase – only reports generated via the command line reporter script. It is not a fatal error but will cause the XML file to [...]
Kontrollbase – graph “no data to display” on new install has been fixed
+0 Vote Up -0Vote Down
If you have been wondering why the overview and graphs pages say “no data to display” on the graphs when you first install Kontrollbase, it’s because there’s no data in the database being returned from the queries that generate the graphs – this is because a new install has no data to graph. This has [...]
Having an issue with a Kontrollbase upgrade?
+0 Vote Up -0Vote Down
If you’ve noticed that your recent upgrade did not go as planned and now the application does not load – please check this page: http://wiki.kontrollsoft.com/wiki/UpgradingReleases for notes on upgrades between versions. Typically you need to execute a SQL file against the current schema to bring it up to date. If you have any questions please [...]
Kontrollbase rev292 gets important UI layout fixes
+0 Vote Up -1Vote Down
This is a small revision and will only be available through SVN. However, it is an important one to speak about as it solves a former issue when running the application on a screen smaller than 1024px wide. While most users may not have noticed this since they have larger monitors it has been noticed [...]
Kontrollbase wiki being migrated to Trac
+0 Vote Up -1Vote Down
Just a quick bit of news to let you all know that additions to the standard Kontrollbase and Kontrollkit userguides are being halted while we migrate the documentation to a new wiki system run by the very nice Trac software. You will be able to access the Kontrollbase and Kontrollkit documentation at http://wiki.kontrollsoft.com when it [...]
Kontrollbase – queries to update your max_connections alert
+0 Vote Up -1Vote Down
If you have been reading the Kontrollbase performance reports and noticed that one alert says your connection usage vs max connections ration is too high but then recommends you to decrease the max_connections variable, then you will find this fix handy. Its two simple queries that execute on the Kontrollbase schema to update the max_connections [...]
Kontrollbase – scripts being rewritten in Python, request improvements now!
+1 Vote Up -0Vote Down
The time has come for major performance improvements to the reporter, stats-gather, alerter, and client scripts. This means that I will be rewriting the scripts in Python. A couple of reasons for this; to cut down on the number of modules that are required for the installation process (which also makes distributing the client script [...]
Kontrollbase user’s group on Brijj.com
+0 Vote Up -1Vote Down
We have a new users’s group on Brijj for anyone that wants to keep up to date with discussions and wants to increase their network profile on the site. Join now: http://www.brijj.com/group/kontrollbase-users
Kontrollbase – a simple way to install module requirements
+0 Vote Up -1Vote Down
I’ve been looking over the documentation lately and trying to find ways to improve the installation experience for new users. That said, I’ve written a short but useful description of the easiest way to install all of the Perl and PHP requirements for Kontrollbase. You can find it here: http://kontrollsoft.com/kontrollbase/userguide/installation-install_overview.php#simple – or in the Installation [...]
Kontrollbase 2.0.1 revision 281 available for download
+0 Vote Up -0Vote Down
A new version of Kontrollbase – the enterprise monitoring, analytics, reporting, and historical analysis webapp for MySQL database administrators and advanced users of MySQL databases – is available for download. See the downloads page or run “svn update” to get your new version today. http://kontrollsoft.com/software-downloads
Showing entries 1 to 30 of 30

Planet MySQL © 1995, 2013, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.