Showing entries 19863 to 19872 of 44109
« 10 Newer Entries | 10 Older Entries »
We’ll Bring the Pie

Yesterday, Tokutek was invited (thanks to David Hughson, Vice-Consul of the British Consulate) to the Massachusetts State House for a forum on international business, hosted by Governor Deval Patrick. I felt pretty good about our Commonwealth – apparently we are the only state whose bond rating has improved since 2007. And, even though big firms like Google, Microsoft and IBM have headquarters elsewhere, it’s their MA offices that are their fastest growing ones. We also get along well with European tech centers. Despite some unpleasantness for a brief period in the past, the UK is the number one country for Massachusetts exports and 250 Massachusetts companies have operations there. I was happy to hear this, given all the expertise we see in MySQL in the UK.

[Read more]
Incorrect Information in .FRM File of InnoDB Table

MySQL organizes all the data as tables, irrespective of storage engine used. If you are using MySQL with InnoDB tables, these tables might get corrupt due to hardware faults, unexpected power failure, MySQL code errors, kernel bugs and other similar reasons. In such cases, InnoDB will typically give some errors indicating table corruption. As a data restoration source, you will need to use your latest database backup. But in case if backup fails to restore required information or doesn’t exist, you should scan your damaged database using third-party MySQL Repair or MySQL Recovery tools.

You might encounter the similar error message while accessing an InnoDB table:

“#1033 - Incorrect information in file: '"table name" .frm”

MySQL crashes after you receive this error message.

Cause: You receive this error message if …

[Read more]
How to compare the record differences of two similar tables - Part 1 of 2

Permalink: http://bit.ly/1z7MNGQ



Click here to skip to the code snippet.

The CHECKSUM TABLE result sets of two similar tables only indicate if there are differences between the two tables. It does not tell you what exactly the differences are between the two. Does `tableA` have an updated value for one of its records that `tableB` does not have? Or does `tableA` have an extra row that `tableB` does not have? Depending on specific business requirements, the CHECKSUM TABLE statement may be sufficient. If you need to determine what the actual differences are, there is a way to go about doing this automatically and dynamically by creating a stored procedure.

This SELECT statement that uses a UNION ALL clause …

[Read more]
Removing Mondrian's 'high cardinality dimension' feature

I would like to remove the 'high cardinality dimension' feature in mondrian 4.0.

To specify that a dimension is high-cardinality, you set the highCardinality attribute of the Dimension element to true. This will cause mondrian to scan over the dimension, rather than trying to load all of the children of a given parent member into memory.

The goal is a worthy one, but the implementation — making iterators look like lists — has a number of architectural problems: it duplicates code; because it allows backtracking for a fixed amount, it works with small dimensions but unpredictably fails with larger ones; and because lists are based on iterators, re-starting an iteration multiple times (e.g. from within a crossjoin) can re-execute complex SQL statements.

There are other architectural features designed to …

[Read more]
MySQL for Big Data

An excerpt from article on mysql for big data published in Dow Jones Venture Wire by Scott Denne.

There is one possible solution to the problem that doesn't include companies having to buy new software tools or even an all-new database: With the right expertise, MySQL can be engineered to handle almost any data-intensive application. The only problem is that there's a shortage of people who have the expertise to make it work.

"There's a big time gap until we, as an industry, think we have data under control," said Frank Mashraqi, chief technology officer at MyLawsuit.com and former database chief at Fotolog Inc., a photo blogging site. "The roadmap to getting that expertise is very difficult and time doesn't allow for it."

Why SQL_MODE is important

Today was another example of where a correct SQL_MODE saved customer data from being corrupted. By default, MySQL does not enforce data integrity. It allows what is called silent truncations where the result of what you INSERT or UPDATE does not represent truth. NOTE: I see very few customers ever have this correctly configured, those that do have actually listened to my advice.

If you do not read any further, your production MySQL environments should be running with at the bare minimum of SQL_MODE=STRICT_ALL_TABLES however I would also advocate for additional SQL_MODE settings.

For this example, some modified undesirable code attempted to reduce a counter by 1, however because of an UNSIGNED data type and a correctly set SQL_MODE, the application produced an error and data was not corrupted.

This is what should happen with your SQL.

mysql> update stats set loss_count=loss_count - 1 where user_id=42; …
[Read more]
Connection Pool: MySQL Communications link failure

The Problem And The Solution While using a MySQL connection pool in Java, I received a MySQL Communications link failure Exception (see below). In order to solve communication link failure exception: I have removed JDBC property autoReconnect=true and put only the JDBC property autoReconnectForPools=true I have added the connection properties: testOnBorrow testWhileIdle timeBetweenEvictionRunsMillis minEvictableIdleTimeMillis See […]

Memory tuning fast paced ETL

Dear Kettle friends,

on occasion we need to support environments where not only a lot of data needs to be processed but also in frequent batches.  For example, a new data file with hundreds of thousands of rows arrives in a folder every few seconds.

In this setting we want to use clustering to use “commodity” computing resources in parallel.  In this blog post I’ll detail how the general architecture would look like and how to tune memory usage in this environment.

Clustering was first created around the end of 2006.  Back then it looked like this.

The master

This is the most important part of our cluster.  It takes care of administrating network configuration and topology.  It also keeps track of the state of dynamically added slave servers.

The master …

[Read more]
If your data lacks integrity you are, in a word, doomed.

Recently I ran across across a website and their related blog ( http://channelmeter.com/ & http://insidechannelmeter.wordpress.com/ ) . The company's focus is the analytical data of You Tube. Naturally with the size of You Tube the analytical data is going to be big.  So I knew this would not be a company that is just taking a few hundred names from a web form each day. I saw from  their blog posting that they use MySQL community edition 5.5. I reached out to asked them a few questions. “We are storing tens of millions of rows of data, and adding millions more every day.” – Dave Storrs Co-Founder, ChannelMeter.com. To me this is showing, once again, how well a company can be built …

[Read more]
451 CAOS Links 2011.05.31

Linus announces Linux 3.0. Attachmate maintains commitment to SUSE Linux. And more.

# Linus Torvalds announced the release candidate of Linux 3.0.

# Attachmate CEO Jeff Hawn maintained that the company is committed to SUSE Linux.

# OpenX raised $20m series D funding.

# Cloudera proposed Flume as an Apache incubator project.

# Isidorey unveiled CloudSandra: a NoSQL database-as-a-service based on Apache Cassandra.

# Wayne Beaton …

[Read more]
Showing entries 19863 to 19872 of 44109
« 10 Newer Entries | 10 Older Entries »