Showing entries 1 to 10 of 20
10 Older Entries »
Displaying posts with tag: csv (reset)
The Fast Way to Import CSV Data Into a Tungsten Cluster

The Question Recently, a customer asked us:

After importing a new section of user data into our Tungsten cluster, we are seeing perpetually rising replication lag. We are sitting at 8.5hrs estimated convergence time after importing around 50 million rows and this lag is climbing continuously. We are currently migrating some of our users from a NoSQL database into our Tungsten cluster. We have a procedure to write out a bunch of CSV files after translating our old data into columns and then we recursively send them to the write master using the mysql client. Specifically our import SQL is doing LOAD DATA LOCAL INFILE and the reading in a large CSV file to do the import. We have 20k records per CSV file and we have 12 workers which insert them in parallel.

Simple Overview The Skinny

In cases like this, the slaves are having trouble with the database unable to keep up with the apply stage …

[Read more]
Loading Unknown CSV Data into a Table Easily

Peter Zaitsev of Percona posted a Tweet about an interactive book Select Star SQL which 'aims to be the best place on the internet for learning SQL'.  This interactive book uses a data set that can be downloaded or you could work the queries online.  I showed the website to a neighbor who was asking 'what is the best way to learn SQL'.   The neighbor wanted their own copy of the data and asked how do you turn a raw CSV file into a table.

Well, that can get complicated.
CSV
CSV files use commas to designate each field in the file.  This is a great idea until someone plants a comma in the middle of a field but it is sort of a least common denominator way of passing data between systems.

But a CSV file does not directly fit into a relational table.  You could …

[Read more]
How To Convert MySQL Two Digit Year To Four Digit Year

create table csv_date (date varchar(20)); insert into csv_date values ('2030-05-18 14:57:19'); insert into csv_date values ('2030-05-18 14:55:15'); insert into csv_date values ('2019-05-18 04:15:15'); insert into csv_date values ('2018-05-18 02:11:53'); insert into csv_date values ('2017-05-18 22:14:24'); Convert two digit to four digit in MySQL: Create a temporary table same as the original table but make the column …

The post How To Convert MySQL Two Digit Year To Four Digit Year appeared first on SQLgossip.

Testing (again) LOAD DATA on MySQL 5.6, 5.7, 8.0 (non-GA) and MariaDB 10.0, 10.1 and 10.2 (non-GA)

I’ve been lately compiling and generating .deb packages for several MySQL and MariaDB recent versions, and I wanted to try them more in depth -specially MySQL 8.0 and MariaDB 10.2, both of which are still in development.

Several people have already given their first impressions (or will do soon), and testing early is the best way to catch bugs and regressions, and get them fixed before the official release. In fact, as I will comment later, I ran into breaking bugs on both MySQL 8.0 and MariaDB 10.2, which I …

[Read more]
MySQL CSV to Rows (Fun with Numbers)

Making The Numbers Table Useful

It’s not easy to find a solution to a very simple problem in MySQL: converting a comma separated list of values into rows. Oracle database users find gobs of tutorials on using REGEXP and CONNECT BY LEVEL to make this happen. MySQL doesn’t have that. So, use the numbers table from the previous post!

The transposing is made possible by (ab)using the SUBSTRING_INDEX function. I love this function. It is right up there with GROUP_CONCAT when mixing NULL and non-null strings.

I will be using comma separated values. You can use any delimiter character you like.

Bonus: This works with empty strings, strings with one value only, and empty delimiters (i.e. “my value,,previous is empty”). No extra code needed.

Making Magic Happen

SET @mycsv = …
[Read more]
Testing the Fastest Way to Import a Table into MySQL (and some interesting 5.7 performance results)

As I mentioned on my last post, where I compared the default configurations options in 5.6 and 5.7, I have been doing some testing for a particular load in several versions of MySQL. What I have been checking is different ways to load a CSV file (the same file I used for testing the compression tools) into MySQL. For those seasoned MySQL DBAs and programmers, you probably know the answer, so you can jump over to my 5.6 versus 5.7 results. However, the first part of this post is dedicated for developers and MySQL beginners that want to know the answer to the title question, in a step-by-step fashion. I must say I also learned something, as I under- and over-estimated some of the effects of certain …

[Read more]
Which Compression Tool Should I Use for my Database Backups? (Part II: Decompression)

On my post last week, I analysed some of the most common compression tools and formats, and its compression speed and ratio. While that could give us a good idea of the performance of those tools, the analysis would be incomplete without researching the decompression. This is particularly true for database backups as, for those cases where the compression process is performed outside of the production boxes, you may not care too much about compression times. In that case, even if it is relatively slow, it will not affect the performance of your MySQL server (or whatever you are using). The decompression time, however, can be critical, as it may influence in many cases the MTTR of your whole system.

Testing …

[Read more]
Where are they now: MySQL Storage Engines

There was once a big hooplah about the MySQL Storage Engine Architecture and how it was easy to just slot in some other method of storage instead of the provided ones. Over the years I’ve repeatedly mentioned how this wasn’t really

[Read more]
Check (Rough) Progress of Your CSV Import to MySQL

If you are importing large CSV or SQL dumps to MySQL, chances are you were looking for ways to see how far the import has gone. If you know how many rows there are from the file being imported, you can do a SELECT COUNT(*) but that would take sometime for the query to finish especially on really big imports.

Using lsof, you can monitor the current file offset to which a process is reading from using the -o option. Knowing the size of the file and some snapshots of the offset, you can get a somewhat rough idea of how fast the import goes. Note though that this is only file-read-pace not actual import speed as MySQL import can vary depending on a number of conditions i.e. table growth, secondary indexes, etc.

Let’s say I am importing a 1.1G CSV file into a table.

[revin@forge msb_5_5_300]$ ls -al /wok/dta/samples/ft_history.csv 
-rw-rw-r--. 1 revin revin 1075456654 Nov 8 23:25 /wok/dta/samples/ft_history.csv …
[Read more]
Tech Messages | 2013-03-10

A special extended edition of Tech Messages for 2013-03-07 through 2013-03-10:

Showing entries 1 to 10 of 20
10 Older Entries »