Planet MySQL

Displaying posts with tag: dataset (reset)

Jun

2017

Posted by Satej Sahu on Tue 06 Jun 2017 05:50 UTC
Tags:

Random, Programming, Language, data, dataset, human, MySQL, generate, large, local, usable

We all do need sometimes to generate raw valid dummy data for our use cases and applications as we start them. Obviously, one can write their own scripts to generate random data, but it is much better to have data, to which human beings can associate with like names, addresses instead of having them filled with random "lorem ipsum" string data :)

While searching for such a tool, I found a site which does exactly this: http://www.generatedata.com/

Documentation: http://benkeen.github.io/generatedata/

This can also be downloaded and installed locally. It supports three types of installations:
- A single, anonymous user account
- A single user account, requires login
- Multiple accounts

Below is the set of wide varied data types it supports for …

[Read more]

Jun

2011

Advantages of weighted lists in RDBMS processing

Posted by Justin Swanhart on Fri 17 Jun 2011 21:17 UTC
Tags:

database, data warehousing, computing, dataset, aggregation, MySQL

A list is simply a list of things. The list has no structure, except in some cases, the length of the list may be known. The list may contain duplicate items. In the following example the number 1 is included twice.

Example list:

A set is similar to a list, but has the following differences:

The size of the set is always known
A set may not contain duplicates

You can convert a list to a set by creating a 'weighted list'. The weighted list includes a count column so that you can determine when an item in the list appears more than once:

1,2
2,1
3,1

Notice that there are two number 1 values in the weighted list. In order to make insertions into such a list scalable, consider using partitioning to avoid large indexes.

…

[Read more]

Jun

2011

Advantages of weighted lists in RDBMS processing

Posted by Justin Swanhart on Fri 17 Jun 2011 21:17 UTC
Tags:

database, data warehousing, computing, dataset, aggregation, MySQL

A set is similar to a list, but has the following differences:

The size of the set is always known
A set may not contain duplicates

You can convert a list to a set by creating a 'weighted list'. The weighted list includes a count column so that you can determine when an item in the list appears more than once:

1,2
2,1
3,1

Notice that there are two number 1 values in the weighted list. In order to make insertions into such a list scalable, consider using partitioning to avoid large indexes.

…

[Read more]

May

2011

MySQL: Using Views as Performance Improvement Tools

Posted by Ovais Tariq on Mon 30 May 2011 19:16 UTC
Tags:

sql, database, views, DBA, joins, planning, slow log, dataset, patterns, mysql resources, MySQL, Performance, data access, suboptimal

The most basic and most oft-repeated task that a DBA has to accomplish is to look at slow logs and filter out queries that are suboptimal, that consume lots of unnecessary resources and that hence slow down the database server. This post looks at why and how VIEWs can help against such suboptimal operations.

May

2008

Tools to generate large synthetic data sets for testing?

Posted by Justin Swanhart on Mon 12 May 2008 22:29 UTC
Tags:

data warehousing, Testing, dataset, MySQL

I need to generate large (1TB-3TB) synthetic MySQL datasets for testing, with a number of requirements:

a) custom output formatting (SQL, CSV, fixed-len row, etc)
b) referential integrity support (ie, child tables should reference PK values, no orphans,etc)
c) able to generate multiple tables in parallel
d) preferably able to operate without a GUI and/or manual intervention
e) uses a well defined templating construct for data generation
f) preferably open source

Does anyone out there know of a product that meets at least most of these requirements?

*edit*
I found a PHP based data generation script (www.generatedata.com) that is extensible in its output formatting, so it should do everything I need it to do.

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links