(Author's note: not necessarily actually a practical idea. But fun!)So pictured here is a histogram of a moderately large set of random integers. Each vertical line represents the total number of entries at each particular integer. Since each number is made up of multiple random factors (10 different random numbers, each between 0 and 100, added together), the distribution tends toward a bell curve.So how did I build the graph? Excel? PHP? Nope. Just a MySQL query. ...
Hi all,
The schedule for PyWorks has been posted! I’m really excited about three things:
1) there are some really cool talks that I’m looking forward to attending. There are a couple of sysadmin-related talks, AppEngine, TurboGears, Django, and an area I’ve been especially slow to move into: testing (I know, shame on me). There’s lots more so be sure to check it out.
2) the conference scheduling process is over
3) I get to meet a lot of people face-to-face that I’ve worked with in the past on Python Magazine developing articles, or interacted with on IRC, etc. One thing I like about conferences surrounding open source technologies is you get to thank people face-to-face for the sweat they poured into some of the tools I use regularly. Mark Ramm, Kevin Dangoor, Michael Foord, Brandon Rhodes, and a collection of Python Magazine authors …
[Read more]
OpenSQL Camp
2008 is coming! When and Where: Charlottesville, Virginia,
USA November 14, 15, and 16 2008!
Organised by Baron
Schwartz & others, and attended by loads of cool and
interesting people (Brian, Monty and Baron are already on the
attendee list!) you'd better get ready for a dynamite weekend of
learning, contributing, and having fun! I'll be there too.
Some Key facts:
- It is of, by and for the community (you).
- At this event, all Open Source databases are created equal.
We’ll learn together and grow together.
- It’s a combination conference and hackathon.
- It’s free.
- It is Friday night Nov 14, 18:00 through Sun the 16th at
18:00 in Charlottesville, Virginia USA in a very cool
location.
- The website, …
A couple of articles have been published recently that point to a growing realisation/admission about the role that open source will play in the future of enterprise software.
In “The Commercial Bear Hug of Open Source” Dan Woods details the various methods by which open source has become increasingly commercial in recent years, while in “The Microsoft-Novell Deal and Trust in Princes” Bruce Byfield discusses the relationship between business and open source.
Neither article is perfect. Woods, in particular, appears to paint open source in the role of the glorious failure - failing to surpass traditional licensing models and being subsumed into the mainstream (a subject …
[Read more]
Aka, colliding MD5, but in a very cool 12-way
demonstration:
We have used a Sony Playstation 3 to correctly predict the
outcome of the 2008 US presidential elections. In order not to
influence the voters we keep our prediction secret, but commit to
it by publishing its cryptographic hash on this website. The
document with the correct prediction and matching hash will be
revealed after the elections.Read all about it at http://www.win.tue.nl/hashclash/Nostradamus/
(yes I share my first name with one of the authors, but
you'll notice that the last name, while similar, is not identical
- it's really not me. I'm not that much of a maths or crypto whiz
;-)
You will need:
- CMake (at least 2.4.7)
- Bazaar (the newer the better - 1.6 was just released - at least use that)
- Gnu Bison
- Visual Studio (Express works, but I’m talking about 2005 here)
- … and all this installed on a Microsoft Windows machine.
- … and to hate yourself, you are going to be using Windows after all.
Then, get and build it:
- Get the source:
bzr branch lp:~mysql/mysql-server/mysql-5.1-telco-6.4-win - Run CMake. the CMake GUI can now be used to select compile options! You’ll have to set the path “where is the source code” to where you put the source code in step 1.
- Hit “Configure” in …
Along with some others, I have arranged a conference for open-source database users and developers.
Key facts:
It is of, by and for the community (you). At this event, all open-source databases are created equal. We’ll learn together and grow together. It’s a combination conference and hackathon. It’s free. It is Friday night Nov 14, 18:00 through Sun the 16th [...]
I came across this behavior today while writing some additional
comprehensive ALTER TABLE tests for use in the test suite that we
run against the Kickfire database appliance to ensure
quality.
mysql> CREATE TABLE `t1` (
-> `col1` tinyint(4) DEFAULT NULL,
-> `id` int(11) NOT NULL DEFAULT '0' COMMENT
'min=1,max=65535',
-> PRIMARY KEY (`id`)
-> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)
mysql> alter table t1 drop column id,
-> add column id int auto_increment primary key;
ERROR 1089 (HY000): Incorrect prefix key; the used key part isn't
a string, the used length is longer than the key part, or the
storage engine doesn't support unique prefix keys
This is a regression in 5.0 (and 5.1, I didn't try 6) that was
reported almost one year ago. This alter table works just fine in
4.1 and there isn't any reason it …
After last week’s post on agents versus agentless monitoring systems, I got a lot of email. One, from a customer whose name I am not permitted to mention, sent me the following action shot (posted with permission):
Over half a gigabyte; more than twice what MySQL itself is using. So, that raises an interesting [...]
With most large sets of data, especially numerical data, statistical analysis plays a key role. You can't be bothered to look at every record yourself; that's what computers are for.One useful tool in any statistical analysis is the identification of outliers. Assuming you have a normally distributed set of data, outliers can help to identify user error in the data entry process, or genuine spikes in the data. Once found, these numbers can be set aside for closer analysis or eliminated to normalize the data set.There are many different methods for identifying outliers, with varying levels of rigor. Here I'll just demonstrate one of the simplest definitions: an outlier is any value greater than three standard deviations away from the mean. ...