The Dallas / Fort Worth Unix User Group asked me to present on Open Source BI tools on July 7th. They meet 7PM at IBM Innovation Center at 13800 Diplomat Drive (see website for details) and will serve pizza! All are welcome, see you there!
[Read more]By far, the most popular way for PDI users to load data into LucidDB is to use the PDI Streaming Loader. The streaming loader is a native PDI step that:
- Enables high performance loading, directly over the network without the need for intermediate IO and shipping of data files.
- Lets users choose more interesting (from a DW perspective) loading type into tables. In particular, in addition to simple INSERTs it allows for MERGE (aka UPSERT) and also UPDATE. All done, in the same, bulk loader.
- Enables the metadata for the load to be managed, scheduled, and run in PDI.
However, we’ve had some known issues. In fact, until PDI 4.2 GA and LucidDB 0.9.4 GA it’s pretty problematic unless you run through the process of patching LucidDB outlined on this page: …
[Read more]Our last LucidDB release was now, just more than 12 months ago on June 16, 2010. We were really really trying to beat the 1 year mark for our 0.9.4 release but we just couldn’t. A tenet of good, open source development is early and often and we need to do better. Since the 0.9.3 release we’ve:
- Built out an entire Web Services infrastructure
- Developed a wicked cool Admin user interface
- Developed cool connectors to Hive, CouchDB
- Built a whole ton of extensions (auto indexing, DDL generation, improved load routines)
- Scriptable …
Update
Since this article was written, HPCC has undergone a number of significant changes and updates. This addresses some of the critique voiced in this blog post, such as the license (updated from AGPL to Apache 2.0) and integration with other tools. For more information, refer to the comments placed by Flavio Villanustre and Azana Baksh.
The original article can be read unaltered below:
Yesterday I noticed this tweet by Andrei Savu: . This prompted me to read the related GigaOM article and then check out the HPCC Systems …
[Read more]Today, more and more proprietary software vendors are choosing to go Open Source. Doing this enables them to leverage the community benefits of Open Source, shorten the sales cycle, and gain a competitive advantage over other proprietary products.
However, for those firms considering a switch to Open Source, there are some hard decisions to make with regard to their product architecture. Should they provide only a single Open Source product, and earn revenue from add-on services like support and consulting (RedHat)? Or should they adopt the Open Core model, offering their product under both Open Source and proprietary licenses (MySQL)? Or …
[Read more]- OMG Text -- a plugin for CSS framework Compass for directional text shadows. (via David Kaneda)
- Build a Cheap Bitcoin Mine -- some day it will be revealed that the act of generating a bitcoin token is helping the Russian mafia to crack nuclear missile launch codes and Afghan druglords built the Bitcoin system to destabilize the US dollar.
- Polycode -- a free, open-source, cross-platform framework for creative code. You can use it as a C++ API or as a standalone scripting language to get easy and simple access to accelerated 2D and 3D graphics, hardware shaders, sound and network …
“Our experience from PNUTS also tells that these systems are hard to build: performance, but also scaleout, elasticity, failure handling, replication. You can’t afford to take any of these for granted when choosing a system. We wanted to find a way to call these out.” – Adam Silberstein and Raghu Ramakrishnan, Yahoo! Research. ___________________________________ A [...]
I wrote this up a while ago and decided that I didn’t want to lose it in a shuffle of documents during my transition to a new workstation. It’s the basics of setting up Heartbeat (LVS) + DRBD (block replication between active/passive master servers) + MySQL. This should give you the basics of a H/A system without the benefits of SAN but also without the associated cost. The validity of this setup for H/A purposes is highly dependent on your workload and environment. You should know the ins and outs of your H/A solution before deciding to blame the system for not performing as expected. As with all production systems you should test, test, test and test some more before going live.
When I get around to it later I’ll post my How-To for setting up RHCS + SAN + MySQL. You can download the DRBD document PDF here: …
[Read more]So I’ve been doing a fair number of automated load tests these past six months. Primarily with Sysbench, which is a fine, fine tool. First I started using some simple bash based loop controls to automate my overnight testing, but as usually happens with shell scripts they grew unwieldy and I rewrote them in python. Now I have some flexible and easily configurable code for sysbench based MySQL benchmarking to offer the community. I’ve always been a fan of giving back to such a helpful group of people – you’ll never hear me complain about “my time isn’t free”. So, let me know what you want in an ideal testing environment (from a load testing framework automation standpoint) and I’ll integrate it into my existing framework and then release it via the BSD license. The main goal here is to have a standardized modular framework, based on sysbench, that allows anyone to compare their server performance via repeatable tests. It’s fun to see …
[Read more]I split my time last week between the IOUG’s Collaborate conference in Orlando, Florida and O’Reilly’s MySQL Conference & Expo in California. The contrast was stark. For me as a MySQLer, Collaborate was a dud. On the other hand, the MySQL conference O’Reilly puts on is superb. It is vital to MySQL as a project and as a community, and it follows that it’s vital to MySQL’s business success. Oracle needs to participate to make it a success in the future.
MySQL at Collaborate had good speakers and content, but no one there is interested in MySQL. MySQL is just from a different world — it is a curiosity at an Oracle conference. Also, as a speaker, sponsor, and attendee, Collaborate was a giant frustration. I can’t recommend it to anyone. (These comments do not reflect on the work that MySQL community members did in recruiting and organizing the MySQL content at the Collaborate conference.) In particular, the experience of …
[Read more]