Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 6

Displaying posts with tag: data mining (reset)

Getting Data into Hadoop in real-time
+0 Vote Up -0Vote Down

Moving data between databases is hard. Without ever intending it, I seem to have spent a lifetime working on solutions for getting data into and out of databases, but more frequently between. In fact, my first job out of university was migrating data from BRS/Text, a free-text database (probably what we would call a NoSQL) into a more structured Oracle.

Today I spend some of my time working in Big Data, more often than not, migrating information from existing data stores into Big Data so that they can be analysed, something I covered in more detail here:

http://www.ibm.com/developerworks/library/bd-sqltohadoop1/index.html
http://www.ibm.com/developerworks/library/bd-sqltohadoop2/index.html


  [Read more...]
Four short links: 21 October 2010
+0 Vote Up -0Vote Down

  • Using MysQL as NoSQL -- 750,000+ qps on a commodity MySQL/InnoDB 5.1 server from remote web clients.
  • Making an SLR Camera from Scratch -- amazing piece of hardware devotion. (via hackaday.com)
  • Mac App Store Guidelines -- Apple announce an app store for the Macintosh, similar to its app store for iPhones and iPads. "Mac App" no longer means generic "program", it has a new and specific meaning, a program that must be installed through the App store and which has limited functionality (only
  •   [Read more...]
    Four short links: 10 December 2009
    +1 Vote Up -1Vote Down

  • Scriblio -- open source CMS and catalogue built on WordPress, with faceted search and browse. (via titine on Delicious)
  • Useful Temporal Functions and Queries -- SQL tricksies for those working with timeseries data. (via mbiddulph on Delicious)
  • Optimal Starting Prices for Negotiations and Auctions --Mind Hacks discussion of a research paper on whether high or low initial prices lead to higher price outcomes in negotiations and online auctions. Many negotiation books recommend waiting for the other side to offer first. However, existing
  •   [Read more...]
    Four short links: 1 December 2009
    +0 Vote Up -0Vote Down

  • Apertus -- open source cinema camera. (via joshua on Delicious)
  • A Survey of Collaborative Filtering Techniques -- From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area. (via bos on Delicious)
  • Drizzle Replication using RabbitMQ as Transport -- we're watching the growing use of message queues in web software, and here's an interesting application. (via sogrady on
  •   [Read more...]
    Four short links: 26 October 2009
    +0 Vote Up -0Vote Down

  • Toiling in the Data Mines -- Tom Armitage describes the process that Berg calls "material exploration". Programmers very rarely talk about what their work feels like to do, and that's a shame. Material explorations are something I've really only done since I've joined BERG, and both times have felt very similar - in that they were very, very different to writing production code for an understood product. They demand code to be used as a sculpting tool, rather than as an engineering material, and I wanted to explain the knock-on effects of that: not just in terms of what I do, and the kind of code that's appropriate for that, but also in terms of how I feel as I work on these explorations. Even if the section on the code itself
  •   [Read more...]
    SQL Puzzle
    +0 Vote Up -0Vote Down
    Dear lazyweb,

    I want to mine a code repository for data to map past bugs to sourcecode files.

    I have written a small PHP script (the initial version of the script can be found here) to import the relevant data from a Subversion repository into the following tables of a relational database:
    bugs            changes         paths
    --------        --------        -------
    bug_id          path_id     path_id
    revision    revision        path
    What I need now is two queries to ask the database for
    • paths that are most commonly changed during bugfix commits and
    • paths that are commonly changed together
    Your suggestions are most welcome in the comments to this posting :-)
    Showing entries 1 to 6

    Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

    Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.