|Showing entries 1 to 30 of 41||Next 11 Older Entries|
Sphinx is a free, open-source search server that integrates nicely with MySQL. It provides a fast, scalable, and pluggable search framework. The Sphinx engine possesses a variety of tools enabling you to customize how searching/indexing interacts with or becomes a part of your environment.
Join me and Sphinx Search CEO/CTO Andrew Aksyonoff, the founder and creative force behind Sphinx, on Wednesday, November 20th at 10 a.m. PST as we discuss how to get started with Sphinx and seamlessly integrate it into your applications and MySQL. The title of our webinar is, “How to Optimally Configure Sphinx Search for MySQL” and[Read more...]
Quite frequently, especially with large-scale or complicated applications, we use MySQL alongside other technologies for certain tasks of reporting, caching as well as main data-store for portions of application.
What technologies for data storage and processing do you use alongside MySQL in your environment? Please feel free to elaborate in the comments about your use case and experiences!Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
The post[Read more...]
Queries in MySQL, Sphinx and many other database or search engines are typically single-threaded. That is when you issue a single query on your brand new r910 with 32 CPU cores and 16 disks, the maximum that is going to be used to process this query at any given point is 1 CPU core and 1 disk. In fact, only one or the other.
Seriously, if query is CPU intensive, it is only going to be using 3% of the available CPU capacity (for the same 32-core machine). If disk IO intensive – 6% of the available IO capacity (for the 16-disk RAID10 or RAID0 for that matter).
Let me put it another way. If your MySQL or Sphinx query takes 10s to run on a machine with a single CPU core and single disk, putting it on a machine with 32 such cores and 16 such disks will not make it any better.
But you knew this already. Question is[Read more...]
One of the most common causes of a poor Sphinx search performance I find our customers face is misuse of search filters. In this article I will cover how Sphinx attributes (which are normally used for filtering) work, when they are a good idea to use and what to do when they are not, but you still want to take advantage of otherwise superb Sphinx performance.
While Sphinx is great for full text search, you can certainly go beyond full text search, but before you go there, it is a good idea to make sure you’re doing it the right way.
In Sphinx, columns are basically one of two kinds:
a) full text
You may have already seen the announcement MariaDB Foundation to Safeguard Leading Open Source Database. We at Open Query wholeheartedly support this (r)evolution of the MySQL ecosystem, which appears to be increasingly necessary as Oracle Corp is seriously dropping the ball with security updates and actually just general development and innovation. Oracle has actually done some very good work, I happily acknowledge that – but security issues are critical, having crashing bugs and incorrect query results in a .28 of a GA release is uncool, and not incorporating awesome development efforts by the community is just astonishing.
MariaDB is where the[Read more...]
Tomorrow, August 22 at 10:00am PDT, I’ll present a webinar called Full Text Search Throwdown. This is a no-nonsense performance comparison of solutions for full text indexing for MySQL applications, including:
I’ll compare the performance for building indexes and querying indexes.
If you’re developing an application with text search features, this will be a very practical and informative overview of your technology options!
Register for this free webinar at http://www.percona.com/webinars/2012-08-22-full-text-search-throwdown
Are you looking to expand your knowledge about MySQL and MariaDB database solutions?
Well, you’re in luck! SkySQL is introducing an exclusive collection of educational videos featuring some of the industry’s leading experts on the MySQL database and related technologies. View informative, technical talks on a variety of topics, from the experts at SkySQL, MariaDB, Calpont InfiniDB, Continuent, ScaleDB, Severalnines, Sphinx, Webyog, and others.
I'm giving thoughts on the viability of MySQL plugins. This is due to a particular experience I've had, which is thankfully solved. However, it left some bitter taste in my mouth.
MySQL plugins are a tricky business. To create a plugin, you must compile it against the MySQL version you wish the users to use it with. Theoretically, you should compile it against any existing MySQL version, minors as well (I'm not sure whether it may sometimes or most times work across minor versions).
But, most important, you must adapt your plugin to major versions.
Another option for plugin makers, is to actually not recompile it, but rather provide with the source code, and let the end user compile it with her own MySQL version. But here, too, the code must be compatible with whatever changes the new MySQL version may have.
I've written a patch which completes Sphinx's integration with MySQL 5.5.
Up until a couple months ago, Sphinx would not compile with MySQL 5.5 at all. This is, thankfully, resolved as of Sphinx 2.0.3.
However, to my disdain, I've found out that it only partially work: the sphinx_snippets() user defined function is not included within the plugin library. After some quick poking I discovered that it was not added to the build, and when added, would not compile.
I rely on sphinx_snippets() quite a lot, and like it. Eventually I wrote the fix to the snippets_udf.cc which allows it to run in a MySQL 5.5 server.
Here are the changes for the 2.0.4 version of[Read more...]
Now that the snow is melting and spring is in the air, the SkySQL Team is hitting the road and making the rounds of key industry events, trade shows, and meetups around the globe. Come meet the team, pick-up a few tips and tricks for using the MySQL database, network with your peers, and learn more about SkySQL’s products and services. Here are some the events we’ll be at this spring:
BIG Data, A New Horizon for Data Analysis
March 20 - 21, 2012
Cité Internationale Univeritaire de Paris, Paris, France
March 28-29, 2012
Columbia Metropolitan Convention Center, Columbia, South Carolina
Vladimir Fedorkov of Sphinx.
Presentation started out with a very nice presentation of candies to all the audience members.
What is Sphinx? Another (C++) daemon on your boxes. Can be queried via API (PHP, Python, etc.) or MySQL-compatible protocol and SQL queries (SphinxQL). Some query examples are in the slides, here’s one about SphinxSE in the KB.
MyISAM FTS is good but becomes slow with half a million documents. InnoDB has FTS now but he’s not tried it (and neither has anyone in the audience to see it compare with MyISAM FTS).
Geographical distance is the distance measuring the surface of the earth (two pairs of float values – latitude, longitude). In Sphinx, there is support for[Read more...]
Stephane Varoqui, Field Services SkySQL, Vlad Fedorkov, Director of PS, Sphinx Inc, Christophe Gesche, LAMP Expert, Delcampe, Herve Seignole, Web Architect, Groupe Pierre & Vacances Center Parcs – this is a big talk!
Pros: Filtering takes place on attributes in separate tables. Rely on the optimizer choice. HASH JOIN can help (MariaDB 5.3). Table elimination can help (MariaDB 5.2). ICP Index Condition Pushdown can help (MariaDB 5.3/MySQL 5.6). Max 80M documents at Pixmania, all queries come in less than 1s using 128GB of RAM (MariaDB 5.2). At PAP.fr, there is 16GB RAM with MariaDB 5.2.
Cons: CPU intensive (replication with many slaves). Need covering indexes to cover various !filter !order. Join & sorting cost on lazy filtering.
The more indexes[Read more...]
At Percona, we’re now using sphinx for our documentation. We’re also using Jenkins for our continuous integration. We have compiler warnings from GCC being parsed by Jenkins using the built in filters, but there isn’t one for the sphinx warnings.
Luckily, in the configuration page for Jenkins, the Warnings plugin allows you to specify your own filters. I’ve added the following filter to process warnings from sphinx:[Read more...]
Introduction to Search with SphinxIntroduction to Search with Sphinx by Andrew Aksyonoff, O’Reilly Media 2011. About 146 pages. (Here’s a link to the publisher’s page.)
This is an engaging short introduction to Sphinx. At 146 pages, you shouldn’t expect it to go into every detail, and it doesn’t. There are major topics that it omits entirely or mentions only tangentially, such as distributed searching across a cluster of machines,[Read more...]
Sphinx search is a full text search engine, commonly used with MySQL.
There are some misconceptions about Sphinx and its usage. Following is a list of some of Sphinx’ properties, hoping to answer some common questions.
While the majority of Percona gang travelled to California for the MySQL event of the year, I headed in the opposite direction to Moscow for RIT++ 2010 conference where I presented a talk on Sphinx. You can get the PDF file here - Improving MySQL-based applications performance with Sphinx.
I have been invited to talk at Open Source Data Center Conference in Nürnberg, Germany in June this year, so I hope I can meet some of you there.[Read more...]
OpenSQLCamp was a huge success! I took videos of most of the sessions (we only had 3 video cameras, and 4 rooms, and 2 sessions were not recorded). Unfortunately, I was busy doing administrative stuff for opensqlcamp for the opening keynote and first 15 minutes of the session organizing, and when I got to the planning board, it was already full….so I was not able to give a session.
OpenSQLCamp was a huge success! Not many folks have blogged about what they learned there….if you missed it, all is not lost. We did take videos of most of the sessions (we only had 3 video cameras, and 4 rooms, and 2 sessions were not recorded).
All the videos have been processed, and I am working on uploading them to YouTube and filling in details for the video descriptions. Not all the videos are up right now….right now all the lightning talks are up.
There is a serious bug with the sphinx storage engine, introduced in 0.9.9-RC2 (and which has not been fixed in latest revisions, as yet – last checked with rev 2006).
I would usually just revert to an older version (0.9.9-RC1 does not contain this bug), but for the reason that RC2 introduces an important feature: the sphinx_snippets() function, which allows for creation of snippets from within MySQL, and which makes the sphinx integration with MySQL complete, as far as the application is concerned.
In the past few weeks I've been implementing advanced search at Plaxo, working quite closely with Solr enterprise search server. Today, I saw this relatively detailed comparison between Solr and its main competitor Sphinx (full credit goes to StackOverflow user mausch who had been using Solr for the past 2 years). For those still confused, Solr and Sphinx are similar to MySQL FULLTEXT search, or for those even more confused, think Google (yeah, this is a bit of a stretch, I know).
Suppose you have a MyISAM table containing a column with a full text index. This table starts to grow to a significant size (millions of rows) and gets updated fairly frequently. Chances are that you’ll start to see some bottlenecks when accessing this table, since without row level locking, the reading and writing operations will be blocking each other.
A solution that many people would suggest right away is to use the master for writes and a slave for reads, but this only masks the problem, and it won’t take long before enough read traffic on the slave starts causing slave lags.
The main difference between the Sphinx search engine and other alternatives is its close integration with MySQL. For example, it can be used as a storage[Read more...]
I just filed a very annoying bug when trying to compile with plugin engines using the 5.1.xx source tarball.
I am trying to test SphinxSE as a plugin instead of getting it statically linked and came across an annoying bug. When using the configure
--with-plugins option only once, the engine is statically linked. When using it twice, the first engine is created as a plugin, and the 2nd one is linked statically. Here are a couple of examples:./configure –prefix=/usr/local/mysql-5.1.33 –with-plugins=innobase –with-plugins=sphinx
[Read more...]plugin_innobase_shared_target='ha_innodb.la' <-- plugin plugin_innobase_static_target='' plugin_sphinx_shared_target='' plugin_sphinx_static_target='libsphinx.a' <-- static
sql_query_infoand found that no results came from the database. That's because "search" only has MySQL support, whereas the main Sphinx code base has support for a number of databases, and uses inheritance to easily support them all so that whatever database you use, it automatically uses the correct connection. So, I added drizzle support to "search". The one issue is that unlike the indexer, "search" can only support one database as a time. So, if you want to compile in Drizzle support, you have to disable MySQL support for "search" to work correctly:
Peter wrote about this recently, but I don’t know if it was really clear what was going on.
Point One: Sphinx can be contacted by the MySQL protocol. Not “as a MySQL storage engine.” Not “from MySQL.” It understands the MySQL protocol itself. So from the protocol point of view, the Sphinx search daemon can look just like a MySQL server.
Point Two: Sphinx understands a SQL-like query language. Don’t be fooled. You’re not writing SQL. It just looks like you are.
Point Three: Because of point One and point Two, you can use the mysql command-line client program to talk directly to[Read more...]
In the recently released Sphinx version 0.9.9-rc2 there is a support for MySQL wire protocol and SphinxQL - SQL-like language to query Sphinx indexes. This support is currently in its early preview stage but it is still fun to play with.
A thing to mention - unlike MySQL Storage Engines, some of which as InfoBright or KickFire take over execution after parsing, Sphinx MySQL support has nothing to do with MySQL - it is implementation of the wire protocol from scratch.
For this test I was not interesting in the full text search performance, we already know Sphinx is much faster than MySQL build in full text search. I was rather interested to look performance of other queries, not using Full Text Search.[Read more...]
This article accompanies the slides from a presentation on database sharding. Sharding is a technique used for horizontal scaling of databases we are using at Netlog. If you’re interested in high performance, scalability, MySQL, php, caching, partitioning, Sphinx, federation or Netlog, read on …
This presentation was given at the second day of FOSDEM 2009 in Brussels. FOSDEM is an annual conference on open source software with about 5000 hackers. I was invited by Kris Buytaert and Lenz Grimmer to give a talk in the MySQL Dev[Read more...]
Here are the slides from yesterday’s presentation about horizontal database scaling through sharding at the mySQL dev room at FOSDEM 2009.
I’ve got a ton of notes and remarks to these slides, which will become available here soon.
|Showing entries 1 to 30 of 41||Next 11 Older Entries|