iOS full-text search with Core Data and SQLite
This article will discuss the implementation of SQLite’s FTS engine in an Core Data-based iOS application.
Introduction
In a full-text search, FTS for short, the search engine examines all of the words in every stored document as it tries to match search criteria. Unfortunately, Core Data, iOS’s native persistence framework, does not offer support for FTS, however Core Data internally relies on an SQLite database for storage on iOS devices. This allows for the usage of SQLite’s FTS4 engine.
To implement this solution I wrote a small app with a simplified data structure similar to that of one of our client apps. For searching, this meant it should search through at least 10,000 documents on a title and content properties where the content would be filled with HTML-rich text. A dataset that closely resembled this is one from Stackoverflow, publicly available at http://blog.stackoverflow.com/category/cc-wiki-dump. I took the users.xml file. It contains about 300,000 users where the fields displayName and aboutMe(HTML rich-text) closely resemble the content in our client’s app.
The app displays a list of Stackoverflow user accounts in a tableView
.
The users are searchable through the tableView
’s searchbar.
Global architecture
In the app I will implement FTS using SQLite’s FTS functionality. To simplify communication between the app and its database, FMDB will be used, FMDB is an Objective-C wrapper around SQLite. The virtual table is created in the same SQLite database in wich the Core Data content resides. To keep this table as light as possible only object properties relevant to the search query are inserted.
Search results are sorted on their relevance, which is calculated by
SQLite in a custom ranking function. The custom function is added to
SQLite using the provided C API.
Objects in the resultset contain an identifier corresponding to the Core
Data object. The relevant Core Data objects are fetched using an array
containg the IDs retrieved by the FTS query. Results are displayed by a
searchDisplayController
in a tableView
, ordered(DESC
) based on
their relevance ranking to the search query.
Pre-requisites:
To use the SQLite FTS engine it is necessary to include the SQLite library in the iOS project, and since there are no iOS specific versions of SQLite it’s necessary to add it as a static library and compile it. Here is how it can be done:
-
Download the SQLite source from http://www.sqlite.org/download.html
-
Open Xcode, create a new Project
-
Under the iOS section, select Framework & Library, create an new Cocoa Touch Static library.
-
Add the SQLite source files (
sqlite3.c
andsqlite3.h
) to the project -
In project build settings, make sure the compiler is set to the LLVM compiler 4.1
-
In project build settings, add the following LLVM preprocessor macros:
-
For debug:
SQLITE_DEBUG
,SQLITE_ENABLE_FTS3_PARENTHESIS
andSQLITE_ENABLE_FTS3
-
For release:
SQLITE_ENABLE_FTS3_PARENTHESIS
andSQLITE_ENABLE_FTS3
-
-
Add this Xcode framework-project as a whole to the Xcode iOS app-project.
For ease of use I also included FMDB to interface with SQLite.
-
Download FMDB from github https://github.com/ccgus/fmdb
-
Add it to the iOS project in Xcode
-
Add its dependency, FMDB requires
libsqlite3.dylib
to be included with the project.-
In project build phases under `Link binary with libraries'.
-
Implementation
This paragraph will describe the steps necessary to implement FTS. The virtual table should be created and filled simultaneously to the initial creation of the Core Data-database. When creating the virtual table specify that FTS is used:
BOOL tableCreated = [_db executeUpdate:@"CREATE VIRTUAL TABLE ZUSERINDEX USING fts4(displayName, aboutMe, id);"];
The _db
variable is an instance of FMDatabase. For convenience this
instance could be placed in an Singleton object so that is accessible
throughout the app.
From the searchDisplayController delegate method
shouldReloadTableForSearchString
I trigger the custom search method
which executes an FTS select query on the virtual table.
_result = [_db executeQuery:[NSString stringWithFormat:@"SELECT displayName, id FROM ZUSERINDEX WHERE ZUSERINDEX MATCH '*%@*' ORDER BY CustomRank(matchinfo(ZUSERINDEX)) DESC;", searchText]];
The function CustomRank
is added to SQLite using following statement:
sqlite3_create_function(db, "CustomRank", 1, SQLITE_UTF8, NULL, &CustomRank, NULL, NULL);
The db variable is the sqlite handle, which is exposed as a
property(sqliteHandle
) by FMDB. Following arguments are the name of
the function, the number of arguments it takes, and the function itself.
Note that newer versions of FMDB also support the adding of functions
via blocks, using makeFunctionNamed:maximumArguments:withBlock:
.
The ranking function takes the matchinfo()
results as argument.
Matchinfo
is an auxiliary
function provided by SQLite which contains (amongst others), the column
in which the phrase hit a match and the number of phrases:
static void CustomRank(sqlite3_context *context, int argc, sqlite3_value **argv) {
if (argc != 1)
sqlite3_result_null(context);
if (sqlite3_value_type(argv[0]) == SQLITE_NULL)
sqlite3_result_null(context);
int weight = 0;
unsigned int *blob = (unsigned int *)sqlite3_value_blob(argv[0]);
unsigned int numberOfPhrases = blob[0];
unsigned int numberOfColumns = blob[1];
// Ranking details left out.
sqlite3_result_int(context, weight);
}
For each matching phrase in each column the weight is increased. Our
ranking function CustomRank
is written in C. It is of course possible
to write your custom function in Objective-C, but it gets a bit more
tricky to convert the arguments to the right types.
This returns a FMDBResultset with the id
column sorted by relevance as
I defined it. I added the results in a NSArray
.
FMResultSet *rs = [_database executeQuery:SQLFetchRequest];
_displayNameArray = [NSMutableArray array];
_idArray = [NSMutableArray array];
while ([rs next]) {
[_displayNameArray addObject:[rs stringForColumn:@"displayname"]];
[_idArray addObject:[rs stringForColumn:@"id"]];
}
The TableViewDatasource
can be implemented to use these arrays as
datasource and display their content in cells based on indexPath.row
.
When a cell is selected I query Core Data for the actual object based on
the object id from _idArray
using the following predicate:
[NSPredicate predicateWithFormat:@"identifier = %@", [_idArray objectAtIndex:indexPath.row]];
The result is a tableView
with search option that responsively
displays full-text search results in order of their relevance.
Performance
There are some performance issues in the implementation described above. This becomes apparent when the query is fired with only a single character. All documents may contain the character `e'. Calculating the ranks of the resultset from this query can take up to 1.5 seconds on older devices. A solution would be to not rank the results of single character queries, or only to fire the query after the second character was given.
The screenshots below show the perfomance gain over the Core Data queries:
SQLites Full-text search.
performance]
Core Data with contains[cd]
predicate.
97.8% CPU usage vs 87.9% CPU usage, an almost 10% decrease. Profiler results of memory usage see an allocation decrease from 131.56 to 19.85 MB, at least a 660% decrease. In practise this means that the app will remain responsive while searching.
Conclusion
Although this implementation may not be as elegant when compared to Core Data-based search implementions, this SQLite implementation offers something Core Data does not: full-text search. Next to that, it performs almost 10% faster and at least 660% more (memory) efficiently than a comparable Core Data query.
Further thoughts
Core Data could be left out of this implementation, SQLite offers
comparable functionality, improved performance and FTS. This would mean
that the tableViewDatasource
would have to be adjusted to map the FMDB
resultset rather than the CoreData fetchRequest
resultset.