What is Manticore Search? Configuration, Setup, and Practical Application
Hello. I’m Jeongil Jeong, working as a backend developer at a PropTech platform.
“We need a search feature.” When you hear this, one name probably pops into most developers’ heads. That is Elasticsearch. Thanks to its powerful features and rich ecosystem, it seems to be considered the industry standard.
However, I don’t think Elasticsearch is always the answer. Of course, it might sound obvious, but every technology must be chosen appropriately according to the situation and environment. In that sense, our concern was “Do we really need to set up Kibana and fight with Java heap memory just to add a single search feature?” Especially in a resource-constrained environment, we thought Elasticsearch’s high resource requirements and complexity could be a significant burden.
At the end of this deliberation, we discovered an open-source search engine called Manticore Search. This post is a record of the process of choosing Manticore Search as our search engine, the realistic concerns and experiences we faced while applying it to a production environment, and our technical choices.
Manticore Search?
Manticore Search is a high-performance open-source full-text search engine written in C++. However, it was interesting that it wasn’t a completely new project.
Its roots lie in Sphinx Search, which gained popularity in the early 2000s alongside MySQL for its lightweight and fast performance. In 2016, when development of Sphinx effectively ceased, most of the core team left and forked the project in 2017 to create Manticore Search.
What’s interesting is that while Sphinx later turned into closed source, Manticore Search has remained 100% open source and is developing much more actively. (Manticore Search: 3 years after forking from Sphinx)
Realistic Reasons Why We Chose Manticore Search
I personally believe that technology selection is always a series of trade-offs. The process by which our team chose Manticore Search was as follows.
1. Realistic Constraint: Server Cost Resources
The biggest reason was the cost issue. It’s one of the most representative reasons that cannot be left out when talking about realistic reasons.
Elasticsearch is Java-based, and for a production environment, at least 8GB RAM and 2vCPU or more are recommended. It was excessive resources for our service scale, and we expected this to lead directly to infrastructure cost burdens.
On the other hand, I read documentation stating that Manticore Search, written in C++, shows amazing efficiency, using only about 40MB of RSS on an empty instance. Of course, the recommended specifications would require higher specs than that, but it seemed certain that it consumes fewer resources.
Currently, we are stably operating Manticore Search in a Kubernetes environment with specs around requests: cpu: 200m, memory: 512Mi.
The difference between Cpu: 2vCPU, Memory: 8GB RAM and Cpu: 200m, Memory: 512Mi is a level that cannot be ignored for small teams like startups or small to medium-sized projects.
2. Performance: Does Lightness Lead to Speed?
As important as cost was performance. According to the latest benchmarks released by Manticore Search, they claim to be several times to tens of times faster than Elasticsearch, especially in scenarios like log analysis. (Manticore Search vs Elasticsearch)
Of course, I think benchmarks are just results in an ideal environment. And since it was released by Manticore Search, we also have to take into account that the results are bound to be more positive.
However, we expected that Manticore’s architecture, written in C++ and using system resources directly without overhead like JVM garbage collection (GC), would provide ‘predictable and consistent performance’.
3. Developer Productivity: SQL and HTTP
One of the great attractions of Manticore Search was its native support for SQL. Being compatible with the MySQL protocol, we could connect directly with existing MySQL clients and write search queries in familiar SQL.
| |

In addition, it basically supports the HTTP JSON API, which is essential for modern web service architectures.
| |

Comparison with Other Search Engines: Why Manticore?
Besides Elasticsearch, we also reviewed other lightweight search engines like Meilisearch and Typesense.
| Item | Manticore Search | Meilisearch | Typesense | Elasticsearch |
|---|---|---|---|---|
| Language | C++ | Rust | C++ | Java |
| Data Sync | Pull (DB → Search Engine) | Push (App → Search Engine) | Push (App → Search Engine) | Push (App/Logstash → Search Engine) |
| Pros | Native SQL, Low Resources | Easy usability, Fast development | Fast search speed (In-Memory) | Rich ecosystem, Analytics features |
| Cons | Small ecosystem, Domestic awareness | Limitations in large-scale data processing | Data size limited by RAM | High resources, Complexity |
I understand that Elasticsearch, Meilisearch, and Typesense are all optimized for the Push method where the application calls the API to push data.
On the other hand, Manticore’s Plain index, which we used, was more specialized for the Pull method where the indexer periodically fetches data from the DB and regenerates the index.
Our requirement was ‘when specific tables in the operational DB change, it should be periodically reflected in the search engine’. In this case, to use the Push model, separate synchronization logic that detects DB changes and calls the API must be implemented in the application. (As covered in the epilogue, we implemented this logic with Spring Batch while migrating to Elasticsearch.)
On the other hand, we judged that Manticore’s Pull model was more suitable for us because it delegates this process to the search engine, keeping the application logic simple.
Below is a diagram showing the difference between the two data synchronization methods.
Push Model
In the
Pushmodel, the responsibility for data synchronization lies with the application. The application must detect DB changes and call the search engine’s API to push data. For example, it can be implemented with a service like Spring Batch.
graph LR;
DB[(Database)] -- "1. Data Changed" --> App{"Application<br/>with Sync Logic"};
App -- "2. Push Data" --> SE["Search Engine<br/>(ex. ElasticSearch)"];
Pull Model
In the
Pullmodel, the responsibility for data synchronization lies with the search engine. The application only handles search requests, and Manticore Search’sIndexerperiodically fetches (Pulls) data from the DB to keep the index up to date.
graph LR;
DB[(Database)];
App{"Application<br/>(Search Only)"} -- "Search Query" --> MS["Manticore Search"];
MS -- "Pulls Data Periodically" --> DB;
Practical Deployment and Indexing Automation
Now I would like to share two methods of deploying Manticore Search in a real production environment and automating indexing: using Docker Compose and Kubernetes.
Method 1: Configuring Development Environment with Docker Compose
When deploying a local development environment or a single server, using Docker Compose is very convenient. This is a simplified example of the configuration we actually used.
1. docker-compose.yml
| |
# 1. searchd daemon basic settings
searchd {
listen = 9306:mysql41 # MySQL protocol
listen = 9308:http # HTTP API port
pid_file = /var/run/manticore/searchd.pid
log = /var/log/manticore/searchd.log
query_log = /var/log/manticore/query.log
}
# 2. Data Source settings example
source poi_src
{
type = mysql
sql_host = YOUR_DB_HOST
sql_user = YOUR_DB_USER
sql_pass = YOUR_DB_PASSWORD
sql_db = YOUR_DB_NAME
sql_query = \
SELECT station_id AS id, station_name, line_name \
FROM station_table_example
sql_attr_string = line_name
}
# 3. Index settings example
index poi {
source = poi_src
path = /var/lib/manticore/poi
# --- Key settings for Korean search ---
charset_table = 0..9, A..Z->a..z, a..z
infix_fields = station_name
min_infix_len = 2
ngram_len = 2
ngram_chars = U+AC00..U+D7AF, U+1100..U+11FF, U+3130..U+318F
}2. Dockerfile.manticore & start.sh
Automate indexing by setting up cron within the Manticore container.
| |
| |
I thought this method had the advantage of managing all infrastructure settings as code (IaC) and easily building the same development environment with a single docker-compose up command.
Method 2: Deploying to Production Environment with Kubernetes
This is our actual production environment configuration method. We assigned the responsibility of indexing automation to the Manticore container itself, not the Kubernetes CronJob resource. This was to make management easier by unifying all responsibilities related to deployment into a single Deployment.
1. manticore.conf (ConfigMap Example)
Configured simply with one source and one index to fit the blog example.
| |
2. Dockerfile and start.sh
Even in the Kubernetes environment, use Dockerfile and start.sh similar to the Docker Compose method to build an image so that the container handles indexing scheduling itself. Added initial indexing logic to start.sh.
| |
| |
3. Deployment.yaml
| |
Korean Search, Record of Trials
In the process of Korean tokenization, we agonized over the ngram_len value. Although ngram_len=1 was often recommended in the official documentation, when analyzing the characteristics of our dataset and major search terms, we judged that two-character searches like ‘Gangnam’ and ‘Seocho’ were more natural and accurate. So we finally decided on ngram_len=2 (cutting by 2 characters).
I think it is important to analyze actual data and user search patterns to find settings that fit our service, rather than following a set answer.
Problems We Faced
- Slow Initial Indexing Speed: The cause was the performance issue of the Source DB
sql_query. We solved it by tuning the query and adding indexes. - Absence of Kibana: We are considering linking Grafana, etc., but haven’t found a satisfactory visualization/monitoring solution yet.
- Mixed Korean+English/Number Search: Solved by including English and numbers in
charset_table.
Pros and Cons of Manticore Search That I Felt
Pros
- Lightweight Resource Usage: It was possible to operate sufficiently with little memory and CPU.
- Pull Indexing Model: It was very convenient to configure in that indexing automation is possible without building a separate system.
- Fast Search Performance: We experienced satisfactory search speed not only in benchmark results but also in actual operating environments.
- Korean Search Support: With the help of the official documentation, we were able to apply Korean tokenization settings relatively easily.
Cons
- Small Ecosystem: Compared to Elasticsearch, there are fewer communities and resources, so I felt that problem solving took longer.
- Limited Advanced Features: I felt that support for complex analysis functions or embedding search was lacking. Of course, there are existing functions, but references are few, so we had to take risks. (As I will cover in the epilogue, we eventually moved to Elasticsearch to build a RAG system.)
- Lack of Monitoring Tools: There was no visualization tool like Kibana, so there was difficulty in log analysis and monitoring.
Conclusion: Journey to Choose the Right Tool for the Problem
I think our team found a relatively excellent alternative called Manticore Search, starting from the realistic problems of cost and complexity.
If you don’t need all the features Elasticsearch provides, I think Manticore Search can be a way to implement excellent search features with much less cost and effort.
Since there are not many references related to Manticore Search in Korea, I hope this article will be of some help to those considering its introduction. If you have any questions or incorrect information, please feel free to leave a comment!
Epilogue: Into the Arms of Elasticsearch
Since writing this article, there has been another change in our team’s technology stack. We eventually chose Elasticsearch. Although we were satisfied using Manticore Search, technical requirements changed in the process of building a new RAG (Retrieval-Augmented Generation) system.
In terms of embedding vector search, the core of the RAG system, and future scalability, we judged that the rich features and ecosystem provided by Elasticsearch were more advantageous.
However, if your goal is still only ’light and fast search’ and you are looking for a tool that supports Pull based indexing, I think Manticore Search can be an excellent choice.
Elasticsearch seems to be a better tool for our new requirement, ‘complex and diverse search support (ex. embedding-based similarity search and multiple parameter search)’. (In this process, indexing was implemented through Spring Batch as mentioned above.)
I will cover the experience of moving from Manticore Search to Elasticsearch and the story of building the RAG system in detail in another article next time.