FAQ

This document encompasses many of the frequently asked questions (FAQs) about Mongo Connector.

My oplog progress file always seems really out of date. What's going on?

Mongo Connector updates the oplog progress file (called config.txt, by default) whenever its cursor into the MongoDB oplog is closed. Note that this may come long after Mongo Connector has read and processed all entries currently in the oplog. This is due to the connector's use of a tailable cursor, which can be re-used to retrieve documents that arrive in the oplog even after the cursor is created. Thus, you cannot rely on the progress file being updated automatically after the oplog is exhausted.

Instead, Mongo Connector provides the --batch-size option with which you can specify the maximum number of documents Mongo Connector may process before having to record its progress. For example, if you wanted to make sure that Mongo Connector records its progress at least every 100 operations in the oplog, you could run:

mongo-connector -m <source host/port> -t <destination host/port> --batch-size=100

Why are some fields in my MongoDB documents are not appearing in Solr?

Documents that are missing or have additional fields to the Solr collection schema cannot be inserted, and Solr will log an exception. Thus, Mongo Connector tries to read your Solr collection's schema prior to replicating any operations to Solr in order to avoid sending invalid requests. Documents replicated to Solr from MongoDB may need to be altered to remove fields that aren't in the schema, and the result may look as if your documents are missing certain fields.

The solution to this is to update your schema.xml file and reload the relevant Solr cores.

What is the `mongodb_meta` index in Elasticsearch?

Mongo Connector creates a mongodb_meta index in Elasticsearch in order to keep track of when documents were last modified. This is used to resolve conflicts in the event of a replica set rollback event, but is kept in a separate index so that it can be removed easily if necessary.

Why are my documents empty in Elasticsearch? Why are updates not happening in Elasticsearch?

Mongo Connector needs _source to be enabled in order to apply update operations. Make sure that you have this enabled.

How many threads does Mongo Connector start?

Mongo Connector starts one thread for each oplog (i.e., each replica set), and an additional thread to monitor them. Thus, if you have a three-shard cluster, where each shard is a replica set, you will have:

1 Connector thread (starts OplogThreads and monitors them)
3 OplogThreads (one for each shard)

How do I increase the speed of Mongo Connector?

Increase the value for --auto-commit-interval (or, even better, don't specify it at all and let it be None). Setting this value higher means we don't need to refresh the remote system as often and can save time. Leaving this option out entirely leaves when to refresh indexes up to the remote indexing system itself. Most indexing systems have some way to configure this.
If you need only to replicate certain collections, use the --namespace-set option to specify these. You can also run separate instances of Mongo Connector, each with a single namespace to replicate, so that you can replicate those namespaces in parallel. Note that this may mean that some collections may be further ahead/behind others, especially if the number of operations is unbalanced across these collections.
You can increase the value for --batch-size, or leave it out, so that Mongo Connector records its timestamp less frequently.

Does Mongo Connector support dynamic schemas for Solr?

Mongo Connector does not currently support this. However, restarting Mongo Connector will cause it to re-read the schema definition.

How can I load several Solr cores with Mongo Connector?

There are two options:

Use multiple solr_doc_managers. When you do this, all MongoDB collections go to all cores. This isn't a very common use case.
Use multiple instances of mongo-connector, passing the base URL of the core to docManagers.XXX.targetURL. This allows you to refine what collections and what fields from each document get sent to each core.

I can't install Mongo Connector! I'm getting the error "README.rst: No such file or directory"

Make sure you have a recent version of setuptools installed. Any version after 0.6.26 should do the trick:

pip install --upgrade setuptools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ

My oplog progress file always seems really out of date. What's going on?

Why are some fields in my MongoDB documents are not appearing in Solr?

What is the `mongodb_meta` index in Elasticsearch?

Why are my documents empty in Elasticsearch? Why are updates not happening in Elasticsearch?

How many threads does Mongo Connector start?

How do I increase the speed of Mongo Connector?

Does Mongo Connector support dynamic schemas for Solr?

How can I load several Solr cores with Mongo Connector?

I can't install Mongo Connector! I'm getting the error "README.rst: No such file or directory"

Clone this wiki locally

FAQ

My oplog progress file always seems really out of date. What's going on?

Why are some fields in my MongoDB documents are not appearing in Solr?

What is the mongodb_meta index in Elasticsearch?

Why are my documents empty in Elasticsearch? Why are updates not happening in Elasticsearch?

How many threads does Mongo Connector start?

How do I increase the speed of Mongo Connector?

Does Mongo Connector support dynamic schemas for Solr?

How can I load several Solr cores with Mongo Connector?

I can't install Mongo Connector! I'm getting the error "README.rst: No such file or directory"

Clone this wiki locally

What is the `mongodb_meta` index in Elasticsearch?