pymongoimport - Import csv files into MongoDB¶
pymongoimport
is a collection of python programs for importing CSV
files into MongoDB.
Why do we have pymongoimport
?
MongoDB already has a perfectly good (and much faster) mongoimport program that is available for free in the standard MongoDB community download.
Well pymonogoimport
does a few things that mongoimport
doesn’t do (yet). For people
with new CSV files there is the --genfieldfile
option which will automatically
generate a typed field file for the specified input file. Even with a field file pymongoimport
will fall back to the string type if type conversion fails on any input column.
pymongoimport
allows you to use the --addlocator
argument to automatically
include a locator in each document that is inserted. This locator will
indicate the file name and the line number of the line that was the input
for the generated document.
pymongoimport also has the ability to restart an upload from the
point where is finished. This restart capability is recorded in an
audit
collection in the current database. An audit record is
stored for each upload in progress and each completed upload. Thus the
audit collection gives you a record of all uploads by filename and
date time.
Finally pymongoimport is more forgiving of dirty data. So if your actual data doesn’t match your field type definitions then the type converter will fall back to using a string type.
On the other hand mongoimport supports the more extensive security options of the MongoDB Enterprise Advanced product and because it is written in Go it can use threads more effectively and so is generally faster.
pymongoimport command-line programs: