签到成功

知道了

CNDBA社区CNDBA社区

MongoDB 备份与恢复 说明

2022-05-04 11:33 2729 0 原创 MongoDB
作者: dave

1 MongoDB 备份恢复概述


从MongoDB 4.4 开始MongoDB的备份恢复命令也是需要单独安装的。
具体安装步骤参考之前的博客:

MongoDB 4.4 以后版本安装Database Tools 工具
https://www.cndba.cn/dave/article/107952

http://www.cndba.cn/cndba/dave/article/107972

这里主要涉及5个命令:

[dave@www.cndba.cn_3 ~]# cd /usr/local/mongodb/bin/
[dave@www.cndba.cn_3 bin]# ll
total 325212
-rwxr-xr-x 1 root root 13997360 Apr 30 11:31 bsondump
-rwxr-xr-x 1 root root    15205 Mar  2 23:24 install_compass
-rwxr-xr-x 1 root root 53691040 Mar  2 23:35 mongo
-rwxr-xr-x 1 root root 83347720 Mar  2 23:55 mongod
-rwxr-xr-x 1 root root 16769896 Apr 30 11:31 mongodump
-rwxr-xr-x 1 root root 16425456 Apr 30 11:31 mongoexport
-rwxr-xr-x 1 root root 17310776 Apr 30 11:31 mongofiles
-rwxr-xr-x 1 root root 16701752 Apr 30 11:31 mongoimport
-rwxr-xr-x 1 root root 17151264 Apr 30 11:31 mongorestore
-rwxr-xr-x 1 root root 65380248 Mar  2 23:44 mongos
-rwxr-xr-x 1 root root 16268112 Apr 30 11:31 mongostat
-rwxr-xr-x 1 root root 15931944 Apr 30 11:31 mongotop
[dave@www.cndba.cn_3 bin]#
  • mongoexport/mongoimport :导入/导出的是JSON格式或者CSV格式
  • mongodump/mongorestore :导入/导出的是BSON格式
  • bsondump:将 bson 格式的文件转储为 json 格式的数据

两组命令的区别:

  1. JSON可读性强但体积较大,BSON则是二进制文件,体积小但几乎没有可读性。
  2. 在一些mongodb版本之间,BSON格式可能会随版本不同而有所不同,所以不同版本之间用mongodump/mongorestore可能不会成功,具体要看版本之间的兼容性。
  3. 当无法使用BSON进行跨版本的数据迁移的时候,使用JSON格式即mongoexport/mongoimport是一个可选项。JSON虽然具有较好的跨版本通用性,但其只保留了数据部分,不保留索引,账户等其他基础信息。
  4. mongodump/mongorestore 支持库和集合级别的操作,mongoexport/mongoimport 只支持集合级别操作。
  5. mongodump 导出文件包含索引页。Mongoexport 导出文件不包含索引页。

官方说明:

https://www.mongodb.com/docs/database-tools/mongoexport/
https://www.mongodb.com/docs/database-tools/mongoimport/
https://www.mongodb.com/docs/database-tools/mongodump/
https://www.mongodb.com/docs/database-tools/mongorestore/

2 mongodump/mongorestore 命令说明


2.1 相关注意事项

Mongodump和mongorestore 在版本兼容性有一定的要求。 导出和导入的版本需要保持一致,比如都是使用100.5.1的版本。目前100.5.1的版本支持如下mongodb 版本:

  • MongoDB 5.0
  • MongoDB 4.4
  • MongoDB 4.2
  • MongoDB 4.0

同时mongodump和mongorestore使用的DB大版本也需要保持一致,比如从4.4.X导出的,也需要导入到4.4.x的版本。

Mongodump 命令仅导出数据库中文档的数据,不会导出索引的数据,但是会记录索引的信息,在导入之后,mongorestore 会重建索引。http://www.cndba.cn/cndba/dave/article/107972

如果mongod实例使用WiredTiger 引擎时,mongodump 的dump 文件不会进行数据压缩。

还有一点需要注意,从MongoDB 4.2+ 开始,mongodump 和 mongorestore 不能在分片集群中使用,因为mongodump 的备份不能维持跨分片事务的原子性(backups created with mongodump do not maintain the atomicity guarantees of transactions across shards)。

Shared Cluster的一致性性备份需要使用MongoDB 企业版或者借助第三方的工具,比如:

https://github.com/Percona-Lab/mongodb_consistent_backup

http://www.cndba.cn/cndba/dave/article/107972

2.2 命令帮助


[dave@www.cndba.cn_1 bin]# mongodump --help
Usage:
  mongodump <options> <connection-string>

Export the content of a running server into .bson files.

Specify a database with -d and a collection with -c to only dump that database or collection.

Connection strings must begin with mongodb:// or mongodb+srv://.

See http://docs.mongodb.com/database-tools/mongodump/ for more information.

general options:
      --help                                                print usage
      --version                                             print the tool version and exit
      --config=                                             path to a configuration file

verbosity options:
  -v, --verbose=<level>                                     more detailed log output (include multiple times for more verbosity, e.g. -vvvvv, or
                                                            specify a numeric value, e.g. --verbose=N)
      --quiet                                               hide all log output

connection options:
  -h, --host=<hostname>                                     mongodb host to connect to (setname/host1,host2 for replica sets)
      --port=<port>                                         server port (can also use --host hostname:port)

ssl options:
      --ssl                                                 connect to a mongod or mongos that has ssl enabled
      --sslCAFile=<filename>                                the .pem file containing the root certificate chain from the certificate authority
      --sslPEMKeyFile=<filename>                            the .pem file containing the certificate and key
      --sslPEMKeyPassword=<password>                        the password to decrypt the sslPEMKeyFile, if necessary
      --sslCRLFile=<filename>                               the .pem file containing the certificate revocation list
      --sslFIPSMode                                         use FIPS mode of the installed openssl library
      --tlsInsecure                                         bypass the validation for server's certificate chain and host name

authentication options:
  -u, --username=<username>                                 username for authentication
  -p, --password=<password>                                 password for authentication
      --authenticationDatabase=<database-name>              database that holds the user's credentials
      --authenticationMechanism=<mechanism>                 authentication mechanism to use
      --awsSessionToken=<aws-session-token>                 session token to authenticate via AWS IAM

kerberos options:
      --gssapiServiceName=<service-name>                    service name to use when authenticating using GSSAPI/Kerberos (default: mongodb)
      --gssapiHostName=<host-name>                          hostname to use when authenticating using GSSAPI/Kerberos (default: <remote server's
                                                            address>)

namespace options:
  -d, --db=<database-name>                                  database to use
  -c, --collection=<collection-name>                        collection to use

uri options:
      --uri=mongodb-uri                                     mongodb uri connection string

query options:
  -q, --query=                                              query filter, as a v2 Extended JSON string, e.g., '{"x":{"$gt":1}}'
      --queryFile=                                          path to a file containing a query filter (v2 Extended JSON)
      --readPreference=<string>|<json>                      specify either a preference mode (e.g. 'nearest') or a preference json object (e.g.
                                                            '{mode: "nearest", tagSets: [{a: "b"}], maxStalenessSeconds: 123}')
      --forceTableScan                                      force a table scan (do not use $snapshot or hint _id). Deprecated since this is
                                                            default behavior on WiredTiger

output options:
  -o, --out=<directory-path>                                output directory, or '-' for stdout (default: 'dump')
      --gzip                                                compress archive or collection output with Gzip
      --oplog                                               use oplog for taking a point-in-time snapshot
      --archive=<file-path>                                 dump as an archive to the specified path. If flag is specified without a value,
                                                            archive is written to stdout
      --dumpDbUsersAndRoles                                 dump user and role definitions for the specified database
      --excludeCollection=<collection-name>                 collection to exclude from the dump (may be specified multiple times to exclude
                                                            additional collections)
      --excludeCollectionsWithPrefix=<collection-prefix>    exclude all collections from the dump that have the given prefix (may be specified
                                                            multiple times to exclude additional prefixes)
  -j, --numParallelCollections=                             number of collections to dump in parallel
      --viewsAsCollections                                  dump views as normal collections with their produced data, omitting standard
                                                            collections
[dave@www.cndba.cn_1 bin]# mongorestore --help
Usage:
  mongorestore <options> <connection-string> <directory or file to restore>

Restore backups generated with mongodump to a running server.

Specify a database with -d to restore a single database from the target directory,
or use -d and -c to restore a single collection from a single .bson file.

Connection strings must begin with mongodb:// or mongodb+srv://.

See http://docs.mongodb.com/database-tools/mongorestore/ for more information.

general options:
      --help                                                print usage
      --version                                             print the tool version and exit
      --config=                                             path to a configuration file

verbosity options:
  -v, --verbose=<level>                                     more detailed log output (include multiple times for more verbosity, e.g. -vvvvv, or
                                                            specify a numeric value, e.g. --verbose=N)
      --quiet                                               hide all log output

connection options:
  -h, --host=<hostname>                                     mongodb host to connect to (setname/host1,host2 for replica sets)
      --port=<port>                                         server port (can also use --host hostname:port)

ssl options:
      --ssl                                                 connect to a mongod or mongos that has ssl enabled
      --sslCAFile=<filename>                                the .pem file containing the root certificate chain from the certificate authority
      --sslPEMKeyFile=<filename>                            the .pem file containing the certificate and key
      --sslPEMKeyPassword=<password>                        the password to decrypt the sslPEMKeyFile, if necessary
      --sslCRLFile=<filename>                               the .pem file containing the certificate revocation list
      --sslFIPSMode                                         use FIPS mode of the installed openssl library
      --tlsInsecure                                         bypass the validation for server's certificate chain and host name

authentication options:
  -u, --username=<username>                                 username for authentication
  -p, --password=<password>                                 password for authentication
      --authenticationDatabase=<database-name>              database that holds the user's credentials
      --authenticationMechanism=<mechanism>                 authentication mechanism to use
      --awsSessionToken=<aws-session-token>                 session token to authenticate via AWS IAM

kerberos options:
      --gssapiServiceName=<service-name>                    service name to use when authenticating using GSSAPI/Kerberos (default: mongodb)
      --gssapiHostName=<host-name>                          hostname to use when authenticating using GSSAPI/Kerberos (default: <remote server's
                                                            address>)

namespace options:
  -d, --db=<database-name>                                  database to use
  -c, --collection=<collection-name>                        collection to use

uri options:
      --uri=mongodb-uri                                     mongodb uri connection string

namespace options:
      --excludeCollection=<collection-name>                 DEPRECATED; collection to skip over during restore (may be specified multiple times to
                                                            exclude additional collections)
      --excludeCollectionsWithPrefix=<collection-prefix>    DEPRECATED; collections to skip over during restore that have the given prefix (may be
                                                            specified multiple times to exclude additional prefixes)
      --nsExclude=<namespace-pattern>                       exclude matching namespaces
      --nsInclude=<namespace-pattern>                       include matching namespaces
      --nsFrom=<namespace-pattern>                          rename matching namespaces, must have matching nsTo
      --nsTo=<namespace-pattern>                            rename matched namespaces, must have matching nsFrom

input options:
      --objcheck                                            validate all objects before inserting
      --oplogReplay                                         replay oplog for point-in-time restore
      --oplogLimit=<seconds>[:ordinal]                      only include oplog entries before the provided Timestamp
      --oplogFile=<filename>                                oplog file to use for replay of oplog
      --archive=<filename>                                  restore dump from the specified archive file.  If flag is specified without a value,
                                                            archive is read from stdin
      --restoreDbUsersAndRoles                              restore user and role definitions for the given database
      --dir=<directory-name>                                input directory, use '-' for stdin
      --gzip                                                decompress gzipped input

restore options:
      --drop                                                drop each collection before import
      --dryRun                                              view summary without importing anything. recommended with verbosity
      --writeConcern=<write-concern>                        write concern options e.g. --writeConcern majority, --writeConcern '{w: 3, wtimeout:
                                                            500, fsync: true, j: true}'
      --noIndexRestore                                      don't restore indexes
      --convertLegacyIndexes                                Removes invalid index options and rewrites legacy option values (e.g. true becomes 1).
      --noOptionsRestore                                    don't restore collection options
      --keepIndexVersion                                    don't update index version
      --maintainInsertionOrder                              restore the documents in the order of their appearance in the input source. By default
                                                            the insertions will be performed in an arbitrary order. Setting this flag also enables
                                                            the behavior of --stopOnError and restricts NumInsertionWorkersPerCollection to 1.
  -j, --numParallelCollections=                             number of collections to restore in parallel
      --numInsertionWorkersPerCollection=                   number of insert operations to run concurrently per collection
      --stopOnError                                         halt after encountering any error during insertion. By default, mongorestore will
                                                            attempt to continue through document validation and DuplicateKey errors, but with this
                                                            option enabled, the tool will stop instead. A small number of documents may be
                                                            inserted after encountering an error even with this option enabled; use
                                                            --maintainInsertionOrder to halt immediately after an error
      --bypassDocumentValidation                            bypass document validation
      --preserveUUID                                        preserve original collection UUIDs (off by default, requires drop)
      --fixDottedHashIndex                                  when enabled, all the hashed indexes on dotted fields will be created as single field
                                                            ascending indexes on the destination
[dave@www.cndba.cn_1 bin]#

2.3 使用示例


这里要注意一点,多次备份生产的备份文件名称是一样的,后面的备份会覆盖前面的备份,所以需要用脚本来指定不同的存放目录。

2.3.1 全库备份+oplog

这里需要使用—db 指定备份的数据库,如果不指定会备份所有数据库实例。

如果要备份oplog,那么不能指定—db参数,否则会报如下错误:

http://www.cndba.cn/cndba/dave/article/107972

[dave@www.cndba.cn_2 ~]# mongodump --username=root --password=root --host=127.0.0.1 --port=27017 --authenticationDatabase admin --db=cndba --oplog --out /data/mongodb/backup
2022-05-04T11:17:14.625+0800    Failed: bad option: --oplog mode only supported on full dumps
[dave@www.cndba.cn_2 ~]#


[dave@www.cndba.cn_2 ~]# mongodump --username=root --password=root --host=127.0.0.1 --port=27017 --authenticationDatabase admin –db=cndba --oplog --out /data/mongodb/backup
2022-05-04T11:14:46.714+0800    writing admin.system.users to /data/mongodb/backup/admin/system.users.bson
2022-05-04T11:14:46.715+0800    done dumping admin.system.users (2 documents)
2022-05-04T11:14:46.715+0800    writing admin.system.version to /data/mongodb/backup/admin/system.version.bson
2022-05-04T11:14:46.718+0800    done dumping admin.system.version (2 documents)
2022-05-04T11:14:46.720+0800    writing cndba.user to /data/mongodb/backup/cndba/user.bson
2022-05-04T11:14:46.721+0800    writing cndba.ustc to /data/mongodb/backup/cndba/ustc.bson
2022-05-04T11:14:46.724+0800    done dumping cndba.ustc (2 documents)
2022-05-04T11:14:46.828+0800    done dumping cndba.user (100000 documents)
2022-05-04T11:14:46.829+0800    writing captured oplog to
2022-05-04T11:14:46.829+0800            dumped 1 oplog entry
[dave@www.cndba.cn_2 ~]#


[dave@www.cndba.cn_2 ~]# mongorestore --username=root --password=root --host=127.0.0.1 --port=27017 --authenticationDatabase admin --drop --oplogReplay /data/mongodb/backup
2022-05-04T11:19:12.382+0800    preparing collections to restore from
2022-05-04T11:19:12.382+0800    don't know what to do with file "/data/mongodb/backup/user.csv", skipping...
2022-05-04T11:19:12.382+0800    don't know what to do with file "/data/mongodb/backup/user.json", skipping...
2022-05-04T11:19:12.384+0800    reading metadata for cndba.user from /data/mongodb/backup/cndba/user.metadata.json
2022-05-04T11:19:12.384+0800    reading metadata for cndba.ustc from /data/mongodb/backup/cndba/ustc.metadata.json
2022-05-04T11:19:12.384+0800    dropping collection cndba.user before restoring
2022-05-04T11:19:12.384+0800    dropping collection cndba.ustc before restoring
2022-05-04T11:19:12.426+0800    restoring cndba.user from /data/mongodb/backup/cndba/user.bson
2022-05-04T11:19:12.440+0800    restoring cndba.ustc from /data/mongodb/backup/cndba/ustc.bson
2022-05-04T11:19:12.481+0800    finished restoring cndba.ustc (2 documents, 0 failures)

2022-05-04T11:19:15.380+0800    finished restoring cndba.user (100000 documents, 0 failures)
2022-05-04T11:19:15.380+0800    restoring users from /data/mongodb/backup/admin/system.users.bson
2022-05-04T11:19:15.450+0800    replaying oplog
2022-05-04T11:19:15.450+0800    applied 0 oplog entries
2022-05-04T11:19:15.450+0800    no indexes to restore for collection cndba.user
2022-05-04T11:19:15.450+0800    no indexes to restore for collection cndba.ustc
2022-05-04T11:19:15.450+0800    100002 document(s) restored successfully. 0 document(s) failed to restore.
[dave@www.cndba.cn_2 ~]#
[dave@www.cndba.cn_2 ~]#

2.3.2 备份单个集合(只有全备支持oplog)

[dave@www.cndba.cn_2 ~]# mongodump --username=cndba --password=cndba --host=127.0.0.1 --port=27017 --authenticationDatabase cndba  --db=cndba --collection=user --out /data/mongodb/backup
2022-05-04T11:21:29.579+0800    writing cndba.user to /data/mongodb/backup/cndba/user.bson
2022-05-04T11:21:29.685+0800    done dumping cndba.user (100000 documents)


[dave@www.cndba.cn_2 backup]# mongorestore --username=cndba --password=cndba --host=127.0.0.1 --port=27017 --authenticationDatabase cndba --drop --db=cndba --collection=user /data/mongodb/backup/cndba/user.bson
2022-05-04T11:22:26.226+0800    checking for collection data in /data/mongodb/backup/cndba/user.bson
2022-05-04T11:22:26.226+0800    reading metadata for cndba.user from /data/mongodb/backup/cndba/user.metadata.json
2022-05-04T11:22:26.226+0800    dropping collection cndba.user before restoring
2022-05-04T11:22:26.253+0800    restoring cndba.user from /data/mongodb/backup/cndba/user.bson
2022-05-04T11:22:27.795+0800    finished restoring cndba.user (100000 documents, 0 failures)
2022-05-04T11:22:27.795+0800    no indexes to restore for collection cndba.user
2022-05-04T11:22:27.795+0800    100000 document(s) restored successfully. 0 document(s) failed to restore.
[dave@www.cndba.cn_2 backup]#

3 mongoexport/mongoimport 命令说明


3.1 相关注意事项

Mongoexport和mongoimport 在版本兼容性有一定的要求。 导出和导入的版本需要保持一致,比如都是使用100.5.1的版本。目前100.5.1的版本支持如下mongodb 版本:http://www.cndba.cn/cndba/dave/article/107972

  • MongoDB 5.0
  • MongoDB 4.4
  • MongoDB 4.2
  • MongoDB 4.0

3.2 命令帮助

[dave@www.cndba.cn_1 bin]# mongoexport --help
Usage:
  mongoexport <options> <connection-string>

Export data from MongoDB in CSV or JSON format.

Connection strings must begin with mongodb:// or mongodb+srv://.

See http://docs.mongodb.com/database-tools/mongoexport/ for more information.

general options:
      --help                                      print usage
      --version                                   print the tool version and exit
      --config=                                   path to a configuration file

verbosity options:
  -v, --verbose=<level>                           more detailed log output (include multiple times for more verbosity, e.g. -vvvvv, or specify a
                                                  numeric value, e.g. --verbose=N)
      --quiet                                     hide all log output

connection options:
  -h, --host=<hostname>                           mongodb host to connect to (setname/host1,host2 for replica sets)
      --port=<port>                               server port (can also use --host hostname:port)

ssl options:
      --ssl                                       connect to a mongod or mongos that has ssl enabled
      --sslCAFile=<filename>                      the .pem file containing the root certificate chain from the certificate authority
      --sslPEMKeyFile=<filename>                  the .pem file containing the certificate and key
      --sslPEMKeyPassword=<password>              the password to decrypt the sslPEMKeyFile, if necessary
      --sslCRLFile=<filename>                     the .pem file containing the certificate revocation list
      --sslFIPSMode                               use FIPS mode of the installed openssl library
      --tlsInsecure                               bypass the validation for server's certificate chain and host name

authentication options:
  -u, --username=<username>                       username for authentication
  -p, --password=<password>                       password for authentication
      --authenticationDatabase=<database-name>    database that holds the user's credentials
      --authenticationMechanism=<mechanism>       authentication mechanism to use
      --awsSessionToken=<aws-session-token>       session token to authenticate via AWS IAM

kerberos options:
      --gssapiServiceName=<service-name>          service name to use when authenticating using GSSAPI/Kerberos (default: mongodb)
      --gssapiHostName=<host-name>                hostname to use when authenticating using GSSAPI/Kerberos (default: <remote server's address>)

namespace options:
  -d, --db=<database-name>                        database to use
  -c, --collection=<collection-name>              collection to use

uri options:
      --uri=mongodb-uri                           mongodb uri connection string

output options:
  -f, --fields=<field>[,<field>]*                 comma separated list of field names (required for exporting CSV) e.g. -f "name,age"
      --fieldFile=<filename>                      file with field names - 1 per line
      --type=<type>                               the output format, either json or csv
  -o, --out=<filename>                            output file; if not specified, stdout is used
      --jsonArray                                 output to a JSON array rather than one object per line
      --pretty                                    output JSON formatted to be human-readable
      --noHeaderLine                              export CSV data without a list of field names at the first line
      --jsonFormat=<type>                         the extended JSON format to output, either canonical or relaxed (defaults to 'relaxed')
                                                  (default: relaxed)

querying options:
  -q, --query=<json>                              query filter, as a JSON string, e.g., '{x:{$gt:1}}'
      --queryFile=<filename>                      path to a file containing a query filter (JSON)
      --readPreference=<string>|<json>            specify either a preference mode (e.g. 'nearest') or a preference json object (e.g. '{mode:
                                                  "nearest", tagSets: [{a: "b"}], maxStalenessSeconds: 123}')
      --forceTableScan                            force a table scan (do not use $snapshot or hint _id). Deprecated since this is default behavior
                                                  on WiredTiger
      --skip=<count>                              number of documents to skip
      --limit=<count>                             limit the number of documents to export
      --sort=<json>                               sort order, as a JSON string, e.g. '{x:1}'
      --assertExists                              if specified, export fails if the collection does not exist
[dave@www.cndba.cn_1 bin]# mongoimport --help
Usage:
  mongoimport <options> <connection-string> <file>

Import CSV, TSV or JSON data into MongoDB. If no file is provided, mongoimport reads from stdin.

Connection strings must begin with mongodb:// or mongodb+srv://.

See http://docs.mongodb.com/database-tools/mongoimport/ for more information.

general options:
      --help                                      print usage
      --version                                   print the tool version and exit
      --config=                                   path to a configuration file

verbosity options:
  -v, --verbose=<level>                           more detailed log output (include multiple times for more verbosity, e.g. -vvvvv, or specify a
                                                  numeric value, e.g. --verbose=N)
      --quiet                                     hide all log output

connection options:
  -h, --host=<hostname>                           mongodb host to connect to (setname/host1,host2 for replica sets)
      --port=<port>                               server port (can also use --host hostname:port)

ssl options:
      --ssl                                       connect to a mongod or mongos that has ssl enabled
      --sslCAFile=<filename>                      the .pem file containing the root certificate chain from the certificate authority
      --sslPEMKeyFile=<filename>                  the .pem file containing the certificate and key
      --sslPEMKeyPassword=<password>              the password to decrypt the sslPEMKeyFile, if necessary
      --sslCRLFile=<filename>                     the .pem file containing the certificate revocation list
      --sslFIPSMode                               use FIPS mode of the installed openssl library
      --tlsInsecure                               bypass the validation for server's certificate chain and host name

authentication options:
  -u, --username=<username>                       username for authentication
  -p, --password=<password>                       password for authentication
      --authenticationDatabase=<database-name>    database that holds the user's credentials
      --authenticationMechanism=<mechanism>       authentication mechanism to use
      --awsSessionToken=<aws-session-token>       session token to authenticate via AWS IAM

kerberos options:
      --gssapiServiceName=<service-name>          service name to use when authenticating using GSSAPI/Kerberos (default: mongodb)
      --gssapiHostName=<host-name>                hostname to use when authenticating using GSSAPI/Kerberos (default: <remote server's address>)

namespace options:
  -d, --db=<database-name>                        database to use
  -c, --collection=<collection-name>              collection to use

uri options:
      --uri=mongodb-uri                           mongodb uri connection string

input options:
  -f, --fields=<field>[,<field>]*                 comma separated list of fields, e.g. -f name,age
      --fieldFile=<filename>                      file with field names - 1 per line
      --file=<filename>                           file to import from; if not specified, stdin is used
      --headerline                                use first line in input source as the field list (CSV and TSV only)
      --jsonArray                                 treat input source as a JSON array
      --parseGrace=<grace>                        controls behavior when type coercion fails - one of: autoCast, skipField, skipRow, stop
                                                  (default: stop)
      --type=<type>                               input format to import: json, csv, or tsv
      --columnsHaveTypes                          indicates that the field list (from --fields, --fieldsFile, or --headerline) specifies types;
                                                  They must be in the form of '<colName>.<type>(<arg>)'. The type can be one of: auto, binary,
                                                  boolean, date, date_go, date_ms, date_oracle, decimal, double, int32, int64, string. For each of
                                                  the date types, the argument is a datetime layout string. For the binary type, the argument can
                                                  be one of: base32, base64, hex. All other types take an empty argument. Only valid for CSV and
                                                  TSV imports. e.g. zipcode.string(), thumbnail.binary(base64)
      --legacy                                    use the legacy extended JSON format
      --useArrayIndexFields                       indicates that field names may include array indexes that should be used to construct arrays
                                                  during import (e.g. foo.0,foo.1). Indexes must start from 0 and increase sequentially
                                                  (foo.1,foo.0 would fail).

ingest options:
      --drop                                      drop collection before inserting documents
      --ignoreBlanks                              ignore fields with empty values in CSV and TSV
      --maintainInsertionOrder                    insert the documents in the order of their appearance in the input source. By default the
                                                  insertions will be performed in an arbitrary order. Setting this flag also enables the behavior
                                                  of --stopOnError and restricts NumInsertionWorkers to 1.
  -j, --numInsertionWorkers=<number>              number of insert operations to run concurrently
      --stopOnError                               halt after encountering any error during importing. By default, mongoimport will attempt to
                                                  continue through document validation and DuplicateKey errors, but with this option enabled, the
                                                  tool will stop instead. A small number of documents may be inserted after encountering an error
                                                  even with this option enabled; use --maintainInsertionOrder to halt immediately after an error
      --mode=[insert|upsert|merge|delete]         insert: insert only, skips matching documents. upsert: insert new documents or replace existing
                                                  documents. merge: insert new documents or modify existing documents. delete: deletes matching
                                                  documents only. If upsert fields match more than one document, only one document is deleted.
                                                  (default: insert)
      --upsertFields=<field>[,<field>]*           comma-separated fields for the query part when --mode is set to upsert or merge
      --writeConcern=<write-concern-specifier>    write concern options e.g. --writeConcern majority, --writeConcern '{w: 3, wtimeout: 500, fsync:
                                                  true, j: true}'
      --bypassDocumentValidation                  bypass document validation
[dave@www.cndba.cn_1 bin]#

3.3 使用示例

在3节点的副本集上进行测试:

rs0:PRIMARY> show dbs
cndba  0.000GB
rs0:PRIMARY> for(var i=1;i<=100000;i++){db.user.save({_id:i,'name':'cndba'})};
WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, "_id" : 100000 })
rs0:PRIMARY> show dbs
cndba  0.003GB
rs0:PRIMARY>

3.3.1 JSON 格式

导出可以在主节点和从节点:http://www.cndba.cn/cndba/dave/article/107972

[dave@www.cndba.cn_1 ~]# mongoexport --username=cndba --password=cndba  --host=127.0.0.1 --port=27017 --authenticationDatabase cndba --type=json --db=cndba --collection=user --out /data/mongodb/backup/user.json
2022-05-04T10:51:59.055+0800    connected to: mongodb://127.0.0.1:27017/
2022-05-04T10:51:59.634+0800    exported 100000 records
[dave@www.cndba.cn_1 ~]#

导入必须在主节点,导入时需要加上 —drop,否则会报如下错误:http://www.cndba.cn/cndba/dave/article/107972

[dave@www.cndba.cn_1 ~]# mongoimport --username=cndba --password=cndba  --host=127.0.0.1 --port=27017 --authenticationDatabase cndba --type=json --db=cndba --collection=user --file /data/mongodb/backup/user.json
2022-05-04T10:53:37.597+0800    connected to: mongodb://127.0.0.1:27017/
2022-05-04T10:53:37.608+0800    Failed: (NotWritablePrimary) not master
2022-05-04T10:53:37.608+0800    0 document(s) imported successfully. 0 document(s) failed to import.
[dave@www.cndba.cn_1 ~]#


2022-05-04T10:58:52.527+0800    continuing through error: E11000 duplicate key error collection: cndba.user index: _id_ dup key: { _id: 90997.0 }

[dave@www.cndba.cn_2 ~]# mongoimport --username=cndba --password=cndba  --host=127.0.0.1 --port=27017 --authenticationDatabase cndba --type=json --drop --db=cndba --collection=user --file /data/mongodb/backup/user.json
2022-05-04T11:00:11.678+0800    connected to: mongodb://127.0.0.1:27017/
2022-05-04T11:00:11.679+0800    dropping: cndba.user
2022-05-04T11:00:14.679+0800    [###################.....] cndba.user   2.44MB/2.95MB (83.0%)
2022-05-04T11:00:15.230+0800    [########################] cndba.user   2.95MB/2.95MB (100.0%)
2022-05-04T11:00:15.230+0800    100000 document(s) imported successfully. 0 document(s) failed to import.
[dave@www.cndba.cn_2 ~]#

3.3.2 CSV格式

Csv 格式导出必须指定导出的列:—fields=_id,name

[dave@www.cndba.cn_2 ~]# mongoexport --username=cndba --password=cndba  --host=127.0.0.1 --port=27017 --authenticationDatabase cndba --type=csv --db=cndba --collection=user --out /data/mongodb/backup/user.csv
2022-05-04T11:05:25.387+0800    connected to: mongodb://127.0.0.1:27017/
2022-05-04T11:05:25.388+0800    Failed: CSV mode requires a field list
[dave@www.cndba.cn_2 ~]#
[dave@www.cndba.cn_2 ~]# mongoexport --username=cndba --password=cndba  --host=127.0.0.1 --port=27017 --authenticationDatabase cndba --type=csv --db=cndba --collection=user --fields=_id,name --out /data/mongodb/backup/user.csv
2022-05-04T11:08:02.471+0800    connected to: mongodb://127.0.0.1:27017/
2022-05-04T11:08:02.832+0800    exported 100000 records
[dave@www.cndba.cn_2 ~]#

CSV 格式的导入也必须加上—headerline 参数:http://www.cndba.cn/cndba/dave/article/107972

If using —type csv or —type tsv, uses the first line as field names. Otherwise, mongoimport will import the first line as a distinct document.

[dave@www.cndba.cn_2 ~]# mongoimport --username=cndba --password=cndba  --host=127.0.0.1 --port=27017 --authenticationDatabase cndba --type=csv --drop --db=cndba --collection=user --headerline  --file /data/mongodb/backup/user.csv
2022-05-04T11:10:02.676+0800    connected to: mongodb://127.0.0.1:27017/
2022-05-04T11:10:02.677+0800    dropping: cndba.user
2022-05-04T11:10:05.677+0800    [####################....] cndba.user   1012KB/1.13MB (87.2%)
2022-05-04T11:10:06.051+0800    [########################] cndba.user   1.13MB/1.13MB (100.0%)
2022-05-04T11:10:06.051+0800    100000 document(s) imported successfully. 0 document(s) failed to import.
[dave@www.cndba.cn_2 ~]#

4 Bsondump 工具


bsondump:将 bson 格式的文件转储为 json 格式的数据http://www.cndba.cn/cndba/dave/article/107972

官网的说明如下:
https://www.mongodb.com/docs/database-tools/bsondump/

我们这里直接使用之前mongodump的备份文件测试:

/data/mongodb/backup/cndba/user.bson

http://www.cndba.cn/cndba/dave/article/107972

[dave@www.cndba.cn_2 backup]# bsondump --outFile=/data/mongodb/backup/cndba/dave.json /data/mongodb/backup/cndba/user.bson
2022-05-04T11:29:16.817+0800    100000 objects found
[dave@www.cndba.cn_2 backup]# head -30 /data/mongodb/backup/cndba/dave.json
{"_id":{"$numberInt":"2"},"name":"cndba"}
{"_id":{"$numberInt":"1"},"name":"cndba"}
{"_id":{"$numberInt":"4"},"name":"cndba"}
{"_id":{"$numberInt":"12"},"name":"cndba"}
{"_id":{"$numberInt":"20"},"name":"cndba"}
{"_id":{"$numberInt":"25"},"name":"cndba"}
{"_id":{"$numberInt":"34"},"name":"cndba"}
{"_id":{"$numberInt":"14"},"name":"cndba"}
{"_id":{"$numberInt":"40"},"name":"cndba"}
{"_id":{"$numberInt":"35"},"name":"cndba"}
{"_id":{"$numberInt":"42"},"name":"cndba"}
{"_id":{"$numberInt":"45"},"name":"cndba"}
{"_id":{"$numberInt":"53"},"name":"cndba"}
{"_id":{"$numberInt":"60"},"name":"cndba"}
{"_id":{"$numberInt":"62"},"name":"cndba"}
{"_id":{"$numberInt":"66"},"name":"cndba"}
{"_id":{"$numberInt":"68"},"name":"cndba"}
{"_id":{"$numberInt":"8"},"name":"cndba"}
{"_id":{"$numberInt":"11"},"name":"cndba"}
{"_id":{"$numberInt":"16"},"name":"cndba"}
{"_id":{"$numberInt":"30"},"name":"cndba"}
{"_id":{"$numberInt":"36"},"name":"cndba"}
{"_id":{"$numberInt":"15"},"name":"cndba"}
{"_id":{"$numberInt":"37"},"name":"cndba"}
{"_id":{"$numberInt":"9"},"name":"cndba"}
{"_id":{"$numberInt":"41"},"name":"cndba"}
{"_id":{"$numberInt":"57"},"name":"cndba"}
{"_id":{"$numberInt":"59"},"name":"cndba"}
{"_id":{"$numberInt":"64"},"name":"cndba"}
{"_id":{"$numberInt":"10"},"name":"cndba"}
[dave@www.cndba.cn_2 backup]#

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2283
    原创
  • 3
    翻译
  • 579
    转载
  • 196
    评论
  • 访问:8179395次
  • 积分:4428
  • 等级:核心会员
  • 排名:第1名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ