cbmigrate

  • reference
    +
    Use the cbmigrate command-line tool to migrate your data from other platforms.

    Description

    The cbmigrate tool will migrate your existing data from the following platforms:

    Installation

    1. Download the latest version of the cbmigrate package from its GitHub repository.

    2. Unpack the downloaded package to its own directory.

    3. Execute the tool by running the following from the command line:

      $ ./cbmigrate [command] [flags]

    Syntax

    $ cbmigrate [--version] [--help HELP]
    $ cbmigrate [command] [flags]

    Command options

    cbmigrate takes one of three optional commands. Depending on the command used, the cbmigrate tool will also accept a range of flags for additional information required for its execution.

    • mongoDB

    • DynamoDB

    • Hugging Face

    Table 1. Command options
    Command Flags

    mongo

    Migrate the data from a MongoDB installation to Couchbase server.

    --mongodb-uri string

    The mongodb connection string.

    --mongodb-database string

    The name of the database that you wish to migrate.

    --mongodb-collection

    The name of the collection within the database you are migrating.

    --cb-username string

    The username granting access to the target cluster.

    --cb-password string

    The password (attached to --cb-username) for accessing the target cluster.

    --cb-cluster string

    The URL of the target cluster node for the import.

    --cb-bucket string

    The name of the target bucket.

    --cb-scope string

    The target scope for the migration.

    --cb-collection string

    The target collection name for the import.

    --cb-generate-key string

    Specifies a key expression used for generating a unique key for each imported document. It allows for the creation of document keys by combining static text, field values (denoted by %fieldname%), and custom generators (like #UUID#). For example, using a combination of static text, field names, and custom generators, you can generate a unique key of the form: "key::%name%::#UUID#"
    (Default: "%_id%")

    --cb-cacert string

    Specifies a CA certificate that will be used to verify the identity of the server being connected to. Either this flag or the --cb-no-ssl-verify flag must be specified when using an SSL encrypted connection.

    --cb-no-ssl-verify

    Skips the SSL verification phase. Specifying this flag will allow a connection using SSL encryption but will not verify the identity of the server you connect to.

    You are vulnerable to a man-in-the-middle attack if you use this flag.

    Either this flag or the --cb-cacert flag must be specified when using an SSL encrypted connection

    --cb-client-cert string

    The path to a client certificate used to authenticate when connecting to a cluster. May be supplied with --cb-client-key as an alternative to the --cb-username and --cb-password flags.

    --cb-client-cert-password

    The password for the certificate provided to the --cb-client-cert flag, when using this flag, the certificate/key pair is expected to be in the PKCS#12 format

    --cb-client-key string

    The path to the client private key whose public key is contained in the certificate provided to the --cb-client-cert flag. May be supplied with --cb-client-cert as an alternative to the --username and --password flags.

    --cb-client-key-password string

    The password for the key provided to the --cb-client-key flag, when using this flag, the key is expected to be in the PKCS#8 format

    --cb-buffer-size int

    An integer value denoting the size of the memory buffer used during the import. (Default: 10000)

    --cb-batch-size int

    The number of documents processed as a batch during the import. (Default: 200)

    --copy-indexes

    Copy indexes for the collection (default: true)

    --hash-document-key string

    Hash the couchbase document key. Can be sha256 or sha512)

    --keep-primary-key

    Keep the non-composite primary key in the document. By default, if the key is a non-composite primary key, it is deleted.

    --help

    Help for the MongoDB migration parameters and flags

    --debug

    Enable debug output.

    Table 2. Command options
    Command Flags

    dynamodb

    Migrate the data from a DynamoDB installation to Couchbase server.

    --aws-access-key-id string

    Your AWS Access Key ID

    --aws-ca-bundle string

    The CA certificate bundle to use when verifying SSL certificates. Overrides config/env settings

    --aws-endpoint-url string

    Override the AWS default endpoint url with the given URL

    --aws-no-verify-ssl

    By default, cbmigrate uses SSL when communicating with AWS services. For each SSL connection, cbmigrate will verify SSL certificates. This option overrides the default behavior of verifying SSL certificates.

    --aws-profile string

    Use a specific aws profile from your credential file.

    --aws-region string

    The region to use. Overrides config/env settings.

    --aws-secret-access-key string

    The AWS secret access key.

    --dynamodb-limit int

    Specifies the maximum number of items to retrieve per page during a scan operation. Use this option to control the amount of data fetched in a single request, helping to manage memory usage and API call rates during scanning.

    --dynamodb-segments int

    Specifies the total number of segments to divide the DynamoDB table into for parallel scanning. Each segment is scanned independently, allowing multiple threads or processes to work concurrently for faster data retrieval. Use this option to optimize performance for large tables. By default, the entire table is scanned sequentially without segmentation (Default: 1)

    --dynamodb-table-name string

    The name of the table containing the requested item. You can also provide the Amazon Resource Name (ARN) of the table in this parameter.

    --cb-username string

    The username granting access to the target cluster.

    --cb-password string

    The password (attached to --cb-username) for accessing the target cluster.

    --cb-cluster string

    The URL of the target cluster node for the import.

    --cb-bucket string

    The name of the target bucket.

    --cb-scope string

    The target scope for the migration.

    --cb-collection string

    The target collection name for the import.

    --cb-generate-key string

    Specifies a key expression used for generating a unique key for each imported document. It allows for the creation of document keys by combining static text, field values (denoted by %fieldname%), and custom generators (like #UUID#). For example, using a combination of static text, field names, and custom generators, you can generate a unique key of the form: "key::%name%::#UUID#"
    (Default: "%_id%")

    --cb-cacert string

    Specifies a CA certificate that will be used to verify the identity of the server being connected to. Either this flag or the --cb-no-ssl-verify flag must be specified when using an SSL encrypted connection.

    --cb-no-ssl-verify

    Skips the SSL verification phase. Specifying this flag will allow a connection using SSL encryption but will not verify the identity of the server you connect to.

    You are vulnerable to a man-in-the-middle attack if you use this flag.

    Either this flag or the --cb-cacert flag must be specified when using an SSL encrypted connection

    --cb-client-cert string

    The path to a client certificate used to authenticate when connecting to a cluster. May be supplied with --cb-client-key as an alternative to the --cb-username and --cb-password flags.

    --cb-client-cert-password

    The password for the certificate provided to the --cb-client-cert flag, when using this flag, the certificate/key pair is expected to be in the PKCS#12 format

    --cb-client-key string

    The path to the client private key whose public key is contained in the certificate provided to the --cb-client-cert flag. May be supplied with --cb-client-cert as an alternative to the --username and --password flags.

    --cb-client-key-password string

    The password for the key provided to the --cb-client-key flag, when using this flag, the key is expected to be in the PKCS#8 format

    --cb-buffer-size int

    An integer value denoting the size of the memory buffer used during the import. (Default: 10000)

    --cb-batch-size int

    The number of documents processed as a batch during the import. (Default: 200)

    --copy-indexes

    Copy indexes for the collection (default: true)

    --hash-document-key string

    Hash the couchbase document key. Can be sha256 or sha512)

    --keep-primary-key

    Keep the non-composite primary key in the document. By default, if the key is a non-composite primary key, it is deleted.

    --help

    Help for the MongoDB migration parameters and flags

    --debug

    Enable debug output.

    Table 3. Command options
    Command Flags

    hugging-face

    Migrate the data from a Hugging Face installation to Couchbase server.

    --path string

    The path or name of the dataset. (Required)

    --name

    Configuration name of the dataset. (Optional)

    --data-files string

    Path(s) to the source data file(s). (Optional)

    --split string

    The split of the data to load. (Optional)

    --cache-dir string

    The cache directory to store the datasets. (Optional)

    --download-config string

    Specific download configuration parameters. (Optional)

    --download-mode reuse_dataset_if_exists | force_redownload

    Specifies whether to reuse existing downloaded data or force a fresh download. (Optional)

    --verification-mode no_checks | basic_checks | all_checks

    Sets the level of verification during the migration. (Optional)

    --keep-in-memory

    Use this flag to keep the migrated dataset in memory.

    --save-infos

    Save the dataset information. (Default: false)

    --revision string

    The version of the dataset script to load. (Optional)

    --token string

    Authentication token for private datasets. (Optional)

    --no-streaming

    Disable streaming mode for dataset loading. (Default: false)

    num-proc int

    Number of processes to use for the migration. (Optional)

    --storage-options string

    Storage options for remote filesystems. (Optional)

    --trust-remote-code

    Allow loading arbitrary code from the dataset repository. (Optional)

    --id-fields string

    Comma-separated list of field names to use as the document ID.

    --cb-url string

    The URL for the target Couchbase cluster (e.g., couchbase://localhost)

    --cb-username string

    The username granting access to the target cluster.

    --cb-password string

    The password (attached to --cb-username) for accessing the target cluster.

    --cb-bucket string

    The name of the target bucket.

    --cb-scope string

    The target scope for the migration.

    --cb-collection string

    The target collection name for the import.

    cb-batch-size int

    The number of documents to insert per batch. (Default: 1000)

    --help

    Show the help screen for the hugging face migration.