Data Migration from Dell ECS to MinIO
2024-9-18 17:19:38 Author: hackernoon.com(查看原文) 阅读量:6 收藏

Dell ECS clusters allow you to migrate your data to any S3 compatible store. Dell ECS calls this feature “Data Movement”, also called copy-to-cloud. It's a feature introduced in ECS 3.8.0.1 that allows you to copy objects from Dell ECS to MinIO which is rather popular with customers and prospects who are modernizing their storage stack to support their AI data infrastructure requirements. The Data Movement is built atop of the ECS Sync open-source tool which provides the capability to copy the data in parallel.

In this overview we’ll show you how to migrate data from Dell ECS to MinIO by specifically focusing on the following:

  • Configuring Source and Target Buckets
  • Setting up Data Movement Policies
  • Monitoring and Logging the migration to MinIO

Configure Source and Target Buckets

Before we can start creating the policy to migrate the data, let’s ensure the source and target buckets are configured for Data Movement.

Configure Dell ECS source bucket

Internally the Data Movement policy scans the source bucket to enumerate all objects for data movement using Metadata (MD) Search.

For Data Movement to actually move data, you need to ensure MD Search is enabled on the Dell ECS source bucket and include LastModified as an indexed field.

Next let's configure MinIO, the target bucket.

Configure MinIO Target Bucket

In order for the data to be transferred to MinIO, we need to create the following resource in MInIO beforehand:

  • Access and Secret Keys
  • Bucket Name
  • IAM policy

When creating the bucket, be sure to enable bucket versioning, unless the target bucket in MinIO is dedicated just for the data movement policy. Be sure to make note of the above details after creating them as their values are needed later.

The IAM policy should allow the following APIs:

  • s3:ListBucket
  • s3:GetObject
  • s3:PutObject
  • s3:DeleteObject

Follow this guide to learn how to create IAM policies, Access Keys and Buckets in MinIO console.

  1. Enter the username that was set while configuring the cluster.
  2. Enter the password that was set while configuring the cluster.
  3. Click “Login”

  1. Object Browser: Buckets which have been created and data uploaded will be shown here.

  2. Access Keys: AWS IAM Style Access Keys

    1. Create Access Key: Click here to create a access and secret separate from the one we used to launch the cluster.
  3. Buckets: List all the buckets that are available.

    1. Create Bucket: If there are no buckets, go ahead and create a new one.
  4. Policies: IAM Policies

  5. Identity: Create and Connect various IDPs such as OpenID and LDAP.

  6. Monitoring: Monitor all aspects of the cluster and even send it to Prometheus.

Once the Source and Target buckets are configured, let's set up the Data Movement Policy.

Data Movement Policy

A Data Movement Policy is a definition in Dell ECS that can be set either via UI or API that defines which objects in a Dell ECS source bucket should be copied to MinIO target bucket. The Data Movement policy scan jobs are automatically triggered but can be paused or resumed at any time. This is very similar to MinIO’s batch replication process. By default the data movement policy migrates the data to MinIO in order of LastModified time.

We’ll show you two different Data Movement scenarios to give you an idea of how this could work, but the sky's the limit when it comes to how you would want to do the Migration.

Data Movement to MinIO

In this configuration we’ll add the necessary MinIO bits and bobs for Dell ECS to communicate with. The first step, once MD Search is enabled, is to enable Data Mobility to ON as shown below.

Once the Data Mobility is set to ON, we can go ahead and configure the policy.

  • Endpoint: Set this to the MinIO endpoint http://<minio_ip>:<minio_port>
  • Access and Secret Key: This was created in the MinIO console and saved in a previous step.
  • Bucket Name: MinIO target bucket name
  • Logging Bucket: This is the bucket in Dell ECS that logs any errors during the migration.

There are other settings, I’ve not gone through all the settings, just the most important ones. Please note that if data is deleted from the Dell ECS source bucket, it won’t be deleted from MinIO target bucket after the migration has been completed.

Data Movement with Dremio to MinIO

Now let's take a look at how the migration would look like using an application that uses Dremio.

There are a few steps that take place during this migration

  1. A customer facing application writes to a Dell ECS bucket.
  2. ECS copies to a staging bucket within MinIO configured using data movement policy.
  3. Data is copied over to the staging bucket.
  4. The MinIO staging bucket will use Event Notification to send a message to RabbitMQ, which Dremio will be subscribed to.
  5. Dremio reads the message and ingests the data from the MinIO staging bucket to the Dremio bucket in MinIO.
  6. Once the data is ingested, you can clean up the MinIO staging bucket using a lifecycle policy.

These are just two examples but you can use this methodology to migrate from any application using Dell ECS to MinIO.

Data Movement Monitoring and Logging

During the data migration process, it's important to keep an eye on the overall migration process as it moves data to MinIO. The Dell ECS GUI provides overview dashboards with advanced monitoring that shows total objects copied, total bytes copied, watermark lag, total errors, objects copied, bytes copies, among others.

You can further drill down to show source/target specific information such as the object count and bucket size over a selected period during the migration process.

In the initial diagram at the start of this blog we’ve shown a log bucket we have on the ECS side, this is where all the operations from the Data Movement are logged. This is very helpful to debug any issues during the migration process especially when it takes a very long time due to hardware and physical constraints.

Here is an example of how the logs look like:

2024-08-31T11:40:51Z DM.COPY demo sourcebucket ASIAD708D0875B4F32F8 test.pdf 2022-08-31T09:30:52Z 1,951,137 5895c19c9e742a88d1bec75d40288e0f http://targetendpoint targetbucket AKIA7A04FF4B251997E0 288 SUCCESS

Why migrate to MinIO?

MinIO is a single Go binary that can be launched in many different types of cloud and on-prem environments. It's very lightweight, but is also feature packed with things like replication and encryption, and it provides integrations with various applications.

We’ve benchmarked it at 325 GiB/s (349 GB/s) on GETs and 165 GiB/s (177 GB/s) on PUTs with just 32 nodes of off-the-shelf NVMe SSDs – and is used to build data lakes/lake houses and analytics and AI/ML workloads.

Not only that but out of the box MinIO also includes:

  • Encryption: MinIO supports both encryption at Rest and in Transit. This ensures that data is encrypted in all facets of the transaction from the moment the call is made till the object is placed in the bucket.

  • Bitrot Protection: There are several reasons data can be corrupted on physical disks. It could be due to voltage spikes, bugs in firmware, misdirected reads and writes among other things. MinIO ensures that these are captured and fixed on the fly to ensure data integrity.

  • Erasure Coding: Rather than ensure redundancy of data using RAID which adds additional overhead on performance, MinIO uses this data redundancy and availability feature to reconstruct objects on the fly without any additional hardware or software.

  • Secure Access ACLs and PBAC: Supports IAM S3-style policies with built in IDP, see MinIO Best Practices - Security and Access Control for more information.

  • Tiering: For data that doesn’t get accessed as often you can siphon off data to another cold storage running MinIO so you can optimize the latest data on your best hardware without the unused data taking space.

  • Object Locking and Retention: MinIO supports object locking (retention) which enforces write once and ready many operations for duration based and indefinite legal hold. This allows for key data retention compliance and meets SEC17a-4(f), FINRA 4511(C), and CFTC 1.31(c)-(d) requirements.

Not to mention, good software is nothing without good support. MinIO provides one of the best support out there using our SUBNET portal. We have engineers who work on the MinIO core code base answer questions directly in a Slack style interactive and collaborative medium. When you speak with an engineer rather than endlessly escalate your issue to the next level engineer the folks who you speak with are capable of resolving any issue you come across. We’ve even had customers come back to us after going with a competitor storage platform because of the lack of proper support. No matter if you design your storage with all the features under the sun, if you do not promptly support your customer then it's of no use. For us supporting our customers and making them successful is our priority #1.

If you have any questions on how to migrate your data from Dell ECS to MinIO migration be sure to reach out to us on Slack!


文章来源: https://hackernoon.com/data-migration-from-dell-ecs-to-minio?source=rss
如有侵权请联系:admin#unsafe.sh