List of semi structured data sources / databases we support for ingestion into your data lake:
  • MongoDB
  • AWS DynamoDB
  • Google BigTable
  • Kafka messages
  • Any other NoSQL databases
  • Semi-structured data sources like XML, Avro, Parquet
  • Semi-structured files like JSON or CSV
Our semi-structured data pipeline ETL job will allow you to process and transform data from sources like MongoDB, and AWS DynamoDB into your centralized data lake. This enables you to gain deeper insights from your data and improve your decision-making using AI/ML, Data Insights or Data Analytics.

Semi-Structured Data Sources (NoSQL databases) to Data Lake High level design flow:

RDBMS to centralized Data Lake Architecture
Full support of reading RDBMS (through JDBC connection or parsing binlog) and ingestion into centralized Data Lake with cloud storage such as AWS S3, GCP GCS, and Azure Blob Storage.
More details about the offering including pipeline orchestration and setup timeline, will be available soon here. To keep updated feel free to reach out to us at: [email protected]