Avocado Datalake
Avocado Datalake simplifies data lake management for your organization.
Avocado Datalake Architecture
A high level design architecture of our proposed solution for your organization to manage all sources of data into a unified Data Lake.

For structured data sources, we support relational databases such as MySQL, PostgreSQL, SQL Server, Amazon Aurora DB, and Cloud SQL. For semi-structured data sources, we support systems like MongoDB, Amazon DynamoDB, and Google Cloud Bigtable. For unstructured data sources, includs Apache Kafka streams, as well as formats such as CSV, JOSN, or Parquet. Avocado Data Lake pipelines can efficiently ingest all of them into a centralized data lake. By leveraging open table storage formats like Apache Hudi, Delta lake or Apache Iceberg, we ensure Change Data Capture (CDC) and enable optimized read and write operations.