For the each offering, visit the respective offering page and checkout the implementation details and the timeline needed to be implemented in your organization as per your chosen cloud provider.
About our ingestion solution into Data Lake
Once the data available in the data lake with open table format as below:
  • Apache Iceberg
  • Apache Hudi
  • Delta Lake
  • Apache XTable
In addition to this, we also provide interoperability using the Apache Xtable.
Table format and storage to centralized Data Lake on cloud
Storage and open table format supported by Avocado Datalake engineer's
We attach DataLake with centralized data catalog for discovery and utilization using the below tools as per your organization needs.
  • AWS Glue Data Catalog
  • GCP Data Catalog
  • Apache Atlas
  • Databricks Unity Catalog
After the data is available through any of the above centralized catalog, we will help you build the access controls using the IAM or Lakeformation in the AWS Glue data catalog. Access controls can be provisioned through our specialized Terraform module, as per your need to provide access to various groups in your organization, or you can configure the access yourself. Examples of intended access control users are:
  • AI/ML Engineers
  • LLM Engineers
  • Data Engineers
  • Data Scientists
  • Data Analysts
  • Business Intelligence
Our High level architecture design of the above centralized data lake for utilization by the end users from various tools is as follows:
Data driven aproach through Avocado data lake support and overall archictures
Data driven aproach through Avocado data lake engineer's support and overall archictures of Data Lake
For more information on Avocado Datalake codebase or if you want to get bootrapping in your organization or full access and support for any of the above proposed soltutions then email us at
We will shortly contact you and setup a call a free half an hour call.